op4j: bending the Java spoon

Features: a general overview

Here are some quick details about what op4j can do:

Apply functions on objects.
- Including more than 200 predefined functions for data conversion, string handling, calendar operation, math operations, etc...
Manage structures (arrays, lists, maps and sets), including:
- Easily creating them from their elements (building).
- Modifying them (adding / removing / replacing elements).
- Iterating them, applying functions on each element.
- Transforming them into other structures.
Execute conditional logic (execute actions depending on conditions).
Define functions from expressions, for later use (similar to closures).
...and more.

1. Expressions

op4j is used for implementing (mainly) auxiliary code as expressions. Expressions look like this:

output = Op.on(input).[ACTION].[ACTION]...[ACTION].get();
function = Fn.on(inputType).[ACTION].[ACTION]...[ACTION].get();

They start by calling either the Op.on(...) or the Fn.on(...) static methods, which receive the input object (Op.on) or input type (Fn.on) as a parameter. There are some variations of the "on(...)" methods with slightly different names, which are explained in the documentation sections for the specific types of input.

Expressions starting with Op.on are called Op expressions or Operation expressions. They act on a specific input object/s and perform a set of chained actions on it before returning a result.
```
String[] values = ...;
List<String> upperStrs = Op.on(values).toList().map(FnString.toUpperCase()).get();
```
Expressions starting with Fn.on are called Fn expressions or Function expressions. They allow the definition of functions as a set of chained actions of exactly the same kind as op expressions, returning as a result an executable Function object equivalent to the defined chain.
```
Function<String[],List<String>> upperStrsFunc = 
    Fn.onArrayOf(Types.STRING).map(FnString.toUpperCase()).toList().get();
...
String[] values = ...;
List<String> upperStrs = upperStrsFunc.execute(values);
```

Expressions end with get(), which returns the output (op expressions) or the defined function (fn expressions).

2. Actions

An action is a chained method executed in an expression, between the "on(...)" and the "get()" methods.

For example:

String[] values = ...;
List<Integer> intValueList = Op.on(values).toList().forEach().exec(FnString.toInteger()).get();

Here we can see three chained actions: toList, forEach and exec.

3. Operators

Operators are the objects representing each of the states of an expression. They are the objects resulting from the execution of actions (or on(...)).

Op.on(...) and Fn.on(...) return operators.
Each action method chained after these "on" methods returns a different operator.
get() methods stop the action/operator chains and return results.

Looking again at our example:

List<Integer> intValueList = Op.on(values).toList().forEach().exec(FnString.toInteger()).get();

We can explain it as:

The on(...) method returns operator1.
The toList() method (or action) is called on operator1, which returns operator2.
The forEach() method (or action) is called on operator2, which returns operator3.
The exec(...) method (or action) is called on operator3, which returns operator4.
The get() method is called on operator4, which returns the expression result.

All operators are immutable. When an action is executed on them, they return a new and different operator representing the new state, instead of changing their own internal state.

4. Targets

The "target" or "targets" of an operator are the objects that the operator's actions (methods) will execute on.

As operators are linked together by actions in the shape of chains, an operator's target is the result of the execution of an action on the previous operator.

Following the previous example:

List<Integer> intValueList = Op.on(values).toList().forEach().exec(FnString.toInteger()).get();

The on(...) method creates operator1.
- Operator1: TARGET = input (a String[] object called values)
The toList() method (or action) is called on operator1, which returns operator2.
- Operator2: TARGET = a List<String> object created from the original array.
The forEach() method (or action) is called on operator2, which returns operator3.
- Operator3: TARGET = each of the elements of List<String> (an iterating list)
The exec(...) method (or action) is called on operator3, which returns operator4.
- Operator4: TARGET (for each execution) = one of the elements of List<String>
The get() method is called on operator4, which returns the expression result (List<Integer>).

5. The operators as a state machine

The behaviour of operators in op4j allows the operator system to be thought of as a state machine where operators represent states, and actions represent state changes. Thanks to this:

Each operator offers only the actions (methods) that can effectively be executed in the specific state it represents.

What does this mean? It means for example that, if you execute Op.on(...) on a Set<String> object, you will be offered a forEach() method for iterating it (because Sets can be iterated), but you will not be offered a distinct() method for removing duplicates as would have been offered if your object was a List (because Sets can never contain duplicates).

Another example, the following sentence will not compile:

List<Integer> intValueList = Op.on(values).forEach().forEach()...

After the first forEach(), you are already iterating the input array, so there is no point in iterating it again!

The five branches of the state machine

The operator state machine in op4j defines five branches, this is, five sub-machines with almost entirely different sets of operator classes and thus corresponding associated actions.

This allows the special treatment of several kinds of objects by offering specific methods for them, as exemplified above. These five branches are:

Array operators for T[] objects.
List operators for List<T> objects.
Map operators for Map<K,V> objects.
Set operators for Set<T> objects.
Generic operators for any other object.

Operators in each of these five branches will offer different actions depending on the state of the expression (for example, a set operator will offer forEach() only if this action has not been added to the chain yet, or it has but an endFor() has been called afterwards).

The array operators branch will be activated if:

Op.on(input) is called being input an array variable.
Op.onArrayOf(type, input) is called being input an array variable and type the javaRuntype Type object representing the array's elements (e.g: Op.onArrayOf(Types.STRING, myStrArr)).
Op.onArrayFor(e1, e2, e3...) is called being e1, e2, e3... the elements on which the expression will be created, treating them as an array.
Fn.onArrayOf(type) is called being type the type of the elements of the arrays that will be passed as input to the defined function.
The asArrayOf(type) action is called from a generic operator. This action represents a cast operation.

The list operators branch will be activated if:

Op.on(input) is called being input a ListT variable.
Op.onList(input) is called being input a ListT variable.
Op.onListFor(e1, e2, e3...) is called being e1, e2, e3... the elements on which the expression will be created, treating them as a list.
Fn.onListOf(type) is called being type the type of the elements of the lists that will be passed as input to the defined function.
The asListOf(type) action is called from a generic operator. This action represents a cast operation.

The map operators branch will be activated if:

Op.on(input) is called being input a MapK,V variable.
Op.onMap(input) is called being input a MapK,V variable.
Op.onMapFor(key, value) is called being (key,value) the first entry added to a map on which the expression will be created. Before executing any actions, new entries can be added like Op.onMapFor(key, value).and(key2, value2).and(key3, value3)...
Fn.onMapOf(keyType,valueType) is called being keyType and valueType the types of the keys and values of the maps that will be passed as input to the defined function.
The asMapOf(keyType,valueType) action is called from a generic operator. This action represents a cast operation.

The set operators branch will be activated if:

Op.on(input) is called being input a SetT variable.
Op.onSet(input) is called being input a SetT variable.
Op.onSetFor(e1, e2, e3...) is called being e1, e2, e3... the elements on which the expression will be created, treating them as a set.
Fn.onSetOf(type) is called being type the type of the elements of the sets that will be passed as input to the defined function.
The asSetOf(type) action is called from a generic operator. This action represents a cast operation.

The generic operators branch will be activated if:

Op.on(input) is called not being input an array, list, map or set.
Fn.on(type) is called being type the type of the objects that will be passed as input to the defined function.
The generic() action is called from an operator in any of the other four branches.

Flowing code: the advantage of the state machine

op4j's state machine makes it easy for you to create your expressions by just using the Content assist feature in your IDE (usually by pressing Ctrl+Space after ".") and going through the list of actions (methods) available to you at a specific point in the chain, selecting the one that fits your needs.

6. Functions

Functions are a key concept in op4j. A quick definition:

An op4j function is an object of a class which implements the org.op4j.functions.IFunction interface

The IFunction interface is defined as IFunction<T,R>, being T the type of the function input, and R the type of the function result.

IFunction only has one method, R execute(T input, ExecCtx ctx), which receives a T object and a context (an internal structure containing iteration information) and returns an R object.

All op4j predefined functions, as well as all functions returned by Fn.on expressions extend a special implementation of IFunction<T,R> called Function<T,R>. This class adds a simpler R execute(T input) method, useful for the isolated (not inside an expression) execution of a Function object.

Executing functions

op4j includes more than 200 functions out-of-the-box, and most actions available in operators correspond to the execution of some of these predefined functions. For example, the distinct() action on a set operator created for a Set<String> input corresponds to the internal execution of the predefined FnSet.ofString().distinct() function (which implements IFunction<Set<String>,Set<String>>).

Besides, almost every operator offers the exec(IFunction) action, which can take any function as a parameter and execute it on the operator's target. This means that, with exec(...), you can execute:

Any of the predefined functions.
Any function you defined by just implementing the IFunction interface.
Any function you defined by executing an fn expression.

These last kind of functions, the ones defined with fn expressions (which are objects of the special Function class), can also be executed by themselves without being part of any other expression:

function = Fn.on(...)...get();
output = function.execute(input);

7. Types

op4j is tightly integrated with the javaRuntype project, which in fact was born as a part of op4j (later separated as a project on its own during development).

javaRuntype offers a runtime type system which fits perfectly the need op4j has for being able to specify types (for example, of a list's elements) including their type parameters (the so-called generics), which is something not covered by the standard java.lang.Class objects.

javaRuntype's Type objects are extremely easy to use, and are created and managed by means of the org.javaruntype.type.Types class, which in fact contains many useful predefined constants for the most used types.

The op4j Project