Skip to content

Collection Transforms

obeliskos edited this page Jul 19, 2015 · 11 revisions

Collection transforms

Collection transforms have been added to allow greater capabilities for portability, organization and/or offer a level of dynamism for creation, management, and execution from a user interface. You should be familiar with lokijs Resultset chaining before expanding into this capability.

The basic premise behind transforms is to allow converting a Resultset 'chain' process into an object definition of that process. This data definition can then be optionally named and saved along with the collections, within a database.

A transform is a (ordered) array of 'step' objects to be executed on collection chain. These steps may include 'find', 'where', 'simplesort', 'compoundsort', 'sort', 'limit', 'offset', 'map', 'mapReduce', 'eqJoin' operations encoded into objects within a transform. These transform steps may hardcode their parameters or use a parameter substitution mechanism added for loki transforms.

A simple, one step loki transform might appear as follows :

var tx = [
  {
    type: 'find',
    value: {
      'owner': 'odin'
    }
  }
];

This can then optionally be saved into the collection with the command :

userCollection.addTransform('OwnerFilter', tx);

This transform can be executed by either :

userCollection.chain('OwnerFilter').data();

or

userCollection.chain(tx).data();

Parameterization is resolved on any object property right-hand value which is represented in your transform as a string beginning with '[%lktxp]'. An example of this might be :

var tx = [
  {
    type: 'find',
    value: {
      'owner': '[%lktxp]OwnerName'
    }
  }
];

To execute this pipeline you need to pass a parameters object containing a value for that parameter when executing. An example of this might be :

var params = {
  OwnerName: 'odin'
};

userCollection.chain(tx, params).data();

or

userCollection.chain("OwnerFilter", params).data();

Where filter functions cannot be saved into a database but (if you still need them), utilizing transforms along with parameterization can allow for cleanly structuring and executing saved transforms. An example might be :

var tx = [
  {
    type: 'where',
    value: '[%lktxp]NameFilter'
  }
];

items.addTransform('ByFilteredName', tx);

// the following may then occur immediately or even across save/load cycles
// this example uses anonymous function but this could be named function reference as well
var params = {
  NameFilter: function(obj) {
    return (obj.name.indexOf("nir") !== -1);
  }
};

var results = items.chain("ByFilteredName", params).data();

Transforms can contain multiple steps to be executed in succession. Behind the scenes, the chain command will instance a Resultset and invoke your steps as independent chain operations before finally returning the result upon completion. A few of the built in 'steps' such as 'mapReduce' actually terminate the transform/chain by returning a data array, so in those cases the chain() result is the actual data, not a resultset which you would need to call data() to resolve.

A more complicated transform example might appear as follows :

var tx = [
  {
    type: 'find',
    value: {
      owner: {
        '$eq': '[%lktxp]customOwner'
      }
    }
  },
  {
    type: 'where',
    value: '[%lktxp]customFilter'
  },
  {
    type: 'limit',
    value: '[%lktxp]customLimit'
  }
];

function myFilter(obj) {
  return (obj.name.indexOf("nir") !== -1);
}

var params = {
  customOwner: 'odin',
  customFilter: myFilter,
  customLimit: 100
}

users.chain(tx, params);

As demonstrated by the above example, we will can the object hierarchy (up to 10 levels deep) and do parameter substation on right hand values which appear to be parameters, which we will then attempt to look up from your params object. The parameter substitution will replace that string with a value identical to that contained in your params which can be any data type.

Certain steps which are multiple parameter require specifically named step properties (other than just type and value). These are demonstrated below as separate steps which do not necessarily make sense within a single transform :

var step1 = {
  type: 'simplesort',
  property: 'name',
  desc: true
};

var step2 = {
  type: 'mapReduce',
  mapFunction: myMap,
  reduceFunction: myReduce
};

var step3 = {
  type: 'eqJoin',
  joinData: jd,
  leftJoinKey: ljk,
  rightJoinKey: rjk,
  mapFun: myMapFun
};

Adding meta for custom solutions

One use for transforms might be to have user driven solutions where you might have the user interface constructing, managing, and executing these transforms. Another might be that you just create many individual database files to be analyzed from your own program interfaces. In such situations you might need to add your own metadata to further describe transforms, steps, or parameters.

If you intend to automate transforms in such a data driven way, you may add your encode your own meta data within the transforms and steps so that you can access them later.

  • Any step with a 'type' unknown to loki transforms will be ignored. You might decide to always have the first step as a 'meta' type with properties containing information about author, description, or required parameter description meta data.
  • Each of the steps may also include additional properties above what we have defined as required, so you might have step descriptions, last changed dates, etc embedded within steps.

Summary

Loki transforms establish (with little additional footprint) a process for automating data transformations on your data. This is not a required functionality and is not intended to replace method chaining but it presents an option for 'growing into' loki, possibly providing cleaner code organization, logic bootstrapping, and extending its capabilities to various dynamic uses.