-
Notifications
You must be signed in to change notification settings - Fork 6
Processing while ingesting
DaCHS allows you to process your data while it is being ingested -- i.e., inside your RD; and that can be done in different levels, from variables type casting to units conversion to complex,multiple variables computing. For the simple/common ones (e.g, units conversion), DaCHS even provides you an API; for more complex tasks, you can define your custom python code.
In your RD there are a couple of places where you can trigger such processing calls.
One of these places is the apply
element, inside data
's rowmaker
element, the other one is from the grammar element, through the rowfilter
[1]:
- apply
-
apply
elements allow you to embed python code. The way to do it is by wrapping your code fragment in acode
element, as shown right below. The namespace in place provides data from grammar through thevars
dictionary; where key:value can be not only accessed but added/modified.
<apply name="myProcessing">
<code>
if something is False:
pass
</code>
</apply>
TODO: apply
element can have a procDef
argument [2]
TODO: namespace, variables/structures available to manipulate
- rowfilter
-
rowfilter
is likeapply
(i.e, same structure), but meant to be used only inside a grammar element. You can then access (only) whatever comes from the grammar: dictionaries namedrow
are available to consume. The way to access such data is through a call torow[key]
[*]. At the end of rowfilter'scode
block there should be ayield
call, at least. In reality, you can place as manyyield
calls as you feel like, generating a number of rows accordingly.
<rowfilter name="myRowGen">
<code>
<!-- do something -->
yield row
</code>
</rowfilter>
As apply
, rowfilter
elements can be declared more than once, or none at all; being executed in sequence.
The (row
dictionary) keywords available are given by the specific grammar in use and (if) by preceding procedures.
For instance, if you consider declaring two rowfilter
elements, where the first one calls the (procedure definition) procDef = //products#define
, the second rowfilter
block will see row
containing the grammar source data plus the keywords defined in //products#define
.
- setup & bind
- Besides the
code
element used to declare the python code fragment, one can use thesetup
element as a particular code block for (global) namespace setup -- to be used on subsequent code blocks. Worth of noticing, thebind
element is a handling element to a direct solution in assigning (new) values to variables in the namespace.
[1] | The structure of an RD : data |
[2] | procDef internal link |
[*] |
key is...? |