-
Notifications
You must be signed in to change notification settings - Fork 17
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
workflow RDF #478
base: main
Are you sure you want to change the base?
workflow RDF #478
Conversation
A preview of the workflow RDF documentation can be found here: |
adapted after furrther discussion with @oeway and @constantinpape |
hey @oeway ! We iterated over several solutions and arrived at this:
Essentially we don't try to come up with our own logic, but mirror how python callables behave. |
been thinking about this part a bit more... bear with me, it's shorter than it looks! We can also mix positional inputs and named inputs with list entries containing a dict with a single key (that is the name of an input): steps:
- op: ...
inputs:
- arg0
- arg1
- key0: value0
- key1: value1 key word order is ignored: steps:
- op: ...
inputs:
- arg0
- arg1
- key1: value1
- key0: value0 # key1 and key2 are given as key word arguments, their order is ignored 👍 Assuming the previous step returns [arg0, arg1], this is equivalent to the above: steps:
- op: ...
inputs: # positional inputs (arg0, arg1) are taken from previous step, optional inputs may be overwritten 👍
key0: value0
key1: value1 if inputs is a list I would ignore positional outputs from the step above: steps:
- op: ...
inputs: # arg0 is NOT filled in from previous step
- arg1
- key0: value0
- key1: value1 correct syntax, but in this example it would This pattern implies that input values that are dicts with one key need to be given like this: steps:
- op: ...
inputs:
arg0: {key: value} or steps:
- op: ...
inputs:
- arg0: {key: value} because steps:
- op: ...
inputs:
- key: value would be interpreted as the optional input 'key' with value 'value' and not as the first positional argument (here arg0) with value {key: value}. |
@oeway , one more thing @constantinpape and I discussed: |
Hi, In the first glance, I like the idea of merging inputs and kwargs. However, when think it a bit further I find it actually making the usage much more complicated and confusing.
I don't really get what this means, the rule is pretty clear to me. Same for defining a python function, if it's required, go for inputs, and if it's optionally with default, go for kwargs. For inputs, you don't need to name them, but kwargs you need to name them. inputs is a list while kwargs is a dictionary. You can of course merge the two by exploiting list of dict, or dict as you are proposing, but feels very hacky and needs a lot of additional rules when using it.
But the actually python callables actually separates the args (a list) and kwargs (a dict). Using a list with 1-key dict or dict to encapsulate the two into one object make it feel hacky and hard to use. In the following example: - op: zero_mean_unit_variance
- op: model_inference
kwargs:
model_id: fearless-crab
preprocessing: false # disable the preprocessing
postprocessing: false # disable the postprocessing
- op: stardist_postprocessing
kwargs:
diameter: 2.3 If I understand correctly, to adapt what your proposed it will become: - id: step1
op: zero_mean_unit_variance
outputs: [out1]
- id: step2
op: model_inference
inputs:
- step1.outputs.out1
- model_id: fearless-crab
- preprocessing: false # disable the preprocessing
- postprocessing: false # disable the postprocessing
outputs: [out2]
- op: stardist_postprocessing
inputs:
- step2.outputs.out2
- diameter: 2.3 Note that, because in step2 and step2 we need to set the kwargs, we are forced to name the inputs, and this propagate to all the other steps to force us set id for the first two steps and name the outputs. This introduces a lot of hassle for the naming. We can potentially allowing ignoring the positional inputs when we only want to change the kwargs, however, this make the inputs fields looks so confusing, since we are not writing out all the inputs. Overall, I felt the merging of inputs and kwargs will require more rules, the list of 1-key dict feels hacky, separating them actually make more sense to me.
Sure, maybe we should bring in more tensor spec (e.g. shape) too, however, for simplicity, maybe adding axes, shape, dtype etc. as an optional field would help. |
no. I would not force naming/specifying - id: step0
op: zero_mean_unit_variance
- id: step2
op: model_inference
inputs:
model_id: fearless-crab
preprocessing: false # disable the preprocessing
postprocessing: false # disable the postprocessing
- op: stardist_postprocessing
inputs:
diameter: 2.3
If positional inputs need to be specified this would become - id: step0
op: zero_mean_unit_variance
outputs: [out1]
- id: somehting-in-between # added this step to the example, because otherwise out1 would not need naming
op: ...
- id: step2
op: model_inference
inputs:
- step1.outputs.out1
- model_id: fearless-crab
- preprocessing: false # disable the preprocessing
- postprocessing: false # disable the postprocessing
- op: stardist_postprocessing
inputs:
diameter: 2.3
to refine my idea of merging inputs and kwargs I wanted to propose a slight update:
The above example then becomes: - id: step0
op: zero_mean_unit_variance
outputs: [out1]
- id: somehting-in-between # added this step to the example, because otherwise out1 would not need naming
op: ...
- id: step2
op: model_inference
inputs:
- step1.outputs.out1
- model_id: fearless-crab
preprocessing: false # disable the preprocessing
postprocessing: false # disable the postprocessing
- op: stardist_postprocessing
inputs:
diameter: 2.3
However, I have to agree with @oeway . this is still hacky...
I agree that shape and dtype may be added as optional fields, but I think we cannot handle tensors reasonably without knowing their axes. We don't want a bunch of transpose ops littered throughout the workflows which would be the consequence... |
@oeway @constantinpape I know this is a lot to read through and think about, but it would be really convenient for me, if you could take a brief moment and just comment if we can proceed with the current state or need more discussion (if so that does not need to happen right away ;-) ) I tried to comment all discussed/proposed decisions in the following example: inputs:
- name: ipt0
type: tensor
axes: bxy # if type=='tensor' then 'axes' is mandatory
options:
- name: param0 # although options are given as a dict we specify them as a list to match inputs/outputs pattern
type: int
default: null # options always need a default value (may be null or of type 'type')
outputs:
- name: ...
type: tensor
axes: bxy
- name: ...
type: bool
steps:
- id: step0
op: zero_mean_unit_variance
outputs: [out1]
- id: somehting-in-between # added this step to the example, because otherwise out1 would not need naming
op: ...
- id: step2
op: model_inference
inputs: [step1.outputs.out1]
# inputs as list or alternatively by name as dict, but either way always all inputs need to be specified:
# inputs:
# raw: step1.outputs.out1
options:
model_id: fearless-crab
preprocessing: false # disable the preprocessing
postprocessing: false # disable the postprocessing
- op: stardist_postprocessing
options:
diameter: 2.3
test_steps: # analog to 'steps', but do not take any inputs/options (it may call the 'steps' multiple times with different fixed inputs/options though
- ... |
I agree with this. But when I discussed this with @FynnBe, he said that you wanted to allow default values for
That sounds good to me. About the last point: I would rather not do it since I think we need to introduce more rules etc. and it makes reading everything more complex. But I don't have a strong opinion on this, so would be fine with this either way. I also agree that we should have
Looks good from my end. My time to look into this will be very limited in the next few weeks, so please don't wait on my feedback to go ahead here. |
Hi @FynnBe I like this! No strong opinion about changing A minor suggestion, would it make sense to make For axes, would it make sense to bring this new axes definition into the tensor spec? |
# Conflicts: # tests/conftest.py
applies only for creating an RDF raw node in code
this is ready for a pre-alpha release! |
# Conflicts: # README.md
implements spec part of #477
with a few assumptions on open questions...