You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
A `FunctionTask` is a `Task` that can be created from every *python* function by using *pydra* decorator: `pydra.mark.task`:
31
31
32
-
```{code-cell}
32
+
```{code-cell} ipython3
33
33
import pydra
34
34
35
-
36
35
@pydra.mark.task
37
36
def add_var(a, b):
38
37
return a + b
39
38
```
40
39
41
40
Once we decorate the function, we can create a pydra `Task` and specify the input:
42
41
43
-
```{code-cell}
44
-
task1 = add_var(a=4, b=5)
42
+
```{code-cell} ipython3
43
+
task0 = add_var(a=4, b=5)
45
44
```
46
45
47
-
We can check the type of `task1`:
46
+
We can check the type of `task0`:
48
47
49
-
```{code-cell}
50
-
type(task1)
48
+
```{code-cell} ipython3
49
+
type(task0)
51
50
```
52
51
53
52
and we can check if the task has correct values of `a` and `b`, they should be saved in the task `inputs`:
54
53
55
-
```{code-cell}
56
-
print(f'a = {task1.inputs.a}')
57
-
print(f'b = {task1.inputs.b}')
54
+
```{code-cell} ipython3
55
+
print(f'a = {task0.inputs.a}')
56
+
print(f'b = {task0.inputs.b}')
58
57
```
59
58
60
59
We can also check content of entire `inputs`:
61
60
62
-
```{code-cell}
63
-
task1.inputs
61
+
```{code-cell} ipython3
62
+
task0.inputs
64
63
```
65
64
66
65
As you could see, `task.inputs` contains also information about the function, that is an inseparable part of the `FunctionTask`.
67
66
68
67
Once we have the task with set input, we can run it. Since `Task` is a "callable object", we can use the syntax:
69
68
70
-
```{code-cell}
71
-
task1()
69
+
```{code-cell} ipython3
70
+
task0()
72
71
```
73
72
74
73
As you can see, the result was returned right away, but we can also access it later:
75
74
76
-
```{code-cell}
77
-
task1.result()
75
+
```{code-cell} ipython3
76
+
task0.result()
78
77
```
79
78
80
79
`Result` contains more than just an output, so if we want to get the task output, we can type:
81
80
82
-
```{code-cell}
83
-
result = task1.result()
81
+
```{code-cell} ipython3
82
+
result = task0.result()
84
83
result.output.out
85
84
```
86
85
87
86
And if we want to see the input that was used in the task, we can set an optional argument `return_inputs` to True.
88
87
89
-
```{code-cell}
90
-
task1.result(return_inputs=True)
88
+
```{code-cell} ipython3
89
+
task0.result(return_inputs=True)
90
+
```
91
+
92
+
## Type-checking
93
+
94
+
+++
95
+
96
+
### What is Type-checking?
97
+
98
+
Type-checking is verifying the type of a value at compile or run time. It ensures that operations or assignments to variables are semantically meaningful and can be executed without type errors, enhancing code reliability and maintainability.
2.**Improved Readability**: Type annotations make understanding what types of values a function expects and returns more straightforward.
106
+
3.**Better Documentation**: Explicitly stating expected types acts as inline documentation, simplifying code collaboration and review.
107
+
4.**Optimized Performance**: Type-related optimizations can be made during compilation when types are explicitly specified.
108
+
109
+
+++
110
+
111
+
### How is Type-checking Implemented in Pydra?
112
+
113
+
+++
114
+
115
+
#### Static Type-Checking
116
+
Static type-checking is done using Python's type annotations. You annotate the types of your function arguments and the return type and then use a tool like `mypy` to statically check if you're using the function correctly according to those annotations.
117
+
118
+
```{code-cell} ipython3
119
+
@pydra.mark.task
120
+
def add(a: int, b: int) -> int:
121
+
return a + b
122
+
```
123
+
124
+
```{code-cell} ipython3
125
+
# This usage is correct according to static type hints:
126
+
task1a = add(a=5, b=3)
127
+
task1a()
128
+
```
129
+
130
+
```{code-cell} ipython3
131
+
# This usage is incorrect according to static type hints:
132
+
task1b = add(a="hello", b="world")
133
+
task1b()
134
+
```
135
+
136
+
#### Dynamic Type-Checking
137
+
138
+
Dynamic type-checking is done at runtime. Add dynamic type checks if you want to enforce types when the function is executed.
139
+
140
+
```{code-cell} ipython3
141
+
@pydra.mark.task
142
+
def add(a, b):
143
+
if not (isinstance(a, int) and isinstance(b, int)):
144
+
raise TypeError("Both inputs should be integers.")
145
+
return a + b
146
+
```
147
+
148
+
```{code-cell} ipython3
149
+
# This usage is correct and will not raise a runtime error:
150
+
task1c = add(a=5, b=3)
151
+
task1c()
152
+
```
153
+
154
+
```{code-cell} ipython3
155
+
# This usage is incorrect and will raise a runtime TypeError:
156
+
task1d = add(a="hello", b="world")
157
+
task1d()
158
+
```
159
+
160
+
#### Checking Complex Types
161
+
162
+
For more complex types like lists, dictionaries, or custom objects, we can use type hints combined with dynamic checks.
if not all(isinstance(pair, Tuple) and len(pair) == 2 for pair in pairs):
170
+
raise ValueError("Input should be a list of pairs (tuples with 2 integers each).")
171
+
return [sum(pair) for pair in pairs]
172
+
```
173
+
174
+
```{code-cell} ipython3
175
+
# Correct usage
176
+
task1e = sum_of_pairs(pairs=[(1, 2), (3, 4)])
177
+
task1e()
178
+
```
179
+
180
+
```{code-cell} ipython3
181
+
# This will raise a ValueError
182
+
task1f = sum_of_pairs(pairs=[(1, 2), (3, "4")])
183
+
task1f()
91
184
```
92
185
93
186
## Customizing output names
94
187
Note, that "out" is the default name for the task output, but we can always customize it. There are two ways of doing it: using *python* function annotation and using another *pydra* decorator:
And if we try to run the task, an error will be raised:
168
259
169
-
```{code-cell}
260
+
```{code-cell} ipython3
170
261
:tags: [raises-exception]
171
262
172
263
task3a()
@@ -176,62 +267,61 @@ task3a()
176
267
177
268
After running the task, we can check where the output directory with the results was created:
178
269
179
-
```{code-cell}
270
+
```{code-cell} ipython3
180
271
task3.output_dir
181
272
```
182
273
183
274
Within the directory you can find the file with the results: `_result.pklz`.
184
275
185
-
```{code-cell}
276
+
```{code-cell} ipython3
186
277
import os
187
278
```
188
279
189
-
```{code-cell}
280
+
```{code-cell} ipython3
190
281
os.listdir(task3.output_dir)
191
282
```
192
283
193
284
But we can also provide the path where we want to store the results. If a path is provided for the cache directory, then pydra will use the cached results of a node instead of recomputing the result. Let's create a temporary directory and a specific subdirectory "task4":
194
285
195
-
```{code-cell}
286
+
```{code-cell} ipython3
196
287
from tempfile import mkdtemp
197
288
from pathlib import Path
198
289
```
199
290
200
-
```{code-cell}
291
+
```{code-cell} ipython3
201
292
cache_dir_tmp = Path(mkdtemp()) / 'task4'
202
293
print(cache_dir_tmp)
203
294
```
204
295
205
296
Now we can pass this path to the argument of `FunctionTask` - `cache_dir`. To observe the execution time, we specify a function that is sleeping for 5s:
If you're running the cell first time, it should take around 5s.
220
310
221
-
```{code-cell}
311
+
```{code-cell} ipython3
222
312
task4()
223
313
task4.result()
224
314
```
225
315
226
316
We can check `output_dir` of our task, it should contain the path of `cache_dir_tmp` and the last part contains the name of the task class `FunctionTask` and the task checksum:
227
317
228
-
```{code-cell}
318
+
```{code-cell} ipython3
229
319
task4.output_dir
230
320
```
231
321
232
322
Let's see what happens when we defined identical task again with the same `cache_dir`:
@@ -240,7 +330,7 @@ This time the result should be ready right away! *pydra* uses available results
240
330
241
331
*pydra* not only checks for the results in `cache_dir`, but you can provide a list of other locations that should be checked. Let's create another directory that will be used as `cache_dir` and previous working directory will be used in `cache_locations`.
242
332
243
-
```{code-cell}
333
+
```{code-cell} ipython3
244
334
cache_dir_tmp_new = Path(mkdtemp()) / 'task4b'
245
335
246
336
task4b = add_var_wait(
@@ -251,13 +341,13 @@ task4b()
251
341
252
342
This time the results should be also returned quickly! And we can check that `task4b.output_dir` was not created:
253
343
254
-
```{code-cell}
344
+
```{code-cell} ipython3
255
345
task4b.output_dir.exists()
256
346
```
257
347
258
348
If you want to rerun the task regardless having already the results, you can set `rerun` to `True`. The task will take several seconds and new `output_dir` will be created:
259
349
260
-
```{code-cell}
350
+
```{code-cell} ipython3
261
351
cache_dir_tmp_new = Path(mkdtemp()) / 'task4c'
262
352
263
353
task4c = add_var_wait(
@@ -270,15 +360,15 @@ task4c.output_dir.exists()
270
360
271
361
If we update the input of the task, and run again, the new directory will be created and task will be recomputed:
272
362
273
-
```{code-cell}
363
+
```{code-cell} ipython3
274
364
task4b.inputs.a = 1
275
365
print(task4b())
276
366
print(task4b.output_dir.exists())
277
367
```
278
368
279
369
and when we check the `output_dir`, we can see that it's different than last time:
280
370
281
-
```{code-cell}
371
+
```{code-cell} ipython3
282
372
task4b.output_dir
283
373
```
284
374
@@ -289,23 +379,22 @@ This is because, the checksum changes when we change either input or function.
289
379
### Exercise 1
290
380
Create a task that take a list of numbers as an input and returns two fields: `mean` with the mean value and `std` with the standard deviation value.
0 commit comments