-
Notifications
You must be signed in to change notification settings - Fork 306
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add List Comprehension to the SPEC #651
Comments
It does not resemble python in the same way, But I personally kind of like # Filtered scatters without reassignment!
scatter(sample in sample if sample == "blood"){
call foo
call bar
call biz
} |
I like adding the filtering syntax to scatter. You could do:
But then you end up with
would (presumably) yield As for list comprehension, I'm supportive. I'd suggest dropping the parens and just using
or
|
I think the |
@jdidion although the implication with scatter is that the computations WITHIN the scatter are executed remotely but not the scatter itself |
If this filtering functionality gets added to the scatter, is there really still a point in adding a separate list comprehension? scatter(sample in samples if sample.include){String sample_names = sample.name} and Array[String] sample_names = [sample.name for sample in samples if sample.include] would have the same effect. Don't get me wrong, I like list comprehensions, but in this case it seems like it's just adding more syntax for people to learn. While the same effect can be achieved with already existing (except for the filtering) and equally concise syntax. Regarding the distinction between |
@DavyCats good question and I would say they are quite distinct, but the utility of the list comprehension is far greater. Scatters are confined to workflows and introduce an inner scope and any variables set there will be available in the outer scope List comprehension on the other hand
What I just realized is that we already have an implicit iterator
This can be the base form that fits really well into an existing scatter scatter(foo in bar if foo > 3: foo * 2) {
...
} Then you can use this in an expression directly Array[File] files
Array[String] contents = [file in files if size(file, "mb") < 5: read_string(file)] |
@patmagee this could work. I'm not sure about
Another thing to consider is an implicit iterator variable, e.g.
|
I wonder if it's better to just bite the bullet and introduce |
I feel like without introducing a more complex concept like Array[String] = [ read_string(input) for file in files if size(file, "mb") < 5) ] I am on the fence of whether we should add a scatter form at all, or keep the List comprehension constrained to the expression. While the fact that we can use it right in a scatter looks nice # Simple form
scatter(foo in bar if foo > 3){
}
# mutate
scatter(foo * 2 for foo in bar if foo > 3){
} It inevitably opens the door to a for (foo for foo in bar){
} So it may actually be simpler to just use list comprehension inline. Although there is also the recursion issue again: scatter(foo in [b * 2 for b in bar if b > 3]){
} |
I agree with your last point - not necessary to change |
Part of the mental block with me on introducing list (or array in this case) comprehensions directly is because WDL doesn't have the ability to define new functions, nor does it have lambda functions. This means that anything more than the straightforward examples you provide is going to be pretty messy when crammed into the brackets, and expressing those sort of complex transformations succinctly is where list comprehensions shine in my view: the simpler cases, like what you show in your examples, aren't bothersome enough to extend the syntax in my view. The example you gave that stuck out to me is this one: List[String] sample_names = scatter(sample in samples if sample == "blood" ) { sample.name } As has been discussed above, An exampleTake this example, which I've noted where I think improvements could happen. version 1.2
struct Sample {
String kind
String name
}
workflow run {
Array[Sample] samples = [
Sample {
kind: "normal",
name: "sample_one",
},
Sample {
kind: "tumor",
name: "sample_two",
},
Sample {
kind: "normal",
name: "sample_three",
},
Sample {
kind: "tumor",
name: "sample_four",
},
]
scatter (sample in samples) {
# This `if` can be encapsulated in the scatter.
if (sample.kind == "normal") {
# This _could_ be pulled into the `scatter` or a list comprehension,
# though I think the benefit is minimal.
String name = sample.name
}
}
output {
# This `select_all` could be avoided by the scatter filter.
Array[String] names = select_all(name)
}
} As I hope this demonstrates:
So, my end thought at the moment is to modify |
In general I've never been super keen on more complex, non-task based operators. That said, if going this route the standard higher order function to convert As to Bs is |
A common pattern that has evolved in WDL is the need to convert a list of type A into a list of type B. Most programming languages support list comprehension to accomadate this, but WDL has no built in method for it. Coincidentally we have already implemented a similar pattern with ternary operators (
if-then-else
) but have not gone as far as list comprehensionCurrently, you can fake list comprehension using a scatter
While this is technically a form of list comprehension, it has a few drawbacks
Support List Comprehension Directly
The simplest strategy we can take is to support list comprehension directly within WDL, potentially following the Python syntax
One unintended consequence (for better or worse) of the above example is that it adds
for
as a reserved KW to the specification, without introducing afor
construct. I think this is an acceptable tradeoff, but if we wanted to keep with existing semantics we could do somethingAn additional benefit to list comprehension is the ability to easily add filtering
The text was updated successfully, but these errors were encountered: