Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dxCompiler possibly incorrect localizes input file expressions #417

Open
LiterallyUniqueLogin opened this issue Jan 31, 2023 · 11 comments
Open
Labels
tracked_internally Issue is tracked internally

Comments

@LiterallyUniqueLogin
Copy link

I'm compiling the following WDL

task load_shared_covars {
  input {
    String script_dir
    File script = "~{script_dir}/traits/load_shared_covars.py"
    File python_array_utils = "~{script_dir}/traits/python_array_utils.py"
  }

  output {
    ...
  }

  command <<< 
    ~{script}
  >>> 

  runtime {
    ...
  }
}

Inside the load_shared_covars.py script I have the line import python_array_utils. However, this fails with the error ModuleNotFoundError: No module named 'python_array_utils'. I'm guessing this is because the python_array_utils input is being mislocalized.

WDL guarantees here that files that originate in the same input directory should be localized into the same runtime directory. So I should be able to rely on being able to import a script that resides in the same input directory. But I'm guessing that dxCompiler isn't properly respecting that for inputs that are assigned to expressions, as here its docs suggest that it does not consider those inputs.

Even if dxCompiler will not considered inputs with default expression values as inputs (which is fine for my use case) can dxCompiler still guarantee that these files get localized to the location that WDL requires?

Happy to provide more details or clarification (or an example run). Also happy to hear if I've misdiagnosed what's going on.

Thanks!

@sclan
Copy link
Collaborator

sclan commented Feb 1, 2023

The inputs of the WDL pipelines on the DNAnexus platform is not meant be local path (it will be compiled to and run on the platform, using the objects from the project of the platform).

In order to provide the acceptable inputs, the dx URI syntax is required. See examples in the following documentation:
https://github.com/dnanexus/dxCompiler/blob/develop/doc/ExpertOptions.md#storing-a-docker-image-as-a-file

@LiterallyUniqueLogin
Copy link
Author

Hi @sclan, apologies for the confusion, that is what I'm doing. I always set script_dir to dx://UKB_Test:/imputed_strs_paper. So the full path turns out to be dx://UKB_Test:/imputed_strs_paper/traits/python_array_utils.py.

So I think my description of the problem above still stands. Can you take another look at this?

(The reason I'm not hardcoding script_dir and instead taking it as an input is that I'd like this WDL to work both locally with Cromwell and my downloaded SNP-array genotype files and also on the cloud with dxCompiler. And the base directory for those two locations is not the same).

@sclan
Copy link
Collaborator

sclan commented Feb 1, 2023

Thanks for the additional information. All inputs are processed at the same time meaning that the dxCompiler would not have the value of "~{script_dir}" at the time when the File inputs "script" or "python_array_utils" were parsed.

Have you ran your use case through cromwell? I used a simplified version below with cromwell v83 and execution failed.

version 1.0

task test_var {
   input {
     String script_dir
     File input_file = "~{script_dir}/file.py"
   }
   command <<<
     set -euxo pipefail
     ls -l ~{input_file} > result.txt
   >>>
   output {
     File o_u_t = "result.txt"
   }
} 

input.json

{
  "script_dir": "."
}

But if I spelled out the File path ("./file.py"), then the result.txt file shows the file.py file has been localized correctly. This is not a unique behavior to the dxCompiler / dxExecutor.

The alternative way to do this is to write a submission script and have the input string "script_dir" as the variable in the script so when the workflow is submitted, all values of the workflow inputs have been filled with literal values rather than variables that needed parsing.

@LiterallyUniqueLogin
Copy link
Author

@sclan, thank you for helping with this.

My pipeline routinely succeeds on Cromwell for me - that's where I started my development.

It seems there's an issue with doing this while trying to run a task directly which is exposed by your example. But if you wrap the task in a workflow then it runs just fine in Cromwell. So I think there is precedent for handling this case according to the WDL spec. If I've got that right, is this something that dxCompiler/executor could support?

test.wdl

version 1.0

workflow main {
    input {
        String script_dir
    }

    call test_var { input:
        script_dir = script_dir
    }

    output {
        File o_u_t = test_var.o_u_t
    }
}

task test_var {
   input {
     String script_dir
     File input_file = "~{script_dir}/file.py"
   }
   command <<<
     set -euxo pipefail
     ls -l ~{input_file} > result.txt
   >>>
   output {
     File o_u_t = "result.txt"
   }
}

input.json

{
        "main.script_dir": "." 
}
> touch file.py
> java -jar cromwell-84.jar run temp.wdl -i input.json
...
...
...
[2023-02-01 16:18:20,57] [info] SingleWorkflowRunnerActor workflow finished with status 'Succeeded'.
{
  "outputs": {
    "main.o_u_t": "/home/cromwell-executions/main/2793491b-0778-4494-965b-79ecf95077a4/call-test_var/execution/result.txt"
  },
  "id": "2793491b-0778-4494-965b-79ecf95077a4"
}
> cat /home/cromwell-executions/main/2793491b-0778-4494-965b-79ecf95077a4/call-test_var/execution/result.txt
-rw-r--r-- 2 user user 14 Feb  1 16:15 /home/cromwell-executions/main/2793491b-0778-4494-965b-79ecf95077a4/call-test_var/inputs/1546792457/file.py

@sclan
Copy link
Collaborator

sclan commented Feb 2, 2023

dxCompiled workflow using the above example you provided also worked. The json input (input.json) used was

{
  "main.script_dir": "dx://project-xxx:"
}

The compile command:
java -jar dxCompiler-2.10.7.jar compile test.wdl -destination foobar:/ -verbose -archive -inputs input.json
The "file.py" file was uploaded to the same project's root level.
The execution command:
dx run workflow-xxx -f input.dx.json -y

The dx style input json (after the conversion during compilation) was

{
  "stage-common.script_dir": "dx://project-xxx:"
}

@LiterallyUniqueLogin
Copy link
Author

@sclan Thank you for taking the time to walk through this with me. It seems you're right, dxCompiler is doing this appropriately. I ran the following test to confirm:

test2.wdl

version 1.0 

workflow main2 {
    input {
        String script_dir
    }   

    call test_var2 { input:
        script_dir = script_dir
    }   

    output {
        File o_u_t = test_var2.o_u_t
    }   
}

task test_var2 {
   input {
     String script_dir
     File input_file1 = "~{script_dir}/file.py"
     File input_file2 = "~{script_dir}/file2.py"
   }   
   command <<< 
     set -euxo pipefail
     {   
       echo "input_file1 dir"
       ls -l $(dirname ~{input_file1})
       echo "input_file2 dir"
       ls -l $(dirname ~{input_file2})
     } > result.txt
   >>> 
   output {
     File o_u_t = "result.txt"
   }   
}

compiled and ran it with the same commands (substituting main2 for main), and got the following output:

result.txt

input_file1 dir
total 0
-r-x------ 1 root root 0 Feb  2 18:36 file.py
-r-x------ 1 root root 0 Feb  2 18:36 file2.py
input_file2 dir
total 0
-r-x------ 1 root root 0 Feb  2 18:36 file.py
-r-x------ 1 root root 0 Feb  2 18:36 file2.py

Since this example was working, I went back to recompile and rerun my original workflow to reproduce the error I was getting before and see what else might be causing it. But the load_shared_covars task in my original workflow is now succeeding. I'm trying to figure out what could've been the issue and am not sure - the only thing I can think of would be that somehow the workflow was pulling an old version of the load_shared_covars task which didn't have the python_array_utils input at all (thus causing the "can't find this file" error), but I don't know why the compilation process wouldn't have overwritten the old tasks with the new ones when I ran it with the -archive flag. Anyway, it's nice things are now working.

Again, thanks for the help.

@LiterallyUniqueLogin
Copy link
Author

Nevermind, I've just run into this issue with a different task in my pipeline, so I'm reopening the issue.

I can reproduce the issue with this script. No clue why our test scripts above couldn't reproduce the issue.

test.wdl

version 1.0 

workflow a_new_main {
    input {
        String script_dir
    }   

    call a_new_task { input:
        script_dir = script_dir
    }   

    output {
        File o_u_t = a_new_task.o_u_t
    }   
}

task a_new_task {
   input {
     String script_dir
     File input_file = "~{script_dir}/file.sh"
     File text_to_read = "~{script_dir}/hello_world.txt"
   }   
   command <<< 
     set -euxo pipefail
     ~{input_file} > result.txt
   >>> 
   output {
     File o_u_t = "result.txt"
   }   
}

hello_world.txt

Hello, world!

This succeeds in Cromwell. But the same input and dxCompile/dx run commands as in previous posts produces the error

cat: /home/dnanexus/inputs/input7974344978815444566/hello_world.txt: No such file or directory

I'm not sure your background, but if you're a DNANexus dev and have sudo permissions, you can see the analysis I ran with the ID analysis-GPK36J0Jv7BKPVfbFk81XY38.

Any idea why this is happening?

@sclan
Copy link
Collaborator

sclan commented Feb 3, 2023

If "text_to_read" is not mentioned / used in the command <<<>>>, the file will not be downloaded to the worker (lazy load behavior).

If the file (hello_world.txt) is needed, you need to specify it in the command block, not through a script.

@LiterallyUniqueLogin
Copy link
Author

AFAIK, that contradicts the WDL specification, which makes no reference to the command section when determining what inputs should be localized. If that understanding is correct, can we change the dxCompiler/dxExecutor functionality here to make it work as intended? How difficult would it be to make such a change?

Just thinking about it from a logical standpoint, why would a user specify a file as an input if they didn't want it localized, and thus would have no means of accessing it? When I think about lazy localization paradigms, I think about streaming and dxFuse. This to me feels instead like a design that contradicts the code author's intent.

@sclan
Copy link
Collaborator

sclan commented Feb 3, 2023

Thanks for the feedback. I will create an internal feature request for adding the compilation switch to turn the file lazy loading behavior on / off.

Since this is something that WDL spec did not specify (it specifies the WDL syntax rather than the execution environment behavior), and the workaround exists (spell out the file target in the command section before the actual processing takes place), the request will start out low on our priority list. If more users request the same feature, then the priority will be increased over time before it gets on the developers' todo list.

@sclan
Copy link
Collaborator

sclan commented Feb 3, 2023

For internal tracking: PMUX-1520

@Gvaihir Gvaihir added the tracked_internally Issue is tracked internally label Apr 24, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
tracked_internally Issue is tracked internally
Projects
None yet
Development

No branches or pull requests

3 participants