Changing from signac-flow to row: Row status keeps completion status after completion file is deleted? #83
Replies: 1 comment 2 replies
-
|
Yes, this is intentional behavior. Row is fast by default. Fully scanning the entire workspace tree for completion files on every invocation would decrease performance by many orders of magnitude. This section of the documentation addresses the use-case you describe, specifically the "You delete product files in a directory" case: https://row.readthedocs.io/en/0.4.0/guide/concepts/cache.html#completed-directories
After you delete a completion file (or many completion files), you can clear the cache and determine the new set of completed actions with the commands: To prevent race conditions, you can only take these steps when there are no currently submitted jobs -- At one point, I considered implementing a If your workflow involves regularly invalidating the completion status of actions, consider removing the product entirely and then you can resubmit the job whenever you need to rerun it.
The fundamental assumption that row makes is that when an action is complete on a directory then it will never need to be executed again. If you want dynamic scheduling (i.e. based on the phase of the moon), use a different tool. Think of if this way, say you have a 3 step worklow A->B->C. You run A. After A completes, B is now eligible and you run it. Now, C is eligible. But then you decide to change the completion condition for A. With dynamic completion conditions, now both A and C are eligible to run on the same directory. That violates the notion of C depending on B which depends on A. Row can't completely prevent you from doing this, but it tries to keep the workflow in a logically consistent state as much as possible. The row way of doing what you ask is to move that logic into the action script and to touch a file when the completion condition is true. In the signac data model, you should create new statepoints to perform additional work. row will automatically detect the addition of new directories when you run Touching a file appears to be a kludge at first glance (prefer to use the existence of actual output files whenever possible), but the only alternative would be a full fledged client/server database. This is possible on some HPC systems and not on others. In all cases, I think it is impractical to ask researchers to configure and manage the security of a database system solely for the purpose of managing workflow actions. What row is doing is using the filesystem itself as the database.
Jobs are aggregated by default in row. See https://row.readthedocs.io/en/0.4.0/guide/tutorial/group.html. You actually need to take additional steps (setting You mention using groups to summarize the results of many directories (e.g. averaging over replicates). I wrote a whole howto on this topic: https://row.readthedocs.io/en/0.4.0/guide/howto/summarize.html#summarize-directory-groups-with-an-action. In short, set One solution to the "summary" jobs is to give them no product files. These actions will then always be available to rerun when needed with no need to manually delete files or clear the cache. |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Hello!
I am working to change from signac-flow to row for a simple tutorial and example signac_numpy_tutorial for the university and general public. I noticed that when generating the "signac_job_document.json" file to track other calculated or site specific variables, as done with signac-flow, it work OK and accepts it being completed in row correctly (like signac-flow). However, when I delete the file and rerun "row show status", the completion status persists, despite it not being completed anymore.
In signac-flow, it would check everything every time from the "labels" (likely another reason it was slower). However, this posses an issue when you are adding to a more state points, more replicates, etc. Typically, I would just deleted the combined analysis files or folder in the main project space (where project.py or actions.py is) for all the combined workspace folders, replicates... when another workspace folder (state point) was created. This would force it to then rerun the analysis as it would be marked incomplete then.
It looks like this info is held in ".row/completed.postcard" and static, even if files are deleted. Is the idea to use "row scan" and "row clean --completed" and check? This is good, but may result in errors as it will not default catch these changes and potentially not run parts that are supposed to be run/rerun. Maybe I am missing the concept though. Is there a reason not to scan every time? Maybe it is too costly?
I would be happy to share my current row stating setup as needed, please just ask.
Is there a work around for this that I am missing that make the jobs uncompleted when it is no longer completed or its marked completion file is deleted? Maybe I am doing something wrong?
Instead of just searching for a file (assume if the file exists), Is there a way to run a function and accept a True/False like signac-flow, instead of just looking for if a specific file is there?
(example below)?
I will also be looking for a way to aggregate functions and jobs for replicates or all the jobs in 1, etc.
Any other thoughts or comments?
Versions used via micromamba:
row 0.4.0 h0716509_0 conda-forge
signac 2.2.0 pyhd8ed1ab_1 conda-forge
signac-dashboard 0.6.1 pyhd8ed1ab_1 conda-forge
Beta Was this translation helpful? Give feedback.
All reactions