support interaction with queues #20

antgonza · 2014-02-13T18:00:31Z

Currently there is no easy way to submit a ton of jobs to a queue and keep track of those jobs. Adding info about where the job was ran will also be nice (maybe a hostname will suffice).

josenavas · 2014-02-13T18:46:42Z

We have started this discussion in private and the proposed solution is to add a new parameter to make-bench-suite in order to provide the submission script, as it is system-dependent.

For the hostname feature I can easily modify the timing_wrapper.py script to also write out it. Are you interested on having it written to a file or through stdout will suffice?

What do you think @wasade and @antgonza, this is the right approach?

antgonza · 2014-02-13T18:50:16Z

I think in the output file is better. Also it will be nice, at least for my
use case, to add the name of the input file within the output file; I know
that the name of the output file has the name of the input but then parsing
should be easier if also in the text inside the output file.

On Thu, Feb 13, 2014 at 11:46 AM, josenavas [email protected]:

We have started this discussion in private and the proposed solution is to
add a new parameter to make-bench-suite in order to provide the
submission script, as it is system-dependent.

For the hostname feature I can easily modify the timing_wrapper.py script
to also write out it. Are you interested on having it written to a file or
through stdout will suffice?

What do you think @wasade https://github.com/wasade and @antgonzahttps://github.com/antgonza,
this is the right approach?

Reply to this email directly or view it on GitHubhttps://github.com//issues/20#issuecomment-35011269
.

Antonio

josenavas · 2014-02-13T19:05:15Z

Ok, in that case I will create a new output folder where all this information is stored...

wasade · 2014-02-13T19:08:01Z

The cluster utils stuff for the American Gut may be helpful here:

https://github.com/qiime/American-Gut/blob/master/ipynb/cluster_utils.ipy

It does require IPython to execute but makes it very easy to track
submitted jobs, automatically pulls runtime and max mem from tracejob, etc.
-Daniel

On Thu, Feb 13, 2014 at 12:05 PM, josenavas [email protected]:

Ok, in that case I will create a new output folder where all this
information is stored...

Reply to this email directly or view it on GitHubhttps://github.com//issues/20#issuecomment-35013264
.

josenavas · 2014-02-13T19:10:04Z

Thanks @wasade! I'll take a look to it and see if that fits on the purpose of this repo.

josenavas · 2014-02-13T19:26:05Z

@wasade I've taken a look to the cluster utils. I've noticed that it is an IPython notebook, that means that I cannot run them from the command line?

That looks really useful, although I think it's an overkill for the purpose here...

wasade · 2014-02-13T20:17:09Z

It is not a notebook, but requires the ipython execution environment at
this time

On Thursday, February 13, 2014, josenavas [email protected] wrote:

@wasade https://github.com/wasade I've taken a look to the cluster
utils. I've noticed that it is an IPython notebook, that means that I
cannot run them from the command line?

That looks really useful, although I think it's an overkill for the
purpose here...

Reply to this email directly or view it on GitHubhttps://github.com//issues/20#issuecomment-35015448
.

josenavas · 2014-02-13T20:21:44Z

Thanks @wasade.

I'd probably modify some of the code in order to wait for the commands to finish and remove the ipython execution environment.

@antgonza in order to do that (wait for the jobs until finish) I cannot rely on a user supplied submission script. Alternatively, I'll add parameters to provide the queue, job name and any extra args that you may need to pass to the qsub command. That'd work for you?

antgonza · 2014-02-13T20:24:41Z

I guess, the only thing I want is to be able to send all my commands via a
single command. Similar to cluster_jobs.py in compy.

josenavas · 2014-02-13T20:25:36Z

Ok, I'll try to get something in later today.

wasade · 2014-02-13T20:58:11Z

Those utils support submission and optional blocking, just FYI

On Thursday, February 13, 2014, Daniel T. McDonald <
[email protected]> wrote:

It is not a notebook, but requires the ipython execution environment at
this time

On Thursday, February 13, 2014, josenavas <[email protected]javascript:_e(%7B%7D,'cvml','[email protected]');>
wrote:

@wasade https://github.com/wasade I've taken a look to the cluster
utils. I've noticed that it is an IPython notebook, that means that I
cannot run them from the command line?

That looks really useful, although I think it's an overkill for the
purpose here...

Reply to this email directly or view it on GitHubhttps://github.com//issues/20#issuecomment-35015448
.

josenavas · 2014-02-13T21:03:07Z

Yes, I will provide support for blocking, since this is something that I'll need to automatically create the plots once all jobs are done. However, I'll relay on the bash script to submit jobs.

Thanks!

wasade · 2014-02-13T21:17:35Z

Back at a computer. The reason I bring up these utils is all of that support is already written:

> %run cluster_utils.ipy # or import it, depends on ipython
> job = submit_qsub("sleep 5; hostname")
> res = wait_on(job) # will block until job is complete
> run_info = job_run_details(*res) # runtime, mem used, stdout, stderr, etc

This supports child submissions as well (i.e., how QIIME's parallel scripts submit from nodes)

wasade · 2014-02-13T21:19:12Z

A step further:

> %run cluster_utils.ipy # or import it, depends on ipython
> job1 = submit_qsub("sleep 5; hostname")
> job2 = submit_qsub("sleep 5; hostname")
> job3 = submit_qsub("sleep 5; hostname")
> res = wait_on([job1, job2, job3]) # will block until job is complete
> run_details = [job_run_details(*job) for job in res]

josenavas · 2014-02-13T21:24:13Z

Agree, but I don't really want to rely on the ipython's execution environment. I'd like to still stay in a single bash script that can execute everything.

Also, this code looks really useful: probably it'd be better to have it as a standalone package? So other projects that want to use them do not have to require the American-Gut project...

wasade · 2014-02-13T21:34:46Z

Ah, okay.

Agree there could easily be benefit outside of AG. Could you create an
issue please? It is a little more involved though as it would be good (for
a general solution) to get away from the need for IPython.
-Daniel

On Thu, Feb 13, 2014 at 2:24 PM, josenavas [email protected] wrote:

Agree, but I don't really want to rely on the ipython's execution
environment. I'd like to still stay in a single bash script that can
execute everything.

Also, this code looks really useful: probably it'd be better to have it as
a standalone package? So other projects that want to use them do not have
to require the American-Gut project...

Reply to this email directly or view it on GitHubhttps://github.com//issues/20#issuecomment-35027664
.

josenavas · 2014-02-13T21:37:24Z

Sure,

I think that you're only using the IPython execution environment for calling bash command right? I think this can be easily moved away using system calls. What do you think?

Do you want the new issue in the AG repo?

wasade · 2014-02-13T21:41:06Z

Yes, please

Using it for things like:

def parse_qstat():
    """Process qstat output"""
    user = os.environ['USER']
    lines = !qstat -u $user

    jobs = {}
    for id_, name, state in lines.grep(user).fields(0,3,9).fields():
        job_id = id_.split('.')[0]
        jobs[job_id] = {}
        jobs[job_id]['name'] = name
        jobs[job_id]['state'] = state

    return jobs

As the stdout parsing from IPython for system calls is very quick and easy to use. Not terribly painful to parse the output in a custom format, or better yet, parse the XML output you can get from qstat, but this was the easy thing to do at the time and wanted to have an excuse to get more familiar with it.

josenavas · 2014-02-13T21:43:00Z

Ok, make sense.

Issue created here

josenavas · 2014-02-13T21:46:24Z

Since I'm going to refactor some of that code, I'll create a new repo under my account and make the changes there. I don't mind to refactor the entire code. We can do some review there and then transfer the ownership to the BioCore organization.

Sounds like a good plan?

josenavas mentioned this issue Feb 15, 2014

Add queue interaction support #21

Merged

antgonza closed this as completed in #21 Mar 5, 2014

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

support interaction with queues #20

support interaction with queues #20

antgonza commented Feb 13, 2014

josenavas commented Feb 13, 2014

antgonza commented Feb 13, 2014

josenavas commented Feb 13, 2014

wasade commented Feb 13, 2014

josenavas commented Feb 13, 2014

josenavas commented Feb 13, 2014

wasade commented Feb 13, 2014

josenavas commented Feb 13, 2014

antgonza commented Feb 13, 2014

josenavas commented Feb 13, 2014

wasade commented Feb 13, 2014

josenavas commented Feb 13, 2014

wasade commented Feb 13, 2014

wasade commented Feb 13, 2014

josenavas commented Feb 13, 2014

wasade commented Feb 13, 2014

josenavas commented Feb 13, 2014

wasade commented Feb 13, 2014

josenavas commented Feb 13, 2014

josenavas commented Feb 13, 2014

support interaction with queues #20

support interaction with queues #20

Comments

antgonza commented Feb 13, 2014

josenavas commented Feb 13, 2014

antgonza commented Feb 13, 2014

josenavas commented Feb 13, 2014

wasade commented Feb 13, 2014

josenavas commented Feb 13, 2014

josenavas commented Feb 13, 2014

wasade commented Feb 13, 2014

josenavas commented Feb 13, 2014

antgonza commented Feb 13, 2014

josenavas commented Feb 13, 2014

wasade commented Feb 13, 2014

josenavas commented Feb 13, 2014

wasade commented Feb 13, 2014

wasade commented Feb 13, 2014

josenavas commented Feb 13, 2014

wasade commented Feb 13, 2014

josenavas commented Feb 13, 2014

wasade commented Feb 13, 2014

josenavas commented Feb 13, 2014

josenavas commented Feb 13, 2014