Skip to content

Quick DIRAC Tutorial

Andrei Tsaregorodtsev edited this page Jun 22, 2023 · 63 revisions

Preliminary remarks

This is a brief introduction to the DIRAC system based on examples of commands usage. For more detailed tutorials please visit the web pages of the project:

https://github.com/DIRACGrid/DIRAC/wiki

1. Getting started

Set the DIRAC client environment

All the DIRAC commands are available once the environment is set up. If you have a configured client installation, then do :

$ source /cvmfs/dirac.egi.eu/dirac/bashrc_egi

Or if you have local client installation, for example

$ source /opt/dirac/bashrc

This is usually done in the login script of a user account. Instructions how to install the DIRAC software locally can be found here .

Getting help

All the DIRAC commands have -h argument to provide help information, for example :

$ dirac-info -h

Report info about local DIRAC installation
Usage:
  dirac-info [option|cfgfile] ... Site

General options:
  -o  --option <value>         : Option=value to add
  -s  --section <value>        : Set base section for relative parsed options
  -c  --cert <value>           : Use server certificate to connect to Core Services
  -d  --debug                  : Set debug mode (-ddd is extra debug)
  -   --autoreload             : Automatically restart if there's any change in the module
  -   --license                : Show DIRAC's LICENSE
  -h  --help                   : Shows this help

In case of problems, executing commands with -d or even -ddd flag will provide additional output.

Setting up user certificate

Use DIRAC certificate conversion tool to convert your certificate in p12 format into the PEM format and store it in $HOME/.globus directory

$ dirac-cert-convert usercert.p12
Converting p12 key to pem format
Enter Import Password:
MAC verified OK
Enter PEM pass phrase:
Verifying - Enter PEM pass phrase:
Converting p12 certificate to pem format
Back up /home/dirac/.globus/usercert.pem file
Enter Import Password:
MAC verified OK
Information about your certificate:
subject= /C=FR/O=DIRAC/OU=DIRAC Consortium/CN=DIRAC Tutorial User 06/[email protected]
issuer= /O=DIRAC Consortium/CN=DIRAC EOSC-Hub Certification Authority
Done

First proxy

Generate the certificate proxy as a member of your dirac group. Add -M switch to add VOMS extension:

$ dirac-proxy-init -g biomed_user -M

Get information

Get information about the client and service that the client will work with

$ dirac-info

Get information about user credentials

$ dirac-proxy-info

Client configuration

In the Tutorial we will use COMDIRAC set of commands that need some additional configuration. The DIRAC client configuration is stored in $HOME/.dirac/dcommands.conf file. The configuration can be visualized and edited with the dconfig command:

$ dconfig
[global]
default_profile = dirac_tutorial

[dirac_tutorial]
group_name = dirac_tutorial
home_dir = /vo.france-grilles.fr/user/u/user01
default_se = DIRAC-USER

Changing some options:

$ dconfig dirac_tutorial.home_dir=/vo.france-grilles.fr/user/u/user02

Starting DIRAC session

The DIRAC client session is started with dinit command:

$ dinit biomed_user
Generating proxy...
Enter Certificate password:
Uploading proxy for dirac_tutorial...
Proxy generated:
subject      : /C=FR/O=DIRAC/OU=DIRAC Consortium/CN=DIRAC Tutorial User 30/[email protected]/CN=6155783731
issuer       : /C=FR/O=DIRAC/OU=DIRAC Consortium/CN=DIRAC Tutorial User 30/[email protected]
identity     : /C=FR/O=DIRAC/OU=DIRAC Consortium/CN=DIRAC Tutorial User 30/[email protected]
timeleft     : 23:59:57
DIRAC group  : biomed_user
rfc          : True
path         : /Users/atsareg/work/test/DiracTest/localproxy
username     : user30
properties   : NormalUser, ProductionManagement

2. First Job

To submit you first job you can use dsub command by doing just:

$ dsub /bin/echo "Hello World"

The job status can be now examined with dstat command:

$ dstat
JobID    Owner   JobName  OwnerGroup     JobGroup  Site  Status   MinorStatus             SubmissionTime
===============================================================================================================
9022022  user40  echo.sh  biomed_user  NoGroup   ANY   Waiting  Pilot Agent Submission  2015-11-08 10:41:34    

This shows all the jobs currently being executed and not reached yet their final state. To see the jobs Done:

$ dstat -S Done
JobID    Owner   JobName     OwnerGroup     JobGroup  Site                  Status  MinorStatus         SubmissionTime
=============================================================================================================================  
9022022  user40  echo.sh     training_user  NoGroup   CLOUD.bifi-unizar.es  Done    Execution Complete  2015-11-08 10:41:34

Now it is time to get the output with the doutput command:

$ doutput 9022022
$ ls -l 9022022
total 4 
-rw-r--r-- 1 user40 dirac 12 Nov  8 11:41 std.out  
$ cat 9022022/std.out
Hello World

Web Portal Job Monitor

You can follow the job status in the DIRAC Web Portal Job Monitor Application. To use you have to first load your certificate in the p12 form (usercert.p12) into the web browser. Then go to the portal page:

https://dirac.egi.eu/DIRAC

3. Understanding job description

JDL description

The job description is done with a JDL language, example ::

$ cat echo.jdl
[
JobName = "Test_Hello";
Executable = "echo.sh";
Arguments = "Hello world !";
StdOuput = "std.out";
StdError = "std.err";
InputSandbox = {"echo.sh"};
OutputSandbox = {"std.out","std.err"};
]

Edit the Arguments field and submit a job using its JDL description ::

$ dsub -J echo.jdl
9022409

Exercise

  • Modify the JDL parameters in the "echo.jdl" file in your local directory
  • Submit jobs and verify that your modifications are taken into account

4. Getting data on the Grid

DIRAC File System

DIRAC is presenting the grid/cloud storages as a single file system accessible with Unix-like commands ::

dpwd, dls, dcd, dmkdir, dchmod, dchgrp, dchown, dfind, drm

Some examples ::

user40@stoor16:~$ dpwd
/training.egi.eu/user/u/user40
user40@stoor16:~$ dls -l
/training.egi.eu/user/u/user40:
-rwxrwxr-x 2 user40 training_user 12 2015-11-07 23:33:43 std.out
-rwxrwxr-x 2 user40 training_user 30 2015-11-07 23:19:06 test.sh
user40@stoor16:~$ dmkdir newdir
user40@stoor16:~$ dls -l
/training.egi.eu/user/u/user40:
drwxrwxr-x 0 user40 training_user  0 2015-11-08 12:14:05 newdir
-rwxrwxr-x 2 user40 training_user 12 2015-11-07 23:33:43 std.out
-rwxrwxr-x 2 user40 training_user 30 2015-11-07 23:19:06 test.sh
user40@stoor16:~$ dchmod 755 test.sh
user40@stoor16:~$ dls -l
/training.egi.eu/user/u/user40:
drwxrwxr-x 0 user40 training_user  0 2015-11-08 12:14:05 newdir
-rwxrwxr-x 2 user40 training_user 12 2015-11-07 23:33:43 std.out
-rwxr-xr-x 2 user40 training_user 30 2015-11-08 12:14:36 test.sh
user40@stoor16:~$ dcd newdir
user40@stoor16:~$ dpwd
/training.egi.eu/user/u/user40/newdir

Uploading data to the Grid

Get local file to the Grid with the dput command ::

$ dput test.sh test.sh
$ dls -l
/training.egi.eu/user/u/user40/newdir:
-rwxrwxr-x 1 user40 training_user 22 2015-11-08 12:21:09 test.sh

To see physical replicas of the file use dreplicas ::

$ dreplicas test.sh
/training.egi.eu/user/u/user40/newdir/test.sh:
   CYFRONET-USER dips://dirac-dms.egi.eu:9148/DataManagement/StorageElement/training.egi.eu/user/u/user40/newdir/test.sh

If not specified explicitely, the default Storage Element is used. To choose another Storage Element do ::

$ dput echo.sh echo.sh -D DIRAC-USER
$ dreplicas echo.sh
/training.egi.eu/user/u/user40/newdir/echo.sh:
   TRAINING-USER dips://dirac-dms.egi.eu:9149/DataManagement/TrainingStorageElement/training.egi.eu/user/u/user40/newdir/echo.sh

Downloading data from the Grid

Get a remote file locally with dget command ::

$ dget echo.sh
$ ls -l
total 4
-rw-r--r-- 1 user40 dirac 22 Nov  8 13:32 echo.sh

Replicating data

Make another physical copy of the data with drepl command ::

$ dreplicas test.sh
/training.egi.eu/user/u/user40/newdir/test.sh:
 CYFRONET-USER dips://dirac-dms.egi.eu:9148/DataManagement/StorageElement/training.egi.eu/user/u/user40/newdir/test.sh
$ drepl test.sh -D TRAINING-USER
$ dreplicas test.sh
/training.egi.eu/user/u/user40/newdir/test.sh:
 CYFRONET-USER dips://dirac-dms.egi.eu:9148/DataManagement/StorageElement/training.egi.eu/user/u/user40/newdir/test.sh
 TRAINING-USER dips://dirac-dms.egi.eu:9149/DataManagement/TrainingStorageElement/training.egi.eu/user/u/user40/newdir/test.sh

Removing data

Remove files with drm command, for example, the following will remove all the replicas of a file from the Grid ::

$ drm /training.egi.eu/user/u/user40/newdir/echo.sh

To remove just one copy from a given Storage Element do the following ::

$ drm /training.egi.eu/user/u/user40/newdir/test.sh -D TRAINING-USER

Exercise

  • Repeat examples described above
  • Create your own local file and upload it to the Grid
  • Replicate it to another Storage Element
  • Download the file from the grid
  • Remove one replica of the file
  • Remove the file completely from the Grid.

5. Jobs with output data

In further examples mandelbrot application creating Mandelbrot images will be used.

Specifying output data

Jobs produce data which should be made available after the job finishes. To instruct job to upload output data use OutputData JDL parameter. You can also specify optional OutputSE and OutputPath parameters. For example ::

[
JobName = "Mandel_tutorial";
Executable = "mandelbrot";
Arguments = "-X -0.464898 -Y -.564798 -W 600 -H 400 -M 1500 -P 0.0000092 mandel.bmp";
StdOuput = "std.out";
StdError = "std.err";
OutputSandbox = {"std.out","std.err","mandel.bmp"};
InputSandbox = {"mandelbrot"};
CPUTime = 1000;
OutputSE = "TRAINING-USER";
OutputPath = "/special_path";
OutputData = { "mandel.bmp" };
]

By default, the output data are stored in the user home directory in the DIRAC File Catalog, a subdirectory per job is created, for example ::

/training.egi.eu/user/u/user40/9022/9022770/mandel.bmp

OutputPath replaces the job subdirectory. With the example above the output file wil be ::

/training.egi.eu/user/u/user40/special_path/mandel.bmp

Finally, the OutputData field can be specified with a full Logical File Name (LFN). This will become an exact path in the catalog, for example ::

[
...
OutputData = { "LFN:/training.egi.eu/user/u/user40/tutorial/mandel.bmp" };
]

Note that you should be sure that you have write access to the specified location.

Downloading output data

The doutput command can be used to also download output data produced by the jobs, for example ::

$ doutput --Data 9022409

This command will download output data instead of the sandbox of the specified job. See the help information of the command for more options.

Exercise

  • Submit several mandelbrot jobs with the output data going to the desired location
  • Download the output data to your local disk

Hint: the mandelbrot program is available in the grid storage at this path ::

/training.egi.eu/user/u/user40/mandelbrot

Use "mandelbrot.jdl" example job description from your home directory

6. Jobs with input data

Jobs can process input data which will downloaded by DIRAC to the local disk of a running job. The input data is specified by InputData JDL parameter. You should provide LFN names of one or more files, for example ::

[
JobName = "Input_Test";
Executable = "/bin/cat";
Arguments = "echo.sh";
StdOuput = "std.out";
StdError = "std.err";
InputData = {"/training.egi.eu/user/u/user40/newdir/echo.sh"};
OutputSandbox = {"std.out","std.err"};
]

You can provide input data as part of the InputSandbox rather than InputData. This can be useful if you want to bypass the mechanism of scheduling your jobs only to the sites which are declared close to storage elements where your data reside. For example ::

[
...
InputSandbox = {"LFN:/training.egi.eu/user/u/user40/newdir/echo.sh"};
...
]

Exercise

  • Submit jobs with input data and make sure that the data is processed by the job

7. Multiple job submission

Specifying parametric jobs

You can submit multiple jobs in one go by describing a sequence of parameters in the job JDL. For example ::

[
JobName = "Test_param_%n";
JobGroup = "Param_Test_1";
Executable = "echo.sh";
Arguments = "This job is for %s, the job ID is %j";
Parameters = {"Andrei","Alexandre","Victor","Pierre"};
StdOuput = "std.out";
StdError = "std.err";
InputSandbox = {"echo.sh"};
OutputSandbox = {"std.out","std.err"};
]

This description will generate 4 jobs, one for each value of Parameters field. The placeholders will be substituted

  • %n -> parameter consecutive number
  • %s -> parameter value
  • %j -> job ID

The numerical parameters can be specified also with a formula ::

P(0) = ParameterStart
P(i) = P(i-1)*ParameterFactor + ParameterStep

In the JDL ::

[
...
Parameters = 10;
ParameterStart = 0;
ParameterStep = 0.02;
ParameterFactor = 1;
...
]

Job groups

All the jobs in the same bulk submission operation will belong to the same Job Group specified in the JDL description. This parameter can be used in various commands. Monitor job progress by group ::

$ dstat -g Param_Test_1 -a
JobID    Owner   JobName       OwnerGroup     JobGroup      Site                  Status     MinorStatus                        SubmissionTime
=====================================================================================================================================================
9023573  user40  Test_param_0  training_user  Param_Test_1  ANY                   Waiting    Pilot Agent Submission             2015-11-08 19:09:57
9023574  user40  Test_param_1  training_user  Param_Test_1  ANY                   Waiting    Pilot Agent Submission             2015-11-08 19:09:57
9023575  user40  Test_param_2  training_user  Param_Test_1  CLOUD.cesnet.cz       Done       Execution Complete                 2015-11-08 19:09:57
9023576  user40  Test_param_3  training_user  Param_Test_1  CLOUD.bifi-unizar.es  Done       Execution Complete                 2015-11-08 19:09:57
9023577  user40  Test_param_4  training_user  Param_Test_1  CLOUD.ceta-ciemat.es  Completed  Application Finished Successfully  2015-11-08 19:09:57
9023578  user40  Test_param_5  training_user  Param_Test_1  ANY                   Waiting    Pilot Agent Submission             2015-11-08 19:09:57
9023579  user40  Test_param_6  training_user  Param_Test_1  CLOUD.ukim.mk         Running    Job Initialization                 2015-11-08 19:09:57
9023580  user40  Test_param_7  training_user  Param_Test_1  CLOUD.cesnet.cz       Done       Execution Complete                 2015-11-08 19:09:57
9023581  user40  Test_param_8  training_user  Param_Test_1  ANY                   Waiting    Pilot Agent Submission             2015-11-08 19:09:57
9023582  user40  Test_param_9  training_user  Param_Test_1  CLOUD.ceta-ciemat.es  Done       Execution Complete                 2015-11-08 19:09:57

Get outputs for all the jobs in a group ::

$ doutput -g Param_Test_1

Jobs with several parameter sequences

If you need more than one parameter in a sequence of jobs, you can specify them as several named parameter sequences as in the following example:

[
JobName = "Mandel_%n";
Executable = "mandelbrot";
Arguments = "-X %(X)s -Y %(Y)s -W 600 -H 400 -M 1500 -P 0.0001557 out_%n.bmp";
StdOuput = "std.out";
StdError = "std.err";
Parameters = 10;
ParameterStart.X = -0.464898;
ParameterStep.X = 0.01;
ParameterStart.Y = -0.564798;
ParameterStep.Y = 0.01;
OutputSandbox = {"std.out","std.err","out_%n.bmp"};
InputSandbox = {"mandelbrot"};
]

Exercise

  • Modify the parametric.jdl example from you home directory
  • Submit parametric jobs
  • Get the job status by their group
  • Get the output of the jobs by their group

8. Using Python API to submit jobs

All the DIRAC functionality is available via the Python API. You can create your own commands or sophisticated scripts according to your needs. To do that you write a python script using Dirac and Job objects as demonstrated with the example below which corresponds to a simple echo.sh job. Several API options are described in the comments:

#!/bin/env python
# Magic lines necessary to activate the DIRAC Configuration System
# to discover all the required services
from DIRAC.Core.Base import Script
Script.parseCommandLine( ignoreErrors = True )


from DIRAC.Interfaces.API.Job import Job
from DIRAC.Interfaces.API.Dirac import Dirac

j = Job()
dirac = Dirac()

j.setName('MyFirstJob')

# Files to be send as part of the InputSandbx
#j.setInputSandbox(['echo.sh'])
# Files to be returned as OutputSandbox
#j.setOutputSandbox(['echo.sh'])

# Specify Input and Output Data
#j.setInputData(['/my/logical/file/name1', '/my/logical/file/name2'])
#j.setOutputData(['output1.data','output2.data'], outputPath='MyFirstAnalysis')

# The job will belong to this job group
j.setJobGroup('MyJobs')

# Specify CPU requirements
#j.setCPUTime(21600)
# Specify the destination site
#j.setDestination('LCG.IN2P3.fr')
# Specify sites to which the job should not go
#j.setBannedSites(['LCG.CNAF.it','LCG.CNAF-t2.it'])

# Specify the log level of the job execution: INFO (default), DEBUG, VERBOSE
j.setLogLevel('DEBUG')

# Executabe and arguments can be given in one call
j.setExecutable('echo.sh', arguments = 'Hello world !')

# Specify environment variables needed for the job
#j.setExecutionEnv({'MYVARIABLE':'TOTO'})
# Some example
#j.setExecutable('/bin/echo $MYVARIABLE')

# You can visualize the resulting JDL
#jdl = j._toJDL()
#print jdl

result = dirac.submitJob(j)
if not result['OK']:
  print "ERROR:", result['Message']
else:
  print result['Value']

Save the above file as jobapi.py and execute it like:

$ python jobapi.py

Using the API to submit several jobs in at once can be done as in the following commented example:

#!/bin/env python

# Magic lines necessary to activate the DIRAC Configuration System
# to discover all the required services
from DIRAC.Core.Base import Script
Script.parseCommandLine( ignoreErrors = True )

from DIRAC.Interfaces.API.Job import Job
from DIRAC.Interfaces.API.Dirac import Dirac

job = Job()
dirac = Dirac()

job.setName('MandelbrotJob')
job.setJobGroup('MandelJobs')

job.setExecutable('mandelbrot',
                  arguments = "-X %(X)s -Y %(Y)s -W 600 -H 400 -M 1500 -P 0.0001557 out_job_%n.bmp")

# Parameter sequences must be of the same length
xList = [-0.464898,-0.454898,-0.444898,-0.434898,-0.424898]
yList = [-.564798, -.554798, -.544798, -.534798, -.524798]

job.setParameterSequence( 'X', xList )
job.setParameterSequence( 'Y', yList )
job.setOutputSandbox( ['out_job_%n.bmp'] )

result = dirac.submitJob( job )
if not result['OK']:
  print "ERROR:", result['Message']
else:
  print result['Value']

Exercise

Try out the above examples modifying them to your imagination.

9. Advanced production jobs with the Transformation System

The Transformation System (TS) is used to automatize common tasks related to production activities. It allows the automatic creation of large number of jobs and automatic execution of multiple data operations. It can be used to execute a workflow composed of several steps, in a fully data-driven manner. For a more detailed tutorial about the Transformation System, refer to:

https://github.com/DIRACGrid/DIRAC/wiki/Transformation-System-Tutorial

In this section we present how to use the Python API to create the most simple example of transformation (i.e. with no Input Data) and realize the first step of the mandelbrot workflow described in the TS tutorial.

9.1. Transformation description

The transformation (image slices production) creates several jobs, each one producing an image slice of 200 lines. In order to produce the whole image (4200 lines), 21 jobs are needed. These jobs execute the mandelbrot application with all identical parameters except the line number parameter -L, which varies from 1 to 21:

./mandelbrot.py -P 0.0005 -M 1000 -L 00i -N 200

where:

  • P is the "precision"
  • M is the number of iteration
  • L is the first line of the image
  • N is the number of lines to compute in the job

Each job produces a data_00i*200.txt ASCII file which is saved on the File Catalog.

9.2 Transformation creation and monitoring

  • Before creating the actual transformations edit the submit_wms.py script and submit a simple mandelbrot job. Then inspect the result:

    python submit_wms.py 1
    

This job is similar to those that will be created by the transformation for "image slice production".

  • Edit the submit_ts_step1.py script that creates and submits a transformation and observe the different sections (Job description, Transformation definition and submission):
""" Transformation launching Mandelbrot jobs
"""
import json
import os
from DIRAC.Core.Base import Script
Script.parseCommandLine()
import DIRAC
from DIRAC.Interfaces.API.Job import Job
from DIRAC.Core.Workflow.Parameter import Parameter
from DIRAC.TransformationSystem.Client.Transformation import Transformation

def submitTS():
  ########################################
  # Modify here with your dirac username
  owner = 'atsareg'
  ########################################
  ########################################
  # Job description
  ########################################
  job = Job()
  job.setName('mandelbrot raw')
  job.setOutputSandbox( ['*log'] )
  # this is so that the JOB_ID within the transformation can be evaluated on the fly in the job application, see below
  job.workflow.addParameter( Parameter( "JOB_ID", "000000", "string", "", "", True, False, "Initialize JOB_ID" ) )

  ## define the job workflow in 3 steps
  # job step1: setup software
  job.setExecutable('/bin/env')
  # job step2: run mandelbrot application
  # note how the JOB_ID (within the transformation) is passed as an argument and will be evaluated on the fly
  job.setExecutable('./mandelbrot',arguments="-P 0.0005 -M 1000 -S @{JOB_ID}")

  outputPath = os.path.join('LFN:/biomed/user',owner[0],owner,'mandelbrot/images/raw/@{JOB_ID}.bmp')
  outputSE = 'DIRAC-USER'
  outputMetadata = json.dumps( {"application":"mandelbrot","image_format":"ascii", "image_width":7680, "image_height":200, "owner":owner} )

  # job step3: upload data and set metadata
  job.setExecutable( 'dirac-dms-add-file', arguments = "%s out.bmp %s" % (outputPath, outputSE) )

  ########################################
  # Transformation definition
  ########################################
  t = Transformation()
  t.setTransformationName( owner+'_step2' )
  t.setType( "MCSimulation" )
  t.setDescription( "Mandelbrot images production" )
  t.setLongDescription( "Mandelbrot images production" )
  # set the job workflow to the transformation
  t.setBody ( job.workflow.toXML() )

  ########################################
  # Transformation submission
  ########################################
  res = t.addTransformation()
  if not res['OK']:
    print(res['Message'])
    DIRAC.exit( -1 )
  t.setStatus( "Active" )
  t.setAgentType( "Automatic" )
  return res
#########################################################
if __name__ == '__main__':
  try:
    res = submitTS()
    if not res['OK']:
      DIRAC.gLogger.error ( res['Message'] )
      DIRAC.exit( -1 )
  except Exception:
    DIRAC.gLogger.exception()
    DIRAC.exit( -1 )
  • Submit the transformation:
  python submit_ts_step1.py
  • Go to the TransformationMonitor on the web portal: https://cctbdirac01.in2p3.fr/DIRAC/. You should see your transformation (and also those of the other participants). The transformation is created but there are no associated jobs yet. Click on the transformation and go to the Action/Extend on the context menu. Here you can choose of how many jobs your transformation will be composed of. So extend the transformation by 21. Observe the status changes of the different columuns of your transformation (refresh clicking on the Submit button). When tasks are in Submitted Status, you can also click on Show Jobs to display the individual jobs. Note, that since jobs are submitted with the Production Shifter identity, you should remove the 'Owner' selection from the JobMonitor to display the jobs.

9.3 Transformation Monitoring

  • Monitor the progress of the transformation from the TransformationMonitor (refresh clicking the Submit button). You may need to increase the number of transformation shown per page (25 by default) and/or reorder the table by id, so that newer transformation with higher ids are shown at the top.
  • Browse the File Catalog to look at your produced files (using COMDIRAC or directly the File Catalog client):
  dls mandelbrot/images/
  • Observe the metadata associated to your produced files:
      dmeta ls mandelbrot/images/raw
      !image_width : 7680
      !image_height : 200
      !application : mandelbrot
      !owner : user02
      !image_format : ascii
Clone this wiki locally