Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Apply check util functions in cron jobs #99

Open
mrceyhun opened this issue Aug 2, 2022 · 1 comment
Open

Apply check util functions in cron jobs #99

mrceyhun opened this issue Aug 2, 2022 · 1 comment
Labels

Comments

@mrceyhun
Copy link
Contributor

mrceyhun commented Aug 2, 2022

Description

Since our check_utils.sh functions is in place, we can start to apply them in the cron jobs for a solid, bullet-proof success status of crons.

Candidate cron jobs for first pitch

We may start with the cron jobs running in vocms092. Here is the static cron definitions of them which is not exist in any repository:

0 */4 * * * /data/cms/CMSSpark/bin/cron4aggregation
0 */3 * * * /data/cms/CMSSpark/bin/cron4dbs_condor
0 20 * * * /data/cms/CMSSpark/bin/cron4dbs_condor_df /data/cms/pop-data
0 18 * * * /data/cms/CMSSpark/bin/cron4dbs_events /data/cms/pop-data
0 15 */5 * * /data/cms/CMSEOS/CMSSpark/bin/backfill_dbs_condor.sh 1>/data/cms/CMSEOS/CMSSpark/log/backfill.log 2>&1
1 1 * * * /data/cms/cmsmonitoringbackup/run.sh 2>&1 1>& /data/cms/cmsmonitoringbackup/log
07 08 * * * /data/cms/CMSSpark/bin/cron4rucio_daily.sh /cms/rucio_daily

How

Each cron job has its own definitions, output directory and output format. Most of the mentioned candidate cron jobs write output to HDFS, so we can use check_hdfs function to check their success.

Let's give an example:
CMSSpark/bin/cron4rucio_daily.sh write output to /cms/rucio_daily/rucio/2022/08/01 hdfs directory so output format is /cms/rucio_daily/rucio/YYYY/MM/DD which is defined in its Python code. We only know "$HDFS_OUTPUT_DIR"given as /cms/rucio_daily and we need to produce /cms/rucio_daily/rucio/YYYY/MM/DD from the variable.

Example check code for CMSSpark/bin/cron4rucio_daily.sh:

......
/bin/bash "$SCRIPT_DIR/run_rucio_daily.sh" --verbose --output_folder "$HDFS_OUTPUT_DIR" --fdate "$CURRENT_DATE"

#  [It can be good to put nice comment line to separate check commands from the actual cron job, an example:]
# ----- CRON SUCCESS CHECK -----
. ./utils/check_utils.sh
# This cron job runs each day and threshold should be at max 12 hours, so 43200
# Let's check the current output sizes: hadoop fs -du -h /cms/rucio_daily/rucio/2022/08
# So, in average directory size is 80MB, so we can give 50Mb, in bytes 50000000 

check_hdfs "$HDFS_OUTPUT_DIR"/rucio/YYYY/MM/DD 43200 50000000
# !!ATTENTION!! no command should be run after this point

After check function, we should not run any command to not overwrite actual exit code of the check function.

In our tests, we can provide $HDFS_OUTPUT_DIR (/cms/rucio_daily) as some personal tmp HDFS directory like /tmp/username/rucio_daily.

@mrceyhun mrceyhun added the cron label Aug 2, 2022
@mrceyhun
Copy link
Contributor Author

mrceyhun commented Aug 2, 2022

fyi @kyrylogy @leggerf @brij01

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant