Running an application
Last updated on 2025-05-22 | Edit this page
Estimated time: 15 minutes
Overview
Questions
- How can you use JUBE to submit batch jobs on an HPC System?
Objectives
- Use the
done_file
attribute to deal with asynchronous commands. - Automatically generate job scripts from templates.
- Automatically submit jobs on an HPC system.
Dealing with asynchronous commands
The commands in the do
clauses so far, were all
synchronous. This means that the effect of the specified command is
observable after the command completed. For example a
mkdir -p my_dir
will return after the directory
my_dir
is created. JUBE will continue execution of the next
do
clause defined in the step of complete the step.
HPC systems however are used differently than personal computers. To ensure the resources are used fairly and efficiently, work is handed to a scheduler to allocate resources for the job. However, those resources may not be available right away. The scheduler therefore does not block the command, but rather return to the user and handles the job without further interaction of the user.
This means, however, the effect of the job script (a completed
application run) will most likely not have materialized when the command
return and JUBE shoud wait for it before continuing execution of the
next do
clause. JUBE handles such asynchronous behavior via
the done_file
attribute of the do
clause. If
specified, JUBE will not assume the successful execution of the
specified command in the do
clause for completion, but the
existence of the the file specified in done_file
as a side
effect of the do
command.
This way, JUBE does not need to understand the details of the
asynchronous command specified, as long as the command eventually
generates the file specified in done_file
. For HPC systems
and batch jobs, this can be exploited by generating a file as part of
the batch job, if the application ran successfully as part of the batch
job.
SH
....
srun ...
JUBE_ERR_CODE=$?
if [ $JUBE_ERR_CODE -ne 0 ]; then
touch error
exit $JUBE_ERR_CODE
fi
...
touch ready
In case of the SLURM scheduler, the corresponding do
clause would then look like so.
OUTPUT
######################################################################
# benchmark: GROMACS
# id: 24
#
# MD Simulation Workflow
######################################################################
Running workpackages (#=done, 0=wait, E=error):
########################################00000000000000000000 ( 2/ 3)
| stepname | all | open | wait | error | done |
|-----------------|-----|------|------|-------|------|
| prepare_sources | 1 | 0 | 0 | 0 | 1 |
| build | 1 | 0 | 0 | 0 | 1 |
| run | 1 | 0 | 1 | 0 | 0 |
>>>> Benchmark information and further useful commands:
>>>> id: 24
>>>> handle: jube_run
>>>> dir: jube_run/000024
>>>> continue: jube continue jube_run --id 24
>>>> analyse: jube analyse jube_run --id 24
>>>> result: jube result jube_run --id 24
>>>> info: jube info jube_run --id 24
>>>> log: jube log jube_run --id 24
######################################################################
Any workpackage with a pending done_file
will be listed
under wait
in the table of open tasks. To try to advance
any waiting workpackage the user needs to needs the
continue
command, as listed in the output given by
JUBE.
OUTPUT
```output
######################################################################
# benchmark: GROMACS
# id: 24
#
# MD Simulation Workflow
######################################################################
Running workpackages (#=done, 0=wait, E=error):
############################################################ ( 3/ 3)
| stepname | all | open | wait | error | done |
|-----------------|-----|------|------|-------|------|
| prepare_sources | 1 | 0 | 0 | 0 | 1 |
| build | 1 | 0 | 0 | 0 | 1 |
| run | 1 | 0 | 0 | 0 | 1 |
>>>> Benchmark information and further useful commands:
>>>> id: 24
>>>> handle: jube_run
>>>> dir: jube_run/000024
>>>> analyse: jube analyse jube_run --id 24
>>>> result: jube result jube_run --id 24
>>>> info: jube info jube_run --id 24
>>>> log: jube log jube_run --id 24
######################################################################
Platform-independent workflows
JUBE offers batch job templates compatible with a variety of batch
systems that utilize this mechanism. These files are available in the
JUBE installation directory unter share/jube/platform
, with
subdirectories per scheduler. Provided with version 2.7.1 of JUBE are
files prepared for LSF, Moab,
PBS, and SLURM.
In each of these directories JUBE provides at least the two files
platform.xml
and submit.job.in
. The former is
a collection of parameter sets, file sets, and substitution sets. The
latter is a template for the corresponding scheduler.
Taking a look into the batch script template reveils the semingly
automatic handling of generating either an error
or a
ready
file as part of the batch job.
SH
...
JUBE_ERR_CODE=$?
if [ $JUBE_ERR_CODE -ne 0 ]; then
#FLAG_ERROR#
exit $JUBE_ERR_CODE
fi
...
#FLAG#
If the return code is not 0 (zero), the action specified by the substitution parameter #FLAG_ERROR# is executed, and the script exits with the corresponding non-zero return code. At the end of the batch script, the substitution parameter #FLAG# will generate the specified file.
If we take a look into the corresponding substitution set in
platform.xml
, we can see te following configuration.
XML
<sub source="#FLAG#" dest="touch $done_file" />
<sub source="#FLAG_ERROR#" dest="touch $error_file" />
as well as the corresponding the parameterset executeset
in platform.xml
contains the definition of
done_file
and error_file
.
Using these pre-defined sets in platform.xml
in
combination with the batch script template, we can easily create the
run step for GROMACS.
Using the indirecttion of the corresponding sets in the
platform.xml
we now have a workflow specification that does
not in itself reference any specific scheduler, as that is handled via
the JUBE_INCLUDE_PATH
from the shell JUBE is executed
in.
Callout
Checkout further substitution patterns available in the batch script template to identify parameters you can use to add commands to your batch script.
Note that those scripts and definitions are only for your convenience. If they don’t fit your need, you can adapt them to your needs or fully rely on self-provided configurations.
Key Points
- JUBE provides a default job script template for different schedulers.
- JUBE allows for asynchronous execution of workflows using the
done_file
attribute. - Asynchronous JUBE executions are dependent on the generation of files by the asynchronous task to continue.