An initial workflow specification

Last updated on 2025-06-24 | Edit this page

Overview

Questions

  • How to use JUBE to automate the build process of a given HPC application?

Objectives

  • Define an initial workflow.
  • Create the benchmarks systems description automatically.

Writing an initial workflow


Mapping our experience with JUBE so far to the steps involved in compiling GROMACS, we can define two different steps:

  • download and unpack GROMACS sources
  • building the application

Discussion

Why don’t we have more or less steps defined?

If we would define all actions taken in the episode manually building GROMACS in one step, we would always download the sources of GROMACS for each build.

If we would separate the configure and the make phases, those will seem independent, yet there is always a single configure phase with each make phase. Therefore, these two actions should be part of the same step.

After executing this workflow we can make several observations:

  1. Some of the strings now hardcoded are part of a pattern that could be represented as a parameter instead.

  2. Downloading and unpacking is done as part of the workflow into the workflow directory tree. Additional runs will download and unpack the sources again.

So let’s address these issues one at a time.

Reducing code-copy in the specification


In several locations, the version of GROMACS is referenced: in the source archive, the unpacked source directory, and the installation path. If we’d change the version in the future, we would have to change several locations in the workflow configuration. We can reduce this by creating an appropriate parameterset.

Edit your initial GROMACS workflow configuration and use parameters for key variables in your workflow, to make this more flexible.

Breaking out of the workpackage sandbox


By default, JUBE will create a specific directory for each workpackage to ensure that independent workpackages do not interfere with each other. However, sometime it is helpful to break out of this safety net.

Caution

The sandboxing of individual workflow runs and their workpackages is done for good reason: to limit potential interactions among independent workpackages and ensure consistency and reproducibility.

Only deviate from this when you have a very strong argument to do so.

Next to workflow-specific parameters defined by the workflow itself, JUBE also defines variables containing information about the current workflow run. These variables can be referenced just as any parameter defined as part of a parameterset. One of these variables is $jube_benchmark_home, and it contains the absolute path to the location of the workflow specification.

Callout

Find a full list of internal variables set by JUBE in the glossary of JUBE’s documentation.

Using this variable, we can now define the the installation path outside of the directory structure referenced by outpath. However, as any paths outside of JUBE’s run directory tree will be accessed (and potentially written to) by multiple workflow runs. Therefore, you will need to take precautions not to overwrite installations accidentially.

Now we installed GROMACS externally but yet have to automate the decision whether to build or use the installed version. This can be handled with the active attribute. Steps and do tags (and a few others) can contain an attribute active that can be either true or false or any parsable boolean Python expression. When evaluated to false, the respective entity is disabled. When evaluated to true, the respective entity remains enabled (just as if no active attribute had been given).

To only build GROMACS, when no complete install is available, we need

  • an indicator that a previous install was successfull,
  • the evaluation of that indicator,
  • an expression to use as the value for the active attribute, and
  • add an action to remove any preexisting installation (that may be incomplete).

For this purpose we add a final do action to the build step that creates a file indicating that this step was complete and, because of transitivity, all prior do actions completed successfully. Furthermore, we then create a parameter as part of the gromacs_pset that indicates the existence of the file in the target directory. We use a parameter here, because of it ease of use when checking for the existence of a file as part of a shell expresseion. This parameter can then be referenced in the appropriate do actions in the build step.

Key Points

  • Group actions that belong together and have a 1:1 relation ship in a single step.
  • Basing parameter values on other parameter values can help code copy and increase flexibility and maintainability of your workflow.
  • You can generate build files from templates using dynamic values from parameter sets.