Glossary#

API#

Application programming interface. hpcFlow’s API allows us to design and execute workflows from within a Python interpreter or Jupyter notebook.

Command files#

If you want to refer to any files that are used as inputs or output, they should be listed under command_files in the workflow file

command_files:
  - label: new_inp_file
    name:
      name: friction_conductance.inp

CLI#

Command line interface. The CLI is typically how we interact with hpcFlow on HPC systems.

cluster#

See HPC

Environment/virtual environment#

An environment is an isolated set of installed software. Using environments allows you to have multiple copies of the same software installed in different environments so you can run different versions, or to run two pieces of software with competing dependencies on the same machine. Using and sharing environments helps make your work reproducible because someone can use the same environment on a different machine and be sure they have the same versions of everything.

HPC#

High-performance computer/computing

jobscript#

A job submission script that is used to queue a job on a batch scheduler system, such as SLURM or SGE. Jobscripts are generated by hpcFlow during workflow submission.

Tasks#

These are actual usages of a task schema, run with defined inputs.

Task schema#

This is a template for a task you want to run, with definitions of the input and outputs that are expected.

hpcFlow has many built-in task schemas, but you may want to write your own.

Workflow#

A pipeline that processes data in some way. A workflow is a list of tasks that run one after the other.

Workflow template#

A workflow template parameterises a workflow, providing the required input values for the task schemas of the workflow. However, it doesn’t actually run the workflow. A workflow template is usually just the list of tasks, but can optionally include hpcFlow environment, the task schemas, and the command files.