seisflows.system.singularity
A Cluster-adjacent base class that provides core utilities for interactions with HPC systems running Singularity. Must be overloaded by subclasses defined for specific workload managers / clusters.
The Singularity class was written for clusters running SeisFlows through Docker containers using Singularity. The reason for writing a separate class is because Docker containers do not have access to the workload manager (e.,g SLURM/sbatch) and therefore we cannot run job submission calls directly from the Python environment. Instead, each time a job must be submitted to the Cluster, the User must manually submit.
Note
To users looking to run SeisFlows directly via their Cluster Conda environment, look at the Cluster class and its workload manager-specific sub-classes
Classes
Singularity System |
Module Contents
- class seisflows.system.singularity.Singularity(title=None, mpiexec='', ntask_max=None, tasktime=1, environs='', singularity_exec='singularity', path_container=None, **kwargs)
Bases:
seisflows.system.workstation.WorkstationSingularity System
HPC interfacing through Docker/Singularity containers
Parameters
- type title:
str
- param title:
The name used to submit jobs to the system, defaults to the name of the current working directory
- type mpiexec:
str
- param mpiexec:
Function used to invoke executables on the system. For example ‘mpirun’, ‘mpiexec’, ‘srun’, ‘ibrun’
- type ntask_max:
int
- param ntask_max:
limit the number of concurrent tasks in a given array job
- type tasktime:
float
- param tasktime:
maximum job time in minutes for each job spawned by the SeisFlows master job during a workflow. These include, e.g., running the forward solver, adjoint solver, smoother, kernel combiner. All spawned tasks receive the same task time. Fractions of minutes acceptable.
- type environs:
str
- param environs:
Optional environment variables to be provided in the following format VAR1=var1,VAR2=var2… Will be set using os.environs
Paths
- type path_container:
str
- param path_container:
path to the Docker Image that contains adjTomo software package
- __doc__ = Multiline-String
Show Value
""" Workstation System [System Base] -------------------------------- Defines foundational structure for System module. When used standalone, runs solver tasks either in serial (if `nproc`==1; i.e., without MPI) or in parallel (if `nproc`>1; i.e., with MPI). All other tasks are run in serial. Parameters ---------- :type ntask: int :param ntask: number of individual tasks/events to run during workflow. Must be <= the number of source files in `path_specfem_data` :type nproc: int :param nproc: number of processors to use for each simulation. Choose 1 for serial simulations, and `nproc`>1 for parallel simulations. :type tasktime: float :param tasktime: maximum job time in units minutes for each job spawned by the SeisFlows master job during a workflow. These include, e.g., running the forward solver, adjoint solver, smoother, kernel combiner. All spawned tasks receive the same task time. Fractions of minutes acceptable. If set as `None`, no tasktime will be enforced. :type mpiexec: str :param mpiexec: MPI executable on system. Defaults to 'mpirun -n ${NPROC}' :type array: str :param array: for `ntask` > 1, determine which tasks to submit to run. By default (NoneType) this submits all task IDs [0:ntask), or for single runs, submits only the first task ID, 0. However, for debugging or manual control purposes, Users may input a string of task IDs that they would like to run. Follows formatting of SLURM array directive (https://slurm.schedmd.com/job_array.html), which is, for example: 1,2,3-8:2,10 -> 1,2,3,5,7,10 where '-' denotes a range (inclusive), and ':' denotes an optional step. If ':' step is not given for a range, then step defaults to 1. :type rerun: int :param rerun: [EXPERIMENTAL FEATURE] attempt to re-run failed tasks or array tasks submitted with `run`. Collects information about failed jobs (or array jobs) after a failure, and re-submits with `run`. `rerun` is an integer defining how many times the User wants System to try and rerun before failing the entire job. If 0 (default), a single task failure will cause main job failure. :type log_level: str :param log_level: logger level to pass to logging module. Available: 'debug', 'info', 'warning', 'critical' :type verbose: bool :param verbose: if True, formats the log messages to include the file name and line number of the log message in the source code, as well as the message and message type. Useful for debugging but also very verbose so not recommended for production runs. Paths ----- :type path_output_log: str :param path_output_log: path to a text file used to store the outputs of the package wide logger, which are also written to stdout :type path_par_file: str :param path_par_file: path to parameter file which is used to instantiate the package :type path_log_files: str :param path_log_files: path to a directory where individual log files are saved whenever a number of parallel tasks are run on the system. *** A Cluster-adjacent base class that provides core utilities for interactions with HPC systems running Singularity. Must be overloaded by subclasses defined for specific workload managers / clusters. The `Singularity` class was written for clusters running SeisFlows through Docker containers using Singularity. The reason for writing a separate class is because Docker containers do not have access to the workload manager (e.,g SLURM/sbatch) and therefore we cannot run job submission calls directly from the Python environment. Instead, each time a job must be submitted to the Cluster, the User must manually submit. .. note:: To users looking to run SeisFlows directly via their Cluster Conda environment, look at the `Cluster` class and its workload manager-specific sub-classes """
- mpiexec = ''
- ntask_max
- tasktime = 1
- environs = ''
- singularity_exec = 'singularity'
- setup()
Copies ‘submit’ and ‘run’ .py scripts from the repository into the working directory so that the User can run these scripts directly. This is a manual step in order to allow Users to run with a container without using native environment commands (e.g., sbatch) from inside a container.
- property run_call_header
The run call defines the Singularity wrapper which executes run calls using the Docker image. It also binds the current working directory inside the container so that we can write back to the local filesystem.
Note
Generalized cluster returns empty string but child system classes will need to overwrite the submit call.
- Return type:
str
- Returns:
the system-dependent portion of a run call
- submit(workdir=None, parameter_file='parameters.yaml')
Submits the main workflow job as a serial job submitted directly to the system that is running the master job
- Parameters:
workdir (str) – path to the current working directory
parameter_file (str) – parameter file name used to instantiate the SeisFlows package
- run(funcs, single=False, **kwargs)
Runs tasks multiple times in parallel by submitting NTASK new jobs to system. The list of functions and its kwargs are saved as pickles files, and then re-loaded by each submitted process with specific environment variables. Each spawned process will run the list of functions.
- Parameters:
funcs (list of methods) – a list of functions that should be run in order. All kwargs passed to run() will be passed into the functions.
single (bool) – run a single-process, non-parallel task, such as smoothing the gradient, which only needs to be run by once. This will change how the job array and the number of tasks is defined, such that the job is submitted as a single-core job to the system.
run_call (str) – the call used to submit the run script. If None, attempts default run call which should be suited for the given system. Can be overwritten by child classes to involve other arguments