seisflows.system.workstation

The workstation class is the foundational System module in SeisFlows, it provides utilities for submitting jobs in SERIAL on a small-scale machine, e.g., a workstation or a laptop. All other System classes build on this class.

Classes

Workstation

Workstation System [System Base]

Module Contents

class seisflows.system.workstation.Workstation(ntask=1, nproc=1, tasktime=1, mpiexec=None, array=None, rerun=0, log_level='DEBUG', verbose=False, workdir=os.getcwd(), path_output=None, path_system=None, path_par_file=None, path_output_log=None, path_log_files=None, **kwargs)

Workstation System [System Base]

Defines foundational structure for System module. When used standalone, runs solver tasks either in serial (if nproc`==1; i.e., without MPI) or in parallel (if `nproc>1; i.e., with MPI). All other tasks are run in serial.

Parameters

type ntask:

int

param ntask:

number of individual tasks/events to run during workflow. Must be <= the number of source files in path_specfem_data

type nproc:

int

param nproc:

number of processors to use for each simulation. Choose 1 for serial simulations, and nproc>1 for parallel simulations.

type tasktime:

float

param tasktime:

maximum job time in units minutes for each job spawned by the SeisFlows master job during a workflow. These include, e.g., running the forward solver, adjoint solver, smoother, kernel combiner. All spawned tasks receive the same task time. Fractions of minutes acceptable. If set as None, no tasktime will be enforced.

type mpiexec:

str

param mpiexec:

MPI executable on system. Defaults to ‘mpirun -n ${NPROC}’

type array:

str

param array:

for ntask > 1, determine which tasks to submit to run. By default (NoneType) this submits all task IDs [0:ntask), or for single runs, submits only the first task ID, 0. However, for debugging or manual control purposes, Users may input a string of task IDs that they would like to run. Follows formatting of SLURM array directive (https://slurm.schedmd.com/job_array.html), which is, for example: 1,2,3-8:2,10 -> 1,2,3,5,7,10 where ‘-’ denotes a range (inclusive), and ‘:’ denotes an optional step. If ‘:’ step is not given for a range, then step defaults to 1.

type rerun:

int

param rerun:

[EXPERIMENTAL FEATURE] attempt to re-run failed tasks or array tasks submitted with run. Collects information about failed jobs (or array jobs) after a failure, and re-submits with run. rerun is an integer defining how many times the User wants System to try and rerun before failing the entire job. If 0 (default), a single task failure will cause main job failure.

type log_level:

str

param log_level:

logger level to pass to logging module. Available: ‘debug’, ‘info’, ‘warning’, ‘critical’

type verbose:

bool

param verbose:

if True, formats the log messages to include the file name and line number of the log message in the source code, as well as the message and message type. Useful for debugging but also very verbose so not recommended for production runs.

Paths

type path_output_log:

str

param path_output_log:

path to a text file used to store the outputs of the package wide logger, which are also written to stdout

type path_par_file:

str

param path_par_file:

path to parameter file which is used to instantiate the package

type path_log_files:

str

param path_log_files:

path to a directory where individual log files are saved whenever a number of parallel tasks are run on the system.

***

ntask = 1
nproc = 1
tasktime = 1
rerun = 0
mpiexec = None
array = None
log_level = ''
verbose = False
path
_acceptable_log_levels = ['CRITICAL', 'WARNING', 'INFO', 'DEBUG']
check()

Checks parameters and paths

setup()

Create the SeisFlows directory structure in preparation for a SeisFlows workflow. Ensure that if any config information is left over from a previous workflow, that these files are not overwritten by the new workflow. Should be called by submit()

Note

This function is expected to create dirs: SCRATCH, SYSTEM, OUTPUT and the following log files: output, error

Note

Logger is configured here as all workflows, independent of system, will be calling setup()

Return type:

tuple of str

Returns:

(path to output log, path to error log)

finalize()

Tear down tasks for the end of an Inversion-based iteration

submit(workdir=None, parameter_file='parameters.yaml')

Submits the main workflow job as a serial job submitted directly to the system that is running the master job

Parameters:
  • workdir (str) – path to the current working directory

  • parameter_file (str) – parameter file name used to instantiate the SeisFlows package

run(funcs, single=False, tasktime=None, **kwargs)

Executes task multiple times in serial.

Note

kwargs will be passed to the underlying method that is called

Parameters:
  • funcs (list of methods) – a list of functions that should be run in order. All kwargs passed to run() will be passed into the functions.

  • single (bool) – run a single-process, non-parallel task, such as smoothing the gradient, which only needs to be run by once. This will change how the job array and the number of tasks is defined, such that the job is submitted as a single-core job to the system.

  • tasktime (float) – Custom tasktime in units minutes for running the given functions funcs. If not given, defaults to the System variable tasktime. If System tasktime is also None, defaults to no tasktime (inifinty time). If tasks exceed the given tasktime, the program will exit

task_ids(single=False)

Return a list of Task IDs (linked to each indiviudal source) to supply to the ‘run’ function. By default this returns a range of available tasks [0:ntask). See class docstring of parameter array for how to manually set task_ids to use for run call.

Parameters:

single (bool) – If we only want to run a single process, this is will default to TaskID == 0

Return type:

list

Returns:

a list of task IDs to be used by the run function

_get_log_file(task_id)

To mimic clusters which assign job numbers to spawned processes, our on-system runs will also assign job numbers simply be incrementing the number on the log files on system.