seisflows.system.workstation ============================ .. py:module:: seisflows.system.workstation .. autoapi-nested-parse:: The `workstation` class is the foundational `System` module in SeisFlows, it provides utilities for submitting jobs in SERIAL on a small-scale machine, e.g., a workstation or a laptop. All other `System` classes build on this class. Classes ------- .. autoapisummary:: seisflows.system.workstation.Workstation Module Contents --------------- .. py:class:: Workstation(ntask=1, nproc=1, tasktime=1, mpiexec=None, array=None, rerun=0, log_level='DEBUG', verbose=False, workdir=os.getcwd(), path_output=None, path_system=None, path_par_file=None, path_output_log=None, path_log_files=None, **kwargs) Workstation System [System Base] -------------------------------- Defines foundational structure for System module. When used standalone, runs solver tasks either in serial (if `nproc`==1; i.e., without MPI) or in parallel (if `nproc`>1; i.e., with MPI). All other tasks are run in serial. Parameters ---------- :type ntask: int :param ntask: number of individual tasks/events to run during workflow. Must be <= the number of source files in `path_specfem_data` :type nproc: int :param nproc: number of processors to use for each simulation. Choose 1 for serial simulations, and `nproc`>1 for parallel simulations. :type tasktime: float :param tasktime: maximum job time in units minutes for each job spawned by the SeisFlows master job during a workflow. These include, e.g., running the forward solver, adjoint solver, smoother, kernel combiner. All spawned tasks receive the same task time. Fractions of minutes acceptable. If set as `None`, no tasktime will be enforced. :type mpiexec: str :param mpiexec: MPI executable on system. Defaults to 'mpirun -n ${NPROC}' :type array: str :param array: for `ntask` > 1, determine which tasks to submit to run. By default (NoneType) this submits all task IDs [0:ntask), or for single runs, submits only the first task ID, 0. However, for debugging or manual control purposes, Users may input a string of task IDs that they would like to run. Follows formatting of SLURM array directive (https://slurm.schedmd.com/job_array.html), which is, for example: 1,2,3-8:2,10 -> 1,2,3,5,7,10 where '-' denotes a range (inclusive), and ':' denotes an optional step. If ':' step is not given for a range, then step defaults to 1. :type rerun: int :param rerun: [EXPERIMENTAL FEATURE] attempt to re-run failed tasks or array tasks submitted with `run`. Collects information about failed jobs (or array jobs) after a failure, and re-submits with `run`. `rerun` is an integer defining how many times the User wants System to try and rerun before failing the entire job. If 0 (default), a single task failure will cause main job failure. :type log_level: str :param log_level: logger level to pass to logging module. Available: 'debug', 'info', 'warning', 'critical' :type verbose: bool :param verbose: if True, formats the log messages to include the file name and line number of the log message in the source code, as well as the message and message type. Useful for debugging but also very verbose so not recommended for production runs. Paths ----- :type path_output_log: str :param path_output_log: path to a text file used to store the outputs of the package wide logger, which are also written to stdout :type path_par_file: str :param path_par_file: path to parameter file which is used to instantiate the package :type path_log_files: str :param path_log_files: path to a directory where individual log files are saved whenever a number of parallel tasks are run on the system. *** .. py:attribute:: ntask :value: 1 .. py:attribute:: nproc :value: 1 .. py:attribute:: tasktime :value: 1 .. py:attribute:: rerun :value: 0 .. py:attribute:: mpiexec :value: None .. py:attribute:: array :value: None .. py:attribute:: log_level :value: '' .. py:attribute:: verbose :value: False .. py:attribute:: path .. py:attribute:: _acceptable_log_levels :value: ['CRITICAL', 'WARNING', 'INFO', 'DEBUG'] .. py:method:: check() Checks parameters and paths .. py:method:: setup() Create the SeisFlows directory structure in preparation for a SeisFlows workflow. Ensure that if any config information is left over from a previous workflow, that these files are not overwritten by the new workflow. Should be called by submit() .. note:: This function is expected to create dirs: SCRATCH, SYSTEM, OUTPUT and the following log files: output, error .. note:: Logger is configured here as all workflows, independent of system, will be calling setup() :rtype: tuple of str :return: (path to output log, path to error log) .. py:method:: finalize() Tear down tasks for the end of an Inversion-based iteration .. py:method:: submit(workdir=None, parameter_file='parameters.yaml') Submits the main workflow job as a serial job submitted directly to the system that is running the master job :type workdir: str :param workdir: path to the current working directory :type parameter_file: str :param parameter_file: parameter file name used to instantiate the SeisFlows package .. py:method:: run(funcs, single=False, tasktime=None, **kwargs) Executes task multiple times in serial. .. note:: kwargs will be passed to the underlying `method` that is called :type funcs: list of methods :param funcs: a list of functions that should be run in order. All kwargs passed to run() will be passed into the functions. :type single: bool :param single: run a single-process, non-parallel task, such as smoothing the gradient, which only needs to be run by once. This will change how the job array and the number of tasks is defined, such that the job is submitted as a single-core job to the system. :type tasktime: float :param tasktime: Custom tasktime in units minutes for running the given functions `funcs`. If not given, defaults to the System variable `tasktime`. If System `tasktime` is also None, defaults to no tasktime (inifinty time). If tasks exceed the given `tasktime`, the program will exit .. py:method:: task_ids(single=False) Return a list of Task IDs (linked to each indiviudal source) to supply to the 'run' function. By default this returns a range of available tasks [0:ntask). See class docstring of parameter `array` for how to manually set task_ids to use for run call. :type single: bool :param single: If we only want to run a single process, this is will default to TaskID == 0 :rtype: list :return: a list of task IDs to be used by the `run` function .. py:method:: _get_log_file(task_id) To mimic clusters which assign job numbers to spawned processes, our on-system runs will also assign job numbers simply be incrementing the number on the log files on system.