seisflows.system.slurm
======================

.. py:module:: seisflows.system.slurm

.. autoapi-nested-parse::

   The Simple Linux Utility for Resource Management (SLURM) is a commonly used
   workload manager on many high performance computers / clusters. The Slurm
   system class provides generalized utilites for interacting with Slurm systems.

   Useful commands for figuring out system-specific required parameters
       $ sinfo --Node --long  # Determine the cores-per-node for partitions

   .. note::
       The main development system for SeisFlows used SLURM. Therefore the other
       system supers will not be up to date until access to those systems are
       granted. This rosetta stone, for converting from SLURM to other workload
       management tools will be useful: https://slurm.schedmd.com/rosetta.pdf

   .. note::
      SLURM systems expect walltime/tasktime in format: "minutes", 
      "minutes:seconds", "hours:minutes:seconds". SeisFlows uses the latter
      and converts task and walltimes from input of minutes to a time string.

   TODO
       Create 'slurm_singulairty', a child class for singularity-based runs which
       loads and runs programs through singularity, OR add a parameter options
       which will change the run and/or submit calls


Classes
-------

.. autoapisummary::

   seisflows.system.slurm.Slurm


Module Contents
---------------

.. py:class:: Slurm(ntask_max=100, slurm_args='', **kwargs)

   Bases: :py:obj:`seisflows.system.cluster.Cluster`


   System Slurm
   ------------
   Interface for submitting and monitoring jobs on HPC systems running the 
   Simple Linux Utility for Resource Management (SLURM) workload manager.

   Parameters
   ----------
   :type slurm_args: str
   :param slurm_args: Any (optional) additional SLURM arguments that will
       be passed to the SBATCH scripts. Should be in the form:
       '--key1=value1 --key2=value2"

   Paths
   -----
   ***


   .. py:attribute:: __doc__


   .. py:attribute:: ntask_max
      :value: 100


   .. py:attribute:: slurm_args
      :value: ''


   .. py:attribute:: partition
      :value: None


   .. py:attribute:: submit_to
      :value: None


   .. py:attribute:: _partitions


   .. py:attribute:: _completed_states
      :value: ['COMPLETED']


   .. py:attribute:: _failed_states
      :value: ['TIMEOUT', 'FAILED', 'NODE_FAIL', 'OUT_OF_MEMORY', 'CANCELLED']


   .. py:attribute:: _pending_states
      :value: ['PENDING', 'RUNNING']


   .. py:method:: check()

      Checks parameters and paths


   .. py:property:: nodes

      Defines the number of nodes which is derived from system node size


   .. py:property:: node_size

      Defines the node size of a given cluster partition. This is a hard
      set number defined by the system architecture


   .. py:property:: submit_call_header

      The submit call defines the SBATCH header which is used to submit a
      workflow task list to the system. It is usually dictated by the
      system's required parameters, such as account names and partitions.
      Submit calls are modified and called by the `submit` function.

      :rtype: str
      :return: the system-dependent portion of a submit call


   .. py:method:: run_call(executable='', single=False, array=None, tasktime=None)

      The run call defines the SBATCH call which is used to run tasks during
      an executing workflow. Like the submit call its arguments are dictated
      by the given system. Run calls are modified and called by the `run`
      function

      :type executable: str
      :param exectuable: the actual exectuable to run within the SBATCH 
          directive. Something like './script.py'
      :type array: str
      :param array: overwrite the `array` variable to run specific jobs. If
          not provided, then we will run jobs 0-{ntask}%{ntask_max}. Jobs 
          should be submitted in the format of a SLURM array string, 
          something like: 0,1,3,5 or 2-4,8-22
      :type single: bool
      :param single: flag to get a run call that is meant to be run on the
          mainsolver (ntask==1), or run for all jobs (ntask times). Examples
          of single process runs include smoothing, and kernel combination
      :rtype: str
      :return: the system-dependent portion of a run call


   .. py:method:: _stdout_to_job_id(stdout)
      :staticmethod:


      The stdout message after an SBATCH job is submitted, from which we get
      the job number, differs between systems, allow this to vary

      .. note:: Examples
          1) standard example: Submitted batch job 4738244
          2) (1) with '--parsable' flag: 4738244
          3) federated cluster: Submitted batch job 4738244; Maui
          4) (3) with '--parsable' flag: 4738244; Maui

      This function deals with cases (2) and (4). Other systems that have more 
      complicated stdout messages will need to overwrite this function

      :type stdout: str
      :param stdout: standard SBATCH response after submitting a job with the
          '--parsable' flag
      :rtype: str
      :return: a matching job ID. We convert str->int->str to ensure that
          the job id is an integer value (which it must be)
      :raises SystemExit: if the job id does not evaluate as an integer


   .. py:method:: run(funcs, single=False, tasktime=None, array=None, _attempts=0, **kwargs)

      Runs task multiple times in embarrassingly parallel fasion on a SLURM
      cluster. Executes the list of functions (`funcs`) NTASK times with each
      task occupying NPROC cores.

      .. note::
          
          Completely overwrites the `Cluster.run()` command

      :type funcs: list of methods
      :param funcs: a list of functions that should be run in order. All
          kwargs passed to run() will be passed into the functions.
      :type single: bool
      :param single: run a single-process, non-parallel task, such as
          smoothing the gradient, which only needs to be run by once.
          This will change how the job array and the number of tasks is
          defined, such that the job is submitted as a single-core job to
          the system.
      :type tasktime: float
      :param tasktime: Custom tasktime in units minutes for running the given
          functions `funcs`. If not given, defaults to the System variable
          `tasktime`. If tasks exceed the given `tasktime`, the program will 
          exit
      :type array: str
      :param array: overwrite the `array` variable to run specific jobs. If
          not provided, then we will run jobs 0-{ntask}%{ntask_max}. Jobs 
          should be submitted in the format of a SLURM array string, 
          something like: 0,1,3,5 or 2-4,8-22
      :type _attempts: int
      :param _attempts: a recursive counter for failed job runs that allows 
          the `run` function to re-attempt failed jobs up to `rerun` number
          of times


   .. py:method:: task_ids(single=False)

      Overwrite `system.workstation.task_ids` to get SLURM specific array
      configurations which are passed as strings for the --array={task_ids()}
      SLURM directive, rather than lists which is how `system.workstation`
      handles this

      Relevant format definition: https://slurm.schedmd.com/job_array.html

      :type single: bool
      :param single: If we only want to run a single process, this is will
          default to TaskID == 0
      :rtype: str
      :return: string formatter of Task IDs to be used by the `run` function
          via the `run_call`


   .. py:method:: query_job_states(job_id, sort=False)

      Overwrites `system.cluster.Cluster.query_job_states`

      Queries completion status of an array job by running the SLURM `sacct`

      .. note::
          The actual command line call wil look something like this
          $ sacct -nLX -o jobid,state -j 441630
          441630_0    PENDING
          441630_1    COMPLETED

      .. note::
          SACCT flag options are described as follows:
          -L: queries all available clusters, not just the cluster that ran 
              the `sacct` call. Used for federated clusters
          -X: supress the .batch and .extern jobnames that are normally 
              returned but don't represent that actual running job

      :type job_id: str
      :param job_id: main job id to query, returned from the subprocess.run 
          that ran the jobs
      :type sort: bool
      :param sort: sort by job ids or job array ids. Defaults to False because
          currently running jobs may return job numbers that cannot be sorted
          e.g., 1_0, 1_1, 1_[2-5]. We only use sort when recovering from job 
          failure because then we are assured that all jobs have run.
      :rtype: (list, list)
      :return: (job ids, corresponding job states). Returns (None, None) if
          `sacct` does not return a useful stdout (e.g., jobs have not
          yet initialized on system)