Background

Learn how the SeisFlows source code is structured and the design choices that went into building the package.


Modularity

SeisFlows is built on five core modules which each take care of a separate functionality. Most modules are independent, only relying on their own set of paths and parameters.

The exception to this is the Workflow module, which calls and combines the other modules within its own functions to solve the task at hand.

Within each module, a base Class defines the general functionality. Additional ‘child’ classes inherit and modify the ‘base’ class. The SeisFlows Modules are:

  • Workflow: Controls the order and execution of tasks performed during a workflow, e.g., first create working directory, then run simulations. (Base: seisflows.workflow.forward.Forward)

  • System: Compute system interface used to run SeisFlows on different computer architectures. A consistent internal structure makes it relatively seamless to switch between workstation and HPC workload manager implementations. (Base: seisflows.system.workstation.Workstation)

  • Solver: Interface and wrapper for external numerical solvers used to generate models, synthetic waveforms, and gradients. (Base: seisflows.solver.specfem.Specfem)

  • Preprocessing: Signal processing operations performed on time series, as well as adjoint source generation and misfit windowing. (Base: seisflows.preprocess.default.Default)

  • Optimization: Nonlinear optimization algorithms used to find the minimum of a given objective function. (Base: seisflows.optimize.gradient.Gradient)


Inheritance in Python

SeisFlows is built upon the object-oriented programming concept of inheritance. This documentation page is a simple introduction to this concept to help new users and developers understand how SeisFlows3 is built and expected to operate.

Inheritance is the ability of one class to derive attributes from another class, improving code re-usability. Some terminology used in to talk about this inheritance is defined here:

  • Base (Baseclass): The foundational Base class which defines standard behavior. The Baseclass does not inherit any of its attributes or behavior.

  • Parent (Superclass): A class which is being inherited from. A Parent can be a Baseclass, but inheritance can also be daisy-chained.

  • Child (Subclass): A class that inherits some or all of its attributes from a parent.

Consider the following toy example where we define a Base class which has some internal attributes and functions.

class Base:
    """
    A Baseclass example. All SeisFlows3 modules contain a Base class which
    defines the foundational structure that all inherited classes will adopt
    """
    def __init__(self, example_integer=5, example_float=1.2):
        """
        The init function defines instance-variables and their
        default values

        :type
        """
        self.example_integer = example_integer
        self.example_float = example_float

    def check(self):
        """
        Check functions ensure that parameters are set correctly.
        This toy problem check function simply asserts types are
        set correctly.
        """
        assert(self.example_integer < 10), \
            "The example integer must be < 10"

        assert(self.example_float > 1.), \
            "The example float must be > 1."

    def manipulate(self):
        """
        Manipulate internal attributes

        Each module provides functions which serve a purpose in
        the larger workflow.

        :rtype: float
        :return: example integer added to example float
        """
        return self.example_integer + self.example_float

We can quickly look at the behavior of this class by creating an instance of it.

module = Base(example_integer=11, example_float=3.)
module.check()
---------------------------------------------------------------------------

AssertionError                            Traceback (most recent call last)

<ipython-input-48-0d3a45ed832e> in <module>
      1 module = Base(example_integer=11, example_float=3.)
----> 2 module.check()


<ipython-input-47-19eac124f82c> in check(self)
     21         """
     22         assert(self.example_integer < 10), \
---> 23             "The example integer must be < 10"
     24
     25         assert(self.example_float > 1.), \


AssertionError: The example integer must be < 10
module = Base(example_integer=6, example_float=3.)
module.check()
print(module.manipulate())
9.0

What problem does inheritance solve?

Say we want to use the structure of the Baseclass, but we need the manipulate function to return the subtraction of example_integer and example_float, instead of their addition. There are a few ways to approach this problem.

Plausible but cumbersome approaches

  • Copy-paste: One method of doing this would be to copy-paste the entire Baseclass (e.g., as BaseCopy) and re-define the manipulate function. This would isntantly double our code length, with a lot of the new code being completly redundant. Additionally, if we made any changes to the Baseclass, we would need to also make those changes to BaseCopy to keep functionality consistent.

  • Create a new function: Another method would be to define a completly new function, e.g., manipulate2. This is more acceptable, BUT if some other script, function or module calls Base.manipulate(), we will now need to make them call Base.manipulate2() instead. This involves a signficant amount of work. Similarly, consider the case where we want to go back to the original manipulate function.

The inheritance approach

Inheritance solves this problem but allowing us to overwrite the manipulate function by creating a Child class, which inherits the properties of its Parent. This results in the least amount of code writing, keeps behavior consistent, and allows flexibility in editing established code (e.g., the Baseclass). Let’s see how this is done:

class Super(Base):
    """
    This Superclass will now inherit all of the attributes of the Baseclass.
    It does nothing new.
    """
    pass
module = Super(example_integer=6, example_float=3.)
module.check()
print(module.manipulate())
9.0

Overwriting functions

To solve the problem stated above, we can totally overwrite the manipulate function to provide different behavior

class Super(Base):
    """
    This Superclass overwrites the manipulate function
    """
    def manipulate(self):
        """
        Manipulate internal attributes

        :rtype: float
        :return: example integer subtracted from example float
        """
        return self.example_integer - self.example_float
module = Super(example_integer=6, example_float=3.)
module.check()
print(module.manipulate())
3.0

super() functions

The super() function “returns a proxy object that delegates method calls to a parent or sibling class.” In other words, super() calls the Parent class.

We can use the Python super() function to directly incorporate functions from the parent class, allowing us to build upon previously written code. This is useful if you don’t want to completely overwrite a previously-defined function.

class Super(Base):
    """
    This Superclass overwrites the manipulate function
    """
    def manipulate(self):
        """
        Manipulate internal attributes

        :rtype: float
        :return: example integer subtracted from example float
        """
        added_values = super().manipulate()  # This calls Base.manipulate()
        print(f"added_values={added_values}")
        return added_values ** 2
module = Super(example_integer=6, example_float=3.)
module.check()
print(module.manipulate())
added_values=9.0
81.0

Multiple inheritance

Inheritance can be chained, meaning former Children can become Parents! Although chaining inheritance can quickly become messy and confusing, it is useful for extending existing capabilities without having to make direct edits to the Parent classes.

Let’s say you want to inherit all of the capabilities of the Super class, but you want to extend it further for your own specific workflow. Here we define a Superer class, which inherits and extends the Super class (which itself inherits from the Base class).

class Superer(Super):
    """
    This Superclass inherits from the Super class, which itself inherits from the Base class
    """
    def __init__(self, new_value=8, **kwargs):
        """
        We can extend the internal attributes in our Superclass.
        The **kwargs allow us to be lazy and assume that the User understands class values must be
        passed all the way to the Baseclass
        """
        super().__init__(**kwargs)
        self.new_value = new_value

    def check(self):
        """
        We would like to extend the check function to address our new value,
        while still checking the original values
        """
        super().check()
        assert(self.new_value != 0), "New value must be > 0"

    def manipulate(self):
        """
        We can further manipulate this function, which itself has been changed in
        the Superclass.

        :rtype: float
        :return: example integer subtracted from example float
        """
        squared_values = super().manipulate()
        print(f"squared_values={squared_values}")
        return squared_values / 2

    def manipulate_more(self):
        """
        We can also define completely new functions which are not present in any of the Parent classes.
        This is useful when your Superclass needs to fully extend the functionalities of its Parents.
        """
        manipulated_value = self.manipulate()
        return self.new_value + manipulated_value
module = Superer(example_integer=6, example_float=3., new_value=2)
module.check()
print(f"manipulate: {module.manipulate()}")
print(f"manipulate_more: {module.manipulate_more()}")
added_values=9.0
squared_values=81.0
manipulate: 40.5
added_values=9.0
squared_values=81.0
manipulate_more: 42.5

An Example of Inheritance in SeisFlows

Here we look at how we define the System module Chinook, which allows SeisFlows to interface with the University of Alaska supercomputer, Chinook.

  • System.Workstation: Workstation is the Base system class. It defines required paths and parameters as well as general functions for running SeisFlows on a workstation/laptop.

  • System.Cluster: The Cluster class inherits properties from Workstation and additionally modifies some of the call behavior so that SeisFlows can run on HPCs. For example, Cluster defines new parameters: WALLTIME (dictates the length of time a workflow is submitted for) and TASKTIME (defines the expected time required for any single simulation).

  • System.Slurm The Slurm system inherits from Cluster, defining even more specific functionality, such as checking of the job queue using the sacct command.

Note

Note that those using other workload managers (e.g., PBS, LSF), will need to branch off here and define a new class that inherits from Cluster.

  • System.Chinook: Individual Slurm systems differ in how Users interact with the workload manager. Chinook defines the calls structure for working on the Slurm-based University of Alaska Fairbanks HPC, named Chinook.

    Chinook inherits all of the attributes of Workstation, Cluster and Slurm, and adds it’s own specifics such as the available partitions on the cluster.

  • SeisFlows System: SeisFlows abstracts all of this behavior behind a generalized System module. Therefore, calling something like system.submit() will bring call on Chinook.submit(), which may inherit from any of the parent classes.

    All SeisFlows modules are similarly structured, with a Base classes defining standard behavior, and Child classes extending and modifying such behavior.