seisflows.workflow.forward

The simplest simulation workflow you can run is a large number of forward simulations to generate synthetics from a velocity model. Therefore the Forward class represents the BASE workflow. All other workflows will build off of the scaffolding defined by the Forward class.

Classes

Forward

Forward Workflow [Workflow Base]

Module Contents

class seisflows.workflow.forward.Forward(modules=None, generate_data=False, stop_after=None, export_traces=False, export_residuals=False, workdir=os.getcwd(), path_output=None, path_data=None, path_state_file=None, path_model_init=None, path_model_true=None, path_eval_grad=None, **kwargs)

Forward Workflow [Workflow Base]

Defines foundational structure for Workflow module. When used standalone is in charge of running forward solver in parallel and (optionally) calculating data-synthetic misfit and adjoint sources.

Parameters

type modules:

list of module

param modules:

instantiated SeisFlows modules which should have been generated by the function seisflows.config.import_seisflows with a parameter file generated by seisflows.configure

type generate_data:

bool

param generate_data:

How to address ‘data’ in the workflow: - False: real data needs to be provided by the User in path_data/{source_name}/* in the same format that the solver will produce synthetics (controlled by solver.format) OR - True: ‘data’ will be generated as synthetic seismograms using a target model provided in path_model_true.

type stop_after:

str

param stop_after:

optional name of task in task list (use seisflows print tasks to get task list for given workflow) to stop workflow after, allowing user to prematurely stop a workflow to explore intermediate results or debug.

type export_traces:

bool

param export_traces:

export all waveforms that are generated by the external solver to path_output. If False, solver traces stored in scratch may be discarded at any time in the workflow

type export_residuals:

bool

param export_residuals:

export all residuals (data-synthetic misfit) that are generated by the external solver to path_output. If False, residuals stored in scratch may be discarded at any time in the workflow

Paths

type workdir:

str

param workdir:

working directory in which to perform a SeisFlows workflow. SeisFlows internal directory structure will be created here. Default cwd

type path_output:

str

param path_output:

path to directory used for permanent storage on disk. Results and exported scratch files are saved here.

type path_data:

str

param path_data:

path to any externally stored data required by the solver

type path_state_file:

str

param path_state_file:

path to a text file used to track the current status of a workflow (i.e., what functions have already been completed), used for checkpointing and resuming workflows

type path_model_init:

str

param path_model_init:

path to the starting model used to calculate the initial misfit. Must match the expected solver_io format.

type path_model_true:

str

param path_model_true:

path to a target model if `case`==’synthetic’ and a set of synthetic ‘observations’ are required for workflow.

type path_eval_grad:

str

param path_eval_grad:

scratch path to store files for gradient evaluation, including models, kernels, gradient and residuals.

***

_modules = None
stop_after = None
generate_data = False
export_traces = False
export_residuals = False
path
_required_modules = ['system', 'solver']
_optional_modules = ['preprocess']
_states
property task_list

USER-DEFINED TASK LIST. This property defines a list of class methods that take NO INPUT and have NO RETURN STATEMENTS. This defines your linear workflow, i.e., these tasks are to be run in order from start to finish to complete a workflow.

This excludes ‘check’ (which is run during ‘import_seisflows’) and ‘setup’ which should be run separately

Note

For workflows that require an iterative approach (e.g. inversion), this task list will be looped over, so ensure that any setup and teardown tasks (run once per workflow, not once per iteration) are not included.

Return type:

list

Returns:

list of methods to call in order during a workflow

check()

Check that workflow has required modules. Run their respective checks

setup()

Assigns modules as attributes of the workflow. I.e., self.solver to access the solver module (or workflow.solver from outside class)

Makes required path structure for the workflow, runs setup functions for all the required modules of this workflow.

checkpoint()

Saves active SeisFlows working state to disk as a text files such that the workflow can be resumed following a crash, pause or termination of workflow.

run()

Call the Task List in order to ‘run’ the workflow. Contains logic for to keep track of completed tasks and avoids re-running tasks that have previously been completed (e.g., if you are restarting your workflow)

generate_synthetic_data(**kwargs)

For synthetic inversion cases, we can use the workflow machinery to generate ‘data’ by running simulations through a target/true model for each of our ntask sources. This only needs to be run once during a workflow.

_generate_synthetic_data_single(path_model=None, _copy_function=unix.ln, **kwargs)

Barebones forward simulation to create synthetic data and export and save the synthetics in the correct locations. Hijacks function run_forward_simulations but uses some different path exports.

Exports data to disk in path_data and then symlinks to solver directories for each source.

Note

Must be run by system.run() so that solvers are assigned individual task ids/ working directories.

Parameters:

_copy_function

how to transfer data from path_data to scratch - unix.ln (default): symlink data to avoid copying large amounts of

data onto the scratch directory.

  • unix.cp: copy data to avoid burdening filesystem that actual data

    resides on, or to avoid touching the original data on disk.

evaluate_initial_misfit(path_model=None, save_residuals=False, save_forward_arrays=False, _preproc_only=False, **kwargs)

Evaluate the initial model misfit. This requires setting up ‘data’ before generating synthetics, which is either copied from user-supplied directory or running forward simulations with a target model. Forward simulations are then run and prepocessing compares data-synthetic misfit

Note

This is run altogether on system to save on queue time waits, because we are potentially running two simulations back to back.

Parameters:
  • path_model (str) – path to the model files that will be used to evaluate initial misfit. If not given, defaults to searching for model provided in path_model_init.

  • save_residuals (str) – Location to save ‘residuals_*.txt files which are used to calculate total misfit (f_new), requires a string formatter {src} so that the preprocessing module can generate a new file for each source. Remainder of string is some combination of the iteration, step count etc. Allows inheriting workflows to override this path if more specific file naming is required.

  • save_forward_arrays (str) – relative path (relative to /scratch/solver/<source_name>/<model_database>) to move the forward arrays which are used for adjoint simulations. Mainly used for ambient noise adjoint tomography which requires multiple forward simulations prior to adjoint simulations, putting forward arrays at the risk of overwrite. Normal Users can leave this default.

  • _preproc_only (bool) – a debug tool to ONLY run the preprocessing contained in evaluate_objective_function, skipping over the forward simulation. You would want to do this, e.g., if your workflow already ran the forward simulation and you just want to re pick windows, or test out different filter bands etc. Recommended this be run in debug mode and that you change tasktime to reflect that no forward simulation will be run.

prepare_data_for_solver(_src=None, _copy_function=unix.ln, **kwargs)

Determines how to provide data to each of the solvers. Either by symlinking (or copying) data in from a user-provided path, or by generating synthetic ‘data’ by running forward simulations through the target model. This usually only needs to be run once per workflow, even for inversions

Note

Must be run by system.run() so that solvers are assigned individual task ids and working directories

Parameters:
  • _src (str) – internal variable used by child classes which inherit from Forward, allowing other workflows to change the default path that data is searched for. Needs to be a wildcard. By default this function looks at the following wildcard path: ‘{path_data}/{source_name}/*

  • _copy_function (function) –

    how to transfer data from path_data to scratch - unix.ln (default): symlink data to avoid copying large amounts of

    data onto the scratch directory.

    • unix.cp: copy data to avoid burdening filesystem that actual data

      resides on, or to avoid touching the original data on disk.

run_forward_simulations(path_model, save_traces=None, export_traces=None, save_forward_arrays=False, flag_save_forward=None, **kwargs)

Performs forward simulation through model saved in path_model for a single event. Upon successful completion of forward simulation, synthetic waveforms are moved to location save_traces for processing, and/or exported permanently to location on disk export_traces.

Note

if PAR.PREPROCESS == None, will not perform misfit quantification

Note

Must be run by system.run() so that solvers are assigned individual task ids/ working directories.

Parameters:
  • path_model (str) – path to SPECFEM model files used to run the forwarsd simulations. Files will be copied to each individual solver directory.

  • save_traces (str) – full path location to save synthetic traces after successful completion of forward simulations. By default, they are stored in ‘scratch/solver/<SOURCE_NAME>/traces/syn’. Overriding classes may re-direct synthetics by setting this variable

  • export_traces (str) – full path location to export (copy) synthetic traces after successful completion of forward simulations. Each fwd simulation erases the synthetics of the previous forward simulation, so exporting to disk is important if the User wants to save waveform data. Set parameter export_traces True in the parameter file to access this option. Overriding classes may re-direct synthetics by setting this variable.

  • save_forward_arrays (str) – relative path (relative to solver.cwd) to move the forward arrays which are used for adjoint simulations. Mainly used for ambient noise adjoint tomography which requires multiple forward simulations prior to adjoint simulations, putting forward arrays at the risk of overwrite. Normal Users can leave this default.

  • flag_save_forward (bool) – whether to turn on the flag for saving the forward arrays which are used for adjoint simulations. Not required if only running forward simulations

evaluate_objective_function(save_residuals=False, components=None, **kwargs)

Uses the preprocess module to evaluate the misfit/objective function given synthetics generated during forward simulations

Note

Must be run by system.run() so that solvers are assigned individual task ids/ working directories.

Parameters:
  • save_residuals (str) – if not None, path to write misfit/residuls to

  • components (list) – optional list of components to ignore preprocessing traces that do not have matching components. The adjoint sources for these components will be 0. E.g., [‘Z’, ‘N’]. If None, all available components will be considered.

finalize_iteration()

Solver finalization procedures for the end of each iteration