seisflows.workflow.inversion

A seismic inversion (a.k.a full waveform inversion, adjoint tomography, full waveform tomography) perturbs seismic velocity models by minimizing objective functions defining differences between observed and synthetic waveforms.

This seismic inversion workflow performs a linear set of tasks involving:

  1. Generating synthetic seismograms using an external numerical solver

  2. Calculating time-dependent misfit (adjoint sources) between data

    (or other synthetics) and synthetics

  3. Using adjoint sources to generate misfit kernels defining volumetric

    perturbations sensitive to data-synthetic misfit

  4. Smoothing and summing misfit kernels into a single gradient

  5. Perturbing the starting model with the gradient to reduce misfit defined by

    the objective function during a line search

The Inversion workflow runs the above tasks in a loop (iterations) while exporting updated models, kernels and/or gradients to disk.

Classes

Inversion

Inversion Workflow

Module Contents

class seisflows.workflow.inversion.Inversion(start=1, end=1, thrifty=False, optimize='LBFGS', export_model=True, path_eval_func=None, **kwargs)

Bases: seisflows.workflow.migration.Migration

Inversion Workflow

Peforms iterative nonlinear inversion using the machinery of the Forward and Migration workflows, as well as a built-in optimization library.

Parameters

type start:

int

param start:

start inversion workflow at this iteration. 1 <= start <= inf

type end:

int

param end:

end inversion workflow at this iteration. start <= end <= inf

type iteration:

int

param iteration:

The current iteration of the workflow. If NoneType, takes the value of start (i.e., first iteration of the workflow). User can also set between start and end to resume a failed workflow.

type thrifty:

bool

param thrifty:

a thrifty inversion skips the costly intialization step (i.e., forward simulations and misfit quantification) if the final forward simulations from the previous iterations line search can be used in the current one. Requires L-BFGS optimization.

type export_model:

bool

param export_model:

export best-fitting model from the line search to disk. If False, new models can be discarded from scratch at any time.

Paths

type path_eval_func:

str

param path_eval_func:

scratch path to store files for line search objective function evaluations, including models, misfit and residuals

***

__doc__ = Multiline-String
Show Value
"""
    Forward Workflow [Workflow Base]
    --------------------------------
    Defines foundational structure for Workflow module. When used standalone
    is in charge of running forward solver in parallel and (optionally)
    calculating data-synthetic misfit and adjoint sources.

    Parameters
    ----------
    :type modules: list of module
    :param modules: instantiated SeisFlows modules which should have been
        generated by the function `seisflows.config.import_seisflows` with a
        parameter file generated by seisflows.configure
    :type generate_data: bool
    :param generate_data: How to address 'data' in the workflow:
        - False: real data needs to be provided by the User in
        `path_data/{source_name}/*` in the same format that the solver will
        produce synthetics (controlled by `solver.format`) OR
        - True: 'data' will be generated as synthetic seismograms using
        a target model provided in `path_model_true`.
    :type stop_after: str
    :param stop_after: optional name of task in task list (use
        `seisflows print tasks` to get task list for given workflow) to stop
        workflow after, allowing user to prematurely stop a workflow to explore
        intermediate results or debug.
    :type export_traces: bool
    :param export_traces: export all waveforms that are generated by the
        external solver to `path_output`. If False, solver traces stored in
        scratch may be discarded at any time in the workflow
    :type export_residuals: bool
    :param export_residuals: export all residuals (data-synthetic misfit) that
        are generated by the external solver to `path_output`. If False,
        residuals stored in scratch may be discarded at any time in the
        workflow

    Paths
    -----
    :type workdir: str
    :param workdir: working directory in which to perform a SeisFlows workflow.
        SeisFlows internal directory structure will be created here. Default cwd
    :type path_output: str
    :param path_output: path to directory used for permanent storage on disk.
        Results and exported scratch files are saved here.
    :type path_data: str
    :param path_data: path to any externally stored data required by the solver
    :type path_state_file: str
    :param path_state_file: path to a text file used to track the current
        status of a workflow (i.e., what functions have already been completed),
        used for checkpointing and resuming workflows
    :type path_model_init: str
    :param path_model_init: path to the starting model used to calculate the
        initial misfit. Must match the expected `solver_io` format.
    :type path_model_true: str
    :param path_model_true: path to a target model if `case`=='synthetic' and
        a set of synthetic 'observations' are required for workflow.
    :type path_eval_grad: str
    :param path_eval_grad: scratch path to store files for gradient evaluation,
        including models, kernels, gradient and residuals.
    ***

Seismic migration performs a 'time-reverse migration', or backprojection.
In the terminology of seismic imaging, we are running a forward and adjoint
simulation to derive the gradient of the objective function. This workflow
sets up the machinery to derive a scaled, smoothed gradient from an initial
model

.. warning::
    Misfit kernels require large amounts of disk space for storage.
    Setting `export_kernel`==True when PAR.NTASK is large and model files
    are large may lead to large file overhead.

.. note::
    Migration workflow includes an option to mask the gradient. While both
    masking and preconditioning involve scaling the gradient, they are
    fundamentally different operations: masking is ad hoc, preconditioning
    is a change of variables; For more info, see Modrak & Tromp 2016 GJI

A seismic inversion (a.k.a full waveform inversion, adjoint tomography, full
waveform tomography) perturbs seismic velocity models by minimizing objective
functions defining differences between observed and synthetic waveforms.

This seismic inversion workflow performs a linear set of tasks involving:

1) Generating synthetic seismograms using an external numerical solver
2) Calculating time-dependent misfit (adjoint sources) between data
    (or other synthetics) and synthetics
3) Using adjoint sources to generate misfit kernels defining volumetric
    perturbations sensitive to data-synthetic misfit
4) Smoothing and summing misfit kernels into a single gradient
5) Perturbing the starting model with the gradient to reduce misfit defined by
    the objective function during a line search

The Inversion workflow runs the above tasks in a loop (iterations) while
exporting updated models, kernels and/or gradients to disk.
"""
start = 1
end = 1
export_model = True
thrifty = False
_optimize_name = 'LBFGS'
_required_modules = ['system', 'solver', 'preprocess', 'optimize']
_was_thrifty = False
property evaluation

Convenience string return for log messages that gives the iteration and step count of the current evaluation as a formatted string e.g., i01s00

property task_list

USER-DEFINED TASK LIST. This property defines a list of class methods that take NO INPUT and have NO RETURN STATEMENTS. This defines your linear workflow, i.e., these tasks are to be run in order from start to finish to complete a workflow.

This excludes ‘check’ (which is run during ‘import_seisflows’) and ‘setup’ which should be run separately

Note

For workflows that require an iterative approach (e.g. inversion), this task list will be looped over, so ensure that any setup and teardown tasks (run once per workflow, not once per iteration) are not included.

Return type:

list

Returns:

list of methods to call in order during a workflow

property is_thrifty

Thrifty inversions are a special case of inversion where the forward simulations and misfit quantification from the previous iteration’s line search can be re-used as the forward simulation of the current iter

This status check determines whether a thrifty iteration can be performed, which is dependent on where we are in the inversion, and whether the optimization module has been restarted.

Warning

Thrifty status from previous iteration is NOT saved, so if your workflow fails at evaluate_initial_misfit, the boolean check will fail and the workflow will re-evaluate the initial misfit.

Return type:

bool

Returns:

thrifty status, True if we can re-use previous forward sims False if we must go the normal inversion route

check()

Checks inversion-specific parameters

setup()

Assigns modules as attributes of the workflow. I.e., self.solver to access the solver module (or workflow.solver from outside class)

Lays groundwork for inversion by running setup() functions for the involved sub-modules, generating True model synthetic data if necessary, and generating the pre-requisite database files.

run()

Call the forward.run() function iteratively, from start to end

checkpoint()

Add an additional line in the state file to keep track of iteration

generate_synthetic_data(**kwargs)

Function Override of workflow.forward.generate_synthetic_data

Add an additional criteria (iteration > 1) that skips over this function

evaluate_objective_function(save_residuals=False, components=None, **kwargs)

Function Override of workflow.forward.evaluate_objective_function

Simple override to include iteration and step count parameters into preprocessing for file naming and tagging. Machinery remains the same.

Note

Must be run by system.run() so that solvers are assigned individual task ids/ working directories.

Parameters:
  • save_residuals (str) – if not None, path to write misfit/residuls to

  • components (list) – optional list of components to ignore preprocessing traces that do not have matching components. The adjoint sources for these components will be 0. E.g., [‘Z’, ‘N’]. If None, all available components will be considered.

sum_residuals(residuals_files, save_to)

Convenience function to read in text files containing misfit residual information written by preprocess.quantify_misfit for each event, and sum the total misfit for the evaluation in a given optimization vector.

Follows Tape et al. 2010 equations 6 and 7

Parameters:
  • residuals_files (list of str) – pathnames to residuals files for each source, generated by the preprocessing module. Will be read in and summed to provide total misfit

  • save_to (str) – name of Optimization module vector to save the misfit value ‘f’, options are ‘f_new’ for misfit of current accepted model ‘m_new’, or ‘f_try’ for the misfit of the current line search trial model

Return type:

float

Returns:

sum of squares of residuals, total misfit

evaluate_initial_misfit(path_model=None, save_residuals=None, **kwargs)

Overwrite workflow.forward to skip over initial misfit evaluation (using MODEL_INIT) if we are past iteration 1. Additionally, sum residuals output by preprocess module and save float to disk, to be discoverable by the optimization library

Parameters:
  • path_model (str) – path to the model files that will be used to evaluate initial misfit. If not given, defaults to searching for model provided in path_model_init.

  • save_residuals (str) – Location to save ‘residuals_*.txt files which are used to calculate total misfit (f_new). - Requires a string formatter ‘{src}’, e.g., ‘residual_{src}.txt’ - String formatter used by preprocessing module to tag files for each source to avoid multiple processes writing to the same file. - Remainder of string may be some combination of the iteration, step count etc. Determined by calling workflow.

Keyword Arguments

bool sum_residuals:
    Bool to determine whether to sum all residuals files saved under
    `save_residuals` filenames. The default behavior should be True,
    that is, once we run preprocessing, we should calculate the
    misfit 'f'. This flag is an option because some workflows, like
    ambient noise inversion, require multiple forward runs before
    summing residuals, so it doesn't make sense to sum each time
    this function is called.
run_forward_simulations(path_model, save_traces=None, export_traces=None, **kwargs)

Overrides ‘workflow.forward.run_forward_simulation’ to hijack the default path location for exporting traces to disk

_run_adjoint_simulation_single(save_kernels=None, export_kernels=None, **kwargs)

Overrides ‘workflow.migration._run_adjoint_simulation_single’ to hijack the default path location for exporting kernels to disk

evaluate_gradient_from_kernels()

Overwrite workflow.migration to convert the current model and the gradient calculated by migration from their native SPECFEM model format into optimization vectors that can be used for model updates.

Also includes search direction computation, which takes the gradient g_new and scales to provide an appropriate search direction. At the simplest form (gradient descent), the search direction is simply -g

Computes search direction using the optimization library and sets up line search machinery to ‘perform line search’ by placing correct files on disk for each of the modules to find.

Optimization module perturbs the current model (m_new) by the search direction (p_new) to recover the trial model (m_try). This model is then exposed on disk to the solver.

evaluate_line_search_misfit()

Evaluate line search misfit f_try by running forward simulations through the trial model m_try and comparing with observations. Acts like a stripped down version of evaluate_initial_misfit

TODO Add in export traces functionality, need to honor step count

Given the misfit f_try calculated in evaluate_line_search_misfit, use the Optimization module to determine if the line search has passed, failed, or needs to perform a subsequent step.

The line search state machine acts in the following way: - Pass: Run clean up and proceed with workflow - Try: Re-calculate step length (alpha) and re-evaluate misfit (f_try) - Fail: Try to restart optimization module and restart line search. If

still failing, exit workflow

Note

Line search starts on step_count == 1 because step_count == 0 is considered the misfit of the starting model

finalize_iteration()

Cleans directories in which function and gradient evaluations were carried out. Contains some logic to consider whether or not to continue with a thrifty inversion.