seisflows.workflow.inversion
A seismic inversion (a.k.a full waveform inversion, adjoint tomography, full waveform tomography) perturbs seismic velocity models by minimizing objective functions defining differences between observed and synthetic waveforms.
This seismic inversion workflow performs a linear set of tasks involving:
Generating synthetic seismograms using an external numerical solver
- Calculating time-dependent misfit (adjoint sources) between data
(or other synthetics) and synthetics
- Using adjoint sources to generate misfit kernels defining volumetric
perturbations sensitive to data-synthetic misfit
Smoothing and summing misfit kernels into a single gradient
- Perturbing the starting model with the gradient to reduce misfit defined by
the objective function during a line search
The Inversion workflow runs the above tasks in a loop (iterations) while exporting updated models, kernels and/or gradients to disk.
Classes
Inversion Workflow |
Module Contents
- class seisflows.workflow.inversion.Inversion(start=1, end=1, thrifty=False, optimize='LBFGS', export_model=True, path_eval_func=None, **kwargs)
Bases:
seisflows.workflow.migration.MigrationInversion Workflow
Peforms iterative nonlinear inversion using the machinery of the Forward and Migration workflows, as well as a built-in optimization library.
Parameters
- type start:
int
- param start:
start inversion workflow at this iteration. 1 <= start <= inf
- type end:
int
- param end:
end inversion workflow at this iteration. start <= end <= inf
- type iteration:
int
- param iteration:
The current iteration of the workflow. If NoneType, takes the value of start (i.e., first iteration of the workflow). User can also set between start and end to resume a failed workflow.
- type thrifty:
bool
- param thrifty:
a thrifty inversion skips the costly intialization step (i.e., forward simulations and misfit quantification) if the final forward simulations from the previous iterations line search can be used in the current one. Requires L-BFGS optimization.
- type export_model:
bool
- param export_model:
export best-fitting model from the line search to disk. If False, new models can be discarded from scratch at any time.
Paths
- type path_eval_func:
str
- param path_eval_func:
scratch path to store files for line search objective function evaluations, including models, misfit and residuals
- __doc__ = Multiline-String
Show Value
""" Forward Workflow [Workflow Base] -------------------------------- Defines foundational structure for Workflow module. When used standalone is in charge of running forward solver in parallel and (optionally) calculating data-synthetic misfit and adjoint sources. Parameters ---------- :type modules: list of module :param modules: instantiated SeisFlows modules which should have been generated by the function `seisflows.config.import_seisflows` with a parameter file generated by seisflows.configure :type generate_data: bool :param generate_data: How to address 'data' in the workflow: - False: real data needs to be provided by the User in `path_data/{source_name}/*` in the same format that the solver will produce synthetics (controlled by `solver.format`) OR - True: 'data' will be generated as synthetic seismograms using a target model provided in `path_model_true`. :type stop_after: str :param stop_after: optional name of task in task list (use `seisflows print tasks` to get task list for given workflow) to stop workflow after, allowing user to prematurely stop a workflow to explore intermediate results or debug. :type export_traces: bool :param export_traces: export all waveforms that are generated by the external solver to `path_output`. If False, solver traces stored in scratch may be discarded at any time in the workflow :type export_residuals: bool :param export_residuals: export all residuals (data-synthetic misfit) that are generated by the external solver to `path_output`. If False, residuals stored in scratch may be discarded at any time in the workflow Paths ----- :type workdir: str :param workdir: working directory in which to perform a SeisFlows workflow. SeisFlows internal directory structure will be created here. Default cwd :type path_output: str :param path_output: path to directory used for permanent storage on disk. Results and exported scratch files are saved here. :type path_data: str :param path_data: path to any externally stored data required by the solver :type path_state_file: str :param path_state_file: path to a text file used to track the current status of a workflow (i.e., what functions have already been completed), used for checkpointing and resuming workflows :type path_model_init: str :param path_model_init: path to the starting model used to calculate the initial misfit. Must match the expected `solver_io` format. :type path_model_true: str :param path_model_true: path to a target model if `case`=='synthetic' and a set of synthetic 'observations' are required for workflow. :type path_eval_grad: str :param path_eval_grad: scratch path to store files for gradient evaluation, including models, kernels, gradient and residuals. *** Seismic migration performs a 'time-reverse migration', or backprojection. In the terminology of seismic imaging, we are running a forward and adjoint simulation to derive the gradient of the objective function. This workflow sets up the machinery to derive a scaled, smoothed gradient from an initial model .. warning:: Misfit kernels require large amounts of disk space for storage. Setting `export_kernel`==True when PAR.NTASK is large and model files are large may lead to large file overhead. .. note:: Migration workflow includes an option to mask the gradient. While both masking and preconditioning involve scaling the gradient, they are fundamentally different operations: masking is ad hoc, preconditioning is a change of variables; For more info, see Modrak & Tromp 2016 GJI A seismic inversion (a.k.a full waveform inversion, adjoint tomography, full waveform tomography) perturbs seismic velocity models by minimizing objective functions defining differences between observed and synthetic waveforms. This seismic inversion workflow performs a linear set of tasks involving: 1) Generating synthetic seismograms using an external numerical solver 2) Calculating time-dependent misfit (adjoint sources) between data (or other synthetics) and synthetics 3) Using adjoint sources to generate misfit kernels defining volumetric perturbations sensitive to data-synthetic misfit 4) Smoothing and summing misfit kernels into a single gradient 5) Perturbing the starting model with the gradient to reduce misfit defined by the objective function during a line search The Inversion workflow runs the above tasks in a loop (iterations) while exporting updated models, kernels and/or gradients to disk. """
- start = 1
- end = 1
- export_model = True
- thrifty = False
- _optimize_name = 'LBFGS'
- _required_modules = ['system', 'solver', 'preprocess', 'optimize']
- _was_thrifty = False
- property evaluation
Convenience string return for log messages that gives the iteration and step count of the current evaluation as a formatted string e.g., i01s00
- property task_list
USER-DEFINED TASK LIST. This property defines a list of class methods that take NO INPUT and have NO RETURN STATEMENTS. This defines your linear workflow, i.e., these tasks are to be run in order from start to finish to complete a workflow.
This excludes ‘check’ (which is run during ‘import_seisflows’) and ‘setup’ which should be run separately
Note
For workflows that require an iterative approach (e.g. inversion), this task list will be looped over, so ensure that any setup and teardown tasks (run once per workflow, not once per iteration) are not included.
- Return type:
list
- Returns:
list of methods to call in order during a workflow
- property is_thrifty
Thrifty inversions are a special case of inversion where the forward simulations and misfit quantification from the previous iteration’s line search can be re-used as the forward simulation of the current iter
This status check determines whether a thrifty iteration can be performed, which is dependent on where we are in the inversion, and whether the optimization module has been restarted.
Warning
Thrifty status from previous iteration is NOT saved, so if your workflow fails at evaluate_initial_misfit, the boolean check will fail and the workflow will re-evaluate the initial misfit.
- Return type:
bool
- Returns:
thrifty status, True if we can re-use previous forward sims False if we must go the normal inversion route
- check()
Checks inversion-specific parameters
- setup()
Assigns modules as attributes of the workflow. I.e., self.solver to access the solver module (or workflow.solver from outside class)
Lays groundwork for inversion by running setup() functions for the involved sub-modules, generating True model synthetic data if necessary, and generating the pre-requisite database files.
- run()
Call the forward.run() function iteratively, from start to end
- checkpoint()
Add an additional line in the state file to keep track of iteration
- generate_synthetic_data(**kwargs)
Function Override of workflow.forward.generate_synthetic_data
Add an additional criteria (iteration > 1) that skips over this function
- evaluate_objective_function(save_residuals=False, components=None, **kwargs)
Function Override of workflow.forward.evaluate_objective_function
Simple override to include iteration and step count parameters into preprocessing for file naming and tagging. Machinery remains the same.
Note
Must be run by system.run() so that solvers are assigned individual task ids/ working directories.
- Parameters:
save_residuals (str) – if not None, path to write misfit/residuls to
components (list) – optional list of components to ignore preprocessing traces that do not have matching components. The adjoint sources for these components will be 0. E.g., [‘Z’, ‘N’]. If None, all available components will be considered.
- sum_residuals(residuals_files, save_to)
Convenience function to read in text files containing misfit residual information written by preprocess.quantify_misfit for each event, and sum the total misfit for the evaluation in a given optimization vector.
Follows Tape et al. 2010 equations 6 and 7
- Parameters:
residuals_files (list of str) – pathnames to residuals files for each source, generated by the preprocessing module. Will be read in and summed to provide total misfit
save_to (str) – name of Optimization module vector to save the misfit value ‘f’, options are ‘f_new’ for misfit of current accepted model ‘m_new’, or ‘f_try’ for the misfit of the current line search trial model
- Return type:
float
- Returns:
sum of squares of residuals, total misfit
- evaluate_initial_misfit(path_model=None, save_residuals=None, **kwargs)
Overwrite workflow.forward to skip over initial misfit evaluation (using MODEL_INIT) if we are past iteration 1. Additionally, sum residuals output by preprocess module and save float to disk, to be discoverable by the optimization library
- Parameters:
path_model (str) – path to the model files that will be used to evaluate initial misfit. If not given, defaults to searching for model provided in path_model_init.
save_residuals (str) – Location to save ‘residuals_*.txt files which are used to calculate total misfit (f_new). - Requires a string formatter ‘{src}’, e.g., ‘residual_{src}.txt’ - String formatter used by preprocessing module to tag files for each source to avoid multiple processes writing to the same file. - Remainder of string may be some combination of the iteration, step count etc. Determined by calling workflow.
Keyword Arguments
bool sum_residuals: Bool to determine whether to sum all residuals files saved under `save_residuals` filenames. The default behavior should be True, that is, once we run preprocessing, we should calculate the misfit 'f'. This flag is an option because some workflows, like ambient noise inversion, require multiple forward runs before summing residuals, so it doesn't make sense to sum each time this function is called.
- run_forward_simulations(path_model, save_traces=None, export_traces=None, **kwargs)
Overrides ‘workflow.forward.run_forward_simulation’ to hijack the default path location for exporting traces to disk
- _run_adjoint_simulation_single(save_kernels=None, export_kernels=None, **kwargs)
Overrides ‘workflow.migration._run_adjoint_simulation_single’ to hijack the default path location for exporting kernels to disk
- evaluate_gradient_from_kernels()
Overwrite workflow.migration to convert the current model and the gradient calculated by migration from their native SPECFEM model format into optimization vectors that can be used for model updates.
Also includes search direction computation, which takes the gradient g_new and scales to provide an appropriate search direction. At the simplest form (gradient descent), the search direction is simply -g
- initialize_line_search()
Computes search direction using the optimization library and sets up line search machinery to ‘perform line search’ by placing correct files on disk for each of the modules to find.
Optimization module perturbs the current model (m_new) by the search direction (p_new) to recover the trial model (m_try). This model is then exposed on disk to the solver.
- evaluate_line_search_misfit()
Evaluate line search misfit f_try by running forward simulations through the trial model m_try and comparing with observations. Acts like a stripped down version of evaluate_initial_misfit
TODO Add in export traces functionality, need to honor step count
- update_line_search()
Given the misfit f_try calculated in evaluate_line_search_misfit, use the Optimization module to determine if the line search has passed, failed, or needs to perform a subsequent step.
The line search state machine acts in the following way: - Pass: Run clean up and proceed with workflow - Try: Re-calculate step length (alpha) and re-evaluate misfit (f_try) - Fail: Try to restart optimization module and restart line search. If
still failing, exit workflow
Note
Line search starts on step_count == 1 because step_count == 0 is considered the misfit of the starting model
- finalize_iteration()
Cleans directories in which function and gradient evaluations were carried out. Contains some logic to consider whether or not to continue with a thrifty inversion.