seisflows.workflow.noise_inversion ================================== .. py:module:: seisflows.workflow.noise_inversion .. autoapi-nested-parse:: Ambient Noise Adjoint Tomography workflow based on Wang et al. (see ref. below) where Synthetic Greens functions (SGF) are generated by simulating point forces in the Z, N and E directions. SGFs are rotated to the R and T components to get ZZ, RR and TT SGFs that can be compared to data. .. note:: EGF Data Location Empirical Greens Function data (noise cross correlations) must be placed in specific directory structures, and are searched for under the following locations: ZZ kernel: `path_data`/{source_name}/ZZ/* RR kernel: `path_data`/{source_name}/RR/* TT kernel: `path_data`/{source_name}/TT/* .. note:: Kernel Naming Naming convention: AB (A=force source direction, B=waveform component) The kernel naming convention used in this workflow follows from Ref. 1. Two letter names (e.g., AB) where first letter (A) represents input force direction, and second letter (B) represents the component of the recorded wavefield. E.g., ZZ represents an upward (+Z) force recorded on Z component. In ambient noise, the common EGFs are ZZ, TT and RR. Cross-component EGFs (e.g., ZT) are also possible, but not currently supported. Please open a GitHub issue if you would like to see these supported. .. note:: TT/RR Workflow Steps: FORWARD SIMULATIONS 1. Run E component forward simulation, save ZNE traces & fwd arrays 2. Run N component forward simulations, save ZNE traces & fwd arrays 3. Rotate N and E component SGF to R and T components based on source-receiver azimuth 4. Calculate RR and TT adjoint sources (u_rr, u_tt) w.r.t EGF data GENERATE TT KERNELS 5a. Rotate u_tt to N and E (u_ee, u_en, u_ne, u_nn) 6a. Run ET adjoint simulation (injecting u_ee, u_en) for K_ET 7a. Run NT adjoint simulation (injecting u_ne, u_nn) for K_NT 8a. Sum T kernels, K_ET + K_NT = K_TT GENERATE RR KERNELS 5b. Rotate u_rr to N and E (u_ee, u_en, u_ne, u_nn) 6b. Run ER adjoint simulation (injecting u_ee, u_en) for K_ER 7b. Run NR adjoint simulation (injecting u_ne, u_nn) for K_NR 8b. Sum R kernels, K_ER + K_NR = K_RR 9. Sum kernels K = K_RR + K_TT .. note:: References 1. "Three‐dimensional sensitivity kernels for multicomponent empirical Green's functions from ambient noise: Methodology and application to Adjoint tomography." Journal of Geophysical Research: Solid Earth 124.6 (2019): 5794-5810. .. warning:: This workflow class makes a lot of assumptions about file naming and path structure defined in other modules that is verging on hard coding. May warrant a re-write in the future. I've tried to mark all the file/dir. naming assumptions with a '!!!' Classes ------- .. autoapisummary:: seisflows.workflow.noise_inversion.NoiseInversion Module Contents --------------- .. py:class:: NoiseInversion(kernels='ZZ', separate_rt_kernels=True, **kwargs) Bases: :py:obj:`seisflows.workflow.inversion.Inversion` Noise Inversion Workflow ------------------------ Run forward and adjoint solvers to produce Synthetic Greens Functions (SGF) based on unidirectional forces which are meant to represent virtual sources of uniform noise distributions. SGFs are compared to Empirical Greens Functions (EGF) for iterative model updates. .. note:: simulation requirements per source station - 'ZZ' kernel requires 1 forward (Z) and 1 adjoint (Z) simulation - 'TT or RR' kernel requires 2 forward (N + E) and 2 adjoint (N_? + E_?) simulations (where ? = R or T) - 'TT,RR' kernels can share their 2 forward simulations (N + E) but require 4 separate adjoint simulations (N_T + E_T + N_R + E_R) - 'ZZ,TT or ZZ,RR' requires 3 forward (Z + N + E) and 3 adjoint - 'ZZ,TT,RR' requires 3 forward (Z + N + E) and 5 adjoint (Z + N_T + E_T + N_R + E_R) Parameters ---------- :type kernels: str :param kernels: comma-separated list of kernels to generate w.r.t available EGF data. Corresponding data must be available. Available options are: - ZZ: Generates Synthetic Greens Functions (SGF) for the ZZ (vertical) component by running forward simulations for each master station using a +Z component force, and then running an adjoint simulation to generate kernels. - TT/RR: Generate Synthetic Greens Functions (SGF) for the TT (transvserse) and/or RR (radial) component(s) following processing steps laid out in Wang et al. (2019) and outlined below: Example inputs would be 'ZZ' or 'ZZ,TT' or 'ZZ,TT,RR'. Case insensitive :type separate_rt_kernels: bool :param separate_rt_kernels: >>> WORK IN PROGRESS, MUST BE SET TRUE <<< if True, generate separate kernels for RR and TT which requires 4 adjoint simulations (ER, ET, NR, NT). If False, mix RR and TT kernel generation for computional efficiency, requiring only 2 adjoint simulations (ER+NR, ET+NT), but losing the ability to look at RR and TT kernels separately. Default is False with the assumption that the User only cares about the sum and not the individual kernels. Paths ----- *** .. py:attribute:: __doc__ .. py:attribute:: kernels :value: '' .. py:attribute:: _force :value: None .. py:attribute:: _cmpnt :value: None .. py:attribute:: _srcrcv_stats :value: None .. py:attribute:: _fwd_arr_dir :value: '{force}_FWD_ARR' .. py:method:: check() Additional checks for the Noise Inversion Workflow to ensure the required modules and parameters are set .. py:method:: setup() Set up some required attributes for Noise Inversion .. py:property:: task_list USER-DEFINED TASK LIST. This property defines a list of class methods that take NO INPUT and have NO RETURN STATEMENTS. This defines your linear workflow, i.e., these tasks are to be run in order from start to finish to complete a workflow. This excludes 'check' (which is run during 'import_seisflows') and 'setup' which should be run separately .. note:: For workflows that require an iterative approach (e.g. inversion), this task list will be looped over, so ensure that any setup and teardown tasks (run once per workflow, not once per iteration) are not included. :rtype: list :return: list of methods to call in order during a workflow .. py:method:: trace_path(tag, comp=None) Convenience path function that returns the full path for storing intermediate waveform files for a given component. These generally adhere to how the `solver` module names directories. Required because this workflow will do a lot of pre-rotation waveform storage, so we use this function as the once-and-for-all definition for the paths .. note :: Must be run by system.run() so that solvers are assigned individual task ids and working directories :type tag: str or None :param tag: sub directory tag, e.g., 'syn' to store synthetic waveforms and 'adj' to store adjoint sources. :type comp: str or None :param comp: optional component used to tag the sub directory :rtype: str :return: full path to solver scratch traces directory to save waveforms .. py:property:: fwd_arr_path Forward arrays must be saved and loaded between subsequent fwd and adjoint simulations. This property removes ambiguity of file naming between multiple functions. Modified by the internal 'force' variable which is set by calling functions. .. py:method:: generate_synthetic_data(**kwargs) Function Overwrite of `workflow.forward.generate_synthetic_data()` To allow generation of both vertical and horizontal component SGF data For synthetic inversion cases, we can use the workflow machinery to generate 'data' by running simulations through a target/true model for each of our `ntask` sources. This only needs to be run once during a workflow. .. py:method:: _generate_synthetic_data_single(path_model=None, **kwargs) Function Overwrite of `workflow.forward.generate_synthetic_data()` To account for specific naming schema of the synthetic data and trace rotation of the horizontal components. Runs forward simulations for Z, N and E components and generates ZZ, RR and TT synthetic data (dependent on values for parameter `kernels`). Data are stored in `path_data/{source_name}/{kernel}/*` so that they are discoverable by the preprocessing module at a later stage. .. py:method:: evaluate_zz_misfit(**kwargs) Run the forward solver to generate ZZ SGFs using a +Z component FORCE, and evalaute the misfit using the preprocessing module. Save the residual files but do not sum them. .. note:: To rerun preprocessing only (e.g., you want to test out new window parameters), run the following commands: $ seisflows debug > workflow.setup() > workflow.evaluate_zz_misfit(_preproc_only=True) .. py:method:: run_zz_adjoint_simulations() Run the adjoint solver to generate kernels for ZZ using the ZZ adjoint sources and applying the saved forward arrays from the ZZ forward simulation. .. py:method:: evaluate_rt_misfit(**kwargs) Run the forward solver to generate E and N SGFs from E and N component FORCES. Preprocessing will wait until the N simulations have finished prior to rotating N and E SGF to RR and TT components. Preprocessing calculates misfit on the R and T components and then we rotate RR and TT adjoint sources into N and E components. .. warning:: IMPORTANT: The order of simulations matters here! E must be first .. py:method:: run_rt_adjoint_simulations() Run adjoint solver to generate horizontal kernels. Choice between separating kernels (ER, NR, ET, NT) requiring four adjoint simulations, or mixing kernels (ER + NR, ET + NT) requiring two adjoint simulations. See parameter `separate_rt_kernels` for choice. .. py:method:: _run_rt_adjoint_simulations_separate() Run adjoint solver for each kernel RR and TT (if requested) by running two adjoint simulations (E and N) per kernel with the appropriate saved E and N forward arrays. .. py:method:: _run_rt_adjoint_simulations_combined() :abstractmethod: Run adjoint solver for each kernel RR and TT (if requested) by running one adjoint simulations (E and N) per kernel with the appropriate saved E and N forward arrays. .. py:method:: _rename_preprocess_files(tag) Prevent preprocess module from overwriting its own log and figure files by simply renaming them with a tag. .. warning:: This feels kludge-y, try to not have to do this by allowing misfit evaluation functions to set a tag on the figure/log naming .. py:method:: sum_all_residuals() Misfit files are tagged with the intended kernel they are generated for, which does not match the original `workflow.inversion` formatting. This function makes summation of residuals files more broad to encompass the updated file naming .. py:method:: prepare_data_for_solver(**kwargs) Function Override of `workflow.forward.prepare_data_for_solver()` run from within the `evaluate_initial_misfit` function Modifies the location of expected observed data, and removes any data previously stored within the `solver/traces/obs/` directory to avoid data conflict during workflow Data are searched for under the following locations: ZZ kernel: `path_data`/{source_name}/ZZ/* RR kernel: `path_data`/{source_name}/RR/* TT kernel: `path_data`/{source_name}/TT/* .. note :: Must be run by system.run() so that solvers are assigned individual task ids and working directories .. py:method:: evaluate_objective_function(save_residuals=False, components=None, **kwargs) Function Override of `workflow.inversion.evaluate_objective_function` run from within the `evaluate_initial_misfit` function - ZZ kernel follows standard procedure as a normal inversion - RR and TT kernels require modification: - N/E synthetics rotated -> R/T, to match EGF data - Objective function evaluated for R/T, R/T adj. srcs generated - R/T adjoint sources rotated back -> N/E for adjoint simulations .. note:: Must be run by system.run() so that solvers are assigned individual task ids/ working directories. :type save_residuals: str :param save_residuals: if not None, path to write misfit/residuls to :type components: list :param components: optional list of components to ignore preprocessing traces that do not have matching components. The adjoint sources for these components will be 0. E.g., ['Z', 'N']. If None, all available components will be considered. .. py:method:: rotate_ne_traces_to_rt(tag='syn') Parallelized rotation function to get N and E component synthetic waveforms, generated by N and E forcesolution simulations, into R and T component synthetics, generated by R and T force sources. See Reference 1 in top docstring for detailed explanation. General idea is that you cannot have a radial or transverse source for N>1 stations with only one simulation, so instead we run N and E force simulations, and then rotate the orthogonal N and E component synthetics to get RR and TT Synthetic Greens Functions. .. note:: Must be run by system.run() so that solvers are assigned individual task ids/ working directories. :type tag: str :param tag: where to look for waveform data within the scratch dir. Example path is: scratch/solver//traces/_?/* Tag by default is 'syn' but can also be 'obs' for generating synthetic data .. py:method:: _rotate_ne_trace_to_rt_single(f_ee, f_ne, f_en, f_nn, tag='syn') Parallalizable private function to rotate NE synthetics to RR and TT .. note:: Must be run by system.run() so that solvers are assigned individual task ids/ working directories. :type f_ee: str :param f_ee: path to the EE synthetic waveform (E force, E component) :type f_ne: str :param f_ne: path to the NE synthetic waveform (N force, E component) :type f_en: str :param f_en: path to the EN synthetic waveform (E force, N component) :type f_nn: str :param f_nn: path to the NN synthetic waveform (N force, N component) :type tag: str :param tag: where to look for waveform data within the scratch dir. Example path is: scratch/solver//traces/_?/* Tag by default is 'syn' but can also be 'obs' for generating synthetic data .. py:method:: rotate_rt_adjsrcs_to_ne(choice) Rotates R and T adjoint sources generated by the preprocessing module back to N and E component synthetics. See `rotate_ne_traces_to_rt` and Ref. 1 for explanations on why this is necessary. Four potential sets of adjoint sources generated during workflow: - ER: RR synthetics that have been rotated so that they appear to originate from an E component force source. - ET: TT synthetics that have been rotated so that they appear to originate from an E component force source - NR: RR synthetics that have been rotated so that they appear to originate from an N component force source - NT: TT synthetics that have been rotated so that they appear to originate from an N component force source .. note:: Must be run by system.run() so that solvers are assigned individual task ids/ working directories. :type choice: str :param choice: define the input component, 'R' or 'T' for the incoming adjoint source as rotation matrix will be different for both. .. py:method:: _rotate_rt_adjsrc_to_ne_single(fid, choice=None) Parallellizable function to rotate R or T adjoint sources to N and E component synthetics for use in adjoint simulations. .. note:: Must be run by system.run() so that solvers are assigned individual task ids/ working directories. :type fid: str :param fid: path to the RR or TT adjoint source that will be read in and rotated into four separate adjoint sources :type choice: str :param choice: define the input component, 'R' or 'T' for the incoming adjoint source as rotation matrix will be different for both. .. py:method:: run_forward_simulations(path_model, save_traces=False, export_traces=False, **kwargs) Function Override of `workflow.forward.run_forward_simulation` - Edits FORCESOLUTION file to set the correct force direction (E, N, Z) - Redirect trace saving/export for more refined tagging based on force .. note:: Must be run by system.run() so that solvers are assigned individual task ids/ working directories. :type path_model: str :param path_model: path to SPECFEM model files used to run the forwarsd simulations. Files will be copied to each individual solver directory. :type save_traces: str :param save_traces: full path location to save synthetic traces after successful completion of forward simulations. By default, they are stored in 'scratch/solver//traces/syn'. Overriding classes may re-direct synthetics by setting this variable :type export_traces: str :param export_traces: full path location to export (copy) synthetic traces after successful completion of forward simulations. Each fwd simulation erases the synthetics of the previous forward simulation, so exporting to disk is important if the User wants to save waveform data. Set parameter `export_traces` True in the parameter file to access this option. Overriding classes may re-direct synthetics by setting this variable. :raises AssertionError: if internal variable `_force` is not set by the calling function .. py:method:: _run_adjoint_simulation_single(save_kernels=None, export_kernels=None, **kwargs) Function Override of `workflow.migration._run_adjoint_simulation_single` 1) Creates necessary empty adjoint sources, e.g., ZZ kernel only requires 'Z' component adjoint sources, and 'N' and 'E' MUST be 0 2) For N and E (TT and RR kernels) adjoint simulations, symlinks collect adjoint sources to be discoverable by SPECFEM. Note that for horizontal components, four sets of adjoint sources are available. .. note:: Must be run by system.run() so that solvers are assigned individual task ids/working directories. .. note:: see solver.specfem.adjoint_simulation() for full detailed list of input parameters :type save_kernels: str :param save_kernels: path to a directory where kernels created by the adjoint simulation are moved to for further use in the workflow Defaults to saving kernels in `scratch/eval_grad/kernels/` :type export kernels: str :param export_kernels: path to a directory where kernels are copied for more permanent storage, where they will not be wiped by `clean` or `restart`. User parameter `export_kernels` must be set `True`. .. py:method:: _generate_empty_adjsrcs(components) Internal NoiseInversion function used to generate empty (zero amplitude) adjoint sources for every station and given `component`. Uses the Solver and Preprocess modules to get after file naming and trace characteristics. This function is required because the original Inversion method for generating empty adjoint sources is insuffficient for the file structure that gets craeted here .. warning:: !!! This entire function makes assumptions about file naming structure for SPECFEM generated synthetics that may be too rigid/ hardcoded. .. note:: Must be run by system.run() so that solvers are assigned individual task ids/working directories. :type components: list of str :param components: components to generate empty adjoint sources for. e.g., ['E', 'N'] will generate E and N component adjoint sources. Note that any files matching the output adjoint source file name will be removed so ensure that there is no actual adjoint source data in this file. .. py:method:: postprocess_event_kernels() Function Override of `workflow.migration.postprocess_event_kernels` - Combines multiple individual kernels, K, for each source. - `K_event = K_ZZ + K_TT + K_RR`, where (K_TT = K_ET + K_NT and K_RR = K_ER + K_NR). - Uses the original function machinery to do the final kernel summation, K_misfit = sum(K_event) and postprocessing (smooth, mask, etc.) .. warning:: Assumes the subdirectory structure of kernels for path `eval_grad` .. py:method:: evaluate_line_search_misfit(_preproc_only=False) Function Overwrite of `workflow.inversion._evaluate_line_search_misfit` Called inside `workflow.inversion.perform_line_search`. Alters the original function by allowing for multiple forward simulations required for both vertical and horizontal SGF creation. .. warning:: Each call of this function will save residuals but these will be ignored and the final residual file will only be created once all forward simulations are run