seisflows.preprocess.default
The SeisFlows Preprocessing module is in charge of interacting with seismic data (observed and synthetic). It should contain functionality to read and write seismic data, apply preprocessing such as filtering, quantify misfit, and write adjoint sources that are expected by the solver.
Module Contents
Classes
Default Preprocess |
Functions
|
Read waveforms in two-column ASCII format. This is copied directly from |
- class seisflows.preprocess.default.Default(syn_data_format='ascii', obs_data_format='ascii', unit_output='VEL', misfit='waveform', adjoint='waveform', normalize=None, filter=None, min_period=None, max_period=None, min_freq=None, max_freq=None, mute=None, early_slope=None, early_const=None, late_slope=None, late_const=None, short_dist=None, long_dist=None, workdir=os.getcwd(), path_preprocess=None, path_solver=None, **kwargs)
Default Preprocess
Data processing for seismic traces, with options for data misfit, filtering, normalization and muting.
Parameters
- type obs_data_format
str
- param obs_data_format
data format for reading observed traces into memory. Available formats: ‘su’, ‘ascii’, ‘sac’
- type unit_output
str
- param unit_output
Data units. Must match the synthetic output of external solver. Available: [‘DISP’: displacement, ‘VEL’: velocity, ‘ACC’: acceleration]
- type misfit
str
- param misfit
misfit function for waveform comparisons. For available see seisflows.plugins.preprocess.misfit
- type backproject
str
- param backproject
backprojection function for migration, or the objective function in FWI. For available see seisflows.plugins.preprocess.adjoint
- type normalize
str
- param normalize
Data normalization parameters used to normalize the amplitudes of waveforms. Choose from two sets: ENORML1: normalize per event by L1 of traces; OR ENORML2: normalize per event by L2 of traces; & TNORML1: normalize per trace by L1 of itself; OR TNORML2: normalize per trace by L2 of itself
- type filter
str
- param filter
Data filtering type, available options are: BANDPASS (req. MIN/MAX PERIOD/FREQ); LOWPASS (req. MAX_FREQ or MIN_PERIOD); HIGHPASS (req. MIN_FREQ or MAX_PERIOD)
- type min_period
float
- param min_period
Minimum filter period applied to time series. See also MIN_FREQ, MAX_FREQ, if User defines FREQ parameters, they will overwrite PERIOD parameters.
- type max_period
float
- param max_period
Maximum filter period applied to time series. See also MIN_FREQ, MAX_FREQ, if User defines FREQ parameters, they will overwrite PERIOD parameters.
- type min_freq
float
- param min_freq
Maximum filter frequency applied to time series, See also MIN_PERIOD, MAX_PERIOD, if User defines FREQ parameters, they will overwrite PERIOD parameters.
- type max_freq
float
- param max_freq
Maximum filter frequency applied to time series, See also MIN_PERIOD, MAX_PERIOD, if User defines FREQ parameters, they will overwrite PERIOD parameters.
- type mute
list
- param mute
Data mute parameters used to zero out early / late arrivals or offsets. Choose any number of: EARLY: mute early arrivals; LATE: mute late arrivals; SHORT: mute short source-receiver distances; LONG: mute long source-receiver distances
Paths
- type path_preprocess
str
- param path_preprocess
scratch path for all preprocessing processes, including saving files
- check()
Checks parameters and paths
- setup()
Sets up data preprocessing machinery by dynamicalyl loading the misfit, adjoint source type, and specifying the expected file type for input and output seismic data.
- read(fid, data_format)
Waveform reading functionality. Imports waveforms as Obspy streams
- Parameters
fid (str) – path to file to read data from
data_format (str) – format of the file to read data from
- Return type
obspy.core.stream.Stream
- Returns
ObsPy stream containing data stored in fid
- write(st, fid)
Waveform writing functionality. Writes waveforms back to format that SPECFEM recognizes
- Parameters
st (obspy.core.stream.Stream) – stream to write
fid (str) – path to file to write stream to
- _calculate_misfit(**kwargs)
Wrapper for plugins.preprocess.misfit misfit/objective function
- _generate_adjsrc(**kwargs)
Wrapper for plugins.preprocess.adjoint source/backproject function
- initialize_adjoint_traces(data_filenames, output, data_format=None)
SPECFEM requires that adjoint traces be present for every matching synthetic seismogram. If an adjoint source does not exist, it is simply set as zeros. This function creates all adjoint traces as zeros, to be filled out later
Appends ‘.adj. to the solver filenames as expected by SPECFEM (if they don’t already have that extension)
TODO there are some sem2d and 3d specific tasks that are not carried TODO over here, were they required?
- Parameters
data_filenames (list of str) – existing solver waveforms (synthetic) to read. These will be copied, zerod out, and saved to path save. Should come from solver.data_filenames
output (str) – path to save the new adjoint traces to.
- _check_adjoint_traces(source_name, save_adjsrcs, synthetic)
Check that all adjoint traces required by SPECFEM exist
- _rename_as_adjoint_source(fid)
Rename synthetic waveforms into filenames consistent with how SPECFEM expects adjoint sources to be named. Usually this just means adding a ‘.adj’ to the end of the filename
- _setup_quantify_misfit(source_name)
Gather waveforms from the Solver scratch directory and
- quantify_misfit(source_name=None, save_residuals=None, save_adjsrcs=None, iteration=1, step_count=0, **kwargs)
Prepares solver for gradient evaluation by writing residuals and adjoint traces. Meant to be called by solver.eval_func().
Reads in observed and synthetic waveforms, applies optional preprocessing, assesses misfit, and writes out adjoint sources and STATIONS_ADJOINT file.
TODO use concurrent futures to parallelize this
- Parameters
source_name (str) – name of the event to quantify misfit for. If not given, will attempt to gather event id from the given task id which is assigned by system.run()
save_residuals (str) – if not None, path to write misfit/residuls to
save_adjsrcs (str) – if not None, path to write adjoint sources to
iteration (int) – current iteration of the workflow, information should be provided by workflow module if we are running an inversion. Defaults to 1 if not given (1st iteration)
step_count (int) – current step count of the line search. Information should be provided by the optimize module if we are running an inversion. Defaults to 0 if not given (1st evaluation)
- finalize()
Teardown procedures for the default preprocessing class
- static sum_residuals(residuals)
Returns the summed square of residuals for each event. Following Tape et al. 2007
- Parameters
residuals (np.array) – list of residuals from each NTASK event
- Return type
float
- Returns
sum of squares of residuals
- _apply_filter(st)
Apply a filter to waveform data using ObsPy
- Parameters
st (obspy.core.stream.Stream) – stream to be filtered
- Return type
obspy.core.stream.Stream
- Returns
filtered traces
- _apply_mute(st)
Apply mute on data based on early or late arrivals, and short or long source receiver distances
Note
The underlying mute functions have been refactored but not tested as I was not aware of the intended functionality. Not gauranteed to work, use at your own risk.
- Parameters
st (obspy.core.stream.Stream) – stream to mute
- Return type
obspy.core.stream.Stream
- Returns
muted stream object
- _apply_normalize(st)
Normalize the amplitudes of waveforms based on user choice
Note
The normalization function has been refactored but not tested as I was not aware of the intended functionality. Not gauranteed to work, use at your own risk.
- Parameters
st (obspy.core.stream.Stream) – All of the data streams to be normalized
- Return type
obspy.core.stream.Stream
- Returns
stream with normalized traces
- seisflows.preprocess.default.read_ascii(fid, origintime=None)
Read waveforms in two-column ASCII format. This is copied directly from pyatoa.utils.read.read_sem()