seisflows.preprocess.default

The SeisFlows Preprocessing module is in charge of interacting with seismic data (observed and synthetic). It should contain functionality to read and write seismic data, apply preprocessing such as filtering, quantify misfit, and write adjoint sources that are expected by the solver.

Module Contents

Classes

Default

Default Preprocess

Functions

read_ascii(fid[, origintime])

Read waveforms in two-column ASCII format. This is copied directly from

class seisflows.preprocess.default.Default(syn_data_format='ascii', obs_data_format='ascii', unit_output='VEL', misfit='waveform', adjoint='waveform', normalize=None, filter=None, min_period=None, max_period=None, min_freq=None, max_freq=None, mute=None, early_slope=None, early_const=None, late_slope=None, late_const=None, short_dist=None, long_dist=None, workdir=os.getcwd(), path_preprocess=None, path_solver=None, **kwargs)

Default Preprocess

Data processing for seismic traces, with options for data misfit, filtering, normalization and muting.

Parameters

type obs_data_format

str

param obs_data_format

data format for reading observed traces into memory. Available formats: ‘su’, ‘ascii’, ‘sac’

type unit_output

str

param unit_output

Data units. Must match the synthetic output of external solver. Available: [‘DISP’: displacement, ‘VEL’: velocity, ‘ACC’: acceleration]

type misfit

str

param misfit

misfit function for waveform comparisons. For available see seisflows.plugins.preprocess.misfit

type backproject

str

param backproject

backprojection function for migration, or the objective function in FWI. For available see seisflows.plugins.preprocess.adjoint

type normalize

str

param normalize

Data normalization parameters used to normalize the amplitudes of waveforms. Choose from two sets: ENORML1: normalize per event by L1 of traces; OR ENORML2: normalize per event by L2 of traces; & TNORML1: normalize per trace by L1 of itself; OR TNORML2: normalize per trace by L2 of itself

type filter

str

param filter

Data filtering type, available options are: BANDPASS (req. MIN/MAX PERIOD/FREQ); LOWPASS (req. MAX_FREQ or MIN_PERIOD); HIGHPASS (req. MIN_FREQ or MAX_PERIOD)

type min_period

float

param min_period

Minimum filter period applied to time series. See also MIN_FREQ, MAX_FREQ, if User defines FREQ parameters, they will overwrite PERIOD parameters.

type max_period

float

param max_period

Maximum filter period applied to time series. See also MIN_FREQ, MAX_FREQ, if User defines FREQ parameters, they will overwrite PERIOD parameters.

type min_freq

float

param min_freq

Maximum filter frequency applied to time series, See also MIN_PERIOD, MAX_PERIOD, if User defines FREQ parameters, they will overwrite PERIOD parameters.

type max_freq

float

param max_freq

Maximum filter frequency applied to time series, See also MIN_PERIOD, MAX_PERIOD, if User defines FREQ parameters, they will overwrite PERIOD parameters.

type mute

list

param mute

Data mute parameters used to zero out early / late arrivals or offsets. Choose any number of: EARLY: mute early arrivals; LATE: mute late arrivals; SHORT: mute short source-receiver distances; LONG: mute long source-receiver distances

Paths

type path_preprocess

str

param path_preprocess

scratch path for all preprocessing processes, including saving files

***

check()

Checks parameters and paths

setup()

Sets up data preprocessing machinery by dynamicalyl loading the misfit, adjoint source type, and specifying the expected file type for input and output seismic data.

read(fid, data_format)

Waveform reading functionality. Imports waveforms as Obspy streams

Parameters
  • fid (str) – path to file to read data from

  • data_format (str) – format of the file to read data from

Return type

obspy.core.stream.Stream

Returns

ObsPy stream containing data stored in fid

write(st, fid)

Waveform writing functionality. Writes waveforms back to format that SPECFEM recognizes

Parameters
  • st (obspy.core.stream.Stream) – stream to write

  • fid (str) – path to file to write stream to

_calculate_misfit(**kwargs)

Wrapper for plugins.preprocess.misfit misfit/objective function

_generate_adjsrc(**kwargs)

Wrapper for plugins.preprocess.adjoint source/backproject function

initialize_adjoint_traces(data_filenames, output, data_format=None)

SPECFEM requires that adjoint traces be present for every matching synthetic seismogram. If an adjoint source does not exist, it is simply set as zeros. This function creates all adjoint traces as zeros, to be filled out later

Appends ‘.adj. to the solver filenames as expected by SPECFEM (if they don’t already have that extension)

TODO there are some sem2d and 3d specific tasks that are not carried TODO over here, were they required?

Parameters
  • data_filenames (list of str) – existing solver waveforms (synthetic) to read. These will be copied, zerod out, and saved to path save. Should come from solver.data_filenames

  • output (str) – path to save the new adjoint traces to.

_check_adjoint_traces(source_name, save_adjsrcs, synthetic)

Check that all adjoint traces required by SPECFEM exist

_rename_as_adjoint_source(fid)

Rename synthetic waveforms into filenames consistent with how SPECFEM expects adjoint sources to be named. Usually this just means adding a ‘.adj’ to the end of the filename

_setup_quantify_misfit(source_name)

Gather waveforms from the Solver scratch directory and

quantify_misfit(source_name=None, save_residuals=None, save_adjsrcs=None, iteration=1, step_count=0, **kwargs)

Prepares solver for gradient evaluation by writing residuals and adjoint traces. Meant to be called by solver.eval_func().

Reads in observed and synthetic waveforms, applies optional preprocessing, assesses misfit, and writes out adjoint sources and STATIONS_ADJOINT file.

TODO use concurrent futures to parallelize this

Parameters
  • source_name (str) – name of the event to quantify misfit for. If not given, will attempt to gather event id from the given task id which is assigned by system.run()

  • save_residuals (str) – if not None, path to write misfit/residuls to

  • save_adjsrcs (str) – if not None, path to write adjoint sources to

  • iteration (int) – current iteration of the workflow, information should be provided by workflow module if we are running an inversion. Defaults to 1 if not given (1st iteration)

  • step_count (int) – current step count of the line search. Information should be provided by the optimize module if we are running an inversion. Defaults to 0 if not given (1st evaluation)

finalize()

Teardown procedures for the default preprocessing class

static sum_residuals(residuals)

Returns the summed square of residuals for each event. Following Tape et al. 2007

Parameters

residuals (np.array) – list of residuals from each NTASK event

Return type

float

Returns

sum of squares of residuals

_apply_filter(st)

Apply a filter to waveform data using ObsPy

Parameters

st (obspy.core.stream.Stream) – stream to be filtered

Return type

obspy.core.stream.Stream

Returns

filtered traces

_apply_mute(st)

Apply mute on data based on early or late arrivals, and short or long source receiver distances

Note

The underlying mute functions have been refactored but not tested as I was not aware of the intended functionality. Not gauranteed to work, use at your own risk.

Parameters

st (obspy.core.stream.Stream) – stream to mute

Return type

obspy.core.stream.Stream

Returns

muted stream object

_apply_normalize(st)

Normalize the amplitudes of waveforms based on user choice

Note

The normalization function has been refactored but not tested as I was not aware of the intended functionality. Not gauranteed to work, use at your own risk.

Parameters

st (obspy.core.stream.Stream) – All of the data streams to be normalized

Return type

obspy.core.stream.Stream

Returns

stream with normalized traces

seisflows.preprocess.default.read_ascii(fid, origintime=None)

Read waveforms in two-column ASCII format. This is copied directly from pyatoa.utils.read.read_sem()