seisflows.tools.specfem

Utilities to interact with, manipulate or call on the external solver, i.e., SPECFEM2D/3D/3D_GLOBE

Attributes

SOURCE_KEYS

Functions

get_src_rcv_lookup_table(path_to_data[, ...])

Generate a lookup table that gives relative distance, azimuth etc. for

return_matching_waveform_files(obs_path, syn_path[, ...])

Gather a list of filenames of matching waveform IDs which are sorted and

convert_stations_to_sources(stations_file, ...[, ...])

Used for ambient noise adjoint tomography inversions where each station

get_station_locations(stations_file)

Read the SPECFEM STATIONS file to get metadata information about stations.

get_source_locations(path_to_sources, source_prefix)

Read in SOURCE/CMTSOLUTION/FORCESOLUTION files to get lat/lon/depth or x/y/z

rename_as_adjoint_source(fid, fmt)

Rename SPECFEM synthetic waveform filenames consistent with how SPECFEM

check_source_names(path_specfem_data, source_prefix[, ...])

Determines names of sources by applying wildcard rule to user-supplied

getpar(key, file[, delim, match_partial, comment, ...])

Reads and returns parameters from a SPECFEM or SeisFlows parameter file

setpar(key, val, file[, delim])

Overwrites parameter value to a SPECFEM Par_file. Kwargs passed to getpar

_getidx_vel_model(lines)

Get the line indices of a velocity model, which can be used to retrieve

getpar_vel_model(file[, strip])

SPECFEM2D doesn't follow standard key = val formatting when defining its

setpar_vel_model(file, model)

Set velocity model values in a SPECFEM2D Par_file, see getpar_vel_model

read_fortran_binary(filename)

Reads Fortran-style unformatted binary data into numpy array.

write_fortran_binary(arr, filename)

Writes Fortran style binary files. Data are written as single precision

Module Contents

seisflows.tools.specfem.SOURCE_KEYS
seisflows.tools.specfem.get_src_rcv_lookup_table(path_to_data, source_prefix='CMTSOLUTION', stations_file='STATIONS')

Generate a lookup table that gives relative distance, azimuth etc. for each source and receiver used in the workflow. Source and station locations will be gathered from metadata files stored in the SPECFEM data directory.

Parameters:
  • path_to_data (str) – full path to the SPECFEM DATA/ directory which should contain the source file (e.g., CMTSOLUTION, FORCESOLUTION), and the STATIONS file whose name is defined by station_file.

  • source_prefix (str) – prefix of all the source files that will be wildcard searched for using the following wildcard: {source_prefix}_*. e.g., if all your files are CMTSOLUTIONS (CMTSOLUTION_001, CMTSOLUTION_002), then source_prefix should be ‘CMTSOLUTION’

  • stations_file (str) – the name of the STATIONS file in the SPECFEM DATA directory path_to_data. By default this is STATIONS

seisflows.tools.specfem.return_matching_waveform_files(obs_path, syn_path, obs_fmt='ASCII', syn_fmt='ASCII', components=None)

Gather a list of filenames of matching waveform IDs which are sorted and match for given paths to ‘observed’ and ‘synthetic’ waveform data. This is useful because often the available ‘observed’ data will not match all of the ‘synthetic’ data that is generated during a simulation, so this function helps determine which files/stations can be accessed for misfit quantification.

Note

Waveform files are expected to be in the format NN.SSS.CCc.* (N=network, S=station, C=channel, c=component; following SPECFEM ASCII formatting). They will be matched on NN.SSS.c (dropping channel naming because SEED convention may have different channel naming). For example, synthetic name ‘AA.S001.BXZ.semd’ will be converted to ‘AA.S001.Z’, and matching observation ‘AA.S001.HHZ.SAC’ will be converted to ‘AA.S001.Z’. These two will be matched.

Parameters:
  • obs_path (str) –

    the path to observed data waveforms for a given source. In SeisFlows this will typically point to somewhere like:

    ’scratch/solver/<SOURCE_NAME>/traces/obs/’

  • syn_path (str) –

    the path to synthetic data waveforms for a given source. In SeisFlows this will typically point to somewhere like:

    ’scratch/solver/<SOURCE_NAME>/traces/syn/’

  • obs_fmt (str) – expected file format of the observed waveforms. Used for safety checks that only one expected file format will be read. Defaults to ‘ASCII’

  • syn_fmt (str) – expected file format of the synthetic waveforms. Used for safety checks that only one expected file format will be read. Defaults to ‘ASCII’

  • components (list) – optional list of components to ignore preprocessing traces that do not have matching components. The adjoint sources for these components will be 0. E.g., [‘Z’, ‘N’]. If None, all available components will be considered.

Return type:

list of tuples

Returns:

[(observed filename, synthetic filename)]. tuples will contain filenames for matching stations + component for obs and syn

seisflows.tools.specfem.convert_stations_to_sources(stations_file, source_file, source_type, output_dir='./')

Used for ambient noise adjoint tomography inversions where each station is treated like a virtual source. This requires generating source files for each station in a station file.

Warning

In SPECFEM3D_GLOBE, the FORCESOLUTION first line requires a specific format. I haven’t tested this thoroughly but it either cannot exceed a char limit, or cannot have characters like spaces or hyphens. To be safe, something like FORCE{N} will work, where N should be the source number. This function enforces this just incase.

Parameters:
  • stations_file (str) – full path to SPECFEM STATIONS file which should be formatted ‘STATION NETWORK LATITUDE LONGITUDE ELEVATION BURIAL’, elevant and burial will not be used

  • source_file (str) – path to

  • source_type (str) –

    tells SeisFlows what type of file we are using, which in turn defines the specific keys and delimiters to use when editing the source file.

    • SOURCE: SPECFEM2D source file

    • FORCESOLUTION_3D: SPECFEM3D Cartesian FORCESOLUTION file

    • FORCESOLUTION_3DGLOBE: SPECFEM3D_GLOBE FORCESOLUTION file

seisflows.tools.specfem.get_station_locations(stations_file)

Read the SPECFEM STATIONS file to get metadata information about stations. This functionality is required in a few preprocessing or utility functions

Parameters:

stations_file (str) – full path to STATIONS file that will be read. We assume the structure of the STATIONS file to be a text file with the first 4 columns defining ‘station ID, network ID, latitude, longitude’

Return type:

Dict

Returns:

keys are network_station and values are dictionaries contianing ‘lat’ and ‘lon’ for latitude and longitude values

seisflows.tools.specfem.get_source_locations(path_to_sources, source_prefix)

Read in SOURCE/CMTSOLUTION/FORCESOLUTION files to get lat/lon/depth or x/y/z values which can be used to determine relative locations between sources and receivers. Used by the preprocessing module.

Parameters:
  • path_to_sources (str) – full path to all source files which should start with source_prefix. Will run a wildcard glob search on this path to look for ALL available source files

  • source_prefix (str) – source file name prefix used for wildcard searching. This function will look for glob(path_to_sources/`source_prefix`_*), and tag each of the sources with whatever tag comes within wildcard *

Return type:

Dict

Returns:

keys are source tag and values are dictionaries contianing ‘lat’ and ‘lon’ for latitude and longitude values. If None is returned, then function could not determine what keys to use for reading source.

seisflows.tools.specfem.rename_as_adjoint_source(fid, fmt)

Rename SPECFEM synthetic waveform filenames consistent with how SPECFEM expects adjoint sources to be named. Usually this just means adding a ‘.adj’ to the end of the filename.

Parameters:
  • fid (str) – file path of synthetic waveform to rename as adjoint source

  • fmt (str) – expected format of the input synthetic waveform, because different file formats have different filename structure. Available are ‘SU’ (seismic unix) and ‘ASCII’. Case-insensitive

Return type:

str

Returns:

renamed file that matches expected SPECFEM filename format for adjoint sources

seisflows.tools.specfem.check_source_names(path_specfem_data, source_prefix, ntask=None)

Determines names of sources by applying wildcard rule to user-supplied input files. Source names are only provided up to PAR.NTASK and are returned in alphabetical order.

Note

SeisFlows expects sources to be stored in the DATA/ directory with a prefix and a source name, e.g., {source_prefix}_{source_name} which would evaluate to something like CMTSOLUTION_001

Parameters:
  • path_specfem_data (str) – path to a

  • source_prefix (str) – type of SPECFEM input source, e.g., CMTSOLUTION

Parma ntask:

if provided, curtails the list of sources up to ntask. If None, returns all files found matching the wildcard

Return type:

list

Returns:

alphabetically ordered list of source names up to PAR.NTASK

seisflows.tools.specfem.getpar(key, file, delim='=', match_partial=False, comment='#', _fmt_dbl=True)

Reads and returns parameters from a SPECFEM or SeisFlows parameter file Assumes the parameter file is formatted in the following way:

# comment comment comment {key} {delim} VAL

Parameters:
  • key (str) – case-insensitive key to match in par_file. must be EXACT match

  • file (str) – The SPECFEM Par_file to match against

  • delim (str) – delimiter between parameters and values within the file. default is ‘=’, which matches for SPECFEM2D and SPECFEM3D_Cartesian

  • match_partial (bool) – allow partial key matches, e.g., allow key=’tit’ to return value for ‘title’. Defaults to False as this can have unintended consequences

  • comment (str) – character used to delimit comments in the file. Defaults to ‘#’ for the SPECFEM Par_file, but things like the FORCESOLUTION use ‘!’

  • _fmt_dbl (bool) – the SPECFEM files use FORTRAN double precision notation to define floats (e.g., 2.5d0 == 2.5, or 1.2e2 == 1.2*10^2). Usually it is preferable to convert this notation directly to a Python float, so this is set True by default. However, in cases where we are doing string replacement, like when using setpar, we do not want to format the double precision values, so this should be set False.

Return type:

tuple (str, str, int)

Returns:

a tuple of the key, value and line number (indexed from 0). The key will match exactly how it looks in the Par_file The value will be returned as a string, regardless of its expected type IF no matches found, returns (None, None, None)

seisflows.tools.specfem.setpar(key, val, file, delim='=', **kwargs)

Overwrites parameter value to a SPECFEM Par_file. Kwargs passed to getpar

Parameters:
  • key (str) – case-insensitive key to match in par_file. must be EXACT match

  • val (str) – value to OVERWRITE to the given key

  • file (str) – The SPECFEM Par_file to match against

  • delim (str) – delimiter between parameters and values within the file. default is ‘=’, which matches for SPECFEM2D and SPECFEM3D_Cartesian

  • match_partial (bool) – allow partial key matches, e.g., allow key=’tit’ to return value for ‘title’. Defaults to False as this can have unintended consequences

seisflows.tools.specfem._getidx_vel_model(lines)

Get the line indices of a velocity model, which can be used to retrieve or replace the model values in a SPECFEM2D paramter file. Used by getpar_vel_model and setpar_vel_model

Parameters:
  • lines (list) – list of strings read from the par_file

  • idxs – list of integer indices of the velocity model lines

Rtype idxs:

list

seisflows.tools.specfem.getpar_vel_model(file, strip=False)

SPECFEM2D doesn’t follow standard key = val formatting when defining its internal velocity models so we need a special function to address this specifically.

Velocity models are ASSUMED to be formatted in the following way

1 1 2700.d0 3000.d0 1732.051d0 0 0 9999 9999 0 0 0 0 0 0

That is, 15 entries separated by spaces. We use that to find all relevant lines of the model.

type file:

str

param file:

The SPECFEM Par_file to match against

type strip:

bool

param strip:

strip newline ‘

‘ from each of the model lines
rtype:

list of str

return:

list of all the layers of the velocity model as strings

seisflows.tools.specfem.setpar_vel_model(file, model)

Set velocity model values in a SPECFEM2D Par_file, see getpar_vel_model for more information.

Deletes the old model from the Par_file, writes the new model in the same place, and then changes the value of ‘nbmodels’

Parameters:
  • file (str) – The SPECFEM Par_file to match against

  • model (list of str) – input model

Return type:

list of str

Returns:

list of all the layers of the velocity model as strings, e.g.: model = [“1 1 2700.d0 3000.d0 1732.051d0 0 0 9999 9999 0 0 0 0 0 0”,

”2 1 2500.d0 2700.d0 0 0 0 9999 9999 0 0 0 0 0 0”]

seisflows.tools.specfem.read_fortran_binary(filename)

Reads Fortran-style unformatted binary data into numpy array.

Note

The FORTRAN runtime system embeds the record boundaries in the data by inserting an INTEGER*4 byte count at the beginning and end of each unformatted sequential record during an unformatted sequential WRITE. see: https://docs.oracle.com/cd/E19957-01/805-4939/6j4m0vnc4/index.html

Parameters:

filename (str) – full path to the Fortran unformatted binary file to read

Return type:

np.array

Returns:

numpy array with data with data read in as type Float32

seisflows.tools.specfem.write_fortran_binary(arr, filename)

Writes Fortran style binary files. Data are written as single precision floating point numbers.

Note

FORTRAN unformatted binaries are bounded by an INT*4 byte count. This function mimics that behavior by tacking on the boundary data. https://docs.oracle.com/cd/E19957-01/805-4939/6j4m0vnc4/index.html

Parameters:
  • arr (np.array) – data array to write as Fortran binary

  • filename (str) – full path to file that should be written in format unformatted Fortran binary