seisflows.tools.specfem ======================= .. py:module:: seisflows.tools.specfem .. autoapi-nested-parse:: Utilities to interact with, manipulate or call on the external solver, i.e., SPECFEM2D/3D/3D_GLOBE Attributes ---------- .. autoapisummary:: seisflows.tools.specfem.SOURCE_KEYS Functions --------- .. autoapisummary:: seisflows.tools.specfem.get_src_rcv_lookup_table seisflows.tools.specfem.return_matching_waveform_files seisflows.tools.specfem.convert_stations_to_sources seisflows.tools.specfem.get_station_locations seisflows.tools.specfem.get_source_locations seisflows.tools.specfem.rename_as_adjoint_source seisflows.tools.specfem.check_source_names seisflows.tools.specfem.getpar seisflows.tools.specfem.setpar seisflows.tools.specfem._getidx_vel_model seisflows.tools.specfem.getpar_vel_model seisflows.tools.specfem.setpar_vel_model seisflows.tools.specfem.read_fortran_binary seisflows.tools.specfem.write_fortran_binary Module Contents --------------- .. py:data:: SOURCE_KEYS .. py:function:: get_src_rcv_lookup_table(path_to_data, source_prefix='CMTSOLUTION', stations_file='STATIONS') Generate a lookup table that gives relative distance, azimuth etc. for each source and receiver used in the workflow. Source and station locations will be gathered from metadata files stored in the SPECFEM data directory. :type path_to_data: str :param path_to_data: full path to the SPECFEM DATA/ directory which should contain the source file (e.g., CMTSOLUTION, FORCESOLUTION), and the STATIONS file whose name is defined by `station_file`. :type source_prefix: str :param source_prefix: prefix of all the source files that will be wildcard searched for using the following wildcard: {source_prefix}_*. e.g., if all your files are CMTSOLUTIONS (CMTSOLUTION_001, CMTSOLUTION_002), then `source_prefix` should be 'CMTSOLUTION' :type stations_file: str :param stations_file: the name of the STATIONS file in the SPECFEM DATA directory `path_to_data`. By default this is STATIONS .. py:function:: return_matching_waveform_files(obs_path, syn_path, obs_fmt='ASCII', syn_fmt='ASCII', components=None) Gather a list of filenames of matching waveform IDs which are sorted and match for given paths to 'observed' and 'synthetic' waveform data. This is useful because often the available 'observed' data will not match all of the 'synthetic' data that is generated during a simulation, so this function helps determine which files/stations can be accessed for misfit quantification. .. note:: Waveform files are expected to be in the format NN.SSS.CCc.* (N=network, S=station, C=channel, c=component; following SPECFEM ASCII formatting). They will be matched on `NN.SSS.c` (dropping channel naming because SEED convention may have different channel naming). For example, synthetic name 'AA.S001.BXZ.semd' will be converted to 'AA.S001.Z', and matching observation 'AA.S001.HHZ.SAC' will be converted to 'AA.S001.Z'. These two will be matched. :type obs_path: str :param obs_path: the path to observed data waveforms for a given source. In SeisFlows this will typically point to somewhere like: 'scratch/solver//traces/obs/' :type syn_path: str :param syn_path: the path to synthetic data waveforms for a given source. In SeisFlows this will typically point to somewhere like: 'scratch/solver//traces/syn/' :type obs_fmt: str :param obs_fmt: expected file format of the observed waveforms. Used for safety checks that only one expected file format will be read. Defaults to 'ASCII' :type syn_fmt: str :param syn_fmt: expected file format of the synthetic waveforms. Used for safety checks that only one expected file format will be read. Defaults to 'ASCII' :type components: list :param components: optional list of components to ignore preprocessing traces that do not have matching components. The adjoint sources for these components will be 0. E.g., ['Z', 'N']. If None, all available components will be considered. :rtype: list of tuples :return: [(observed filename, synthetic filename)]. tuples will contain filenames for matching stations + component for obs and syn .. py:function:: convert_stations_to_sources(stations_file, source_file, source_type, output_dir='./') Used for ambient noise adjoint tomography inversions where each station is treated like a virtual source. This requires generating source files for each station in a station file. .. warning:: In SPECFEM3D_GLOBE, the FORCESOLUTION first line requires a specific format. I haven't tested this thoroughly but it either cannot exceed a char limit, or cannot have characters like spaces or hyphens. To be safe, something like FORCE{N} will work, where N should be the source number. This function enforces this just incase. :type stations_file: str :param stations_file: full path to SPECFEM STATIONS file which should be formatted 'STATION NETWORK LATITUDE LONGITUDE ELEVATION BURIAL', elevant and burial will not be used :type source_file: str :param source_file: path to :type source_type: str :param source_type: tells SeisFlows what type of file we are using, which in turn defines the specific keys and delimiters to use when editing the source file. - SOURCE: SPECFEM2D source file - FORCESOLUTION_3D: SPECFEM3D Cartesian FORCESOLUTION file - FORCESOLUTION_3DGLOBE: SPECFEM3D_GLOBE FORCESOLUTION file .. py:function:: get_station_locations(stations_file) Read the SPECFEM STATIONS file to get metadata information about stations. This functionality is required in a few preprocessing or utility functions :type stations_file: str :param stations_file: full path to STATIONS file that will be read. We assume the structure of the STATIONS file to be a text file with the first 4 columns defining 'station ID, network ID, latitude, longitude' :rtype: Dict :return: keys are `network_station` and values are dictionaries contianing 'lat' and 'lon' for latitude and longitude values .. py:function:: get_source_locations(path_to_sources, source_prefix) Read in SOURCE/CMTSOLUTION/FORCESOLUTION files to get lat/lon/depth or x/y/z values which can be used to determine relative locations between sources and receivers. Used by the preprocessing module. :type path_to_sources: str :param path_to_sources: full path to all source files which should start with `source_prefix`. Will run a wildcard glob search on this path to look for ALL available source files :type source_prefix: str :param source_prefix: source file name prefix used for wildcard searching. This function will look for glob(`path_to_sources`/`source_prefix`_*), and tag each of the sources with whatever tag comes within wildcard * :rtype: Dict :return: keys are `source tag` and values are dictionaries contianing 'lat' and 'lon' for latitude and longitude values. If None is returned, then function could not determine what keys to use for reading source. .. py:function:: rename_as_adjoint_source(fid, fmt) Rename SPECFEM synthetic waveform filenames consistent with how SPECFEM expects adjoint sources to be named. Usually this just means adding a '.adj' to the end of the filename. :type fid: str :param fid: file path of synthetic waveform to rename as adjoint source :type fmt: str :param fmt: expected format of the input synthetic waveform, because different file formats have different filename structure. Available are 'SU' (seismic unix) and 'ASCII'. Case-insensitive :rtype: str :return: renamed file that matches expected SPECFEM filename format for adjoint sources .. py:function:: check_source_names(path_specfem_data, source_prefix, ntask=None) Determines names of sources by applying wildcard rule to user-supplied input files. Source names are only provided up to PAR.NTASK and are returned in alphabetical order. .. note:: SeisFlows expects sources to be stored in the DATA/ directory with a prefix and a source name, e.g., {source_prefix}_{source_name} which would evaluate to something like CMTSOLUTION_001 :type path_specfem_data: str :param path_specfem_data: path to a :type source_prefix: str :param source_prefix: type of SPECFEM input source, e.g., CMTSOLUTION :type ntask: int :parma ntask: if provided, curtails the list of sources up to `ntask`. If None, returns all files found matching the wildcard :rtype: list :return: alphabetically ordered list of source names up to PAR.NTASK .. py:function:: getpar(key, file, delim='=', match_partial=False, comment='#', _fmt_dbl=True) Reads and returns parameters from a SPECFEM or SeisFlows parameter file Assumes the parameter file is formatted in the following way: # comment comment comment {key} {delim} VAL :type key: str :param key: case-insensitive key to match in par_file. must be EXACT match :type file: str :param file: The SPECFEM Par_file to match against :type delim: str :param delim: delimiter between parameters and values within the file. default is '=', which matches for SPECFEM2D and SPECFEM3D_Cartesian :type match_partial: bool :param match_partial: allow partial key matches, e.g., allow key='tit' to return value for 'title'. Defaults to False as this can have unintended consequences :type comment: str :param comment: character used to delimit comments in the file. Defaults to '#' for the SPECFEM Par_file, but things like the FORCESOLUTION use '!' :type _fmt_dbl: bool :param _fmt_dbl: the SPECFEM files use FORTRAN double precision notation to define floats (e.g., 2.5d0 == 2.5, or 1.2e2 == 1.2*10^2). Usually it is preferable to convert this notation directly to a Python float, so this is set True by default. However, in cases where we are doing string replacement, like when using `setpar`, we do not want to format the double precision values, so this should be set False. :rtype: tuple (str, str, int) :return: a tuple of the key, value and line number (indexed from 0). The key will match exactly how it looks in the Par_file The value will be returned as a string, regardless of its expected type IF no matches found, returns (None, None, None) .. py:function:: setpar(key, val, file, delim='=', **kwargs) Overwrites parameter value to a SPECFEM Par_file. Kwargs passed to `getpar` :type key: str :param key: case-insensitive key to match in par_file. must be EXACT match :type val: str :param val: value to OVERWRITE to the given key :type file: str :param file: The SPECFEM Par_file to match against :type delim: str :param delim: delimiter between parameters and values within the file. default is '=', which matches for SPECFEM2D and SPECFEM3D_Cartesian :type match_partial: bool :param match_partial: allow partial key matches, e.g., allow key='tit' to return value for 'title'. Defaults to False as this can have unintended consequences .. py:function:: _getidx_vel_model(lines) Get the line indices of a velocity model, which can be used to retrieve or replace the model values in a SPECFEM2D paramter file. Used by `getpar_vel_model` and `setpar_vel_model` :type lines: list :param lines: list of strings read from the par_file :rtype idxs: list :param idxs: list of integer indices of the velocity model lines .. py:function:: getpar_vel_model(file, strip=False) SPECFEM2D doesn't follow standard key = val formatting when defining its internal velocity models so we need a special function to address this specifically. Velocity models are ASSUMED to be formatted in the following way 1 1 2700.d0 3000.d0 1732.051d0 0 0 9999 9999 0 0 0 0 0 0 That is, 15 entries separated by spaces. We use that to find all relevant lines of the model. :type file: str :param file: The SPECFEM Par_file to match against :type strip: bool :param strip: strip newline ' ' from each of the model lines :rtype: list of str :return: list of all the layers of the velocity model as strings .. py:function:: setpar_vel_model(file, model) Set velocity model values in a SPECFEM2D Par_file, see getpar_vel_model for more information. Deletes the old model from the Par_file, writes the new model in the same place, and then changes the value of 'nbmodels' :type file: str :param file: The SPECFEM Par_file to match against :type model: list of str :param model: input model :rtype: list of str :return: list of all the layers of the velocity model as strings, e.g.: model = ["1 1 2700.d0 3000.d0 1732.051d0 0 0 9999 9999 0 0 0 0 0 0", "2 1 2500.d0 2700.d0 0 0 0 9999 9999 0 0 0 0 0 0"] .. py:function:: read_fortran_binary(filename) Reads Fortran-style unformatted binary data into numpy array. .. note:: The FORTRAN runtime system embeds the record boundaries in the data by inserting an INTEGER*4 byte count at the beginning and end of each unformatted sequential record during an unformatted sequential WRITE. see: https://docs.oracle.com/cd/E19957-01/805-4939/6j4m0vnc4/index.html :type filename: str :param filename: full path to the Fortran unformatted binary file to read :rtype: np.array :return: numpy array with data with data read in as type Float32 .. py:function:: write_fortran_binary(arr, filename) Writes Fortran style binary files. Data are written as single precision floating point numbers. .. note:: FORTRAN unformatted binaries are bounded by an INT*4 byte count. This function mimics that behavior by tacking on the boundary data. https://docs.oracle.com/cd/E19957-01/805-4939/6j4m0vnc4/index.html :type arr: np.array :param arr: data array to write as Fortran binary :type filename: str :param filename: full path to file that should be written in format unformatted Fortran binary