epsproc.util package¶
Submodules¶
Module contents¶
ePSproc utility functions.
Set of tools for assignment, sorting, normalisation and conversion.
- 16/03/20 Converted to submodule, mainly split out from old util.py, plus some new functions.
- Imports may be buggy…
14/10/19 Added string replacement function (generic) 11/08/19 Added matEleSelector
-
epsproc.util.
ADMdimList
(sType='stacked')[source]¶ Return standard list of dimensions for frame definitions, from
epsproc.sphCalc.setADMs()
.Parameters: sType (string, optional, default = 'stacked') – Selected ‘stacked’ or ‘unstacked’ dimensions. Set ‘sDict’ to return a dictionary of unstacked <> stacked dims mappings for use with xr.stack({dim mapping}). Returns: list Return type: set of dimension labels.
-
epsproc.util.
BLMdimList
(sType='stacked')[source]¶ Return standard list of dimensions for calculated BLM.
Parameters: sType (string, optional, default = 'stacked') – Selected ‘stacked’ or ‘unstacked’ dimensions. Set ‘sDict’ to return a dictionary of unstacked <> stacked dims mappings for use with xr.stack({dim mapping}). Returns: list Return type: set of dimension labels.
-
epsproc.util.
arraySort2D
(a, col)[source]¶ Sort np.array a by specified column col. From https://thispointer.com/sorting-2d-numpy-array-by-column-or-row-in-python/
-
epsproc.util.
conv_ev_atm
(data, to='ev')[source]¶ Convert eV <> Hartree (atomic units)
Parameters: - data (int, float, np.array) – Values to convert.
- to (str, default = 'ev') –
- ‘ev’ to convert H > eV
- ’H’ to convert eV > H
Returns: Return type: data converted in converted units.
-
epsproc.util.
dataTypesList
()[source]¶ Return a dict of allowed dataTypes, corresponding to epsproc processed data.
Each dataType lists ‘source’, ‘desc’ and ‘recordType’ fields.
- ‘source’ fields correspond to ePS functions which get or generate the data.
- ‘desc’ brief description of the dataType.
- ‘recordType’ gives the required segment in ePS files (and associated parser). If the segment is not present in the source file, then the dataType will not be available.
- ‘def’ provides definition function handle (if applicable).
- ‘dims’ lists results from def(sType = ‘sDict’)
TODO: best choice of data structure here? Currently nested dictionary.
-
epsproc.util.
eulerDimList
(sType='stacked')[source]¶ Return standard list of dimensions for frame definitions, from
epsproc.sphCalc.setPolGeoms()
.Parameters: sType (string, optional, default = 'stacked') – Selected ‘stacked’ or ‘unstacked’ dimensions. Set ‘sDict’ to return a dictionary of unstacked <> stacked dims mappings for use with xr.stack({dim mapping}). Returns: list Return type: set of dimension labels.
-
epsproc.util.
genLM
(Lmax)[source]¶ Return array of (L,M) up to supplied Lmax
TODO: add return type options, include conversion to SHtools types.
-
epsproc.util.
jobSummary
(jobInfo=None, molInfo=None, tolConv=0.01)[source]¶ Print some jobInfo stuff & plot molecular structure. (Currently very basic.)
Parameters: - jobInfo (dict, default = None) – Dictionary of job data, as generated by :py:function:`epsproc.IO.headerFileParse()` from source ePS output file.
- molInfo (dict, default = None) – Dictionary of molecule data, as generated by
epsproc.IO.molInfoParse()
from source ePS output file. - tolConv (float, default = 1e-2) – Used to check for convergence in ExpOrb outputs, which defines single-center expansion of orbitals.
Returns: - JobInfo (list)
- orbInfo (dict) – Properties of ionizing orbital, as determined from (jobInfo, molInfo).
- 20/09/20 v2 Added orbInfo dict, and use this to hold all orbital related outputs for return. May break old codes (pre v1.2.6-dev).
- Moved orbInfo to a separate function.
-
epsproc.util.
lmSymSummary
(data)[source]¶ Display summary info data tables.
Works nicely in a notebook cell, with Pandas formatted table… but not from function?
For a more sophisticated Pandas conversion, see
epsproc.util.conversion.multiDimXrToPD()
-
epsproc.util.
matEdimList
(sType='stacked')[source]¶ Return standard list of dimensions for matrix elements.
Parameters: sType (string, optional, default = 'stacked') – Selected ‘stacked’ or ‘unstacked’ dimensions. Set ‘sDict’ to return a dictionary of unstacked <> stacked dims mappings for use with xr.stack({dim mapping}). Returns: list Return type: set of dimension labels.
-
epsproc.util.
matEleSelector
(da, thres=None, inds=None, dims=None, sq=False, drop=True)[source]¶ Select & threshold raw matrix elements in an Xarray. Wraps Xarray.sel(), plus some additional options.
See Xarray docs for more: http://xarray.pydata.org/en/stable/user-guide/indexing.html
Parameters: - da (Xarray) – Set of matrix elements to sub-select
- thres (float, optional, default None) – Threshold value for abs(matElement), keep only elements > thres. This is element-wise.
- inds (dict, optional, default None) – Dicitonary of additional selection criteria, in name:value format. These correspond to parameter dimensions in the Xarray structure. E.g. inds = {‘Type’:’L’,’Cont’:’A2’}
- dims (str or list of strs, dimensions to look for max & threshold, default None) – Set for dimension-wise thresholding. If set, this is used instead of element-wise thresholding. List of dimensions, which will be checked vs. threshold for max value, according to abs(dim.max) > threshold This allows for consistent selection of continuous parameters over a dimension, by a threshold.
- sq (bool, optional, default False) – Squeeze output singleton dimensions.
- drop (bool, optional, default True) – Passed to da.where() for thresholding, drop coord labels for values below threshold.
Returns: Xarray structure of selected matrix elements. Note that Nans are dropped if possible.
Return type: daOut
Example
>>> daOut = matEleSelector(da, inds = {'Type':'L','Cont':'A2'})
Notes
xr.sel(inds) is used here. For single values xr.sel({name:[value]}) or xr.sel({name:value}) is different! Automatically squeeze out dim in latter case. (Tested on xr v0.15)
E.g., for selecting a single Eke value: da.sel({‘Eke’:[1.1]}) # Keeps Eke dim da.sel({‘Eke’:1.1}) # Drops Eke to non-dimension coord. da.sel({‘Eke’:1.1}, drop=True) # Drops Eke completely da.sel({‘Eke’:[1.1]}, drop=True) # Keeps Eke da.sel({‘Eke’:[1.1]}, drop=True).squeeze() # Drops Eke to non-dim coord
-
epsproc.util.
multiDimXrToPD
(da, colDims=None, rowDims=None, thres=None, squeeze=True, dropna=True, fillna=False, colRound=2, verbose=False)[source]¶ Convert multidim Xarray to stacked Pandas 2D array, (rowDims, colDims)
Parameters: - da (Xarray) – Array for conversion.
- colDims (list of dims for columns, default = None) –
- rowDims (list of dims for rows, default = None) –
- NOTE (if xDim is a MultiIndex, pass as a dictionary mapping, otherwise it may be unstacked during data prep.) –
- full control over dim stack ordering, specifiy both colDims and rowDims (For) –
- NOTE –
- for plotting stacked (L,M), set xDim = {'LM' (E.g.) –
- thres (float, optional, default = None) – Threshold values in output (pd table only) TODO: generalise this and use matEleSelector() for input?
- squeeze (bool, optional, default = True) – Drop singleton dimensions.
- dropna (bool, optional, default = True) – Drop all NaN dimensions from output pd data frame (columnwise and rowise).
- fillna (bool, optional, default = False) – Fill any NaN values with 0.0. Useful for plotting/making data contiguous.
- colRound (int, optional, default = True) – Round column values to colRound dp. Only applied for Eke, Ehv, Euler or t dimensions.
Returns: - daRestackpd (pandas data frame (2D) with sorted data.)
- daRestack (Xarray with restacked data.)
Restack Xarray by specified dims, including basic dims checking, then use da.to_pandas().
12/03/20 Function adapted from lmPlot() code.
Note
This might casue
epsproc.lmPlot()
to fail for singleton x-dimensions if squeeze = True. TO do: add work-around, see lines 114-122.
-
epsproc.util.
orb3DCoordConv
(fileIn, coordMaxLen=50)[source]¶ Basic coord parse & conversion for volumetric wavefunction files from ePS.
Parameters: - fileIn (data from a single file) – List of values from a wavefunction file, as returned by
epsproc.readOrb3D()
. (Note this currently assumes a single file/set of values.) - coordMaxLen (int, optional, default=50) – Max coord grid size, assumed to demark native Cart (<coordMaxLen) from Spherical (>coordMaxLen) coords.
Returns: x,y,z
Return type: np.arrays of Cartesian coords (x,y,z)
- fileIn (data from a single file) – List of values from a wavefunction file, as returned by
-
epsproc.util.
stringRepMap
(string, replacements, ignore_case=False)[source]¶ Given a string and a replacement map, it returns the replaced string. :param str string: string to execute replacements on :param dict replacements: replacement dictionary {value to find: value to replace} :param bool ignore_case: whether the match should be case insensitive :rtype: str
CODE from: https://gist.github.com/bgusach/a967e0587d6e01e889fd1d776c5f3729 https://stackoverflow.com/questions/6116978/how-to-replace-multiple-substrings-of-a-string … more or less verbatim.
Thanks to bgusach for the Gist.