epsproc.util package

Submodules

Module contents

ePSproc utility functions.

Set of tools for assignment, sorting, normalisation and conversion.

16/03/20 Converted to submodule, mainly split out from old util.py, plus some new functions.

Imports may be buggy…

14/10/19 Added string replacement function (generic) 11/08/19 Added matEleSelector

epsproc.util.ADMdimList(sType='stacked')[source]

Return standard list of dimensions for frame definitions, from epsproc.sphCalc.setADMs().

Parameters

sType (string, optional, default = 'stacked') – Selected ‘stacked’ or ‘unstacked’ dimensions. Set ‘sDict’ to return a dictionary of unstacked <> stacked dims mappings for use with xr.stack({dim mapping}).

Returns

list

Return type

set of dimension labels.

epsproc.util.BLMdimList(sType='stacked')[source]

Return standard list of dimensions for calculated BLM.

Parameters

sType (string, optional, default = 'stacked') – Selected ‘stacked’ or ‘unstacked’ dimensions. Set ‘sDict’ to return a dictionary of unstacked <> stacked dims mappings for use with xr.stack({dim mapping}).

Returns

list

Return type

set of dimension labels.

epsproc.util.arraySort2D(a, col)[source]

Sort np.array a by specified column col. From https://thispointer.com/sorting-2d-numpy-array-by-column-or-row-in-python/

epsproc.util.conv_ev_atm(data, to='ev')[source]

Convert eV <> Hartree (atomic units)

Parameters
  • data (int, float, np.array) – Values to convert.

  • to (str, default = 'ev') –

    • ‘ev’ to convert H > eV

    • ’H’ to convert eV > H

Return type

data converted in converted units.

epsproc.util.conv_ev_nm(data)[source]

Convert E(eV) <> lambda(nm).

epsproc.util.dataGroupSel(data, dInd)[source]
epsproc.util.dataTypesList()[source]

Return a dict of allowed dataTypes, corresponding to epsproc processed data.

Each dataType lists ‘source’, ‘desc’ and ‘recordType’ fields.

  • ‘source’ fields correspond to ePS functions which get or generate the data.

  • ‘desc’ brief description of the dataType.

  • ‘recordType’ gives the required segment in ePS files (and associated parser). If the segment is not present in the source file, then the dataType will not be available.

  • ‘def’ provides definition function handle (if applicable).

  • ‘dims’ lists results from def(sType = ‘sDict’)

TODO: best choice of data structure here? Currently nested dictionary.

epsproc.util.eulerDimList(sType='stacked')[source]

Return standard list of dimensions for frame definitions, from epsproc.sphCalc.setPolGeoms().

Parameters

sType (string, optional, default = 'stacked') – Selected ‘stacked’ or ‘unstacked’ dimensions. Set ‘sDict’ to return a dictionary of unstacked <> stacked dims mappings for use with xr.stack({dim mapping}).

Returns

list

Return type

set of dimension labels.

epsproc.util.genLM(Lmax, allM=True)[source]

Return array of (L,M) up to supplied Lmax

If allM=False only M=0 terms will be set.

TODO: add return type options, include conversion to SHtools types.

epsproc.util.jobSummary(jobInfo=None, molInfo=None, tolConv=0.01)[source]

Print some jobInfo stuff & plot molecular structure. (Currently very basic.)

Parameters
  • jobInfo (dict, default = None) – Dictionary of job data, as generated by :py:function:`epsproc.IO.headerFileParse()` from source ePS output file.

  • molInfo (dict, default = None) – Dictionary of molecule data, as generated by epsproc.IO.molInfoParse() from source ePS output file.

  • tolConv (float, default = 1e-2) – Used to check for convergence in ExpOrb outputs, which defines single-center expansion of orbitals.

Returns

  • JobInfo (list)

  • orbInfo (dict) – Properties of ionizing orbital, as determined from (jobInfo, molInfo).

History

20/09/20 v2 Added orbInfo dict, and use this to hold all orbital related outputs for return. May break old codes (pre v1.2.6-dev).

Moved orbInfo to a separate function.

epsproc.util.lmSymSummary(data)[source]

Display summary info data tables.

Works nicely in a notebook cell, with Pandas formatted table… but not from function?

For a more sophisticated Pandas conversion, see epsproc.util.conversion.multiDimXrToPD()

epsproc.util.matEdimList(sType='stacked')[source]

Return standard list of dimensions for matrix elements.

Parameters

sType (string, optional, default = 'stacked') – Selected ‘stacked’ or ‘unstacked’ dimensions. Set ‘sDict’ to return a dictionary of unstacked <> stacked dims mappings for use with xr.stack({dim mapping}).

Returns

list

Return type

set of dimension labels.

epsproc.util.matEleSelector(da, thres=None, inds=None, dims=None, sq=False, drop=True)[source]

Select & threshold raw matrix elements in an Xarray. Wraps Xarray.sel(), plus some additional options.

See Xarray docs for more: http://xarray.pydata.org/en/stable/user-guide/indexing.html

Parameters
  • da (Xarray) – Set of matrix elements to sub-select

  • thres (float, optional, default None) – Threshold value for abs(matElement), keep only elements > thres. This is element-wise.

  • inds (dict, optional, default None) – Dicitonary of additional selection criteria, in name:value format. These correspond to parameter dimensions in the Xarray structure. E.g. inds = {‘Type’:’L’,’Cont’:’A2’} Slices are also acceptable, e.g. inds = {‘Eke’:slice(1,5,4)}

  • dims (str or list of strs, dimensions to look for max & threshold, default None) – Set for dimension-wise thresholding. If set, this is used instead of element-wise thresholding. List of dimensions, which will be checked vs. threshold for max value, according to abs(dim.max) > threshold This allows for consistent selection of continuous parameters over a dimension, by a threshold.

  • sq (bool, optional, default False) – Squeeze output singleton dimensions.

  • drop (bool, optional, default True) – Passed to da.where() for thresholding, drop coord labels for values below threshold.

Returns

Xarray structure of selected matrix elements. Note that Nans are dropped if possible.

Return type

daOut

Example

>>> daOut = matEleSelector(da, inds = {'Type':'L','Cont':'A2'})

Notes

xr.sel(inds) is used here. For single values xr.sel({name:[value]}) or xr.sel({name:value}) is different! Automatically squeeze out dim in latter case. (Tested on xr v0.15)

E.g., for selecting a single Eke value: da.sel({‘Eke’:[1.1]}) # Keeps Eke dim da.sel({‘Eke’:1.1}) # Drops Eke to non-dimension coord. da.sel({‘Eke’:1.1}, drop=True) # Drops Eke completely da.sel({‘Eke’:[1.1]}, drop=True) # Keeps Eke da.sel({‘Eke’:[1.1]}, drop=True).squeeze() # Drops Eke to non-dim coord

epsproc.util.multiDimXrToPD(da, colDims=None, rowDims=None, thres=None, squeeze=True, dropna=True, fillna=False, colRound=2, verbose=False)[source]

Convert multidim Xarray to stacked Pandas 2D array, (rowDims, colDims)

Parameters
  • da (Xarray) – Array for conversion.

  • colDims (list of dims for columns, default = None) –

  • rowDims (specifiy both colDims and) –

  • NOTE (if xDim is a MultiIndex, pass as a dictionary mapping, otherwise it may be unstacked during data prep.) –

  • ordering (For full control over dim stack) –

  • rowDims

  • NOTE

  • (L (E.g. for plotting stacked) –

  • M) (['L','M']}) –

  • {'LM' (set xDim =) –

  • thres (float, optional, default = None) – Threshold values in output (pd table only) TODO: generalise this and use matEleSelector() for input?

  • squeeze (bool, optional, default = True) – Drop singleton dimensions.

  • dropna (bool, optional, default = True) – Drop all NaN dimensions from output pd data frame (columnwise and rowise).

  • fillna (bool, optional, default = False) – Fill any NaN values with 0.0. Useful for plotting/making data contiguous.

  • colRound (int, optional, default = True) – Round column values to colRound dp. Only applied for Eke, Ehv, Euler or t dimensions.

Returns

  • daRestackpd (pandas data frame (2D) with sorted data.)

  • daRestack (Xarray with restacked data.)

Method

Restack Xarray by specified dims, including basic dims checking, then use da.to_pandas().

12/03/20 Function adapted from lmPlot() code.

Note

This might casue epsproc.lmPlot() to fail for singleton x-dimensions if squeeze = True. TO do: add work-around, see lines 114-122.

epsproc.util.orb3DCoordConv(fileIn, coordMaxLen=50)[source]

Basic coord parse & conversion for volumetric wavefunction files from ePS.

Parameters
  • fileIn (data from a single file) – List of values from a wavefunction file, as returned by epsproc.readOrb3D(). (Note this currently assumes a single file/set of values.)

  • coordMaxLen (int, optional, default=50) – Max coord grid size, assumed to demark native Cart (<coordMaxLen) from Spherical (>coordMaxLen) coords.

Returns

x,y,z

Return type

np.arrays of Cartesian coords (x,y,z)

epsproc.util.stringRepMap(string, replacements, ignore_case=False)[source]

Given a string and a replacement map, it returns the replaced string. :param str string: string to execute replacements on :param dict replacements: replacement dictionary {value to find: value to replace} :param bool ignore_case: whether the match should be case insensitive :rtype: str

CODE from: https://gist.github.com/bgusach/a967e0587d6e01e889fd1d776c5f3729 https://stackoverflow.com/questions/6116978/how-to-replace-multiple-substrings-of-a-string … more or less verbatim.

Thanks to bgusach for the Gist.