epsproc.classes.multiJob module

Core classes for ePSproc data to handle multiple job filesets (energy and/or orbitals etc.)

13/10/20 V2 Consolidating with ePSbase class, some functionality moved there.

23/09/20 Added ePSmultiJob class, currently very rough.

14/09/20 v1 Started class development. See ePSproc_multijob_class_dev_140920_bemo.ipynb

class epsproc.classes.multiJob.ePSmultiJob(fileBase=None, jobDirs=None, jobStructure=None, prefix=None, ext='.out', Edp=1, verbose=1)[source]

Bases: epsproc.classes.base.ePSbase

Class for handling multiple ePS jobs/datasets.

  • Read datasets from a single dir, or set of dirs.
  • Sort to dictionaries and Xarray datasets (needs some work).
  • Comparitive plotting (in development).
Parameters:
  • fileBase (str or Path object, default = None) – Base directory to scan for ePS files, subdirs will also be searched. This is required unless jobDirs is set instead.
  • jobDirs (list of str or Path objects, default= None) – List of dirs containing ePS output files to read, subdirs will NOT be searched.
  • jobStructure (str, optional, default = None) – This will be set automatically by self.scanDirs(), but can also be passed to override. Values “subDirs” or “rootDir”, in former case multiple files will be stacked by Eke.
  • prefix (str, optional, default = None) – Set prefix string for file checks (cf. wfPlot class). NOT YET USED.
  • ext (str, optional, default = '.out') – Set default file extension for dir scanning. This should match the file extension for ePolyScat output files.
  • Edp (int, optional, default = 2) – Set default dp for Ehv conversion. May want to set this elsewhere instead… maybe just for plotting? TODO: also consider axis reindex, lookups and interp functions here - useful for differences between datasets.
  • verbose (int, optional, default = 1) – Set verbosity level for printing/error checking. Not yet fully implemented, but, generally: - 0, no printed output. - 1, basic printed info. - 2, print all info, including subfunction outputs.

TODO:

  • verbosity levels, subtract for subfunctions? Or use a dict to handle multiple levels?
    UPDATE: now handled in base class with verbose[‘main’] and verbose[‘sub’].
jobLabel(key=None, lString=None, append=True)[source]

Reset or append text to jobLabel. Very basic.

TODO: consistency over [m], jobLabel vs. orbLabel rationalisation.

jobsSummary()[source]

Print some general info.

TODO: add some info!

scanDirs()[source]

Scan dir structure for folders containing ePS files.

Compatibility… this assumes one of two dir structures:

  • Old structure, with multiple job output files per dir (but not per E range).
  • New structure, with multiple output files per E range, subdirs per job/orb.

This is set in jobStructure variable, as “rootDir” or “subDirs” respectively.

scanFiles(keys=None, outputKeyType='dir')[source]

Scan ePS output files from dir list.

Adapted from https://phockett.github.io/ePSdata/XeF2-preliminary/XeF2_multi-orb_comparisons_270320-dist.html

Currently outputting complicated dataSets dictionary/list - need to sort this out! - Entry per dir scanned. - Sub-entries per file, but collapsed in some cases. - Sending to Xarray dataset with orb labels should be cleaner.

Parameters:keys (int or list, optional, defaults to all jobDirs) – Set keys (dirs) to use by index. Default = enumerate(self.jobs[‘jobDirs’])
outputKeyType : str, optional, default = ‘dir’

Types as supported by super().scanFiles() ‘orb’: Use orbital labels as dataset keys ‘int’: Use integer labels as dataset keys (will be ordered by file read) ‘dir’: Use for cases where all files from a directory should be stacked, e.g. multiple files for various energies.

Any other setting will result in key = keyType, which can be used to explicitly pass a key (e.g. in multijob wrapper case). This should be tidied up. Note that setting ‘dir’ for cases with multiple different jobs in a dir will result in some outputs being overwritten.

TODO:

  • Flatten data structure to remove unnecessary nesting.
  • Fix keys to apply to single dir case (above should also fix this).
  • convert outputs to Xarray dataset. Did this before, but currently missing file (on AntonJr)! CHECK BACKUPS - NOPE.
  • Confirm HV scaling - may be better to redo this, rather than correct existing values?
  • Fix xr.dataset: currently aligns data, so will set much to Nan if, e.g., different symmetries etc.

Change to structure as ds(‘XS’,’matE’) per orb, rather than ds(‘XS’) and ds(‘matE’) for all orbs? This should also be in line with hypothetical base dataclass, which will be per orb by defn. UPDATE 16/10/20: now handled in base class, use flat dict with entry per job or dir (for E-stacked case).

14/10/20 v2 Now using ePSbase() class for functionality.
DATASTRUCTURE IS NOW FLAT