Data structures - IO demo

28/06/22

This notebook outlines various IO methods for reading and writing data. For a general datastructures overview, see the Data structures intro doc.

Note that this page details functional forms, for class usage see the base class intro page. However, as of June 2022, file writers are not fully implemented for the data class.

Load libraries

[1]:

from pathlib import Path

import epsproc as ep

# Set data path
# Note this is set here from ep.__path__, but may not be correct in all cases - depends on where the Github repo is.
epDemoDataPath = Path(ep.__path__[0]).parent/'data'

OMP: Info #273: omp_set_nested routine deprecated, please use omp_set_max_active_levels instead.

* sparse not found, sparse matrix forms not available.
* natsort not found, some sorting functions not available.

* Setting plotter defaults with epsproc.basicPlotters.setPlotters(). Run directly to modify, or change options in local env.

* Set Holoviews with bokeh.
* pyevtk not found, VTK export not available.

Pickle

As usual, Pickle can be used as a general method, but the usual caveats apply - it requires data structures and classes to be available on read-in, so can break with library versions, and should not be regarded as archival.

Note, however, that Pickle is the fastest way to save many objects, including complex class objects containing multiple dataarrays. (Other methods to be implemented soon.)

[34]:

import pickle

# Save a dictionary
with open(Path(dataPath, 'n2_3sg_0.1-50.1eV_A2_dict.pickle'), 'wb') as handle:
    pickle.dump(dataDict, handle, protocol=pickle.HIGHEST_PROTOCOL)


# Save an Xarray
with open(Path(dataPath, 'n2_3sg_0.1-50.1eV_A2_XR.pickle'), 'wb') as handle:
    pickle.dump(data, handle, protocol=pickle.HIGHEST_PROTOCOL)

[35]:

# Read back in to test
with open(Path(dataPath, 'n2_3sg_0.1-50.1eV_A2_dict.pickle'), 'rb') as handle:
    dataDictPklIn = pickle.load(handle)

print(dataDict==dataDictPklIn)  # This is False...
data.equals(ep.util.misc.reconstructDims(dataDictPklIn))  # Also False
(ep.util.misc.reconstructDims(dataDictPklIn) - data).max()   # But data looks OK. Dim ordering and/or attrs are different?

False

[35]:

<xarray.DataArray 'n2_3sg_0.1-50.1eV_A2.inp.out' ()>
array(0.+0.j)

[36]:

# Read back in to test
with open(Path(dataPath, 'n2_3sg_0.1-50.1eV_A2_XR.pickle'), 'rb') as handle:
    dataDictPklIn = pickle.load(handle)

print(data.equals(dataDictPklIn))  # This is True
(dataDictPklIn - data).max()   # And data looks OK.

True

[36]:

<xarray.DataArray 'n2_3sg_0.1-50.1eV_A2.inp.out' ()>
array(0.+0.j)

Versions

[37]:

import scooby
scooby.Report(additional=['epsproc', 'holoviews', 'hvplot', 'xarray', 'matplotlib', 'bokeh'])

[37]:

Wed Jun 29 18:27:52 2022 UTC
OS	Linux	CPU(s)	32	Machine	x86_64	Architecture	64bit
RAM	50.1 GiB	Environment	Jupyter
Python 3.9.10 \| packaged by conda-forge \| (main, Feb 1 2022, 21:24:11) [GCC 9.4.0]
epsproc	1.3.2-dev	holoviews	1.14.8	hvplot	0.8.0	xarray	2022.3.0
matplotlib	3.5.1	bokeh	2.4.2	numpy	1.21.5	scipy	1.8.0
IPython	8.1.1	scooby	0.5.12

[38]:

# Check current Git commit for local ePSproc version
from pathlib import Path
!git -C {Path(ep.__file__).parent} branch
!git -C {Path(ep.__file__).parent} log --format="%H" -n 1

* dev
  master
  numba-tests
  pkgUpdates
ede98a5ad1ab5240d31f30e7a1eebab43b2da810

[39]:

# Check current remote commits
!git ls-remote --heads https://github.com/phockett/ePSproc

ede98a5ad1ab5240d31f30e7a1eebab43b2da810        refs/heads/dev
54b929025381f1c2cbf371180fa870f247e7f627        refs/heads/master
69cd89ce5bc0ad6d465a4bd8df6fba15d3fd1aee        refs/heads/numba-tests
ea30878c842f09d525fbf39fa269fa2302a13b57        refs/heads/revert-9-master

Read the Docs v: dev

Versions: latest; stable; v1.3.1; v1.3.0; v1.3.0-dev; v1.2.6; v1.2.5; v1.2.5-dev; v1.2.4; v1.2.0; dev

Downloads

On Read the Docs: Project Home; Builds