Numpy

Description

The Numpy file ADRIO loads data from any provided .npy (a NumPy array file) or .npz (a zipped archive of .npy files) files. In addition, you can provide a slice of data that you would like to specifically extract and load from the given files.

Numpy ADRIO

Simple NumPy Array Example

  • Use necessary imports for manipulating numpy arrays and using the Numpy ADRIO
from pathlib import Path

import numpy as np

from epymorph.adrio.numpy import NPY, NPZ
from epymorph.error import DataResourceError
  • Provide or make a .npy file to use.
array = np.array([1, 2, 3])
np.save("./scratch/npy_test.npy", arr=array)
  • Use the NPY ADRIO with the path to the .npy file and evaluate to load the data from that file.
adrio = NPY(Path("./scratch/npy_test.npy"))
adrio.evaluate()
array([1, 2, 3])

NPZ File Example

  • Provide or make any NPZ files, which can be made by saving arrays and using savez()
array = np.array(
    [(7.7, 8.8, 9.9)],
    dtype=[("data", np.float64), ("data2", np.float64), ("data3", np.float64)],
)
array2 = np.array([4.4, 5.5, 6.6])
np.savez("./scratch/npz_test.npz", arr=array, arr2=array2)
  • Use the NPZ call with the path to the .npz file and the array you want to load
NPZ(Path("./scratch/npz_test.npz"), "arr").evaluate()
array([(7.7, 8.8, 9.9)],
      dtype=[('data', '<f8'), ('data2', '<f8'), ('data3', '<f8')])
  • Use a slice to exclude or include any specific elements of an array
NPZ(Path("./scratch/npz_test.npz"), "arr2", np.s_[1:3,]).evaluate()
array([5.5, 6.6])

Slicing Arrays

The following are examples on different ways arrays from NPZ files can be sliced.

Each Array Axis

# save and load a multidimensional array
array = np.array(
    [["This", "is", "an"], ["array", "of", "strings"], ["for", "testing", "."]]
)
np.savez("./scratch/npz_test2.npz", array)

# slice each axis of the array individually
NPZ(Path("./scratch/npz_test2.npz"), "arr_0", (slice(0, 2), slice(1, 3))).evaluate()
array([['is', 'an'],
       ['of', 'strings']], dtype='<U7')

Single Slice

NPZ(Path("./scratch/npz_test2.npz"), "arr_0", slice(0, None, 2)).evaluate()
array([['This', 'is', 'an'],
       ['for', 'testing', '.']], dtype='<U7')

Using np.s_ to Slice Arrays

NPZ(Path("./scratch/npz_test2.npz"), "arr_0", np.s_[0:2, ::2]).evaluate()
array([['This', 'an'],
       ['array', 'strings']], dtype='<U7')

Ellipsis Slice

NPZ(Path("./scratch/npz_test2.npz"), "arr_0", np.s_[...]).evaluate()
array([['This', 'is', 'an'],
       ['array', 'of', 'strings'],
       ['for', 'testing', '.']], dtype='<U7')

Using np.index_exp to Slice Arrays

NPZ(Path("./scratch/npz_test2.npz"), "arr_0", np.index_exp[...]).evaluate()
array([['This', 'is', 'an'],
       ['array', 'of', 'strings'],
       ['for', 'testing', '.']], dtype='<U7')

Error Examples

Below are some examples of how the ADRIO may not output an expected output. These issues can occur when there are more indices provided to slice than the array actually has or if an incorrect file type is given to the call.

# ERROR: too many indices
try:
    adrio = (
        NPZ(Path("./scratch/npz_test2.npz"), "arr_0", 
        np.index_exp[1:3, 1:3, 1:3])
        )
    adrio.evaluate()
except DataResourceError as e:
    print(repr(e))
DataResourceError('Specified array slice is invalid for the shape of this data.')
# ERROR: wrong file type
try:
    adrio = NPY(Path("./scratch/npz_test2.npz"))
except DataResourceError as e:
    print(repr(e))
DataResourceError('Incorrect file type. Only .npy files can be loaded through NPY ADRIOs.')