Numpy

Description

The Numpy file ADRIO loads data from any provided .npy (a NumPy array file) or .npz (a zipped archive of .npy files) files. In addition, you can provide a slice of data that you would like to specifically extract and load from the given files.

Examples

These examples will create temporary numpy files using the create_temp_file function, but in typical usage of course you would have a suitable file on disk to load.

.npy file

.npy files contain a single compressed numpy array. We can simply load that file in.

import numpy as np

from epymorph.kit import *
from epymorph.adrio.numpy import NumpyFile

npy_path = create_temp_file(
    np.array([1, 2, 3])
)

NumpyFile(
    file_path=npy_path,
    shape=Shapes.N,
    dtype=np.int64,
).with_context(
    scope=CustomScope(["A", "B", "C"]),
).evaluate()

array([1, 2, 3])

.npz file

.npz files are like a compressed dictionary of numpy arrays, so we have to specify which array we want to return. One ADRIO can only return one array. (You could of course use multiple ADRIOs on the same file if needed.)

# You can load any dtype numpy supports, like this structured array:
fancy_dtype = np.dtype([
    ("val1", np.float64),
    ("val2", np.float64),
    ("val3", np.str_, 3),
])

npz_path = create_temp_file({
    "some_values": np.array([4.4, 5.5, 6.6]),
    "other_values": np.array(
        [
            (1.1, 2.2, "abc"),
            (3.3, 4.4, "def"),
            (5.5, 6.6, "ghi"),
        ],
        dtype=fancy_dtype,
    ),
})

NumpyFile(
    file_path=npz_path,
    shape=Shapes.N,
    dtype=fancy_dtype,
    array_name="other_values",
).with_context(
    scope=CustomScope(["A", "B", "C"]),
).evaluate()

array([(1.1, 2.2, 'abc'), (3.3, 4.4, 'def'), (5.5, 6.6, 'ghi')],
      dtype=[('val1', '<f8'), ('val2', '<f8'), ('val3', '<U3')])

Slicing

If you have a larger data file and only need part of it but don’t want to do the work to pre-process the file first, you can also use slicing to subset the array.

# Say we want to simulate for nodes D,E,F
scope = CustomScope(["D", "E", "F"])

# But we have data for other nodes, too
npy_path = create_temp_file(
    # nodes:  A  B  C  D  E  F  G  H  I
    np.array([1, 2, 3, 4, 5, 6, 7, 8, 9])
)

NumpyFile(
    file_path=npy_path,
    shape=Shapes.N,
    dtype=np.int64,
    array_slice=np.s_[3:6], # this slice gets us the data we want
).with_context(
    scope=scope,
).evaluate()

array([4, 5, 6])

np.s_ is a handy utility for creating slices using array indexing syntax, but if you prefer Python slice objects those work too: slice(3, 6) in this case.

Common Errors

Here are some common errors you might run into.

# Error: file not found
try:
    NumpyFile(
        file_path="./this-is-not-a-real-file.npy", # <--
        shape=Shapes.N,
        dtype=np.int64,
    ).with_context(scope=scope).evaluate()
except Exception as e:
    print(f"{type(e).__name__}: {e}")

ADRIOProcessingError: Error processing epymorph.adrio.numpy.NumpyFile: Error loading file.

# Error: file suffix is not .npy or .npz
try:
    NumpyFile(
        file_path="./file.txt", # <--
        shape=Shapes.N,
        dtype=np.int64,
    ).with_context(scope=scope).evaluate()
except Exception as e:
    print(f"{type(e).__name__}: {e}")

ValueError: This ADRIO supports .npz or .npy files only.

# Error: .npz file but incorrect array_name
try:
    NumpyFile(
        file_path=npz_file,
        shape=Shapes.N,
        dtype=np.int64,
        array_name="not_in_the_npz", # <--
    ).with_context(scope=scope).evaluate()
except Exception as e:
    print(f"{type(e).__name__}: {e}")

NameError: name 'npz_file' is not defined

# Error: file suffix is not .npy or .npz
try:
    NumpyFile(
        file_path="./file.txt", # <--
        shape=Shapes.N,
        dtype=np.int64,
    ).with_context(scope=scope).evaluate()
except Exception as e:
    print(f"{type(e).__name__}: {e}")

ValueError: This ADRIO supports .npz or .npy files only.

# Error: invalid array slice
try:
    NumpyFile(
        file_path=npy_path,
        shape=Shapes.N,
        dtype=np.int64,
        array_slice=np.s_[3:6, 2:7], # <-- too many axes!
    ).with_context(scope=scope).evaluate()
except Exception as e:
    print(f"{type(e).__name__}: {e}")

ADRIOProcessingError: Error processing epymorph.adrio.numpy.NumpyFile: Specified array slice is invalid for the shape of this data.