epymorph.tools.data
General tools for processing epymorph data.
Output
munge
munge(
output: Output,
geo: GeoSelection | GeoAggregation,
time: TimeSelection | TimeAggregation,
quantity: QuantitySelection | QuantityAggregation,
) -> DataFrame
Apply select/group/aggregate operations to an output dataframe.
This function powers many of our more-specialized output processing tools, but we expose this general utility to enable re-use of this logic in more use-cases.
Parameters:
-
output
(Output
) –The result data to process.
-
geo
(GeoSelection | GeoAggregation
) –The geo-axis strategy.
-
time
(TimeSelection | TimeAggregation
) –The time-axis strategy.
-
quantity
(QuantitySelection | QuantityAggregation
) –The quantity-axis strategy.
Returns:
-
DataFrame
–The munged result.
It is a dataframe with columns "time", "geo", and a column per selected quantity. The values in "time" and "geo" come from the chosen aggregation for those axes. Without any group or aggregation specified, "time" is the simulation ticks and "geo" is node IDs.
Raises:
-
ValueError
–If the axis strategies don't refer to the same
RUME
used to produce theOutput
. Generally it's safest to create the axis strategies using the methods from an output'sRUME
object.
memoize_rume
memoize_rume(
path: str | Path,
rume: RumeT,
*,
refresh: bool = False,
rng: Generator | None = None,
) -> RumeT
Cache a RUME's parameter data using a local file.
If the file doesn't exist, the RUME's parameters are evaluated and saved. If the file does exist (and we're not forcing a refresh), values are loaded from the file and the RUME's parameters are overridden to use those values.
This is intended as a utility for working with RUMEs that use data from ADRIOs that may be expensive to fetch repeatedly.
For example, working interactively in a Notebook you may have to reprocess code cells many times, which could mean spending a long time re-fetching data and/or incurring API usage costs. This function helps you avoid those costs.
Parameters:
-
path
(str | Path
) –The path to the cache file.
-
rume
(RumeT
) –The RUME to use.
-
refresh
(bool
, default:False
) –True to force this logic to ignore any existing cache file. Parameters will be evaluated and the cache file will be overwritten.
-
rng
(Generator | None
, default:None
) –A random number generator instance to use. Otherwise we'll use numpy's default RNG.
Returns:
-
RumeT
–A clone of the
RUME
with all parameters replaced by the fully-evaluated numpy data arrays.
Warning
This is not a full serialization of the RUME, so if you change the RUME config
you should not re-use the cache file; passing refresh=True
is an easy way to
make this function ignore any previously saved file.