epymorph.adrio.acs5
ACS5Year
module-attribute
ACS5Year = Literal[
2009,
2010,
2011,
2012,
2013,
2014,
2015,
2016,
2017,
2018,
2019,
2020,
2021,
2022,
2023,
]
A supported ACS5 data year.
ACS5_YEARS
module-attribute
ACS5_YEARS: Sequence[ACS5Year] = (
2009,
2010,
2011,
2012,
2013,
2014,
2015,
2016,
2017,
2018,
2019,
2020,
2021,
2022,
2023,
)
All supported ACS5 data years.
RaceCategory
module-attribute
RaceCategory = Literal[
"White",
"Black",
"Native",
"Asian",
"Pacific Islander",
"Other",
"Multiple",
]
A racial category defined by ACS5.
ACS5Client
Methods for interacting with the Census API for ACS5 data. Typical usage will not need to use this class, but it's provided for advanced cases.
url
staticmethod
get_vars
cached
staticmethod
Loads (and caches) ACS5 variable metadata. This metadata is published by the Census alongside the data for each year.
Parameters:
-
year
(int
) –The ACS5 data vintage year.
Returns:
get_group_vars
cached
staticmethod
Retrieves the variables metadata for a specific group of variables.
This is equivalent to calling get_vars
and then filtering to the
variables in the group.
Parameters:
Returns:
get_group_var_names
cached
staticmethod
make_queries
staticmethod
make_queries(scope: CensusScope) -> list[dict[str, str]]
Creates one or more Census API query predicates for the given scope. These may involve the "for" and "in" request parameters. Depending on your scope and the limitations of the API, multiple queries may be required, especially when your scope represents a disjoint spatial selection or one that otherwise can't be neatly expressed in a form like "all counties within state X".
Parameters:
-
scope
(CensusScope
) –The geo scope for which to make a query.
Returns:
fetch
staticmethod
fetch(
scope: CensusScope,
variables: list[str],
value_dtype: type[generic],
report_progress: Callable[[float], None] | None = None,
) -> DataFrame
Requests variables
from the Census API for the given scope
.
Parameters:
-
scope
(CensusScope
) –The geo scope to query.
-
variables
(list[str]
) –The list of variables to query.
-
value_dtype
(type[generic]
) –The dtype of the result array.
-
report_progress
(Callable[[float], None] | None
, default:None
) –A callback for reporting query progress; especially useful when the scope necessitates multiple queries.
Returns:
-
DataFrame
–A dataframe in "long" format, with columns: geoid, variable, and value. Geoid and variable are strings and value will be converted to the given dtype.
Population
Population(
*,
fix_insufficient_data: FixLikeInt = False,
fix_missing: FillLikeInt = False,
)
Bases: _ACS5FetchMixin
, FetchADRIO[int64, int64]
Loads population data from the US Census ACS 5-Year Data (variable B01001_001E). ACS5 data is compiled from surveys taken during a rolling five year period, and as such are estimates.
Data is available using CensusScope
geographies, from StateScope
down to
BlockGroupScope
(aggregates are computed by the Census Bureau). Data is loaded
according to the scope's year, from 2009 to 2023.
The result is an N-shaped array of integers.
Parameters:
-
fix_insufficient_data
(FixLikeInt
, default:False
) –The method to use to replace values that could not be computed due to an insufficient number of sample observation (-666666666 in the data).
-
fix_missing
(FillLikeInt
, default:False
) –The method to use to fix missing values.
See Also
The ACS 5-Year documentation from the US Census.
result_format
property
result_format: ResultFormat
Information about the expected format of the ADRIO's resulting data.
validate_result
Validates that the result of evaluating the ADRIO adheres to the expected result format.
Parameters:
-
context
(Context
) –The context in which the result has been evaluated.
-
result
(NDArray[ResultT]
) –The result produced by the ADRIO.
Raises:
-
ADRIOProcessingError
–If the result is invalid, indicating the processing logic has a bug.
PopulationByAgeTable
PopulationByAgeTable(
*,
fix_insufficient_data: FixLikeInt = False,
fix_missing: FillLikeInt = False,
)
Bases: _ACS5FetchMixin
, FetchADRIO[int64, int64]
Loads a table of population categorized by Census-defined age brackets from the
US Census ACS 5-Year Data (group B01001). This table is most useful as the source
data for one or more PopulationByAge
ADRIOs, which knows how to select, group,
and aggregate the data for simulations. ACS5 data is compiled from surveys taken
during a rolling five year period, and as such are estimates.
Data is available using CensusScope
geographies, from StateScope
down to
BlockGroupScope
(aggregates are computed by the Census Bureau). Data is loaded
according to the scope's year, from 2009 to 2023.
The result is an NxA-shaped array of integers where A is the number of variables included in the table. For example, in 2023 there are 49 variables: 23 age brackets for male, 23 age brackets for female, the male all-ages total, the female all-ages total, and a grand total.
Parameters:
-
fix_insufficient_data
(FixLikeInt
, default:False
) –The method to use to replace values that could not be computed due to an insufficient number of sample observation (-666666666 in the data).
-
fix_missing
(FillLikeInt
, default:False
) –The method to use to fix missing values.
See Also
The ACS 5-Year documentation from the US Census, and an example of this table for 2023.
result_format
property
result_format: ResultFormat
Information about the expected format of the ADRIO's resulting data.
validate_result
Validates that the result of evaluating the ADRIO adheres to the expected result format.
Parameters:
-
context
(Context
) –The context in which the result has been evaluated.
-
result
(NDArray[ResultT]
) –The result produced by the ADRIO.
Raises:
-
ADRIOProcessingError
–If the result is invalid, indicating the processing logic has a bug.
AgeRange
Bases: NamedTuple
Models an age range for use with ACS age-categorized data.
Unlike Python integer ranges, the end
of the this range is inclusive.
end
can also be None which models the "and over" part of ranges
like "85 years and over".
end
instance-attribute
end: int | None
The oldest age included in the range, or None to indicate an unbounded range.
contains
PopulationByAge
Bases: _ACS5Mixin
, ADRIO[int64, int64]
Processes a population-by-age table to extract the population of a specified age
bracket, as limited by the age brackets defined by the US Census ACS 5-Year Data
(group B01001). This ADRIO does not fetch data on its own, but requires you to
provide another attribute named "population_by_age_table" for it to parse.
Most often, this will be provided by a PopulationByAgeTable
instance.
This allows the table to be reused in case you need to calculate more than one
population bracket (as is common in a multi-strata model).
The result is an N-shaped array of integers.
Parameters:
-
age_range_start
(int
) –The youngest age to include in the age bracket.
-
age_range_end
(int | None
) –The oldest age to include in the age bracket, or None to indicate an unbounded range (include all ages greater than or equal to
age_range_start
).
Raises:
-
ValueError
–If the given age range does not line up with those ranges which are available in the source data. For instance, the Census defines an age bracket of 20-to-24 years. This makes it impossible for 21, 22, or 23 to be either the start or end of an age range. You can view the available age ranges on data.census.gov.
See Also
The ACS 5-Year documentation from the US Census, and an example of this table for 2023.
POP_BY_AGE_TABLE
class-attribute
instance-attribute
POP_BY_AGE_TABLE = AttributeDef(
"population_by_age_table", int, NxA
)
Defines the population-by-age-table requirement of this ADRIO.
requirements
class-attribute
instance-attribute
requirements = (POP_BY_AGE_TABLE,)
The attribute definitions describing the data requirements for this function.
For advanced use-cases, you may specify requirements as a property if you need it to be dynamically computed.
result_format
property
result_format: ResultFormat
Information about the expected format of the ADRIO's resulting data.
validate_result
Validates that the result of evaluating the ADRIO adheres to the expected result format.
Parameters:
-
context
(Context
) –The context in which the result has been evaluated.
-
result
(NDArray[ResultT]
) –The result produced by the ADRIO.
Raises:
-
ADRIOProcessingError
–If the result is invalid, indicating the processing logic has a bug.
inspect
inspect() -> InspectResult[int64, int64]
Produce an inspection of the ADRIO's data for the current context.
When implementing an ADRIO, override this method to provide data fetching and processing logic. Use self methods and properties to access the simulation context or defer processing to another function.
NOTE: if you are implementing this method, make sure to call validate_context
first and _validate_result
last.
Returns:
-
InspectResult[ResultT, ValueT]
–The data inspection results for the ADRIO's current context.
age_ranges
staticmethod
Lists the age ranges used by the ACS5 population by age table in definition order for the given year. Note that this does not correspond one-to-one with the values in the B01001 table -- this list omits "total" columns and duplicates.
Parameters:
-
year
(int
) –A supported ACS5 year.
Returns:
PopulationByRace
PopulationByRace(
race: RaceCategory,
*,
fix_insufficient_data: FixLikeInt = False,
fix_missing: FillLikeInt = False,
)
Bases: _ACS5FetchMixin
, FetchADRIO[int64, int64]
Loads population by race from the US Census ACS 5-Year Data (group B02001). ACS5 data is compiled from surveys taken during a rolling five year period, and as such are estimates.
Data is available using CensusScope
geographies, from StateScope
down to
BlockGroupScope
(aggregates are computed by the Census Bureau). Data is loaded
according to the scope's year, from 2009 to 2023.
The result is an N-shaped array of integers.
Parameters:
-
race
(RaceCategory
) –The Census-defined race category to load.
-
fix_insufficient_data
(FixLikeInt
, default:False
) –The method to use to fix values for which there were insufficient data to report (sentinel value: -666666666).
-
fix_missing
(FillLikeInt
, default:False
) –The method to use to fix missing values.
See Also
The ACS 5-Year documentation from the US Census.
result_format
property
result_format: ResultFormat
Information about the expected format of the ADRIO's resulting data.
validate_result
Validates that the result of evaluating the ADRIO adheres to the expected result format.
Parameters:
-
context
(Context
) –The context in which the result has been evaluated.
-
result
(NDArray[ResultT]
) –The result produced by the ADRIO.
Raises:
-
ADRIOProcessingError
–If the result is invalid, indicating the processing logic has a bug.
AverageHouseholdSize
AverageHouseholdSize(
*,
fix_insufficient_data: FixLikeFloat = False,
fix_missing: FillLikeFloat = False,
)
Bases: _ACS5FetchMixin
, FetchADRIO[float64, float64]
Loads average household size data, based on the number of people living in a household, from the US Census ACS 5-Year Data (variable B25010_001E). ACS5 data is compiled from surveys taken during a rolling five year period, and as such are estimates.
Data is available using CensusScope
geographies, from StateScope
down to
BlockGroupScope
(aggregates are computed by the Census Bureau). Data is loaded
according to the scope's year, from 2009 to 2023.
The result is an N-shaped array of floats.
Parameters:
-
fix_insufficient_data
(FixLikeFloat
, default:False
) –The method to use to fix values for which there were insufficient data to report (sentinel value: -666666666).
-
fix_missing
(FillLikeFloat
, default:False
) –The method to use to fix missing values.
See Also
The ACS 5-Year documentation from the US Census.
result_format
property
result_format: ResultFormat
Information about the expected format of the ADRIO's resulting data.
validate_result
Validates that the result of evaluating the ADRIO adheres to the expected result format.
Parameters:
-
context
(Context
) –The context in which the result has been evaluated.
-
result
(NDArray[ResultT]
) –The result produced by the ADRIO.
Raises:
-
ADRIOProcessingError
–If the result is invalid, indicating the processing logic has a bug.
MedianAge
MedianAge(
*,
fix_insufficient_data: FixLikeFloat = False,
fix_missing: FillLikeFloat = False,
)
Bases: _ACS5FetchMixin
, FetchADRIO[float64, float64]
Loads median age data from the US Census ACS 5-Year Data (variable B01002_001E). ACS5 data is compiled from surveys taken during a rolling five year period, and as such are estimates.
Data is available using CensusScope
geographies, from StateScope
down to
BlockGroupScope
(aggregates are computed by the Census Bureau). Data is loaded
according to the scope's year, from 2009 to 2023.
The result is an N-shaped array of floats.
Parameters:
-
fix_insufficient_data
(FixLikeFloat
, default:False
) –The method to use to fix values for which there were insufficient data to report (sentinel value: -666666666).
-
fix_missing
(FillLikeFloat
, default:False
) –The method to use to fix missing values.
See Also
The ACS 5-Year documentation from the US Census.
result_format
property
result_format: ResultFormat
Information about the expected format of the ADRIO's resulting data.
validate_result
Validates that the result of evaluating the ADRIO adheres to the expected result format.
Parameters:
-
context
(Context
) –The context in which the result has been evaluated.
-
result
(NDArray[ResultT]
) –The result produced by the ADRIO.
Raises:
-
ADRIOProcessingError
–If the result is invalid, indicating the processing logic has a bug.
MedianIncome
MedianIncome(
*,
fix_insufficient_data: FixLikeInt = False,
fix_missing: FillLikeInt = False,
)
Bases: _ACS5FetchMixin
, FetchADRIO[int64, int64]
Loads median income data in whole dollars from the US Census ACS 5-Year Data (variable B19013_001E), which is adjusted for inflation to the year of the data. ACS5 data is compiled from surveys taken during a rolling five year period, and as such are estimates.
Data is available using CensusScope
geographies, from StateScope
down to
BlockGroupScope
(aggregates are computed by the Census Bureau). Data is loaded
according to the scope's year, from 2009 to 2023.
The result is an N-shaped array of integers.
Parameters:
-
fix_insufficient_data
(FixLikeInt
, default:False
) –The method to use to fix values for which there were insufficient data to report (sentinel value: -666666666).
-
fix_missing
(FillLikeInt
, default:False
) –The method to use to fix missing values.
See Also
The ACS 5-Year documentation from the US Census.
result_format
property
result_format: ResultFormat
Information about the expected format of the ADRIO's resulting data.
validate_result
Validates that the result of evaluating the ADRIO adheres to the expected result format.
Parameters:
-
context
(Context
) –The context in which the result has been evaluated.
-
result
(NDArray[ResultT]
) –The result produced by the ADRIO.
Raises:
-
ADRIOProcessingError
–If the result is invalid, indicating the processing logic has a bug.
GiniIndex
GiniIndex(
*,
fix_insufficient_data: FixLikeFloat = False,
fix_missing: FillLikeFloat = False,
)
Bases: _ACS5FetchMixin
, FetchADRIO[float64, float64]
Loads Gini Index data from the US Census ACS 5-Year Data (variable B19083_001E). This is a measure of income inequality on a scale from 0 (perfect equality) to 1 (perfect inequality). ACS5 data is compiled from surveys taken during a rolling five year period, and as such are estimates.
Data is available using CensusScope
geographies, from StateScope
down to
BlockGroupScope
(aggregates are computed by the Census Bureau). Data is loaded
according to the scope's year, from 2009 to 2023.
The result is an N-shaped array of floats.
Parameters:
-
fix_insufficient_data
(FixLikeFloat
, default:False
) –The method to use to fix values for which there were insufficient data to report (sentinel value: -666666666).
-
fix_missing
(FillLikeFloat
, default:False
) –The method to use to fix missing values.
See Also
The ACS 5-Year documentation from the US Census, and general info on the Gini index.
result_format
property
result_format: ResultFormat
Information about the expected format of the ADRIO's resulting data.
validate_result
Validates that the result of evaluating the ADRIO adheres to the expected result format.
Parameters:
-
context
(Context
) –The context in which the result has been evaluated.
-
result
(NDArray[ResultT]
) –The result produced by the ADRIO.
Raises:
-
ADRIOProcessingError
–If the result is invalid, indicating the processing logic has a bug.
validate_context
validate_context(context: Context) -> None
Validates the context before ADRIO evaluation.
Parameters:
-
context
(Context
) –The context to validate.
Raises:
-
ADRIOContextError
–If this ADRIO cannot be evaluated in the given context.
DissimilarityIndex
DissimilarityIndex(
majority_pop: RaceCategory,
minority_pop: RaceCategory,
*,
fix_insufficient_population: FixLikeInt = False,
fix_missing_population: FillLikeInt = False,
fix_not_computable: FixLikeFloat = False,
)
Bases: _ACS5Mixin
, ADRIO[float64, float64]
Calculates the Dissimilarity Index using US Census ACS 5-Year Data (group B02001). The dissimilarity index is a measure of segregation comparing two races. Typically one compares a majority to a minority race and so the names of parameters reflect this, but this relationship between races involved isn't strictly necessary. The numerical result can be interpreted as the percentage of "minority" individuals that would have to move in order for the geographic distribution of individuals within subdivisions of a location to match the distribution of individuals in the location as a whole. ACS5 data is compiled from surveys taken during a rolling five year period, and as such are estimates.
Data is available using CensusScope
geographies, from StateScope
down to
TractScope
. Data is loaded according to the scope's year, from 2009 to 2023.
This ADRIO does not support BlockGroupScope
because we the calculation of the index
requires loading data at a finer granularity than the target granularity, and
there is no ACS5 data below block groups.
The result is an N-shaped array of floats.
Parameters:
-
majority_pop
(RaceCategory
) –The race category representing the majority population for the amount of segregation.
-
minority_pop
(RaceCategory
) –The race category representing the minority population within the segregation analysis.
-
fix_insufficient_population
(FixLikeInt
, default:False
) –The method to use to fix values for which there were insufficient data to report (sentinel value: -666666666). The replacement is performed on the underlying population by race data.
-
fix_missing_population
(FillLikeInt
, default:False
) –The method to use to fix missing values. The replacement is performed on the underlying population by race data.
-
fix_not_computable
(FixLikeFloat
, default:False
) –The method to use to fix values for which we cannot compute a value because population numbers cannot be loaded for one or more of the populations involved.
See Also
The ACS 5-Year documentation from the US Census, and general information about the dissimilarity index.
result_format
property
result_format: ResultFormat
Information about the expected format of the ADRIO's resulting data.
validate_context
validate_context(context: Context) -> None
Validates the context before ADRIO evaluation.
Parameters:
-
context
(Context
) –The context to validate.
Raises:
-
ADRIOContextError
–If this ADRIO cannot be evaluated in the given context.
validate_result
Validates that the result of evaluating the ADRIO adheres to the expected result format.
Parameters:
-
context
(Context
) –The context in which the result has been evaluated.
-
result
(NDArray[ResultT]
) –The result produced by the ADRIO.
Raises:
-
ADRIOProcessingError
–If the result is invalid, indicating the processing logic has a bug.
inspect
inspect() -> InspectResult[float64, float64]
Produce an inspection of the ADRIO's data for the current context.
When implementing an ADRIO, override this method to provide data fetching and processing logic. Use self methods and properties to access the simulation context or defer processing to another function.
NOTE: if you are implementing this method, make sure to call validate_context
first and _validate_result
last.
Returns:
-
InspectResult[ResultT, ValueT]
–The data inspection results for the ADRIO's current context.