from epymorph.kit import *
from epymorph.adrio import acs5
= CountyScope.in_states(["AZ"], year=2020) county_scope
ACS5
Description
The US Census Bureau’s American Community Survey (ACS) provides detailed estimates about a broad set of topics including the country’s population, economy, housing, and society. The goal of the ACS is to provide data more frequently than the decennial census every 10 years. To do this in a cost-efficient manner, it trades complete-count accuracy for statistical methods. ACS estimates are based on survey responses collected from a random sample of US residents. In constrast to the decennial census, ACS’ surveys are issued continuously throughout the year. The data include margin of error when appropriate.
The ACS 5-Year Data (ACS5) is published annually and incorporates the last five years of survey responses. Thus its estimates reflect a five-year moving average. Many of its estimates are available down to the census block group level. The Census Bureau also publishes 1-Year Data (using the past year of survey responses) and has published 3-Year Data (the past three years of survey responses). 1-Year and 3-Year data are not currently available in epymorph — since they include fewer survey responses they tend to redact more estimates due to privacy concerns or aggregate only to coarser geographies, and so we decided they would be less useful.
Data Collection
The Census Bureau currently issues roughly 3.5 million surveys every year to a random selection of addresses. Participation is mandated by law, and respondents may respond online, on paper, or in-person with a Census Bureau interviewer. Puerto Rico has a separate survey (the Puerto Rico Community Survey) which asks similar questions; its responses are also included in the ACS estimates.
Geographic and Temporal Coverage
ACS5 data is available for all US states, the District of Columbia, and Puerto Rico. Data is aggregated by the Census Bureau at state, county, tract, and census block group granularities. However for 2012 and prior years, block group data is currently not available using the Census API, and so epymorph will raise an error in this case. (If you need this data it is available from files available from the Census website.)
The first ACS 5-year publication covered the 2005-2009 period. It has been released annually since then, each covering overlapping 5-year periods. The ACS data uses geographic boundaries as of January 1 of the last year of the estimate period. For example, the 2014–2018 ACS 5-year estimates use boundaries as of January 1, 2018, as determined by the US Census Bureau.
Additional Resources
The ACS 5-Year info site and Understanding and Using American Community Survey Data: What All Data Users Need to Know provide much more detail. The data.census.gov interface can be used to explore Census data interactively, including ACS data.
Attributes
The ACS 5-year data includes more than 40,000 individual variables divided into subproducts, numbered by their table and row. epymorph ADRIOs exist to fetch data from a curated subset of these variables.
ADRIO | Description | Source Table |
---|---|---|
Population | Total population. | Sex By Age [B01001] |
PopulationByAgeTable | A full table of population by age. | Sex by Age [B01001] |
PopulationByAge | Population by selected age(s). | Sex by Age [B01001] |
PopulationByRace | Population by selected Census race. | Race [B02001] |
AverageHouseholdSize | Average number of occupants per household. | Average Household Size of Occupied Housing Units by Tenure [B25010] |
MedianAge | Median age of the resident population. | Median Age by Sex [B01002] |
MedianIncome | Median income of the resident population. | Median Household Income in the Past 12 Months (in {ACS_YEAR} Inflation-Adjusted Dollars) [B19013] |
GiniIndex | An index of wealth inequality in the resident population. | Gini Index of Income Inequality [B19083] |
DissimilarityIndex | An epymorph-computed index of spatial segregation between two races, combining multiple datapoints from ACS source data. | Race [B02001] |
Examples
All examples will use the same scope defined here:
Population
(API) Retrieves an N-shaped array of integers representing the total population of each provided geo node.
=county_scope).evaluate() acs5.Population().with_context(scope
array([ 71714, 126442, 142254, 53846, 38304, 9465, 21035,
4412779, 210998, 110271, 1038476, 447559, 46594, 232396,
211931])
Population By Age Table
(API) Retrieves the full table of population data for each geo node aggregated by the age ranges defined by the Census. This ADRIO is not intended for direct use, but as the source of data for PopulationByAge
ADRIOs: it’s simply more efficient to load this table once.
=county_scope).evaluate()[:3] acs5.PopulationByAgeTable().with_context(scope
array([[ 71714, 35388, 2388, 2522, 3001, 1641, 1089, 711,
491, 1130, 2581, 2316, 2156, 1830, 1912, 2113,
2061, 1080, 1326, 707, 1192, 1201, 986, 633,
321, 36326, 2335, 2652, 2918, 1778, 1268, 496,
539, 1148, 2263, 2206, 1844, 1889, 1974, 2179,
2429, 985, 1372, 850, 1239, 1494, 1090, 655,
723],
[126442, 64371, 3677, 4240, 3542, 2387, 1874, 1062,
623, 2615, 4477, 4160, 4414, 3217, 3315, 3407,
3169, 1518, 3113, 1796, 2418, 3603, 2289, 2138,
1317, 62071, 3637, 3704, 3865, 2107, 1484, 781,
559, 1914, 3623, 3442, 3351, 3233, 3128, 3584,
4060, 2106, 2953, 1635, 2596, 3944, 2728, 1906,
1731],
[142254, 70124, 3905, 3803, 4614, 2646, 4265, 2364,
2051, 4581, 5805, 4524, 4128, 3757, 3639, 3597,
3829, 1690, 2327, 1405, 1789, 2565, 1513, 871,
456, 72130, 3726, 3451, 4498, 2397, 7673, 2771,
1910, 3285, 5243, 4190, 4110, 3568, 3575, 3777,
4591, 1503, 2506, 1415, 2168, 2241, 1774, 878,
880]])
Population By Age
(API) Computes an N-shaped array of integers representing the total population within a specified age range for each geo node.
This ADRIO does not access the ACS5 data directly, rather it processes the table of population data as loaded by the PopulationByAgeTable
ADRIO. Make sure a “population_by_age_table” param is available in the context. (Although this ADRIO is a bit odd compared to the others, this design was chosen to efficiently load multiple separate population groups which is useful in age-stratified models.)
(18, 49)
acs5.PopulationByAge(
.with_context(=county_scope,
scope={
params"population_by_age_table": acs5.PopulationByAgeTable(),
},
)
.evaluate() )
array([ 27843, 47272, 71439, 15972, 16957, 4139, 5644,
1912144, 64159, 40489, 430702, 177931, 17231, 67932,
86886])
The age range you specify must align with those defined by the Census Bureau (which could change from year to year) but can combine multiple contiguous ranges. You can query a list of available age ranges:
2020) acs5.PopulationByAge.age_ranges(
[AgeRange(start=0, end=4),
AgeRange(start=5, end=9),
AgeRange(start=10, end=14),
AgeRange(start=15, end=17),
AgeRange(start=18, end=19),
AgeRange(start=20, end=20),
AgeRange(start=21, end=21),
AgeRange(start=22, end=24),
AgeRange(start=25, end=29),
AgeRange(start=30, end=34),
AgeRange(start=35, end=39),
AgeRange(start=40, end=44),
AgeRange(start=45, end=49),
AgeRange(start=50, end=54),
AgeRange(start=55, end=59),
AgeRange(start=60, end=61),
AgeRange(start=62, end=64),
AgeRange(start=65, end=66),
AgeRange(start=67, end=69),
AgeRange(start=70, end=74),
AgeRange(start=75, end=79),
AgeRange(start=80, end=84),
AgeRange(start=85, end=None)]
Population by Race
(API) Retrieves an N-shaped array of integers representing the total population of a specified race group for each geo node.
"Black").with_context(scope=county_scope).evaluate() acs5.PopulationByRace(
array([ 545, 5158, 1929, 391, 683, 175, 205, 249691,
1976, 1174, 36441, 20151, 398, 1577, 4611])
Average Household Size
(API) Retrieves an N-shaped array of floats representing the average number of people living in each household for every geo node.
=county_scope).evaluate() acs5.AverageHouseholdSize().with_context(scope
array([3.28, 2.34, 2.6 , 2.34, 3.06, 2.84, 2.1 , 2.73, 2.29, 2.95, 2.45,
2.84, 2.88, 2.24, 2.76])
Median Age
(API) Retrieves an N-shaped array of floats representing the median age in each geo node.
=county_scope).evaluate() acs5.MedianAge().with_context(scope
array([35.4, 41. , 31. , 50.4, 33.7, 35.7, 57.4, 36.6, 52.3, 38.2, 38.7,
39.8, 37.2, 54.1, 34.8])
Median Income
(API) Retrieves an N-shaped array of floats representing the median yearly income in each geo node.
=county_scope).evaluate() acs5.MedianIncome().with_context(scope
array([33967, 51505, 59000, 46907, 55693, 66368, 34956, 67799, 47686,
43140, 55023, 60968, 41424, 53329, 48790])
Gini Index
(API) Retrieves an N-shaped array of floats representing the amount of income inequality on a scale of 0 (perfect equality) to 1 (perfect inequality) for each geo node.
=county_scope).evaluate() acs5.GiniIndex().with_context(scope
array([0.4941, 0.4353, 0.4641, 0.4449, 0.4057, 0.3774, 0.4983, 0.4629,
0.4476, 0.4581, 0.4658, 0.438 , 0.4927, 0.4504, 0.454 ])
Dissimilarity Index
(API) Calculates an N-shaped array of floats quantifying spatial segregation between specified racial majority and minority groups. This is not a value computed by the Census Bureau, but computed by epymorph using data from ACS5. Values are on a scale of 0 (complete integration) to 1 (complete segregation).
(
acs5.DissimilarityIndex(="White",
majority_pop="Black",
minority_pop
)=county_scope)
.with_context(scope
.evaluate() )
array([0.55397177, 0.53833896, 0.50504738, 0.50758161, 0.46014043,
0.41818213, 0.70177825, 0.42899297, 0.58358261, 0.66416851,
0.39650261, 0.48425149, 0.68361853, 0.49817053, 0.52189792])