Skip to content

epymorph.geography.us_tiger

Functions for fetching information from TIGER files for common US Census geographic delineations. This is designed to return information for a common selection of the United States and territories, and handles quirks and differences between the supported census years.

TigerYear module-attribute

TigerYear = Literal[
    2000,
    2009,
    2010,
    2011,
    2012,
    2013,
    2014,
    2015,
    2016,
    2017,
    2018,
    2019,
    2020,
    2021,
    2022,
    2023,
]

A supported TIGER file year. (2000 and 2009-2023)

TIGER_YEARS module-attribute

TIGER_YEARS: Sequence[TigerYear] = (
    2000,
    2009,
    2010,
    2011,
    2012,
    2013,
    2014,
    2015,
    2016,
    2017,
    2018,
    2019,
    2020,
    2021,
    2022,
    2023,
)

All supported TIGER file years. (2000 and 2009-2023)

GranularitySummary dataclass

GranularitySummary(geoid: list[str])

Bases: ABC

Contains information about the geography at the given level of granularity for a specific Census year. The information available may differ between the implementations for different granularities, but at a minimum each provides the full list of GEOIDs in that granularity.

Concrete child classes exist for the various Census granularity levels.

Parameters:

  • geoid (list[str]) –

    The GEOIDs (sometimes called FIPS codes) of all nodes in this granularity.

geoid instance-attribute

geoid: list[str]

The GEOIDs (sometimes called FIPS codes) of all nodes in this granularity.

interpret

interpret(identifiers: Sequence[str]) -> list[str]

Permissively interprets the given set of identifiers as describing nodes, and converts them to a sorted list of GEOIDs.

Parameters:

  • identifiers (Sequence[str]) –

    A list of identifiers. Which kind of identifiers are allowed depends on the granularity.

Returns:

  • list[str]

    The list of GEOIDs in canonical sort order.

Raises:

StatesSummary dataclass

StatesSummary(
    geoid: list[str], name: list[str], code: list[str]
)

Bases: GranularitySummary

Information about US states (and state equivalents). Typically you will use get_states to obtain an instance of this class for a particular year.

Parameters:

  • geoid (list[str]) –

    The GEOIDs (aka FIPS codes) of all states.

  • name (list[str]) –

    The typical names for the states.

  • code (list[str]) –

    The US postal codes for the states.

geoid instance-attribute

geoid: list[str]

The GEOIDs (aka FIPS codes) of all states.

name instance-attribute

name: list[str]

The typical names for the states.

code instance-attribute

code: list[str]

The US postal codes for the states.

state_code_to_fips cached property

state_code_to_fips: Mapping[str, str]

Mapping from state postal code to FIPS code.

state_fips_to_code cached property

state_fips_to_code: Mapping[str, str]

Mapping from state FIPS code to postal code.

state_fips_to_name cached property

state_fips_to_name: Mapping[str, str]

Mapping from state FIPS code to full name.

interpret

interpret(identifiers: Sequence[str]) -> list[str]

Permissively interprets the given set of identifiers as describing nodes, and converts them to a sorted list of GEOIDs.

Parameters:

  • identifiers (Sequence[str]) –

    A list of identifiers. Identifiers can be given in any of the acceptable forms, but all of the identifiers must use the same form. Forms are: GEOID/FIPS code, full name, or postal code.

Returns:

  • list[str]

    The list of GEOIDs in canonical sort order.

Raises:

CountiesSummary dataclass

CountiesSummary(
    geoid: list[str],
    name: list[str],
    name_with_state: list[str],
)

Bases: GranularitySummary

Information about US counties (and county equivalents.) Typically you will use get_counties to obtain an instance of this class for a particular year.

Parameters:

  • geoid (list[str]) –

    The GEOIDs (aka FIPS codes) of all counties.

  • name (list[str]) –

    The typical names of the counties (does not include state).

geoid instance-attribute

geoid: list[str]

The GEOIDs (aka FIPS codes) of all counties.

name instance-attribute

name: list[str]

The typical names of the counties (does not include state). Note: county names are not unique across the whole US.

name_with_state instance-attribute

name_with_state: list[str]

The typical names including county and state, e.g., Coconino, AZ

county_fips_to_name cached property

county_fips_to_name: Mapping[str, str]

Mapping from county FIPS code to name with state.

interpret

interpret(identifiers: Sequence[str]) -> list[str]

Permissively interprets the given set of identifiers as describing nodes, and converts them to a sorted list of GEOIDs.

Parameters:

  • identifiers (Sequence[str]) –

    A list of identifiers. Identifiers can be given in any of the acceptable forms, but all of the identifiers must use the same form. Forms are: GEOID/FIPS code, or the name of the county and its state postal code separated by a comma, e.g., Coconino, AZ.

Returns:

  • list[str]

    The list of GEOIDs in canonical sort order.

Raises:

TractsSummary dataclass

TractsSummary(geoid: list[str])

Bases: GranularitySummary

Information about US Census tracts. Typically you will use get_tracts to obtain an instance of this class for a particular year.

Parameters:

  • geoid (list[str]) –

    The GEOIDs (aka FIPS codes) of all tracts.

geoid instance-attribute

geoid: list[str]

The GEOIDs (aka FIPS codes) of all tracts.

BlockGroupsSummary dataclass

BlockGroupsSummary(geoid: list[str])

Bases: GranularitySummary

Information about US Census block groups. Typically you will use get_block_groups to obtain an instance of this class for a particular year.

Parameters:

  • geoid (list[str]) –

    The GEOIDs (aka FIPS codes) of all block groups.

geoid instance-attribute

geoid: list[str]

The GEOIDs (aka FIPS codes) of all block groups.

CacheEstimate

Bases: NamedTuple

Estimates related to data needed to fulfill TIGER requests.

total_cache_size instance-attribute

total_cache_size: int

An estimate of the size of the files that we need to have cached to fulfill a request.

missing_cache_size instance-attribute

missing_cache_size: int

An estimate of the size of the files that are not currently cached that we would need to fulfill a request. Zero if we have all of the files already.

is_tiger_year

is_tiger_year(year: int) -> TypeGuard[TigerYear]

A type-guard function to ensure a year is a supported TIGER year.

Parameters:

  • year (int) –

    The year to check.

Returns:

  • TypeGuard[TigerYear]

    True (as a type guard) if the year is in the set of supported TIGER years.

get_states_geo

get_states_geo(
    year: int, progress: ProgressCallback | None = None
) -> GeoDataFrame

Get all supported US states and territories for the given census year, with geography.

Parameters:

  • year (int) –

    The geography year.

  • progress (ProgressCallback | None, default: None ) –

    A optional callback for reporting the progress of downloading TIGER files.

Returns:

get_states_info

get_states_info(
    year: int, progress: ProgressCallback | None = None
) -> DataFrame

Get all US states and territories for the given census year, without geography.

Parameters:

  • year (int) –

    The geography year.

  • progress (ProgressCallback | None, default: None ) –

    A optional callback for reporting the progress of downloading TIGER files.

Returns:

  • DataFrame

    The TIGER file info without geography.

get_states

get_states(year: int) -> StatesSummary

Loads US States information (assumed to be invariant for all supported years).

Parameters:

  • year (int) –

    The geography year.

Returns:

get_counties_geo

get_counties_geo(
    year: int, progress: ProgressCallback | None = None
) -> GeoDataFrame

Get all supported US counties and county-equivalents for the given census year, with geography.

Parameters:

  • year (int) –

    The geography year.

  • progress (ProgressCallback | None, default: None ) –

    A optional callback for reporting the progress of downloading TIGER files.

Returns:

get_counties_info

get_counties_info(
    year: int, progress: ProgressCallback | None = None
) -> DataFrame

Get all US counties and county-equivalents for the given census year, without geography.

Parameters:

  • year (int) –

    The geography year.

  • progress (ProgressCallback | None, default: None ) –

    A optional callback for reporting the progress of downloading TIGER files.

Returns:

  • DataFrame

    The TIGER file info without geography.

get_counties

get_counties(year: int) -> CountiesSummary

Loads US Counties information for the given year.

Parameters:

  • year (int) –

    The geography year.

Returns:

get_tracts_geo

get_tracts_geo(
    year: int,
    state_id: Sequence[str] | None = None,
    progress: ProgressCallback | None = None,
) -> GeoDataFrame

Get all supported US census tracts for the given census year, with geography.

Parameters:

  • year (int) –

    The geography year.

  • state_id (Sequence[str] | None, default: None ) –

    If provided, return only the tracts in the given list of states (by GEOID).

  • progress (ProgressCallback | None, default: None ) –

    A optional callback for reporting the progress of downloading TIGER files.

Returns:

get_tracts_info

get_tracts_info(
    year: int,
    state_id: Sequence[str] | None = None,
    progress: ProgressCallback | None = None,
) -> DataFrame

Get all US census tracts for the given census year, without geography.

Parameters:

  • year (int) –

    The geography year.

  • state_id (Sequence[str] | None, default: None ) –

    If provided, return only the tracts in the given list of states (by GEOID).

  • progress (ProgressCallback | None, default: None ) –

    A optional callback for reporting the progress of downloading TIGER files.

Returns:

  • DataFrame

    The TIGER file info without geography.

get_tracts

get_tracts(year: int) -> TractsSummary

Loads US Census Tracts information for the given year.

Parameters:

  • year (int) –

    The geography year.

Returns:

get_block_groups_geo

get_block_groups_geo(
    year: int,
    state_id: Sequence[str] | None = None,
    progress: ProgressCallback | None = None,
) -> GeoDataFrame

Get all supported US census block groups for the given census year, with geography.

Parameters:

  • year (int) –

    The geography year.

  • state_id (Sequence[str] | None, default: None ) –

    If provided, return only the block groups in the given list of states (by GEOID).

  • progress (ProgressCallback | None, default: None ) –

    A optional callback for reporting the progress of downloading TIGER files.

Returns:

get_block_groups_info

get_block_groups_info(
    year: int,
    state_id: Sequence[str] | None = None,
    progress: ProgressCallback | None = None,
) -> DataFrame

Get all US census block groups for the given census year, without geography.

Parameters:

  • year (int) –

    The geography year.

  • state_id (Sequence[str] | None, default: None ) –

    If provided, return only the block groups in the given list of states (by GEOID).

  • progress (ProgressCallback | None, default: None ) –

    A optional callback for reporting the progress of downloading TIGER files.

Returns:

  • DataFrame

    The TIGER file info without geography.

get_block_groups

get_block_groups(year: int) -> BlockGroupsSummary

Loads US Census Block Group information for the given year.

Parameters:

  • year (int) –

    The geography year.

Returns:

get_summary_of

get_summary_of(
    granularity: CensusGranularityName, year: int
) -> GranularitySummary

Retrieve a GranularitySummary for the given granularity and year.

Parameters:

Returns:

Raises:

check_cache

check_cache(
    granularity: CensusGranularityName,
    year: int,
    *,
    state_ids: Sequence[str] | None = None,
) -> CacheEstimate

Check the status of the cache for a specified TIGER granularity and year.

Parameters:

  • granularity (CensusGranularityName) –

    The Census granularity.

  • year (int) –

    The geography year.

  • state_ids (Sequence[str] | None, default: None ) –

    If specified, only consider places in this set of states. Must be in state GEOID (FIPS code) format.

Returns:

  • CacheEstimate

    The estimate of the total size of the cached files for the given granularity and year, as well as how much is not currently cached.