2. Racial or Ethnic Residential Segregation Indices

Ian D. Buller (GitHub: @idblr)

2024-09-02

Start with the necessary packages for the vignette.

loadedPackages <- c('dplyr', 'ggplot2', 'ndi', 'sf', 'tidycensus', 'tigris')
invisible(lapply(loadedPackages, library, character.only = TRUE))
options(tigris_use_cache = TRUE)

Set your U.S. Census Bureau access key. Follow this link to obtain one. Specify your access key in the functions below using the key argument of the get_acs() function from the tidycensus package called within each or by using the census_api_key() function from the tidycensus package before running the functions.

census_api_key('...') # INSERT YOUR OWN KEY FROM U.S. CENSUS API

Racial or Ethnic Residential Segregation Indices

Since version v0.1.1, the ndi package can use data from the ACS to compute racial or ethnic residential segregation indices (many discussed in Massey & Denton (1988)), including:

Evenness:

Exposure:

Concentration:

Centralization:

Clustering:

Racial or ethnic residential segregation indices

The ndi package can use data from the ACS to compute multiple racial or ethnic residential segregation indices for multiple racial or ethnic subgroups, including:

ACS table source racial or ethnic subgroup character for subgroup, subgroup_ixn, or subgroup_ref arguments
B03002_002 not Hispanic or Latino NHoL
B03002_003 not Hispanic or Latino, white alone NHoLW
B03002_004 not Hispanic or Latino, Black or African American alone NHoLB
B03002_005 not Hispanic or Latino, American Indian and Alaska Native alone NHoLAIAN
B03002_006 not Hispanic or Latino, Asian alone NHoLA
B03002_007 not Hispanic or Latino, Native Hawaiian and Other Pacific Islander alone NHoLNHOPI
B03002_008 not Hispanic or Latino, some other race alone NHoLSOR
B03002_009 not Hispanic or Latino, two or more races NHoLTOMR
B03002_010 not Hispanic or Latino, two races including some other race NHoLTRiSOR
B03002_011 not Hispanic or Latino, two races excluding some other race, and three or more races NHoLTReSOR
B03002_012 Hispanic or Latino HoL
B03002_013 Hispanic or Latino, white alone HoLW
B03002_014 Hispanic or Latino, Black or African American alone HoLB
B03002_015 Hispanic or Latino, American Indian and Alaska Native alone HoLAIAN
B03002_016 Hispanic or Latino, Asian alone HoLA
B03002_017 Hispanic or Latino, Native Hawaiian and other Pacific Islander alone HoLNHOPI
B03002_018 Hispanic or Latino, some other race alone HoLSOR
B03002_019 Hispanic or Latino, two or more races HoLTOMR
B03002_020 Hispanic or Latino, two races including some other race HoLTRiSOR
B03002_021 Hispanic or Latino, two races excluding some other race, and three or more races HoLTReSOR

Compute Dissimilarity Index (D)

Compute the racial or ethnic D values (2006-2010 5-year ACS) for census tracts within counties of Pennsylvania. This metric is based on James & Taeuber (1985). D is a measure of the evenness of racial or ethnic residential segregation when comparing smaller geographical areas to larger ones within which the smaller geographical areas are located. D can range in value from 0 to 1 and represents the proportion of racial or ethnic subgroup members that would have to change their area of residence to achieve an even distribution within the larger geographical area under conditions of maximum segregation.

james_taeuber2010PA <- james_taeuber(
  geo_large = 'county',
  geo_small = 'tract',
  state = 'PA',
  year = 2010,
  subgroup = c('NHoLB', 'HoLB')
)

# Obtain the 2010 counties from the 'tigris' package
county2010PA <- counties(state = 'PA', year = 2010, cb = TRUE)
# Remove first 9 characters from GEOID for compatibility with tigris information
county2010PA$GEOID <- substring(county2010PA$GEO_ID, 10) 

# Join the D values to the county geometries
PA2010james_taeuber <- county2010PA %>%
  left_join(james_taeuber2010PA$d, by = 'GEOID')
# Visualize the D values (2006-2010 5-year ACS) for census tracts within counties of Pennsylvania
ggplot() +
  geom_sf(
    data = PA2010james_taeuber,
    aes(fill = D),
    size = 0.05,
    color = 'white'
  ) +
  geom_sf(
    data = county2010PA,
    fill = 'transparent',
    color = 'white',
    size = 0.2
  ) +
  theme_minimal() +
  scale_fill_viridis_c(limits = c(0, 1)) +
  labs(fill = 'Index (Continuous)', caption = 'Source: U.S. Census ACS 2006-2010 estimates') +
  ggtitle(
    'Dissimilarity Index (James & Taeuber)\nCensus tracts within counties of Pennsylvania',
    subtitle = 'Black population'
  )

The racial or ethnic Gini Index (G)

Compute the racial or ethnic Gini Index (G) values (2006-2010 5-year ACS) for census tracts within counties of Massachusetts. This metric is based on Gini (1921). G is a measure of the evenness of racial or ethnic residential populations when comparing smaller geographical areas to larger ones within which the smaller geographical areas are located. G can range in value from 0 to 1 and the mean absolute difference between a selected subgroup proportions weighted across all pairs of geographic units, expressed as a proportion of the maximum weighted difference.

gini2010MA <- gini(
  geo_large = 'county',
  geo_small = 'tract',
  state = 'MA',
  year = 2010,
  subgroup = c('NHoLB', 'HoLB')
)

# Obtain the 2010 counties from the 'tigris' package
county2010MA <- counties(state = 'MA', year = 2010, cb = TRUE)
# Remove first 9 characters from GEOID for compatibility with tigris information
county2010MA$GEOID <- substring(county2010MA$GEO_ID, 10) 

# Join the G values to the census tract geometries
MA2010gini <- county2010MA %>%
  left_join(gini2010MA$g, by = 'GEOID')
# Visualize the G values (2006-2010 5-year ACS) for census tracts within counties of Massachusetts
ggplot() +
  geom_sf(
    data = MA2010gini,
    aes(fill = G_re),
    size = 0.05,
    color = 'transparent'
  ) +
  geom_sf(
    data = county2010MA,
    fill = 'transparent',
    color = 'white',
    size = 0.2
  ) +
  theme_minimal() +
  scale_fill_viridis_c(limits = c(0, 1)) +
  labs(
    fill = 'Index (Continuous)',
    caption = 'Source: U.S. Census ACS 2006-2010 estimates') +
  ggtitle(
    'Gini Index (Gini)\nCensus tracts within counties of Massachusetts', 
    subtitle = 'Black population'
  )

Compute Entropy (H)

Compute racial or ethnic H (2010-2014 5-year ACS) for census tracts within metropolitan divisions of Pennsylvania. This metric is based on Theil (1972; ISBN:978-0-444-10378-9) and Theil & Finizza (1971). H is a measure of the evenness of racial or ethnic residential segregation when comparing smaller geographical areas to larger ones within which the smaller geographical areas are located. H can range in value from 0 to 1 and represents the (weighted) average deviation of each smaller geographical unit from the larger geographical unit’s “entropy” or racial and ethnic diversity, which is greatest when each group is equally represented in the larger geographical unit. H varies between 0, when all smaller geographical units have the same racial or ethnic composition as the larger geographical area (i.e., maximum integration), to a high of 1, when all smaller geographical units contain one group only (maximum segregation).

theil2014PA <- theil(
  geo_large = 'metro',
  geo_small = 'tract',
  state = c('PA', 'NJ', 'DE', 'MD', 'OH', 'WV', 'NY', 'CT'),
  year = 2014,
  subgroup = c('NHoLB', 'HoLB')
)

# Obtain the 2014 metropolitan divisions from the 'tigris' package
metro2014 <- metro_divisions(year = 2014)
# Obtain the 2014 state from the 'tigris' package
state2014 <- states(year = 2014, cb = TRUE)

# Join the H values to the metropolitan divisions geometries and filter for Pennsylvania
PA2010theil <- metro2014 %>%
  left_join(theil2014PA$h, by = 'GEOID') %>%
  filter(!st_is_empty(.)) %>%
  filter(!is.na(H)) %>%
  st_filter(state2014 %>% filter(STUSPS == 'PA')) %>%
  st_make_valid()
# Visualize the H values (2010-2014 5-year ACS) for census tracts within metropolitan divisions of Pennsylvania
ggplot() +
  geom_sf(
    data = PA2010theil,
    aes(fill = H)
  ) +
  geom_sf(
    data = state2014 %>% filter(STUSPS == 'PA'),
    fill = 'transparent',
    color = 'black',
    size = 0.2
  ) +
  theme_minimal() +
  scale_fill_viridis_c(limits = c(0, 1)) +
  labs(fill = 'Index (Continuous)', caption = 'Source: U.S. Census ACS 2010-2014 estimates') +
  ggtitle(
    'Entropy (Theil)\nCensus tracts within Metro Divisions of Pennsylvania',
    subtitle = 'Black population'
  )

Compute Atkinson Index (A)

Compute the racial or ethnic A values (2017-2021 5-year ACS) for census block groups within counties of Kentucky. This metric is based on Atkinson (1970) that assessed the distribution of income within 12 counties but has since been adapted to study racial or ethnic segregation (see James & Taeuber 1985). A is a measure of the inequality and, in the context of residential race or ethnicity, segregation when comparing smaller geographical areas to larger ones within which the smaller geographical areas are located. A can range in value from 0 to 1 and smaller values of the index indicate lower levels of inequality (e.g., less segregation).

A is sensitive to the choice of epsilon argument or the shape parameter that determines how to weight the increments to inequality (segregation) contributed by different proportions of the Lorenz curve. A user must explicitly decide how heavily to weight smaller geographical units at different points on the Lorenz curve (i.e., whether the index should take greater account of differences among areas of over- or under-representation). The epsilon argument must have values between 0 and 1.0. For 0 <= epsilon < 0.5 or less ‘inequality-averse,’ smaller geographical units with a subgroup proportion smaller than the subgroup proportion of the larger geographical unit contribute more to inequality (‘over-representation’). For 0.5 < epsilon <= 1.0 or more ‘inequality-averse,’ smaller geographical units with a subgroup proportion larger than the subgroup proportion of the larger geographical unit contribute more to inequality (‘under-representation’). If epsilon = 0.5 (the default), units of over- and under-representation contribute equally to the index. See Section 2.3 of Saint-Jacques et al. (2020) for one method to select epsilon. We choose epsilon = 0.67 in the example below:

atkinson2021KY <- atkinson(
  geo_large = 'county',
  geo_small = 'block group',
  state = 'KY',
  year = 2021,
  subgroup = 'NHoLB',
  epsilon = 0.67
)

# Obtain the 2021 counties from the 'tigris' package
county2021KY <- counties(state = 'KY', year = 2021, cb = TRUE)

# Join the A values to the county geometries
KY2021atkinson <- county2021KY %>% 
  left_join(atkinson2021KY$a, by = 'GEOID')
# Visualize the A values (2017-2021 5-year ACS) for census block groups within counties of Kentucky
ggplot() +
  geom_sf(
    data = KY2021atkinson,
    aes(fill = A),
    size = 0.05,
    color = 'white'
  ) +
  geom_sf(
    data = county2021KY,
    fill = 'transparent',
    color = 'white',
    size = 0.2
  ) +
  theme_minimal() +
  scale_fill_viridis_c(limits = c(0, 1)) +
  labs(fill = 'Index (Continuous)', caption = 'Source: U.S. Census ACS 2017-2021 estimates') +
  ggtitle(
    'Atkinson Index (Atkinson)\nCensus block groups within counties of Kentucky',
    subtitle = expression(paste('Black non-Hispanic (', epsilon, ' = 0.67)'))
  )

Compute Dissimilarity Index (D)

Compute the racial or ethnic D values (2006-2010 5-year ACS) for census tracts with counties of Pennsylvania. This metric is based on Duncan & Duncan (1955a) that assessed the racial or ethnic isolation of students that identify as non-Hispanic or Latino, Black or African American alone compared to students that identify as non-Hispanic or Latino, white alone between schools and school districts. D is a measure of the evenness of racial or ethnic residential segregation when comparing smaller geographical areas to larger ones within which the smaller geographical areas are located. D can range in value from 0 to 1 and represents the proportion of racial or ethnic subgroup members that would have to change their area of residence to achieve an even distribution within the larger geographical area under conditions of maximum segregation.

duncan2010PA <- duncan(
  geo_large = 'county',
  geo_small = 'tract',
  state = 'PA',
  year = 2010,
  subgroup = 'NHoLB',
  subgroup_ref = 'NHoLW'
)

# Obtain the 2010 counties of Pennsylvania from the 'tigris' package
county2010PA <- counties(state = 'PA', year = 2010, cb = TRUE)
# Remove first 9 characters from GEOID for compatibility with tigris information
county2010PA$GEOID <- substring(county2010PA$GEO_ID, 10) 

# Join the D values to the county geometries
PA2010duncan <- county2010PA %>%
  left_join(duncan2010PA$d, by = 'GEOID')
# Visualize the D values (2006-2010 5-year ACS) for census tracts within counties of Pennsylvania
ggplot() +
  geom_sf(
    data = PA2010duncan,
    aes(fill = D),
    size = 0.05,
    color = 'white'
  ) +
  geom_sf(
    data = county2010PA,
    fill = 'transparent',
    color = 'white',
    size = 0.2
  ) +
  theme_minimal() +
  scale_fill_viridis_c(limits = c(0, 1)) +
  labs(fill = 'Index (Continuous)', caption = 'Source: U.S. Census ACS 2006-2010 estimates') +
  ggtitle(
    'Dissimilarity Index (Duncan & Duncan)\nCensus tracts within counties of Pennsylvania',
    subtitle = 'Black non-Hispanic vs. white non-Hispanic'
  )

Compute Location Quotient (LQ)

Compute the racial or ethnic LQ values (2017-2021 5-year ACS) for counties within state of Tennessee. This metric is based on Merton (1939) and adapted by Sudano et al. (2013). LQ is some measure of relative racial homogeneity of each smaller geography within a larger geography. LQ can range in value from 0 to infinity because it is ratio of two proportions in which the numerator is the proportion of subgroup population in a smaller geography and the denominator is the proportion of subgroup population in its larger geography. For example, a smaller geography with an LQ of 5 means that the proportion of the subgroup population living in the smaller geography is five times the proportion of the subgroup population in its larger geography. Unlike the previous metrics that aggregate to the larger geography, LQ computes values for each smaller geography relative to the larger geography.

sudano2021TN <- sudano(
  geo_large = 'state',
  geo_small = 'county',
  state = 'TN',
  year = 2021,
  subgroup = 'NHoLB'
)

# Obtain the 2021 counties of Tennessee from the 'tigris' package
county2021TN <- counties(state = 'TN', year = 2021, cb = TRUE)

# Join the LQ values to the county geometry
TN2021sudano <- county2021TN %>% 
  left_join(sudano2021TN$lq, by = 'GEOID')
# Visualize the LQ values (2017-2021 5-year ACS) for counties within state of Tennessee
ggplot() +
  geom_sf(
    data = TN2021sudano,
    aes(fill = LQ),
    size = 0.05,
    color = 'white'
  ) +
  geom_sf(
    data = county2021TN,
    fill = 'transparent',
    color = 'white',
    size = 0.2
  ) +
  theme_minimal() +
  scale_fill_viridis_c() +
  labs(fill = 'Index (Continuous)', caption = 'Source: U.S. Census ACS 2017-2021 estimates') +
  ggtitle(
    'Location Quotient (Sudano)\nCounties within state of Tennessee',
    subtitle = 'Black non-Hispanic'
  )

Compute Interaction Index (xPy*)

Compute the racial or ethnic xPy* values (2017-2021 5-year ACS) for census tracts within counties of Ohio. This metric is based on Shevky & Williams (1949; ISBN-13:978-0-837-15637-8) and adapted by Bell (1954). xPy* is some measure of the probability that a member of one subgroup(s) will meet or interact with a member of another subgroup(s) with higher values signifying higher probability of interaction (less isolation) when comparing smaller geographical areas to larger ones within which the smaller geographical areas are located. xPy* can range in value from 0 to 1.

bell2021OH <- bell(
  geo_large = 'county',
  geo_small = 'tract',
  state = 'OH',
  year = 2021,
  subgroup = 'NHoLB',
  subgroup_ixn = 'NHoLW'
)

# Obtain the 2021 counties of Ohio from the 'tigris' package
county2021OH <- counties(state = 'OH', year = 2021, cb = TRUE)

# Join the xPy* values to the county geometries
OH2021bell <- county2021OH %>%
  left_join(bell2021OH$xpy_star, by = 'GEOID')
# Visualize the xPy* values (2017-2021 5-year ACS) for census tracts within counties of Ohio
ggplot() +
  geom_sf(
    data = OH2021bell,
    aes(fill = xPy_star),
    size = 0.05,
    color = 'white'
  ) +
  geom_sf(
    data = county2021OH,
    fill = 'transparent',
    color = 'white',
    size = 0.2
  ) +
  theme_minimal() +
  scale_fill_viridis_c(limits = c(0, 1)) +
  labs(fill = 'Index (Continuous)', caption = 'Source: U.S. Census ACS 2017-2021 estimates') +
  ggtitle(
    'Interaction Index (Bell)\nCensus tracts within counties of Ohio',
    subtitle = 'Black non-Hispanic vs. white non-Hispanic'
  )

Compute Isolation Index (xPx*)

Compute the racial or ethnic xPx* values (2015-2019 5-year ACS) for census block groups within census tracts of Delaware. This metric is based on Bell (1954) and adapted by Lieberson (1981; ISBN-13:978-1-032-53884-6). xPx* is some measure of the probability that a member of one subgroup(s) will meet or interact with a member of another subgroup(s) with higher values signifying higher probability of interaction (less isolation) when comparing smaller geographical areas to larger ones within which the smaller geographical areas are located. xPx* can range in value from 0 to 1 with higher values signifying higher probability of interaction (less isolation).

lieberson2021DE <- lieberson(
  geo_large = 'tract',
  geo_small = 'block group',
  state = 'DE',
  year = 2019,
  subgroup = 'NHoLB'
)

# Obtain the 2021 census tracts of Delaware from the 'tigris' package
tract2021DE <- tracts(state = 'DE', year = 2019, cb = TRUE)

# Join the xPx* values to the census tract geometries
DE2021lieberson <- tract2021DE %>%
  left_join(lieberson2021DE$xpx_star, by = 'GEOID')
# Visualize the xPx* values (2015-2019 5-year ACS) for census block groups within census tracts of Delaware
ggplot() +
  geom_sf(
    data = DE2021lieberson,
    aes(fill = xPx_star),
    size = 0.05,
    color = 'white'
  ) +
  geom_sf(
    data = tract2021DE,
    fill = 'transparent',
    color = 'white',
    size = 0.2
  ) +
  theme_minimal() +
  scale_fill_viridis_c(limits = c(0, 1)) +
  labs(fill = 'Index (Continuous)', caption = 'Source: U.S. Census ACS 2015-2019 estimates') +
  ggtitle(
    'Isolation Index (Lieberson)\nCensus block groups within census tracts of Delaware',
    subtitle = 'Black non-Hispanic'
  )

Compute Correlation Ratio (V)

Compute the racial or ethnic V values (2017-2021 5-year ACS) for census tracts within counties of South Carolina. This metric is based on Bell (1954) and adapted by White (1986). V removes the asymmetry from the Isolation Index by controlling for the effect of population composition when comparing smaller geographical areas to larger ones within which the smaller geographical areas are located. The Isolation Index is some measure of the probability that a member of one subgroup(s) will meet or interact with a member of another subgroup(s) with higher values signifying higher probability of interaction (less isolation). V can range in value from 0 to Inf.

white2021SC <- white(
  geo_large = 'county',
  geo_small = 'tract',
  state = 'SC',
  year = 2021,
  subgroup = 'NHoLB'
)

# Obtain the 2021 counties of South Carolina from the 'tigris' package
county2021SC <- counties(state = 'SC', year = 2021, cb = TRUE)

# Join the V values to the county geometries
SC2021white <- county2021SC %>%
  left_join(white2021SC$v, by = 'GEOID')
# Visualize the V values (2017-2021 5-year ACS) for census tracts within counties of South Carolina
ggplot() +
  geom_sf(
    data = SC2021white,
    aes(fill = V),
    size = 0.05,
    color = 'white'
  ) +
  geom_sf(
    data = county2021SC,
    fill = 'transparent',
    color = 'white',
    size = 0.2
  ) +
  theme_minimal() +
  scale_fill_viridis_c() +
  labs(fill = 'Index (Continuous)', caption = 'Source: U.S. Census ACS 2017-2021 estimates') +
  ggtitle(
    'Correlation Ratio (White)\nCensus tracts within counties of South Carolina',
    subtitle = 'Black non-Hispanic'
  )

Compute Local Exposure and Isolation (LEx/Is)

Compute the racial or ethnic Local Exposure and Isolation metric (2017-2021 5-year ACS) for counties of state of Mississippi. This metric is based on Bemanian & Beyer (2017). LEx/Is is a measure of the probability that two individuals living within a specific smaller geography (e.g., census tract) of either different (i.e., exposure) or the same (i.e., isolation) racial or ethnic subgroup(s) will interact, assuming that individuals within a smaller geography are randomly mixed. LEx/Is is standardized with a logit transformation and centered against an expected case that all races or ethnicities are evenly distributed across a larger geography. LEx/Is can range from negative infinity to infinity. If LEx/Is is zero then the estimated probability of the interaction between two people of the given subgroup(s) within a smaller geography is equal to the expected probability if the subgroup(s) were perfectly mixed in the larger geography. If LEx/Is is greater than zero then the interaction is more likely to occur within the smaller geography than in the larger geography, and if LEx/Is is less than zero then the interaction is less likely to occur within the smaller geography than in the larger geography. Note: the exponentiation of each LEx/Is metric results in the odds ratio of the specific exposure or isolation of interest in a smaller geography relative to the larger geography. Similar to LQ (Sudano), LEx/Is computes values for each smaller geography relative to the larger geography.

bemanian_beyer2021MS <- bemanian_beyer(
  geo_large = 'state',
  geo_small = 'county',
  state = 'MS',
  year = 2021,
  subgroup = 'NHoLB',
  subgroup_ixn = 'NHoLW'
)

# Obtain the 2021 counties of Mississippi from the 'tigris' package
county2021MS <- counties(state = 'MS', year = 2021, cb = TRUE)

# Join the LEx/Is values to the county geometries
MS2021bemanian_beyer <- county2021MS %>%
  left_join(bemanian_beyer2021MS$lexis, by = 'GEOID')
# Visualize the LEx/Is values (2017-2021 5-year ACS) for counties of state of Mississippi
ggplot() +
  geom_sf(
    data = MS2021bemanian_beyer,
    aes(fill = LExIs),
    size = 0.05,
    color = 'white'
  ) +
  geom_sf(
    data = county2021MS,
    fill = 'transparent',
    color = 'white',
    size = 0.2
  ) +
  theme_minimal() +
  scale_fill_gradient2(
    low = '#998ec3',
    mid = '#f7f7f7',
    high = '#f1a340'
  ) +
  labs(fill = 'Index (Continuous)', caption = 'Source: U.S. Census ACS 2017-2021 estimates') +
  ggtitle(
    'Local Exposure and Isolation (Bemanian & Beyer)\nCounties within state of Mississippi',
    subtitle = 'Black non-Hispanic vs. white non-Hispanic'
  )

# Visualize the exponentiated LEx/Is values (2017-2021 5-year ACS) for Counties within state of Mississippi
ggplot() +
  geom_sf(
    data = MS2021bemanian_beyer,
    aes(fill = exp(LExIs)),
    size = 0.05,
    color = 'white'
  ) +
  geom_sf(
    data = county2021MS,
    fill = 'transparent',
    color = 'white',
    size = 0.2
  ) +
  theme_minimal() +
  scale_fill_viridis_c() +
  labs(fill = 'Index (Continuous)', caption = 'Source: U.S. Census ACS 2017-2021 estimates') +
  ggtitle(
    'Odds ratio of Local Exposure and Isolation (Bemanian & Beyer)\nCounties within state of Mississippi',
    subtitle = 'Black non-Hispanic vs. white non-Hispanic'
  )

Compute Racial Isolation Index (RI)

Compute the spatial RI values (2006-2010 5-year ACS) for census tracts of North Carolina. This metric is based on Anthopolos et al. (2011) that assessed the racial isolation of the population that identifies as non-Hispanic or Latino, Black or African American alone. A census geography (and its neighbors) that has nearly all of its population who identify with the specified race or ethnicity subgroup(s) (e.g., Not Hispanic or Latino, Black or African American alone) will have an RI value close to 1. In contrast, a census geography (and its neighbors) that is nearly none of its population who identify with the specified race or ethnicity subgroup(s) (e.g., not Not Hispanic or Latino, Black or African American alone) will have an RI value close to 0.

anthopolos2010NC <- anthopolos(
  state = 'NC', 
  year = 2010, 
  subgroup = 'NHoLB'
)

# Obtain the 2010 census tracts of North Carolina from the 'tigris' package
tract2010NC <- tracts(state = 'NC', year = 2010, cb = TRUE)
# Remove first 9 characters from GEOID for compatibility with tigris information
tract2010NC$GEOID <- substring(tract2010NC$GEO_ID, 10) 

# Obtain the 2010 counties of North Carolina from the 'tigris' package
county2010NC <- counties(state = 'NC', year = 2010, cb = TRUE)

# Join the RI values to the census tract geometries
NC2010anthopolos <- tract2010NC %>%
  left_join(anthopolos2010NC$ri, by = 'GEOID')
# Visualize the RI values (2006-2010 5-year ACS) for census tracts of North Carolina
ggplot() +
  geom_sf(
    data = NC2010anthopolos,
    aes(fill = RI),
    size = 0.05,
    color = 'transparent'
  ) +
  geom_sf(
    data = county2010NC,
    fill = 'transparent',
    color = 'white',
    size = 0.2
  ) +
  theme_minimal() +
  scale_fill_viridis_c() +
  labs(fill = 'Index (Continuous)', caption = 'Source: U.S. Census ACS 2006-2010 estimates') +
  ggtitle(
    'Racial Isolation Index (Anthopolos)\nCensus tracts of North Carolina',
    subtitle = 'Black Non-Hispanic populations (not corrected for edge effects)'
  )

The current version of the ndi package does not correct for edge effects (e.g., census geographies along the specified spatial extent border, coastline, or U.S.-Mexico / U.S.-Canada border) may have few neighboring census geographies, and RI values in these census geographies may be unstable. A stop-gap solution for the former source of edge effect is to compute the RI for neighboring census geographies (i.e., the states bordering a study area of interest) and then use the estimates of the study area of interest.

# Compute RI for all census tracts in neighboring states
anthopolos2010GNSTV <- anthopolos(
  state = c('GA', 'NC', 'SC', 'TN', 'VA'),
  year = 2010,
  subgroup = 'NHoLB'
)

# Crop to only census tracts of North Carolina
anthopolos2010NCe <- anthopolos2010GNSTV$ri[anthopolos2010GNSTV$ri$GEOID %in% 
                                              anthopolos2010NC$ri$GEOID, ]

# Obtain the 2010 census tracts of North Carlina from the 'tigris' package
tract2010NC <- tracts(state = 'NC', year = 2010, cb = TRUE)
# Remove first 9 characters from GEOID for compatibility with tigris information
tract2010NC$GEOID <- substring(tract2010NC$GEO_ID, 10) 

# Obtain the 2010 counties of North Carolina from the 'tigris' package
county2010NC <- counties(state = 'NC', year = 2010, cb = TRUE)

# Join the RI values to the census tract geometries
edgeNC2010anthopolos <- tract2010NC %>% 
  left_join(anthopolos2010NCe, by = 'GEOID')
# Visualize the RI values (2006-2010 5-year ACS) for census tracts of North Carolina
ggplot() +
  geom_sf(
    data = edgeNC2010anthopolos,
    aes(fill = RI),
    size = 0.05,
    color = 'transparent'
  ) +
  geom_sf(
    data = county2010NC,
    fill = 'transparent',
    color = 'white',
    size = 0.2
  ) +
  theme_minimal() +
  scale_fill_viridis_c() +
  labs(fill = 'Index (Continuous)', caption = 'Source: U.S. Census ACS 2006-2010 estimates') +
  ggtitle(
    'Racial Isolation Index (Anthopolos)\nCensus tracts of North Carolina',
    subtitle = 'Black Non-Hispanic populations (corrected for interstate edge effects)'
  )

Compute Delta (DEL)

Compute the racial or ethnic DEL values (2017-2021 5-year ACS) for census tracts within counties within Alabama. This metric is based on Hoover (1941) and Duncan et al. (1961; LC:60007089). DEL is a measure of the proportion of members of one subgroup(s) residing in geographic units with above average density of members of the subgroup(s). The index provides the proportion of a subgroup population that would have to move across geographic units to achieve a uniform density. DEL can range in value from 0 to 1.

hoover2021AL <- hoover(
  geo_large = 'county',
  geo_small = 'tract',
  state = 'AL',
  year = 2021,
  subgroup = 'NHoLB'
)

# Obtain the 2021 counties of Alabama from the 'tigris' package
county2021AL <- counties(state = 'AL', year = 2021, cb = TRUE)

# Join the DEL values to the county geometries
AL2021hoover <- county2021AL %>%
  left_join(hoover2021AL$del, by = 'GEOID')
# Visualize the DEL values (2017-2021 5-year ACS) for census tracts within counties within Alabama
ggplot() +
  geom_sf(
    data = AL2021hoover,
    aes(fill = DEL),
    size = 0.05,
    color = 'white'
  ) +
  geom_sf(
    data = county2021AL,
    fill = 'transparent',
    color = 'white',
    size = 0.2
  ) +
  theme_minimal() +
  scale_fill_viridis_c(limits = c(0, 1)) +
  labs(fill = 'Index (Continuous)', caption = 'Source: U.S. Census ACS 2017-2021 estimates') +
  ggtitle(
    'Delta (Hoover)\nCensus tracts within counties within Alabama',
    subtitle = 'Black Non-Hispanic population'
  )

Compute Relative Concentration (RCO)

Compute the racial or ethnic RCO values (2015-2019 5-year ACS) for census tracts within core-based statistical areas of Wisconsin. This metric is based on Massey & Denton (1988) and Duncan, Cuzzort, & Duncan (1961; LC:60007089). RCO is a measure of concentration of racial or ethnic populations within smaller geographical units that are located within larger geographical units. RCO can range from -1 to 1 and represents the share of a larger geographical unit occupied by a racial or ethnic subgroup compared to a referent racial or ethnic subgroup. A value of 1 indicates that the concentration of the racial or ethnic subgroup exceeds the concentration of the referent racial or ethnic subgroup at the maximum extent possible. A value of -1 is the converse. Note: Computed as designed, but values smaller than -1 are possible if the racial or ethnic subgroup population is larger than the referent racial or ethnic subgroup population.

denton_cuzzort2019WI <- denton_cuzzort(
  geo_large = 'cbsa',
  geo_small = 'tract',
  state = c('WI', 'IL', 'MI', 'MN'),
  year = 2019,
  subgroup = 'NHoLB',
  subgroup_ref = 'NHoLW'
)

# Obtain the 2019 census-designated places from the 'tigris' package
cbsa2019 <- core_based_statistical_areas(year = 2019, cb = TRUE)
# Obtain the 2019 state from the 'tigris' package
states2019 <- states(year = 2019, cb = TRUE)

# Join the RCO values to the census-designated places geometries and filter for Wisconsin
WI2019denton_cuzzort <- cbsa2019 %>%
  left_join(denton_cuzzort2019WI$rco, by = 'GEOID') %>%
  filter(!st_is_empty(.)) %>%
  filter(!is.na(RCO)) %>%
  st_filter(states2019 %>% filter(STUSPS == 'WI'), .predicate = st_within) %>%
  st_make_valid()
# Visualize the RCO values (2015-2019 5-year ACS) for census tracts within core-based statistical areas of Wisconsin.
ggplot() +
  geom_sf(
    data = WI2019denton_cuzzort,
    aes(fill = RCO)
  ) +
  geom_sf(
    data = states2019 %>% filter(STUSPS == 'WI'),
    fill = 'transparent',
    color = 'black',
    size = 0.2
  ) +
  theme_minimal() +
  scale_fill_gradient2(
    low = '#998ec3', 
    mid = '#f7f7f7', 
    high = '#f1a340', 
    midpoint = 0
  ) +
  labs(fill = 'Index (Continuous)', caption = 'Source: U.S. Census ACS 2015-2019 estimates') +
  ggtitle(
    'Relative Concentration (Massey & Duncan)\nCensus tracts within core-based statistical areas of Wisconsin',
    subtitle = 'Black non-Hispanic vs. white non-Hispanic'
  )

Compute Absolute Concentration (ACO)

Compute the racial or ethnic ACO values (2015-2019 5-year ACS) for census tracts within core-based statistical areas of Wisconsin. This metric is based on Massey & Denton (1988) and Duncan, Cuzzort, & Duncan (1961; LC:60007089). ACO is a measure of concentration of racial or ethnic populations within smaller geographical units that are located within larger geographical units. ACO can range from 0 to 1 and represents the relative amount of physical space occupied by a racial or ethnic subgroup in a larger geographical unit. A value of 1 indicates that a racial or ethnic subgroup has achieved the maximum spatial concentration possible (all racial or ethnic subgroup members live in the smallest of the smaller geographical units). A value of 0 indicates the maximum deconcentration possible (all racial or ethnic subgroup members live in the largest of the smaller geographical units).

massey_duncan2019WI <- massey_duncan(
  geo_large = 'cbsa',
  geo_small = 'tract',
  state = c('WI', 'IL', 'MI', 'MN'),
  year = 2019,
  subgroup = c('NHoLB', 'HoLB')
)

# Obtain the 2019 census-designated places from the 'tigris' package
cbsa2019 <- core_based_statistical_areas(year = 2019, cb = TRUE)
# Obtain the 2019 state from the 'tigris' package
states2019 <- states(year = 2019, cb = TRUE)

# Join the ACO values to the census-designated places geometries and filter for Wisconsin
WI2019massey_duncan <- cbsa2019 %>%
  left_join(massey_duncan2019WI$aco, by = 'GEOID') %>%
  filter(!st_is_empty(.)) %>%
  filter(!is.na(ACO)) %>%
  st_filter(states2019 %>% filter(STUSPS == 'WI'), .predicate = st_within) %>%
  st_make_valid()
# Visualize the ACO values (2015-2019 5-year ACS) for census tracts within core-based statistical areas of Wisconsin.
ggplot() +
  geom_sf(
    data = WI2019massey_duncan,
    aes(fill = ACO)
  ) +
  geom_sf(
    data = states2019 %>% filter(STUSPS == 'WI'),
    fill = 'transparent',
    color = 'black',
    size = 0.2
  ) +
  theme_minimal() +
  scale_fill_viridis_c(limits = c(0, 1)) +
  labs(fill = 'Index (Continuous)', caption = 'Source: U.S. Census ACS 2015-2019 estimates') +
  ggtitle(
    'Absolute Concentration (Massey & Duncan)\nCensus tracts within core-based statistical areas of Wisconsin',
    subtitle = 'Black population'
  )

Compute Absolute Centralization (ACE)

Compute the racial or ethnic ACE values (2013-2017 5-year ACS) for census block groups within census-designated places of Connecticut. This metric is based on Duncan, Cuzzort, & Duncan (1961; LC:60007089) and Massey & Denton (1988). ACE is a measure of the degree to which racial or ethnic populations within smaller geographical units are located near the center of a larger geographical unit. ACE is a measure of concentration of racial or ethnic populations within smaller geographical units that are located within larger geographical units. ACE can range from 0 to 1 and represents the relative amount of physical space occupied by a racial or ethnic subgroup in a larger geographical unit. A value of 1 indicates that a racial or ethnic subgroup has achieved the maximum spatial concentration possible (all racial or ethnic subgroup members live in the smallest of the smaller geographical units). A value of 0 indicates the maximum deconcentration possible (all racial or ethnic subgroup members live in the largest of the smaller geographical units).

Note: The original metric used the location of the central business district (CBD) to compute the metric, but the U.S. Census Bureau has not defined CBDs for U.S. cities since the 1982 Census of Retail Trade. Therefore, this function uses the the centroids of each larger geographical unit as the ‘centre’, but may not represent the current CBD.

duncan_cuzzort2017CT <- duncan_cuzzort(
  geo_large = 'place',
  geo_small = 'cbg',
  state = 'CT',
  year = 2017,
  subgroup = c('NHoLB', 'HoLB')
)

# Obtain the 2017 census-designated places of Connecticut from the 'tigris' package
places2017 <- places(year = 2017, state = 'CT')
# Obtain the 2017 state from the 'tigris' package
states2017 <- states(year = 2017, cb = TRUE)

# Join the ACE values to the census-designated places geometries and filter for Connecticut
CT2010duncan_cuzzort <- places2017 %>%
  left_join(duncan_cuzzort2017CT$ace, by = 'GEOID') %>%
  filter(!st_is_empty(.)) %>%
  filter(!is.na(ACE)) %>%
  st_filter(states2017 %>% filter(STUSPS == 'CT')) %>%
  st_make_valid()
# Visualize the ACE values (2013-2017 5-year ACS) for census block groups within census-designated places of Connecticut
ggplot() +
  geom_sf(
    data = CT2010duncan_cuzzort,
    aes(fill = ACE)
  ) +
  geom_sf(
    data = states2017 %>% filter(STUSPS == 'CT'),
    fill = 'transparent',
    color = 'black',
    size = 0.2
  ) +
  theme_minimal() +
  scale_fill_gradient2(
    low = '#998ec3', 
    mid = '#f7f7f7', 
    high = '#f1a340', 
    midpoint = 0,
    limits = c(-1, 1)
  ) +
  labs(fill = 'Index (Continuous)', caption = 'Source: U.S. Census ACS 2013-2017 estimates') +
  ggtitle(
    'Absolute Centralization (Duncan & Cuzzort)\nCensus block groups within census-designated places of Connecticut',
    subtitle = 'Black population'
  )

Compute Relative Centralization (RCE)

Compute the racial or ethnic RCE values (2013-2017 5-year ACS) for census block groups within census-designated places of Connecticut. This metric is based on Duncan & Duncan (1955b) and Massey & Denton (1988). RCE is a measure of the degree to which racial or ethnic populations within smaller geographical units are located near the center of a larger geographical unit. RCE can range in value from -1 to 1 and represents the spatial distribution of racial or ethnic populations within smaller geographical units relative to the compared to the distribution of the referent racial or ethnic population around the center of a larger geographical unit. Positive values indicate a tendency for racial or ethnic populations to reside closer to the center of a larger geographical unit than the referent racial or ethnic population, while negative values indicate the racial or ethnic population is distributed farther from the center of a larger geographical unit than the referent racial or ethnic population. A score of 0 means that racial or ethnic populations have a uniform distribution throughout a larger geographical unit. RCE gives the proportion of racial or ethnic populations required to change residence to match the degree of centralization of the referent racial or ethnic population.

Note: The original metric used the location of the central business district (CBD) to compute the metric, but the U.S. Census Bureau has not defined CBDs for U.S. cities since the 1982 Census of Retail Trade. Therefore, this function uses the the centroids of each larger geographical unit as the ‘centre’, but may not represent the current CBD.

duncan_duncan2017CT <- duncan_duncan(
  geo_large = 'place',
  geo_small = 'cbg',
  state = 'CT',
  year = 2017,
  subgroup = 'NHoLB',
  subgroup_ref = 'NHoLW'
)

# Obtain the 2017 census-designated places of Connecticut from the 'tigris' package
places2017 <- places(year = 2017, state = 'CT')
# Obtain the 2017 state from the 'tigris' package
states2017 <- states(year = 2017, cb = TRUE)

# Join the ACE values to the census-designated places geometries and filter for Connecticut
CT2010duncan_duncan <- places2017 %>%
  left_join(duncan_duncan2017CT$rce, by = 'GEOID') %>%
  filter(!st_is_empty(.)) %>%
  filter(!is.na(RCE)) %>%
  st_filter(states2017 %>% filter(STUSPS == 'CT')) %>%
  st_make_valid()
# Visualize the ACE values (2013-2017 5-year ACS) for census block groups within census-designated places of Connecticut
ggplot() +
  geom_sf(
    data = CT2010duncan_duncan,
    aes(fill = RCE)
  ) +
  geom_sf(
    data = states2017 %>% filter(STUSPS == 'CT'),
    fill = 'transparent',
    color = 'black',
    size = 0.2
  ) +
  theme_minimal() +
  scale_fill_gradient2(
    low = '#998ec3', 
    mid = '#f7f7f7', 
    high = '#f1a340', 
    midpoint = 0,
    limits = c(-1, 1)
  ) +
  labs(fill = 'Index (Continuous)', caption = 'Source: U.S. Census ACS 2013-2017 estimates') +
  ggtitle(
    'Relative Centralization (Duncan & Duncan)\nCensus block groups within census-designated places of Connecticut',
    subtitle = 'Black non-Hispanic vs. white non-Hispanic'
  )

Compute Absolute Clustering (ACL)

Compute the racial or ethnic ACL values (2014-2018 5-year ACS) for census block groups within census tracts of Harris County, TX. This metric is based on Massey & Denton (1988). ACL is a measure of clustering of racial or ethnic populations within smaller geographical units that are located within larger geographical units. ACL can range in value from 0 to Inf and represents the degree to which an area is a racial or ethnic enclave. A value of 1 indicates there is no differential clustering of the racial or ethnic subgroup. A value greater than 1 indicates the racial or ethnic subgroup live nearer to one another. A value less than 1 indicates the racial or ethnic subgroup do not live near one another.

massey2018HTX <- massey(
  geo_large = 'tract',
  geo_small = 'cbg',
  state = 'TX',
  county = 'Harris County',
  year = 2018,
  subgroup = c('NHoLB', 'HoLB')
)

# Obtain the 2018 census tracts of Texas from the 'tigris' package
tract2018 <- tracts(year = 2018, state = 'TX')
# Obtain the 2018 Texas counties from the 'tigris' package
county2018 <- counties(state = 'TX', year = 2018, cb = TRUE)

# Join the ACL values to the census tract geometries and filter for Harris County, TX
HTX2010massey <- tract2018 %>%
  left_join(massey2018HTX$acl, by = 'GEOID') %>%
  filter(!st_is_empty(.)) %>%
  filter(!is.na(ACL)) %>%
  st_filter(county2018 %>% filter(NAME == 'Harris')) %>%
  st_make_valid()
# Visualize the ACL values (2013-2017 5-year ACS) for census block groups within census tracts of Harris County, TX
ggplot() +
  geom_sf(
    data = HTX2010massey,
    aes(fill = ACL)
  ) +
  geom_sf(
    data = county2018 %>% filter(NAME == 'Harris County'),
    fill = 'transparent',
    color = 'black',
    size = 0.2
  ) +
  theme_minimal() +
  scale_fill_gradient2(
    low = '#998ec3', 
    mid = '#f7f7f7', 
    high = '#f1a340', 
    midpoint = 0
  ) +
  labs(fill = 'Index (Continuous)', caption = 'Source: U.S. Census ACS 2013-2017 estimates') +
  ggtitle(
    'Absolute Clustering (Massey & Denton)\nCensus block groups within census tracts of Harris County, TX',
    subtitle = 'Black population'
  )

Compute an index of spatial proximity (SP)

Compute an index of spatial proximity (2010-2014 5-year ACS) for census tracts within combined statistical areas of Georgia. This metric is based on White (1986) and Blau (1977; ISBN-13:978-0-029-03660-0) that designed the metric to identify racial or ethnic enclaves. SP is a measure of clustering of racial or ethnic populations within smaller geographical areas that are located within larger geographical areas. SP can range in value from 0 to Inf and represents the degree to which an area is a racial or ethnic enclave. A value of 1 indicates there is no differential clustering between subgroup and referent group members. A value greater than 1 indicates subgroup members live nearer to one another than to referent subgroup members. A value less than 1 indicates subgroup live nearer to and referent subgroup members than to their own subgroup members.

whiteblau2014GA <- white_blau(
  geo_large = 'csa',
  geo_small = 'tract',
  state = c('GA', 'AL', 'TN', 'FL'),
  year = 2014,
  subgroup = 'NHoLB',
  subgroup_ref = 'NHoLW'
)

# Obtain the 2014 Combined Statistical Areas from the 'tigris' package
csa2014 <- combined_statistical_areas(year = 2014, cb = TRUE)
# Obtain the 2014 state from the 'tigris' package
state2014 <- states(year = 2014, cb = TRUE)

# Join the SP values to the CSA geometries and filter for Georgia
GA2010whiteblau <- csa2014 %>%
  left_join(whiteblau2014GA$sp, by = 'GEOID') %>%
  filter(!st_is_empty(.)) %>%
  filter(!is.na(SP)) %>%
  st_filter(state2014 %>% filter(STUSPS == 'GA')) %>%
  st_make_valid()
# Visualize the SP values (2010-2014 5-year ACS) for census tracts within combined statistical areas of Georgia
ggplot() +
  geom_sf(
    data = GA2010whiteblau,
    aes(fill = SP),
   # size = 0.05,
   # color = 'white'
  ) +
  geom_sf(
    data = state2014 %>% filter(STUSPS == 'GA'),
    fill = 'transparent',
    color = 'black',
    size = 0.2
  ) +
  theme_minimal() +
  scale_fill_gradient2(
    low = '#998ec3', 
    mid = '#f7f7f7', 
    high = '#f1a340', 
    midpoint = 1
  ) +
  labs(fill = 'Index (Continuous)', caption = 'Source: U.S. Census ACS 2010-2014 estimates') +
  ggtitle(
    'An index of spatial proximity (White)\nCensus tracts within combined statistical areas of Georgia',
    subtitle = 'Black non-Hispanic vs. white non-Hispanic'
  )

Compute Relative Clustering (RCL)

Compute the racial or ethnic RCL values (2014-2018 5-year ACS) for census block groups within census tracts of Harris County, TX. This metric is based on Massey & Denton (1988). RCL equals 0 when the racial or ethnic subgroup population displays the same amount of clustering as the referent racial or ethnic subgroup population, and is positive whenever the racial or ethnic subgroup population members display greater clustering than is typical of the the referent racial or ethnic subgroup population. If the racial or ethnic subgroup population members were less clustered than the the referent racial or ethnic subgroup population, then RCL would be negative.

denton2018HTX <- denton(
  geo_large = 'tract',
  geo_small = 'cbg',
  state = 'TX',
  county = 'Harris County',
  year = 2018,
  subgroup = 'NHoLB',
  subgroup_ref = 'NHoLW'
)

# Obtain the 2018 census tracts of Texas from the 'tigris' package
tract2018 <- tracts(year = 2018, state = 'TX')
# Obtain the 2018 Texas counties from the 'tigris' package
county2018 <- counties(state = 'TX', year = 2018, cb = TRUE)

# Join the RCL values to the census tract geometries and filter for Harris County, TX
HTX2010denton <- tract2018 %>%
  left_join(denton2018HTX$rcl, by = 'GEOID') %>%
  filter(!st_is_empty(.)) %>%
  filter(!is.na(RCL)) %>%
  st_filter(county2018 %>% filter(NAME == 'Harris')) %>%
  st_make_valid()
# Visualize the RCL values (2013-2017 5-year ACS) for census block groups within census tracts of Harris County, TX
ggplot() +
  geom_sf(
    data = HTX2010denton,
    aes(fill = RCL)
  ) +
  geom_sf(
    data = county2018 %>% filter(NAME == 'Harris County'),
    fill = 'transparent',
    color = 'black',
    size = 0.2
  ) +
  theme_minimal() +
  scale_fill_gradient2(
    low = '#998ec3', 
    mid = '#f7f7f7', 
    high = '#f1a340', 
    midpoint = 0
  ) +
  labs(fill = 'Index (Continuous)', caption = 'Source: U.S. Census ACS 2013-2017 estimates') +
  ggtitle(
    'Relative Clustering (Massey & Denton)\nCensus block groups within census tracts of Harris County, TX',
    subtitle = 'Black non-Hispanic vs. white non-Hispanic'
  )

Compute Distance-Decay Isolation Index (DPxy*)

Compute the racial or ethnic DPxy* values (2017-2021 5-year ACS) for census tracts within census-designated placed of Louisiana. This metric is based on Morgan (1983) and Massey & Denton (1988). DPxy* is some measure of the probability that a member of a racial or ethnic subgroup will meet or interact with a member of another racial or ethnic subgroup(s). DPxy* can range in value from 0 to 1 with higher values signifying higher probability of isolation (less isolation).

morgan_denton2021LA <- morgan_denton(
  geo_large = 'cbsa',
  geo_small = 'tract',
  state = 'LA',
  year = 2021,
  subgroup = 'NHoLB',
  subgroup_ixn = 'NHoLW'
)

# Obtain the 2021 core-based statistical areas from the 'tigris' package
cbsa2021 <- core_based_statistical_areas(year = 2021, cb = TRUE)
# Obtain the 2021 state from the 'tigris' package
states2021 <- states(year = 2021, cb = TRUE)

# Join the DPxx* values to the core-based statistical area geometries and filter for Louisiana
LA2021morgan_denton <- cbsa2021 %>%
  left_join(morgan_denton2021LA$dpxy_star, by = 'GEOID') %>%
  filter(!st_is_empty(.)) %>%
  filter(!is.na(DPxy_star)) %>%
  st_filter(states2021 %>% filter(STUSPS == 'LA'), .predicate = st_within) %>%
  st_make_valid()
# Visualize the DPxx* values (2017-2021 5-year ACS) for census tracts within core-based statistical areas of Louisiana
ggplot() +
  geom_sf(
    data = LA2021morgan_denton,
    aes(fill = DPxy_star),
    size = 0.05,
    color = 'white'
  ) +
  geom_sf(
    data = states2021 %>% filter(STUSPS == 'LA'),
    fill = 'transparent',
    color = 'black',
    size = 0.2
  ) +
  theme_minimal() +
  scale_fill_viridis_c(limits = c(0, 1)) +
  labs(fill = 'Index (Continuous)', caption = 'Source: U.S. Census ACS 2017-2021 estimates') +
  ggtitle(
    'Distance-Decay Interaction Index (Morgan)\nCensus tracts within core-based statistical areas of Louisiana',
    subtitle = 'Black non-Hispanic vs. white non-Hispanic'
  )

Compute Distance-Decay Isolation Index (DPxx*)

Compute the racial or ethnic DPxx* values (2017-2021 5-year ACS) for census tracts within census-designated placed of Louisiana. This metric is based on Morgan (1983) and Massey & Denton (1988). DPxx* is some measure of the probability that a member of one racial or ethnic subgroup will meet or interact with a member of the same racial or ethnic subgroup. DPxx* can range in value from 0 to 1 with higher values signifying higher probability of isolation (less isolation).

morgan_massey2021LA <- morgan_massey(
  geo_large = 'cbsa',
  geo_small = 'tract',
  state = 'LA',
  year = 2021,
  subgroup = c('NHoLB', 'HoLB')
)

# Obtain the 2021 core-based statistical areas from the 'tigris' package
cbsa2021 <- core_based_statistical_areas(year = 2021, cb = TRUE)
# Obtain the 2021 state from the 'tigris' package
states2021 <- states(year = 2021, cb = TRUE)

# Join the DPxx* values to the core-based statistical area geometries and filter for Louisiana
LA2021morgan_massey <- cbsa2021 %>%
  left_join(morgan_massey2021LA$dpxx_star, by = 'GEOID') %>%
  filter(!st_is_empty(.)) %>%
  filter(!is.na(DPxx_star)) %>%
  st_filter(states2021 %>% filter(STUSPS == 'LA'), .predicate = st_within) %>%
  st_make_valid()
# Visualize the DPxx* values (2017-2021 5-year ACS) for census tracts within core-based statistical areas of Louisiana
ggplot() +
  geom_sf(
    data = LA2021morgan_massey,
    aes(fill = DPxx_star),
    size = 0.05,
    color = 'white'
  ) +
  geom_sf(
    data = states2021 %>% filter(STUSPS == 'LA'),
    fill = 'transparent',
    color = 'black',
    size = 0.2
  ) +
  theme_minimal() +
  scale_fill_viridis_c(limits = c(0, 1)) +
  labs(fill = 'Index (Continuous)', caption = 'Source: U.S. Census ACS 2017-2021 estimates') +
  ggtitle(
    'Distance-Decay Isolation Index (Morgan)\nCensus tracts within core-based statistical areas of Louisiana',
    subtitle = 'Black population'
  )

sessionInfo()
## R version 4.4.1 (2024-06-14 ucrt)
## Platform: x86_64-w64-mingw32/x64
## Running under: Windows 10 x64 (build 19045)
## 
## Matrix products: default
## 
## 
## locale:
## [1] LC_COLLATE=English_United States.utf8 
## [2] LC_CTYPE=English_United States.utf8   
## [3] LC_MONETARY=English_United States.utf8
## [4] LC_NUMERIC=C                          
## [5] LC_TIME=English_United States.utf8    
## 
## time zone: America/New_York
## tzcode source: internal
## 
## attached base packages:
## [1] stats     graphics  grDevices utils     datasets  methods   base     
## 
## other attached packages:
## [1] tigris_2.1       tidycensus_1.6.5 sf_1.0-16        ndi_0.1.6.9014  
## [5] ggplot2_3.5.1    dplyr_1.1.4      knitr_1.48      
## 
## loaded via a namespace (and not attached):
##  [1] gtable_0.3.5       xfun_0.47          bslib_0.8.0        psych_2.4.6.26    
##  [5] lattice_0.22-6     tzdb_0.4.0         Cairo_1.6-2        vctrs_0.6.5       
##  [9] tools_4.4.1        generics_0.1.3     curl_5.2.2         parallel_4.4.1    
## [13] tibble_3.2.1       proxy_0.4-27       fansi_1.0.6        highr_0.11        
## [17] pkgconfig_2.0.3    Matrix_1.7-0       KernSmooth_2.23-24 uuid_1.2-1        
## [21] lifecycle_1.0.4    farver_2.1.2       compiler_4.4.1     stringr_1.5.1     
## [25] munsell_0.5.1      mnormt_2.1.1       carData_3.0-5      htmltools_0.5.8.1 
## [29] class_7.3-22       sass_0.4.9         yaml_2.3.10        pillar_1.9.0      
## [33] car_3.1-2          crayon_1.5.3       jquerylib_0.1.4    tidyr_1.3.1       
## [37] MASS_7.3-61        classInt_0.4-10    cachem_1.1.0       wk_0.9.2          
## [41] abind_1.4-5        nlme_3.1-166       tidyselect_1.2.1   rvest_1.0.4       
## [45] digest_0.6.36      stringi_1.8.4      purrr_1.0.2        labeling_0.4.3    
## [49] fastmap_1.2.0      grid_4.4.1         colorspace_2.1-1   cli_3.6.3         
## [53] magrittr_2.0.3     utf8_1.2.4         e1071_1.7-14       readr_2.1.5       
## [57] withr_3.0.1        scales_1.3.0       rappdirs_0.3.3     rmarkdown_2.28    
## [61] httr_1.4.7         hms_1.1.3          evaluate_0.24.0    viridisLite_0.4.2 
## [65] s2_1.1.7           rlang_1.1.4        Rcpp_1.0.13        glue_1.7.0        
## [69] DBI_1.2.3          xml2_1.3.6         rstudioapi_0.16.0  jsonlite_1.8.8    
## [73] R6_2.5.1           units_0.8-5