5.4 Accessibility

Travel routing algorithms can also be used to create “isochrones”, which are bands of equal distance (or duration) travel from a central point. Depending on the algorithm, these can be significantly more efficient to compute than individual travel routes, since as one propagates movement from a central point, many routes are shared; or, an isochrone function will divide up a region into cells of a certain granularity, route to only those cells, and then connect the dots between results. Usually isochrone functions only differ in their underlying travel network accuracy and the resolution chosen for analysis.

mapboxapi, which we used in the previous section, can also be used to generate isochrones. For our demonstration, let’s source COVID testing sites from a countrywide volunteer effort by GIS Corps. This ESRI-based API can be referenced using st_read() as we’ve done before:

library(mapboxapi)
library(tidyverse)
library(tigris)
library(sf)
library(leaflet)

mb_access_token("YOUR_TOKEN_HERE", install = T)

covid_testing <-
  st_read("https://opendata.arcgis.com/datasets/11fe8f374c344549815a716c8472832f_0.geojson")

## Reading layer `Testing_Locations' from data source `https://opendata.arcgis.com/datasets/11fe8f374c344549815a716c8472832f_0.geojson' using driver `GeoJSON'
## replacing null geometries with empty geometries
## Simple feature collection with 29705 features and 44 fields (with 1 geometry empty)
## geometry type:  POINT
## dimension:      XY
## bbox:           xmin: -176.6576 ymin: -14.32821 xmax: 144.8928 ymax: 71.29834
## geographic CRS: WGS 84

Let’s filter these testing sites down to just the Bay Area, and only those that are considered open at the moment:

bay_county_names <-
  c(
    "Alameda",
    "Contra Costa",
    "Marin",
    "Napa",
    "San Francisco",
    "San Mateo",
    "Santa Clara",
    "Solano",
    "Sonoma"
  )

bay_counties <-
  counties("CA", cb = T, progress_bar = F) %>%
  filter(NAME %in% bay_county_names) %>% 
  st_transform(st_crs(covid_testing))

bay_covid_testing <-
  covid_testing %>% 
  .[bay_counties, ] %>% 
  filter(status == "Open")

leaflet() %>% 
  addProviderTiles(providers$CartoDB.Positron) %>% 
  addCircleMarkers(
    data = bay_covid_testing,
    radius = 1,
    label = ~name
  )

And to simplify the demonstration, let’s further filter to just Santa Clara County:

scc_covid_testing <-
  bay_covid_testing %>% 
  .[bay_counties %>% filter(NAME == "Santa Clara"), ]

The function mb_isochrone() is straightforward to use on an sf object, as shown:

walk_10min <- mb_isochrone(
  scc_covid_testing,
  profile = "walking",
  time = 10
)

scc_covid_testing_walk_10min <-
  scc_covid_testing %>% 
  st_set_geometry(NULL) %>% 
  cbind(walk_10min$geometry) %>% 
  st_as_sf()

leaflet() %>% 
  addMapboxTiles(
    style_id = "streets-v11",
    username = "mapbox"
  ) %>%
  addPolygons(
    data = scc_covid_testing_walk_10min,
    label = ~name
  )

Now, similar to in Section 2.5, let’s intersect these isochrones with CBGs and compare the profile of the population that is within 10 minute walking distance of a COVID testing site to the population of Santa Clara County as a whole. Recall that the key assumption in this technique is that a population is homogeneously distributed within CBGs. Note the use of st_union() below to merge all the isochrone areas into one polygon, since we don’t need to retain the specific attribution of test sites in this analysis, and since we don’t want overlap in isochrones to create duplicate results.

scc_bgs <- 
  block_groups("CA","085", cb = T, progress_bar = F) %>% 
  st_transform(26910) %>% 
  mutate(original_area = st_area(.))

scc_bg_isochrone_intersect <-
  scc_bgs %>% 
  st_intersection(
    scc_covid_testing_walk_10min %>% 
      st_union() %>% 
      st_transform(26910)
  ) %>% 
  mutate(
    leftover_area = st_area(.),
    perc_area = leftover_area / original_area
  )

scc_bg_income <-
  getCensus(
    name = "acs/acs5",
    vintage = 2018,
    region = "block group:*", 
    regionin = "state:06+county:085",
    vars = "group(B19001)"
  ) %>% 
  mutate(cbg = paste0(state,county,tract,block_group)) %>% 
  select(!c(GEO_ID,state,county,tract,block_group,NAME) & !ends_with(c("EA","MA","M"))) %>%
  pivot_longer(
    ends_with("E"),
    names_to = "variable",
    values_to = "estimate"
  ) %>%
  left_join(
    acs_vars_2018_5yr %>% 
      select(name, label), 
    by = c("variable" = "name")
  ) %>% 
  select(-variable) %>% 
  separate(
    label,
    into = c(NA,NA,"income"),
    sep = "!!"
  ) %>% 
  filter(!is.na(income)) %>% 
  mutate(
    income = case_when(
      income %in% c("Less than $10,000","$10,000 to $14,999","$15,000 to $19,999","$20,000 to $24,999") ~ "Less than $25,000",
      income %in% c("$25,000 to $29,999","$30,000 to $34,999","$35,000 to $39,999","$40,000 to $44,999","$45,000 to $49,999") ~ "$25,000 to $49,999",
      income %in% c("$50,000 to $59,999","$60,000 to $74,999") ~ "$50,000 to $74,999",
      TRUE ~ income
    )
  )

scc_income <-
  scc_bg_income %>% 
  mutate(income = factor(income, levels = unique(scc_bg_income$income))) %>% 
  group_by(income) %>% 
  summarize(estimate = sum(estimate)) %>% 
  mutate(
    perc = estimate/sum(estimate),
    group = "Full Population"
  )
  
scc_covid_income <-
  scc_bg_income %>% 
  mutate(income = factor(income, levels = unique(scc_bg_income$income))) %>% 
  left_join(
    scc_bg_isochrone_intersect %>% 
      select(cbg = GEOID, perc_area) %>% 
      st_set_geometry(NULL)
  ) %>% 
  filter(!is.na(perc_area)) %>% 
  mutate(
    estimate = estimate * perc_area
  ) %>% 
  group_by(income) %>% 
  summarize(estimate = sum(estimate)) %>% 
  mutate(
    perc = estimate/sum(estimate),
    group = "Population within 10 min. walk of COVID-19 testing"
  )

sum(scc_covid_income$estimate)/
  sum(scc_income$estimate)

## 0.1687836 [1]

The output above is the estimated percent of population in Santa Clara County within 10 minutes walk of COVID-19 testing. This itself may be a metric worth increasing as a public goal. Now let’s use ggplot to produce a pie chart comparison of full population vs. test-accessible population, in terms of income distribution:

rbind(scc_income,scc_covid_income) %>% 
  ggplot(
    aes(
      x = "", 
      y = perc, 
      fill = reorder(income,desc(income))
    )
  ) + 
  geom_bar(
    stat = "identity", 
    position = position_fill()
  ) +
  geom_text(
    aes(label = paste0(round(perc*100),"%")), 
    position = position_fill(vjust = 0.5)
  ) +
  coord_polar(theta = "y") +
  facet_wrap(~ group)  +
  theme(
    axis.title.x = element_blank(),
    axis.title.y = element_blank(),
    legend.position = 'bottom'
  ) + 
  guides(
    fill = guide_legend(nrow=3, byrow=TRUE)
  ) +
  labs(
    fill = "Household\nIncome"
  )

It does appear that on the whole, COVID-19 testing sites in Santa Clara County are roughly equitable in terms of access to households of different incomes, with a slight favor towards lower income households. It may be a further goal of the County to actually make this distribution more “inequitable” with respect to income, since lower-income households may benefit more from walking access to testing.