5.4 Risk estimation

The final step in the Stanford Urban Risk Framework involves combining different hazard scenarios with different “exceedance rates” (i.e. a 100-yr flood has a 1% exceedance rate), and in the case of our coast flood analysis, also considering the different likelihoods of base sea level rise over some time period (which has to do with climate change models). We’ll deal with these different levels of probabilistic math separately.

First, from our flood hazard source, for each sea level rise scenario, we have three storm events: annual, 20-year, and 100-year floods. The exceedance rates of these three events are 100%, 5%, and 1%. If we combine these exceedance rates within the same 100% band of possibility, then the ranges between each marker become “occurrence rates”:

  • The range between 100% and 5%, or a 95% window, can be represented by floods of steadily increasing intensities, ranging from the intensity of the annual flood to the intensity of the 20-year flood. The “area” underneath this line segment (which looks like a triangle if there is no damage experienced by a building in the annual flood scenario, or otherwise looks like a trapezoid) represents the “expected damage” of all possible floods in this probability window, combined.
  • The range between 5% and 1%, or a 4% window, can be represented by floods ranging from the intensity of the 20-year flood to the intensity of the 100-year flood. The “area” underneath this line segment (which is a trapezoid that would connect to the previous triangle/trapezoid and be much skinnier but grow taller) represents the “expected damage” of all possible floods in this probability window, combined.
  • The range between 1% and 0%, or a 1% window, represents the most unlikely but most extreme events, which could potentially include unimaginably extreme events that are < 0.00001% likely. But given how negligible this thin probability slice will be, relative to the last two segments, we can choose to just represent this probability window using the intensity of the 100-year flood, which will underestimate the actual “expected damage” of this window, but likely not have affected the rounded result overall.

We have three “windows” because we happen to have three events to consider; if we had additional storm events modeled, we’d have additional probability windows which would look more and more like a smooth curve. Given that the description of “areas under the curve” can be represented mathematically, we can perform these calculations on our tidy dataframe from the last section, epa_bldg_perc_damage.

By the way: at this stage, we can also choose to convert our “percent damages” for each building into $ damages, which involves multiplying the percent damage by the total replacement cost of the home which, as a simplifying assumption, will just be the area of the building footprint multiplied by $200 per square foot (this conversion could be refined based on more detailed information about each building’s use, construction type, and number of stories, and can incorporate more detailed construction cost estimates). Since all we’re doing from here on out is multiplication and addition, we could also do this conversion from percents to dollars after the probabilistic match, and it wouldn’t change the result. But since it’s more intuitive to imagine dollar damages at this point, we’ll go ahead and convert first:

library(tidyverse)
library(leaflet)
library(sf)
library(mapboxapi)
library(tigris)
detach("package:raster")

Note that if you want to unload a package you’ve loaded before, say because raster causes the ambiguity with select() you can do it as shown above.

projection <- "+proj=utm +zone=10 +ellps=GRS80 +datum=NAD83 +units=ft +no_defs"

epa_bldg_flooded_max <- 
  readRDS("epa_bldg_flooded_max.rds") %>% 
  st_transform(projection) %>% 
  mutate(
    area = st_area(.) %>% as.numeric()
  )

epa_bldg_perc_damage <- readRDS("epa_bldg_perc_damage.rds")

epa_bldg_damage <-
  epa_bldg_perc_damage %>% 
  left_join(
    epa_bldg_flooded_max %>%
      st_drop_geometry() %>% 
      select(osm_id, area)
  ) %>% 
  mutate(
    damage = area * 200 * perc_damage
  ) %>% 
  select(osm_id, SLR, RP, damage)
head(epa_bldg_damage)

Now that we have “$ damage” for each building, for each event, we can now combine each trio of storm events together (for each of 5 levels of sea level rise). The result will be an “average annualized loss” for each building for 5 different hypothetical years, each of which has a different base sea level rise.

epa_bldg_aal_by_slr <-
  epa_bldg_damage %>% 
  pivot_wider(
    names_from = RP,
    values_from = damage
  ) %>% 
  replace(is.na(.), 0) %>% 
  mutate(
    damage = 
      0.95*(`001`+`020`)/2 + 
      0.04*(`020`+`100`)/2 + 
      0.01*(`100`)
  ) %>% 
  select(osm_id, SLR, damage)
head(epa_bldg_aal_by_slr)

Our dataframe has now been reduced to a third of its size. Next, we will consider, for any given year (we’ll bound our analysis to the 2020-2050 range), the likelihood of sea level rise being some amount or greater. Intuitively for the current year, the current sea level rise is what it is, and the chances of any greater amount of sea level rise before the end of the year are effectively 0. 10 years from now, the distribution of probabilities will be something, based on climate models (for our demonstration, we’ll choose Representative Concentration Pathway 4.5, which is an “intermediate” scenario for global climate action). What we’re dealing with here are, once again, “exceedance rates”, which we can then turn into “occurrence rates” and calculate average annualized losses, but this time the percentages are not self-evident as they were for the storm scenarios. We’ll need to retrieve exceedance rates for sea levels for specific years for our specific location in the Bay Area. Professor Bob Kopp from Rutgers University is the preeminent scientist producing these kinds of estimates, among a wide portfolio of climate change research, and for this 2014 paper, he made these MATLAB scripts available to create localized sea level rise projections. Unfortunately this code is not available in R, and since the details within the software are beyond the scope of this curriculum, we have generated the relevant “occurrence rates” for between different decadal years and made them available in CSV form here:

rcp45 <- read_csv("https://raw.githubusercontent.com/stanfordfuturebay/stanfordfuturebay.github.io/master/advanced/rcp45_sanfrancisco.csv")

rcp45

Since the occurrence rates are only available for decadal years, we will produce outputs which look like “average annualized loss” for each building for 4 different years: 2020, 2030, 2040, and 2050. These will have averaged out the damages associated with different base levels of sea level rise, based on the probabilities of each of those sea levels occurring in each of those four decadal years.

The dplyr manipulation below gets a bit more complicated than we’ve demonstrated before; I encourage you to break this down into individual steps in the pipeline to understand what is happening at each stage.

epa_bldg_aal_by_year <- 
  epa_bldg_aal_by_slr %>% 
  left_join(
    rcp45 %>% 
      mutate(
        SLR = str_pad(SLR, 3 , "left", "0")
      ) %>% 
      select(
        SLR,
        `2020`,
        `2030`,
        `2040`,
        `2050`
      )
  ) %>% 
  pivot_longer(
    `2020`:`2050`,
    names_to = "year",
    values_to = "occurrence"
  ) %>% 
  pivot_longer(
    c(damage,occurrence),
    names_to = "key",
    values_to = "value"
  ) %>% 
  pivot_wider(
    names_from = c("key","SLR"),
    values_from = value
  ) %>% 
  replace(is.na(.), 0) %>% 
  mutate(
    damage = 
      occurrence_000 * (damage_000 + damage_025)/2 + 
      occurrence_025 * (damage_025 + damage_050)/2 + 
      occurrence_050 * (damage_050)
  ) %>% 
  select(osm_id, year, damage)
head(epa_bldg_aal_by_year)

Now, we have “average annualized loss” (AAL) for each building, centered on 4 different years between 2020-2050. This may already be useful to visualize. For example, below we’ll map AAL in 2020, AAL in 2050, and the change in AAL between 2020 and 2050 for each building:

epa_bldg_aal_by_year_map <-
  epa_bldg_aal_by_year %>% 
  pivot_wider(
    names_from = year,
    values_from = damage
  ) %>% 
  mutate(
    change = `2050`-`2020`
  ) %>% 
  left_join(
    epa_bldg_flooded_max %>%
      select(osm_id)
  ) %>% 
  st_as_sf() %>% 
  st_transform(4326)
aal_pal <- colorNumeric(
  palette = "Reds",
  domain = c(0,epa_bldg_aal_by_year_map$`2050`)
)

epa_bldg_aal_by_year_map %>% 
  leaflet() %>% 
  addMapboxTiles(
    style_id = "light-v9",
    username = "mapbox"
  ) %>% 
  addPolygons(
    fillColor = ~aal_pal(`2020`),
    color = "gray",
    fillOpacity = 1,
    opacity = 1,
    weight = 0.25,
    highlightOptions = highlightOptions(
      color = "white",
      weight = 2
    ),
    label = ~paste0("$",prettyNum(signif(`2020`,2),",")," average annualized loss in 2020"),
    group = "2020"
  ) %>% 
  addPolygons(
    fillColor = ~aal_pal(`2050`),
    color = "gray",
    fillOpacity = 1,
    opacity = 1,
    weight = 0.25,
    highlightOptions = highlightOptions(
      color = "white",
      weight = 2
    ),
    label = ~paste0("$",prettyNum(signif(`2050`,2),",")," average annualized loss in 2050"),
    group = "2050"
  ) %>% 
  addPolygons(
    fillColor = ~aal_pal(change),
    color = "gray",
    fillOpacity = 1,
    opacity = 1,
    weight = 0.25,
    highlightOptions = highlightOptions(
      color = "white",
      weight = 2
    ),
    label = ~paste0("$",prettyNum(signif(change,2),",")," change in average annualized loss from 2020 to 2050"),
    group = "Change"
  ) %>% 
  addLegend(
    pal = aal_pal,
    values = ~`2050`,
    title = "AAL"
  ) %>% 
  addLayersControl(
    baseGroups = c("2020","2050","Change"),
    options = layersControlOptions(collapsed = FALSE)
  ) %>% 
  showGroup("2050")

These kinds of numbers should strike you as staggering risks for whomever may be occupying these homes in the next three decades. Finally, let’s average these out to a single measure of AAL across the entire 30 years (in a way, doing yet another integration under the curve, where each year will represent the 5 or 10 years adjacent to it), and aggregate the results up to the block group level, where we can begin to also overlay these risks onto populations and understand the socioeconomic consequences of this flood risk.

epa_boundary <- 
  places("CA", cb = T, progress_bar = F) %>% 
  filter(NAME == "East Palo Alto")

epa_bgs <- 
  block_groups("CA", "San Mateo", progress_bar = F) %>% 
  select(GEOID) %>% 
  .[epa_boundary, ]
epa_bg_aal <-
  epa_bldg_aal_by_year %>% 
  pivot_wider(
    names_from = year,
    values_from = damage
  ) %>% 
  mutate(
    aal = (`2020`*5 + `2030`*10 + `2040`*10 + `2050`*5)/30
  ) %>% 
  left_join(
    epa_bldg_flooded_max %>%
      select(osm_id) %>% 
      st_centroid()
  ) %>% 
  st_as_sf() %>% 
  st_transform(4269) %>% 
  st_join(epa_bgs) %>% 
  st_set_geometry(NULL) %>% 
  group_by(GEOID) %>% 
  summarize(
    aal = sum(aal),
    count = n()
  ) %>% 
  left_join(epa_bgs) %>% 
  st_as_sf()
aal_pal <- colorNumeric(
  palette = "Reds",
  domain = epa_bg_aal$aal
)

epa_bg_aal %>% 
  leaflet() %>% 
  addMapboxTiles(
    style_id = "light-v9",
    username = "mapbox"
  ) %>% 
  addPolygons(
    fillColor = ~aal_pal(aal),
    color = "gray",
    fillOpacity = 0.5,
    opacity = 1,
    weight = 0.25,
    highlightOptions = highlightOptions(
      color = "white",
      weight = 2
    ),
    label = ~paste0("$",prettyNum(signif(aal,2),",")," average annualized loss across ", count, " buildings, 2020-2050")
  ) %>% 
  addLegend(
    pal = aal_pal,
    values = ~aal,
    title = "AAL, 2020-2050"
  )