library(sf)
library(leaflet)
library(dplyr)Getting demographic data
Below, we have provided instructions for downloading the data that will be used to identify underserved communities in Tampa Bay. To view instructions for cleaning the data and utilizing the demographic indices to map underserved communities, see Mapping underserved communities.
UPDATE: In 2025, The U.S. Environmental Protection Agency removed all data we rely on for mapping underserved and overburdened communities in Tampa Bay, including EJScreen and CEJST. The instructions below have now been updated to reflect this change, which requires the user to download archived copies of these datasets from third party sources.
Load the required R packages (install first as needed).
To collect demographic data that will be used for identifying underserved communities, we will be downloading U.S. census data originally provided by the EPA’s 2022 Environmental Justice Screening Tool (EJScreen). Archived EJScreen data for 2015-2024 is available from the Harvard Dataverse. Here you will find different versions of EJScreen data that are summarized, calculated, and visualized in different ways to meet your particular needs (e.g., census blocks or tracts, state or national percentiles, tabular or spatial data).
In our case, we are interested in obtaining spatial data for the supplemental demographic indices, summarized at the census tract level, using national percentiles as our thresholds for identifying underserved communities. Census tracts represent aggregated block groups of 1,200-8,000 people. This level is advantageous because it is the highest resolution for which the federal government provides standardized demographic, socioeconomic, and environmental data. However, this data is not available in the archived 2023 or 2024 EJScreen data, so we use the 2022 data.
The appropriate file to download for our requirements at the tract level is named: EJSCREEN_2022_Supplemental_with_AS_CNMI_GU_VI_Tracts.gdb.zip.
Download the ZIP file (~260mb), available from our public Dropbox repository, to a temporary directory.
# url with zip gdb to download
urlin <- 'https://www.dropbox.com/scl/fi/1hshluhorw2fgy3shkvjo/EJSCREEN_2022_Supplemental_with_AS_CNMI_GU_VI_Tracts.gdb.zip?rlkey=c1jyluowtutk9vhhhff7wlnfq&st=w6gs4y2r&dl=1'
# download file
tmp1 <- tempfile(fileext = ".zip")
download.file(url = urlin, destfile = tmp1)Unzip the geodatabase that was downloaded to a second temporary directory.
# unzip file
tmp2 <- tempdir()
utils::unzip(tmp1, exdir = tmp2)You can now work with the data in R. First, read the polygon layer from the geodatabase.
# get the layers from the gdb
gdbpth <- list.files(tmp2, pattern = '\\.gdb$', full.names = T)
gdbpth <- gsub('\\\\', '/', gdbpth)
lyr <- st_layers(gdbpth)$name
# read the layer
dat <- st_read(dsn = gdbpth, lyr)To exclude census tracts outside of our watershed boundary, intersect the layer with the Tampa Bay watershed (available as an RData object in the source repository for this website here). If working in a different area, you will want to replace the tbshed shapefile with your own boundary file.
load(file = 'data/tbshed.RData')
# intersect the layer with the tb watershed
tb_tract <- dat %>%
st_transform(crs = st_crs(tbshed)) %>%
st_make_valid() %>%
st_intersection(tbshed)The layer can be saved as an RData object if needed. The size should be minimal (~1mb).
# save the layer as an RData object (~1mb)
save(tb_tract, file = 'data/tb_tract.RData')View the data using leaflet. You can see that we now have the desired spatial data just for our watershed.
load(file = 'data/tb_tract.RData')
# Convert to WGS84 for leaflet
tb_tract_wgs84 <- st_transform(tb_tract, crs = 4326)
tb_tract_wgs84 %>%
select(ID) %>%
leaflet() %>%
addProviderTiles("CartoDB.Positron") %>%
addPolygons(
fillColor = "blue",
weight = 0.5,
opacity = 1,
color = "black",
fillOpacity = 0.7,
popup = ~ID,
group = "Census Tract"
) %>%
addLayersControl(
overlayGroups = "Census Tract",
options = layersControlOptions(collapsed = FALSE)
)Unlink the temporary files to delete them when you are finished.
unlink(tmp1, recursive = TRUE)
unlink(gdbpth, recursive = TRUE)