library(sf)
library(mapview)
library(dplyr)
Getting demographic data
Below, we have provided instructions for downloading the data that will be used to identify underserved communities in Tampa Bay. To view instructions for cleaning the data and utilizing the demographic indices to map underserved communities, see Mapping underserved communities.
Load the required R packages (install first as needed).
To collect demographic data that will be used for identifying underserved communities, we will be downloading U.S. census data provided by the EPA’s 2022 Environmental Justice Screening Tool (EJScreen). This data is available from https://gaftp.epa.gov/EJSCREEN/2022/. Here you will find different versions of EJScreen data that are summarized, calculated, and visualized in different ways to meet your particular needs (e.g., census blocks or tracts, state or national percentiles, tabular or spatial data).
In our case, we are interested in obtaining spatial data for the supplemental demographic indices, summarized at the census tract level, using national percentiles as our thresholds for identifying underserved communities. Census tracts represent aggregated block groups of 1,200-8,000 people. This level is advantageous because it is the highest resolution for which the federal government provides standardized demographic, socioeconomic, and environmental data.
The appropriate file to download for our requirements at the tract level is EJSCREEN_2022_Supplemental_with_AS_CNMI_GU_VI_Tracts.gdb.zip.
Download the relevant file from EJScreen. The file is downloaded to a temporary directory (~260mb).
# url with zip gdb to download
<- 'https://gaftp.epa.gov/EJSCREEN/2022/EJSCREEN_2022_Supplemental_with_AS_CNMI_GU_VI_Tracts.gdb.zip'
urlin
# download file
<- tempfile(fileext = ".zip")
tmp1 download.file(url = urlin, destfile = tmp1)
Unzip the geodatabase that was downloaded to a second temporary directory.
# unzip file
<- tempdir()
tmp2 ::unzip(tmp1, exdir = tmp2) utils
Read the polygon layer from the geodatabase.
# get the layers from the gdb
<- list.files(tmp2, pattern = '\\.gdb$', full.names = T)
gdbpth <- gsub('\\\\', '/', gdbpth)
gdbpth <- st_layers(gdbpth)$name
lyr
# read the layer
<- st_read(dsn = gdbpth, lyr) dat
To exclude census tracts outside of our watershed boundary, intersect the layer with the Tampa Bay watershed (available as an RData object in the source repository for this website here). If working in a different area, you will want to replace the tbshed
shapefile with your own boundary file.
load(file = 'data/tbshed.RData')
# intersect the layer with the tb watershed
<- dat %>%
tb_tract st_transform(crs = st_crs(tbshed)) %>%
st_make_valid() %>%
st_intersection(tbshed)
The layer can be saved as an RData object if needed. The size should be minimal (~1mb).
# save the layer as an RData object (~1mb)
save(tb_tract, file = 'data/tb_tract.RData')
View the data using mapview (only the spatial data are shown). You can see that we now have the desired spatial data just for our watershed.
load(file = 'data/tb_tract.RData')
%>%
tb_tract select(-everything()) %>%
mapview(layer.name = "Census tracts")
Unlink the temporary files to delete them when you are finished.
unlink(tmp1, recursive = TRUE)
unlink(gdbpth, recursive = TRUE)