Installation
Install the package from r-universe as follows. The source code is available on GitHub.
# Install tbeploads in R:
install.packages('tbeploads', repos = c('https://tbep-tech.r-universe.dev', 'https://cloud.r-project.org'))
Load the package in an R session after installation:
library(tbeploads)
Usage
Load estimates are broadly defined as domestic point source (DPS), industrial point source (IPS), material losses (ML), nonpoint source (NPS), atmospheric deposition (AD), and groundwater sources and springs (GW). The functions are built around these sources with unique inputs for each.
DPS
The domestic point source (DPS) functions are designed to work with
raw entity data provided by partners. The core function is
anlz_dps_facility()
that requires only a vector of file
paths as input, where each path points to a file with monthly parameter
concentration (mg/L) and flow data (million gallons per day). The data
also describe whether the observations are end of pipe (direct inflow to
the bay) or reuse (applied to the land), with each defined by outfall
Ids typically noted as D-001, D-002, etc. and R-001, R-002, etc,
respectively. Both are estimated as concentration times flow, whereas
reuse includes an attenuation factor for land application depending on
location. The file names must follow a specific convention, where
metadata for each entity is found in the facilities()
data
object using information in the file name.
For convenience, four example files are included with the package. These files represent actual entities and facilities, but the data have been randomized. The paths to these files are used as input to the function. Non-trivial data pre-processing and quality control is needed for each file and those included in the package are the correct format. The output is returned as tons per month for TN, TP, TSS, and BOD and million cubic meters per month for flow (hy).
dpsfls <- list.files(system.file('extdata/', package = 'tbeploads'),
pattern = 'ps_dom', full.names = TRUE)
anlz_dps_facility(dpsfls)
#> # A tibble: 144 × 11
#> Year Month entity facility coastco source tn_load tp_load tss_load bod_load
#> <int> <int> <chr> <chr> <chr> <chr> <dbl> <dbl> <dbl> <dbl>
#> 1 2021 1 Clearw… City of… 387 D-001 0.862 6.04 0.452 1.14
#> 2 2021 2 Clearw… City of… 387 D-001 1.37 1.15 0.553 2.87
#> 3 2021 3 Clearw… City of… 387 D-001 1.70 0.575 0.982 3.31
#> 4 2021 4 Clearw… City of… 387 D-001 0.344 0.0801 0.105 0.656
#> 5 2021 5 Clearw… City of… 387 D-001 1.09 0.460 0.603 2.23
#> 6 2021 6 Clearw… City of… 387 D-001 0.641 0.435 0.248 0.588
#> 7 2021 7 Clearw… City of… 387 D-001 0.667 0.183 0.360 1.33
#> 8 2021 8 Clearw… City of… 387 D-001 0.138 0.164 0.0601 0.327
#> 9 2021 9 Clearw… City of… 387 D-001 0.421 1.63 0.281 0.655
#> 10 2021 10 Clearw… City of… 387 D-001 1.15 0.217 0.563 0.666
#> # ℹ 134 more rows
#> # ℹ 1 more variable: hy_load <dbl>
The anlz_dps()
function uses
anlz_dps_facility()
to summarize the DPS results by
location as facility (combines outfall data), entity (combines facility
data), bay segment (combines entity data), and as all (combines bay
segment data). The results can also be temporally summarized as monthly
or annual totals. The location summary is defined by the
summ
argument and the temporal summary is defined by the
summtime
argument. The fls
argument used by
anlz_dps_facility()
is also used by
anlz_dps()
. The output is tons per month for TN, TP, TSS,
and BOD and as million cubic meters per month for flow (hy) if
summtime = 'month'
or tons per year for TN, TP, TSS, and
BOD and million cubic meters per year for flow (hy) if
summtime = 'year'
.
# combine by entity and month
anlz_dps(dpsfls, summ = 'entity', summtime = 'month')
#> # A tibble: 108 × 10
#> Year Month source entity segment tn_load tp_load tss_load bod_load hy_load
#> <int> <int> <chr> <chr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 2021 1 DPS - e… Clear… Old Ta… 0.862 6.04 0.452 1.14 0.663
#> 2 2021 1 DPS - r… Clear… Old Ta… 0.102 0.00852 0.00844 0.0473 0.165
#> 3 2021 2 DPS - e… Clear… Old Ta… 1.37 1.15 0.553 2.87 0.948
#> 4 2021 2 DPS - r… Clear… Old Ta… 0.210 0.0416 0.0169 0.0244 0.319
#> 5 2021 3 DPS - e… Clear… Old Ta… 1.70 0.575 0.982 3.31 1.83
#> 6 2021 3 DPS - r… Clear… Old Ta… 0.261 0.0439 0.0194 0.0554 0.417
#> 7 2021 4 DPS - e… Clear… Old Ta… 0.344 0.0801 0.105 0.656 0.194
#> 8 2021 4 DPS - r… Clear… Old Ta… 0.0472 0.00602 0.00245 0.00499 0.0563
#> 9 2021 5 DPS - e… Clear… Old Ta… 1.09 0.460 0.603 2.23 0.874
#> 10 2021 5 DPS - r… Clear… Old Ta… 0.0354 0.00281 0.00298 0.00885 0.0729
#> # ℹ 98 more rows
# combine by bay segment and year
anlz_dps(dpsfls, summ = "segment", summtime = "year")
#> # A tibble: 9 × 8
#> Year source segment tn_load tp_load tss_load bod_load hy_load
#> <int> <chr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 2017 DPS - end of pipe Hillsboroug… 0.891 1.20e-1 2.90e-1 6.92e-1 5.25e-1
#> 2 2017 DPS - reuse Hillsboroug… 464. 2.94e+1 5.10e+1 1.70e+2 1.10e+3
#> 3 2018 DPS - end of pipe Hillsboroug… 102. 1.49e+1 3.58e+1 8.24e+1 6.43e+1
#> 4 2018 DPS - reuse Hillsboroug… 28.6 8.80e-1 1.82e+0 4.88e+0 3.95e+1
#> 5 2019 DPS - end of pipe Hillsboroug… 14.4 1.63e+0 2.92e+0 9.17e+0 6.55e+0
#> 6 2019 DPS - reuse Hillsboroug… 4.21 8.47e-2 1.33e-1 5.20e-1 3.82e+0
#> 7 2021 DPS - reuse Hillsboroug… 2.90 1.61e-9 1.61e-9 1.61e-9 1.76e+0
#> 8 2021 DPS - end of pipe Old Tampa B… 8.62 1.11e+1 4.34e+0 1.43e+1 7.35e+0
#> 9 2021 DPS - reuse Old Tampa B… 1.48 2.35e-1 1.19e-1 7.17e-1 2.39e+0
IPS
The industrial point source (IPS) functions are designed to work with
raw entity data provided by partners and are similar in functionality to
the DPS functions. The core function is anlz_ips_facility()
that requires only a vector of file paths as input, where each path
points to a file with monthly parameter concentration (mg/L) and flow
data (million gallons per day). Loads are estimated as concentration
times flow. The file names must follow a specific convention, where
metadata for each entity is found in the facilities()
data
object using information in the file name.
For convenience, four example files are included with the package. These files represent actual entities and facilities, but the data have been randomized. The paths to these files are used as input to the function. As before, non-trivial data pre-processing and quality control is needed for each file and those included in the package are the correct format. The output is returned as tons per month for TN, TP, TSS, and BOD and million cubic meters per month for flow (hy).
ipsfls <- list.files(system.file('extdata/', package = 'tbeploads'),
pattern = 'ps_ind_', full.names = TRUE)
anlz_ips_facility(ipsfls)
#> # A tibble: 60 × 11
#> Year Month entity facility coastco source tn_load tp_load tss_load bod_load
#> <int> <int> <chr> <chr> <chr> <chr> <dbl> <dbl> <dbl> <dbl>
#> 1 2020 1 Busch … Busch G… 191a D-002 0.0195 0.00253 0.443 0.850
#> 2 2020 2 Busch … Busch G… 191a D-002 0.0306 0.00328 0.544 1.04
#> 3 2020 3 Busch … Busch G… 191a D-002 0.0972 0.00925 1.39 2.68
#> 4 2020 4 Busch … Busch G… 191a D-002 0.0398 0.0258 0.522 1.00
#> 5 2020 5 Busch … Busch G… 191a D-002 0.0820 0.00514 0.506 0.971
#> 6 2020 6 Busch … Busch G… 191a D-002 0.0112 0.00594 0.355 0.681
#> 7 2020 7 Busch … Busch G… 191a D-002 0.0430 0.00413 0.506 0.971
#> 8 2020 8 Busch … Busch G… 191a D-002 0.0167 0.00225 0.199 0.382
#> 9 2020 9 Busch … Busch G… 191a D-002 0.0226 0.00307 0.332 0.638
#> 10 2020 10 Busch … Busch G… 191a D-002 0.0187 0.0186 0.333 0.638
#> # ℹ 50 more rows
#> # ℹ 1 more variable: hy_load <dbl>
The anlz_ips()
function uses
anlz_ips_facility()
to summarize the IPS results by
location as facility (combines outfall data), entity (combines facility
data), bay segment (combines entity data), and as all (combines bay
segment data). The results can also be temporally summarized as monthly
or annual totals. The location summary is defined by the
summ
argument and the temporal summary is defined by the
summtime
argument. The fls
argument used by
anlz_ips_facility()
is also used by
anlz_ips()
. The output is tons per month for TN, TP, TSS,
and BOD and as million cubic meters per month for flow (hy) if
summtime = 'month'
or tons per year for TN, TP, TSS, and
BOD and million cubic meters per year for flow (hy) if
summtime = 'year'
.
# combine by entity and month
anlz_ips(ipsfls, summ = 'entity', summtime = 'month')
#> # A tibble: 60 × 10
#> Year Month source entity segment tn_load tp_load tss_load bod_load hy_load
#> <int> <int> <chr> <chr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 2020 1 IPS Busch G… Hillsb… 0.0195 0.00253 0.443 0.850 0.0804
#> 2 2020 2 IPS Busch G… Hillsb… 0.0306 0.00328 0.544 1.04 0.0987
#> 3 2020 3 IPS Busch G… Hillsb… 0.0972 0.00925 1.39 2.68 0.253
#> 4 2020 4 IPS Busch G… Hillsb… 0.0398 0.0258 0.522 1.00 0.0946
#> 5 2020 5 IPS Busch G… Hillsb… 0.0820 0.00514 0.506 0.971 0.0917
#> 6 2020 6 IPS Busch G… Hillsb… 0.0112 0.00594 0.355 0.681 0.0644
#> 7 2020 7 IPS Busch G… Hillsb… 0.0430 0.00413 0.506 0.971 0.0918
#> 8 2020 8 IPS Busch G… Hillsb… 0.0167 0.00225 0.199 0.382 0.0361
#> 9 2020 9 IPS Busch G… Hillsb… 0.0226 0.00307 0.332 0.638 0.0603
#> 10 2020 10 IPS Busch G… Hillsb… 0.0187 0.0186 0.333 0.638 0.0603
#> # ℹ 50 more rows
# combine by bay segment and year
anlz_ips(ipsfls, summ = "segment", summtime = "year")
#> # A tibble: 5 × 8
#> Year source segment tn_load tp_load tss_load bod_load hy_load
#> <int> <chr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 2017 IPS Hillsborough Bay 0.215 0.0612 0 0 0.188
#> 2 2018 IPS Hillsborough Bay 0.168 0.0456 0 0 0.140
#> 3 2019 IPS Hillsborough Bay 0.0950 0.0226 0 0 0.0763
#> 4 2020 IPS Hillsborough Bay 0.437 0.0858 6.11 11.7 1.11
#> 5 2021 IPS Hillsborough Bay 0.0305 0.0515 0.0662 0 0.0184
ML
Material losses (ML) are estimates of nutrient loads to the bay primarily from fertilizer shipping activities at ports. Historically, loadings from material losses were much higher than at present. Only a few entities report material losses, typically as a total for the year and only for total nitrogen. The material losses as tons/yr are estimated from the tons shipped using an agreed upon loss rate. Values reported in the example files represent the estimated loss as the total tons of N shipped each year multiplied by 0.0023 and divided by 2000. The total N shipped at a facility each year can be obtained using a simple back-calculation (multiply by 2000, divide by 0.0023).
The core function is anlz_ml_facility()
that requires
only a vector of file paths as input, where each file should be one row
per year per facility, where the row shows the total tons per year of
total nitrogen loss. The file names must follow a specific convention,
where metadata for each entity is found in the facilities()
data object using information in the file name.
For convenience, four example files are included with the package. These files represent actual entities and facilities, but the data have been randomized. The paths to these files are used as input to the function. The output is nearly identical to the input data since no load calculations are used, except results are shown as monthly load as the annual loss divided by 12. Additional empty columns (e.g., TP load, TSS load, etc.) are also returned for consistency of reporting with other loading sources.
mlfls <- list.files(system.file('extdata/', package = 'tbeploads'),
pattern = 'ps_indml', full.names = TRUE)
anlz_ml_facility(mlfls)
#> # A tibble: 60 × 11
#> Year Month entity facility coastco source tn_load tp_load tss_load bod_load
#> <int> <int> <chr> <chr> <chr> <lgl> <dbl> <lgl> <lgl> <lgl>
#> 1 2017 1 Kinder… Kinder … NA NA 0.0155 NA NA NA
#> 2 2017 2 Kinder… Kinder … NA NA 0.0155 NA NA NA
#> 3 2017 3 Kinder… Kinder … NA NA 0.0155 NA NA NA
#> 4 2017 4 Kinder… Kinder … NA NA 0.0155 NA NA NA
#> 5 2017 5 Kinder… Kinder … NA NA 0.0155 NA NA NA
#> 6 2017 6 Kinder… Kinder … NA NA 0.0155 NA NA NA
#> 7 2017 7 Kinder… Kinder … NA NA 0.0155 NA NA NA
#> 8 2017 8 Kinder… Kinder … NA NA 0.0155 NA NA NA
#> 9 2017 9 Kinder… Kinder … NA NA 0.0155 NA NA NA
#> 10 2017 10 Kinder… Kinder … NA NA 0.0155 NA NA NA
#> # ℹ 50 more rows
#> # ℹ 1 more variable: hy_load <lgl>
The anlz_ml()
function uses
anlz_ml_facility()
to summarize the IPS results by location
as facility, entity (combines facility data), bay segment (combines
entity data), and as all (combines bay segment data). The results can
also be temporally summarized as monthly or annual totals. The location
summary is defined by the summ
argument and the temporal
summary is defined by the summtime
argument. The
fls
argument used by anlz_ml_facility()
is
also used by anlz_ml()
. The output is tons per month of TN
if summtime = 'month'
or tons per year of TN if
summtime = 'year'
. Columns for TP, TSS, BOD, and hydrologic
load are also returned with zero load for consistency with other point
source load calculation functions. Material loss loads are often
combined with IPS loads for reporting.
# combine by entity and month
anlz_ml(mlfls, summ = 'entity', summtime = 'month')
#> # A tibble: 60 × 10
#> Year Month source entity segment tn_load tp_load tss_load bod_load hy_load
#> <int> <int> <chr> <chr> <chr> <dbl> <int> <int> <int> <int>
#> 1 2020 1 ML CSX Hillsbor… 0.0743 0 0 0 0
#> 2 2020 2 ML CSX Hillsbor… 0.0743 0 0 0 0
#> 3 2020 3 ML CSX Hillsbor… 0.0743 0 0 0 0
#> 4 2020 4 ML CSX Hillsbor… 0.0743 0 0 0 0
#> 5 2020 5 ML CSX Hillsbor… 0.0743 0 0 0 0
#> 6 2020 6 ML CSX Hillsbor… 0.0743 0 0 0 0
#> 7 2020 7 ML CSX Hillsbor… 0.0743 0 0 0 0
#> 8 2020 8 ML CSX Hillsbor… 0.0743 0 0 0 0
#> 9 2020 9 ML CSX Hillsbor… 0.0743 0 0 0 0
#> 10 2020 10 ML CSX Hillsbor… 0.0743 0 0 0 0
#> # ℹ 50 more rows
# combine by bay segment and year
anlz_ml(mlfls, summ = "segment", summtime = "year")
#> # A tibble: 5 × 8
#> Year source segment tn_load tp_load tss_load bod_load hy_load
#> <int> <chr> <chr> <dbl> <int> <int> <int> <int>
#> 1 2017 ML Hillsborough Bay 0.186 0 0 0 0
#> 2 2018 ML Hillsborough Bay 0.188 0 0 0 0
#> 3 2019 ML Hillsborough Bay 0.0224 0 0 0 0
#> 4 2020 ML Hillsborough Bay 0.892 0 0 0 0
#> 5 2021 ML Hillsborough Bay 0.989 0 0 0 0
AD
Loading from atmospheric deposition (AD) for bay segments in the
Tampa Bay watershed are calculated using rainfall data from weather
stations in the watershed and atmospheric concentration data from the
Verna Wellfield site. Rainfall data must be obtained using the
util_ad_getrain()
function before calculating loads. For
convenience, daily rainfall data from 2017 to 2023 at sites in the
watershed are included with the package in the ad_rain
object.
head(ad_rain)
#> # A tibble: 6 × 6
#> station date Year Month Day rainfall
#> <dbl> <date> <int> <dbl> <int> <dbl>
#> 1 228 2017-01-01 2017 1 1 0
#> 2 228 2017-01-02 2017 1 2 0
#> 3 228 2017-01-03 2017 1 3 0
#> 4 228 2017-01-04 2017 1 4 0
#> 5 228 2017-01-05 2017 1 5 0.02
#> 6 228 2017-01-06 2017 1 6 0
The Verna Wellfield data must also be obtained from https://nadp.slh.wisc.edu/sites/ntn-FL41/ as monthly
observations. This file is also included with the package and can be
found using system.file()
as follows:
vernafl <- system.file('extdata/verna-raw.csv', package = 'tbeploads')
vernafl
#> [1] "/home/runner/work/_temp/Library/tbeploads/extdata/verna-raw.csv"
During load calculation, the Verna data are converted to total
nitrogen and total phosphorus from ammonium and nitrate concentration
data using the util_ad_prepverna()
function. Total nitrogen
and phosphorus concentrations are estimated from ammonium and nitrate
concentrations (mg/L) using the following relationships:
The first equation corrects for the % of ions in ammonium and nitrate that are N, and the second is a regression relationship between TBADS TN and TP, applied to Verna.
AD loads are estimated using the anlz_ad()
function,
where total hydrologic load by bay segment is calculated from the rain
data and total nitrogen and phosphorus load is calculated by multiplying
hydrologic load by the atmospheric deposition concentrations from the
Verna data. Total hydrologic load for each bay segment is calculated
using daily estimates of rainfall at NWIS NCDC sites in the watershed.
This is done as a weighted mean of rainfall at the measured sites
relative to grid locations in each bay segment. The weights are based on
distance of the grid cells from the closest site as inverse distance
squared. Total hydrologic load for a sub-watershed is then estimated by
converting inches/month to m3/month using the area of each bay segment.
The distance data and bay segment areas are contained in the file
included with the package.
head(ad_distance)
#> segment seg_x seg_y matchsit distance invdist2 area
#> 1 1 344902.2 3080488 520 26000.117 1.479277e-09 23407.05
#> 2 1 344902.2 3080488 940 39711.285 6.341210e-10 23407.05
#> 3 1 344902.2 3080488 945 44691.735 5.006631e-10 23407.05
#> 4 1 344902.2 3080488 1632 22105.640 2.046416e-09 23407.05
#> 5 1 344902.2 3080488 2806 8649.245 1.336730e-08 23407.05
#> 6 1 344902.2 3080488 3986 47920.032 4.354776e-10 23407.05
The total nitrogen and phosphorus loads are then estimated for each bay segment by multiplying the total hydrologic load by the total nitrogen and phosphorus concentrations in the Verna data. The loading calculations also include a wet/dry deposition conversion factor to account for differences in loading during the rainy and dry seasons.
Using anlz_ad()
to estimate AD load is done as follows,
where ad_rain
is the rain data and vernafl
is
the path to the Verna Wellfield data.
anlz_ad(ad_rain, vernafl)
#> # A tibble: 672 × 7
#> Year Month source segment tn_load tp_load hy_load
#> <int> <dbl> <chr> <chr> <dbl> <dbl> <dbl>
#> 1 2017 1 AD Boca Ciega Bay 1.13 0.0192 1.98
#> 2 2017 2 AD Boca Ciega Bay 0.945 0.0168 1.95
#> 3 2017 3 AD Boca Ciega Bay 0.950 0.0181 2.48
#> 4 2017 4 AD Boca Ciega Bay 3.23 0.0504 3.90
#> 5 2017 5 AD Boca Ciega Bay 9.06 0.131 6.89
#> 6 2017 6 AD Boca Ciega Bay 9.67 0.178 22.5
#> 7 2017 7 AD Boca Ciega Bay 10.8 0.189 26.2
#> 8 2017 8 AD Boca Ciega Bay 5.58 0.120 24.6
#> 9 2017 9 AD Boca Ciega Bay 2.14 0.0553 14.1
#> 10 2017 10 AD Boca Ciega Bay 0.493 0.0174 5.56
#> # ℹ 662 more rows
Results can be summarized by segment, baywide, monthly, or annually
using the summ
and summtime
arguments. By
default, loads are returned monthly for each segment. Note that Boca
Ciega Bay and Boca Ciega Bay South results are returned separately. Only
Boca Ciega Bay South is used when estimating total bay loads.