Skip to contents

Analyze Fecal Indicator Bacteria categories over time by station or bay segment

Usage

anlz_fibmatrix(
  fibdata,
  yrrng = NULL,
  stas = NULL,
  bay_segment = NULL,
  indic,
  threshold = NULL,
  lagyr = 3,
  subset_wetdry = c("all", "wet", "dry"),
  precipdata = NULL,
  temporal_window = NULL,
  wet_threshold = NULL,
  warn = TRUE
)

Arguments

fibdata

input data frame as returned by read_importfib, read_importentero, or read_importwqp, see details

yrrng

numeric vector indicating min, max years to include, defaults to range of years in data, see details

stas

optional vector of stations to include, see details

bay_segment

optional vector of bay segment names to include, supercedes stas if provided, see details

indic

character for choice of fecal indicator. Allowable options are fcolif for fecal coliform, or entero for Enterococcus. A numeric column in the data frame must have this name.

threshold

optional numeric for threshold against which to calculate exceedances for the indicator bacteria of choice. If not provided, defaults to 400 for fcolif and 130 for entero.

lagyr

numeric for year lag to calculate categories, see details

subset_wetdry

character, subset data frame to only wet or dry samples as defined by wet_threshold and temporal_window? Defaults to "all", which will not subset. If "wet" or "dry" is specified, anlz_fibwetdry is called using the further specified parameters, and the data frame is subsetted accordingly.

precipdata

input data frame as returned by read_importrain. columns should be: station, date (yyyy-mm-dd), rain (in inches). The object catchprecip has this data from 1995-2023 for select Enterococcus stations. If NULL, defaults to catchprecip.

temporal_window

numeric; required if subset_wetdry is not "all". number of days precipitation should be summed over (1 = day of sample only; 2 = day of sample + day before; etc.)

wet_threshold

numeric; required if subset_wetdry is not "all". inches accumulated through the defined temporal window, above which a sample should be defined as being from a 'wet' time period

warn

logical to print warnings about stations with insufficient data, default TRUE

Value

A tibble object with FIB summaries by year and station including columns for the estimated geometric mean of fecal coliform or Enterococcus concentrations (gmean), the proportion of samples exceeding 400 CFU / 100 mL (fecal coliform) or 130 CFU / 100 mL (Enterococcus) (exced), the count of samples (cnt), and a category indicating a letter outcome based on the proportion of exceedences (cat). Results can be summarized by bay segment if bay_segment is not NULL and the input data is from read_importentero.

Details

This function is used to create output for plotting a matrix stoplight graphic for FIB categories by station. The output can also be summarized by bay segment if bay_segment is not NULL and the input data is from read_importentero. In the latter case, the stas argument is ignored and all stations within each subsegment watershed are used to evaluate the FIB categories. Each station (or bay segment) and year combination is categorized based on the likelihood of fecal indicator bacteria concentrations exceeding some threshold in a given year. For fecal coliform, the default threshold is 400 CFU / 100 mL in a given year (using Fecal Coliform, fcolif in fibdata). For Enterococcus, the default threshold is 130 CFU / 100 mL. The proportions are categorized as A, B, C, D, or E (Microbial Water Quality Assessment or MWQA categories) with corresponding colors, where the breakpoints for each category are <10%, 10-30%, 30-50%, 50-75%, and >75% (right-closed). By default, the results for each year are based on a right-centered window that uses the previous two years and the current year to calculate probabilities using the monthly samples (lagyr = 3). See show_fibmatrix for additional details.

yrrng can be specified several ways. If yrrng = NULL, the year range of the data for the selected changes is chosen. User-defined values for the minimum and maximum years can also be used, or only a minimum or maximum can be specified, e.g., yrrng = c(2000, 2010) or yrrng = c(2000, NA). In the latter case, the maximum year will be defined by the data.

The default stations for fecal coliform data are those used in TBEP report #05-13 (https://drive.google.com/file/d/1MZnK3cMzV7LRg6dTbCKX8AOZU0GNurJJ/view) for the Hillsborough River Basin Management Action Plan (BMAP) subbasins if bay_segment is NULL and the input data are from read_importfib. These include Blackwater Creek (WBID 1482, EPC stations 143, 108), Baker Creek (WBID 1522C, EPC station 107), Lake Thonotosassa (WBID 1522B, EPC stations 135, 118), Flint Creek (WBID 1522A, EPC station 148), and the Lower Hillsborough River (WBID 1443E, EPC stations 105, 152, 137). Other stations can be plotted using the stas argument.

Input from read_importwqp for Manatee County (21FLMANA_WQX) FIB data can also be used. The function has not been tested for other organizations.

See also

Examples

anlz_fibmatrix(fibdata, indic = 'fcolif')
#> # A tibble: 459 × 6
#>       yr grp   gmean Latitude Longitude cat  
#>    <dbl> <fct> <dbl>    <dbl>     <dbl> <chr>
#>  1  1974 143      NA       NA        NA NA   
#>  2  1974 108      NA       NA        NA NA   
#>  3  1974 107      NA       NA        NA NA   
#>  4  1974 135      NA       NA        NA NA   
#>  5  1974 118      NA       NA        NA NA   
#>  6  1974 148      NA       NA        NA NA   
#>  7  1974 105      NA       NA        NA NA   
#>  8  1974 152      NA       NA        NA NA   
#>  9  1974 137      NA       NA        NA NA   
#> 10  1975 143      NA       NA        NA NA   
#> # ℹ 449 more rows

# use different indicator
anlz_fibmatrix(fibdata, indic = 'entero')
#> # A tibble: 216 × 6
#>       yr grp   gmean Latitude Longitude cat  
#>    <dbl> <fct> <dbl>    <dbl>     <dbl> <chr>
#>  1  2001 143      NA       NA        NA NA   
#>  2  2001 108      NA       NA        NA NA   
#>  3  2001 107      NA       NA        NA NA   
#>  4  2001 135      NA       NA        NA NA   
#>  5  2001 118      NA       NA        NA NA   
#>  6  2001 148      NA       NA        NA NA   
#>  7  2001 105      NA       NA        NA NA   
#>  8  2001 152      NA       NA        NA NA   
#>  9  2001 137      NA       NA        NA NA   
#> 10  2002 143      NA       NA        NA NA   
#> # ℹ 206 more rows

# use different dataset
anlz_fibmatrix(enterodata, indic = 'entero', lagyr = 1)
#> Warning: Stations with insufficient data for lagyr: 21FLPDEM_WQX-05-06
#> # A tibble: 1,224 × 6
#>       yr grp                        gmean Latitude Longitude cat  
#>    <dbl> <fct>                      <dbl>    <dbl>     <dbl> <chr>
#>  1  2000 21FLCOSP_WQX-32-03          NA       NA        NA   NA   
#>  2  2000 21FLCOSP_WQX-44-02          NA       NA        NA   NA   
#>  3  2000 21FLCOSP_WQX-48-03          NA       NA        NA   NA   
#>  4  2000 21FLCOSP_WQX-CENTRAL CANAL  NA       NA        NA   NA   
#>  5  2000 21FLCOSP_WQX-COSP580        NA       NA        NA   NA   
#>  6  2000 21FLCOSP_WQX-NORTH CANAL    NA       NA        NA   NA   
#>  7  2000 21FLCOSP_WQX-SC-01          NA       NA        NA   NA   
#>  8  2000 21FLCOSP_WQX-SOUTH CANAL    NA       NA        NA   NA   
#>  9  2000 21FLDOH_WQX-MANATEE152      10.7     27.5     -82.7 A    
#> 10  2000 21FLHILL_WQX-101            NA       NA        NA   NA   
#> # ℹ 1,214 more rows

# same entero data; lower threshold - changes 'cat' scores
anlz_fibmatrix(enterodata, indic = 'entero', lagyr = 1, threshold = 30)
#> Warning: Stations with insufficient data for lagyr: 21FLPDEM_WQX-05-06
#> # A tibble: 1,224 × 6
#>       yr grp                        gmean Latitude Longitude cat  
#>    <dbl> <fct>                      <dbl>    <dbl>     <dbl> <chr>
#>  1  2000 21FLCOSP_WQX-32-03          NA       NA        NA   NA   
#>  2  2000 21FLCOSP_WQX-44-02          NA       NA        NA   NA   
#>  3  2000 21FLCOSP_WQX-48-03          NA       NA        NA   NA   
#>  4  2000 21FLCOSP_WQX-CENTRAL CANAL  NA       NA        NA   NA   
#>  5  2000 21FLCOSP_WQX-COSP580        NA       NA        NA   NA   
#>  6  2000 21FLCOSP_WQX-NORTH CANAL    NA       NA        NA   NA   
#>  7  2000 21FLCOSP_WQX-SC-01          NA       NA        NA   NA   
#>  8  2000 21FLCOSP_WQX-SOUTH CANAL    NA       NA        NA   NA   
#>  9  2000 21FLDOH_WQX-MANATEE152      10.7     27.5     -82.7 A    
#> 10  2000 21FLHILL_WQX-101            NA       NA        NA   NA   
#> # ℹ 1,214 more rows

# subset to only wet samples
anlz_fibmatrix(enterodata, indic = 'entero', lagyr = 1, subset_wetdry = "wet",
               temporal_window = 2, wet_threshold = 0.5)
#> # A tibble: 1,150 × 6
#>       yr grp                         gmean Latitude Longitude cat  
#>    <dbl> <fct>                       <dbl>    <dbl>     <dbl> <chr>
#>  1  2001 21FLCOSP_WQX-32-03           NA       NA        NA   NA   
#>  2  2001 21FLCOSP_WQX-44-02           NA       NA        NA   NA   
#>  3  2001 21FLCOSP_WQX-48-03           NA       NA        NA   NA   
#>  4  2001 21FLCOSP_WQX-CENTRAL CANAL   NA       NA        NA   NA   
#>  5  2001 21FLCOSP_WQX-COSP580         NA       NA        NA   NA   
#>  6  2001 21FLCOSP_WQX-NORTH CANAL     NA       NA        NA   NA   
#>  7  2001 21FLCOSP_WQX-SC-01           NA       NA        NA   NA   
#>  8  2001 21FLCOSP_WQX-SOUTH CANAL     NA       NA        NA   NA   
#>  9  2001 21FLDOH_WQX-MANATEE152       31.6     27.5     -82.7 A    
#> 10  2001 21FLHILL_WQX-101           2252.      28.0     -82.6 C    
#> # ℹ 1,140 more rows

# Manatee County data
anlz_fibmatrix(mancofibdata, indic = 'fcolif', lagyr = 1)
#> # A tibble: 1,350 × 6
#>       yr grp   gmean Latitude Longitude cat  
#>    <dbl> <fct> <dbl>    <dbl>     <dbl> <chr>
#>  1  1995 396    NA       NA        NA   NA   
#>  2  1995 BC1    NA       NA        NA   NA   
#>  3  1995 BC2    NA       NA        NA   NA   
#>  4  1995 BC41   NA       NA        NA   NA   
#>  5  1995 BL01   NA       NA        NA   NA   
#>  6  1995 BL201  NA       NA        NA   NA   
#>  7  1995 BR1    14.8     27.4     -82.5 A    
#>  8  1995 BR2    41.1     27.4     -82.5 A    
#>  9  1995 BR3    62.4     27.4     -82.5 A    
#> 10  1995 BU01A  NA       NA        NA   NA   
#> # ℹ 1,340 more rows