First, install the wqtrends package from the ssc branch on GitHub:
install.packages('devtools')
library(devtools)
install_github('tbep-tech/wqtrends', ref = 'ssc')
Load the package (and dplyr):
library(wqtrends)
library(dplyr)
The rawdat object now includes ssc (mg/L) from Dumbarton
Bridge as daily-averaged values.
head(rawdat)
## date station param value doy cont_year yr mo ssc
## 1 1993-01-05 30 chl 1.400000 5 1993.011 1993 Jan 53.84946
## 2 1993-01-05 24 chl 1.400000 5 1993.011 1993 Jan 53.84946
## 3 1993-01-05 27 chl 1.433333 5 1993.011 1993 Jan 53.84946
## 4 1993-01-05 21 chl 1.366667 5 1993.011 1993 Jan 53.84946
## 5 1993-01-05 32 chl 1.400000 5 1993.011 1993 Jan 53.84946
## 6 1993-01-05 22 chl 1.333333 5 1993.011 1993 Jan 53.84946
The sscdat object includes the full record of ssc data.
The dataset is used to create prediction matrices for the downstream
analysis and plotting functions. This allows for an assessment of
results from a fitted model at points in the time series when
chlorophyll-a was not monitored, but ssc is available. The ssc data are
available for most but not all days.
head(sscdat)
## # A tibble: 6 x 2
## date ssc
## <date> <dbl>
## 1 1992-10-21 36.3
## 2 1992-10-22 44.7
## 3 1992-10-23 45.7
## 4 1992-10-24 48.9
## 5 1992-10-25 59.4
## 6 1992-10-26 62.2
The anlz_gam function includes an additional smoother
for ssc. By default, the number of knots for the continuous year
variable (cont_year) is set to “high” as 12 times the
number of years in the input data. The default number of knots for ssc
is 24.
tomod <- rawdat %>%
filter(station %in% 34) %>%
filter(param %in% 'chl')
mod1 <- anlz_gam(tomod, trans = 'log10')
mod1
##
## Family: gaussian
## Link function: identity
##
## Formula:
## value ~ s(cont_year, k = 324) + s(ssc, k = 24)
##
## Estimated degrees of freedom:
## 160.20 2.16 total = 163.36
##
## GCV score: 0.08157124
Changing the knots for cont_year and ssc
can be done with the kts and ssckts
arguments.
mod2 <- anlz_gam(tomod, trans = 'log10', kts = 10, ssckts = 200)
mod2
##
## Family: gaussian
## Link function: identity
##
## Formula:
## value ~ s(cont_year, k = 10) + s(ssc, k = 200)
##
## Estimated degrees of freedom:
## 4.08 2.01 total = 7.09
##
## GCV score: 0.1169829
The original GAM formula without ssc can be used by setting the
usessc argument to FALSE.
mod3 <- anlz_gam(tomod, trans = 'log10', usessc = FALSE)
mod3
##
## Family: gaussian
## Link function: identity
##
## Formula:
## value ~ s(cont_year, k = 324)
##
## Estimated degrees of freedom:
## 169 total = 170.21
##
## GCV score: 0.09209277
Fits for each model can be assessed as before using the
anlz_fit function.
anlz_fit(mod1)
## AIC GCV R2
## GCV.Cp 15.80171 0.08157124 0.6891917
anlz_fit(mod2)
## AIC GCV R2
## GCV.Cp 253.8026 0.1169829 0.2071176
anlz_fit(mod3)
## AIC GCV R2
## GCV.Cp 48.37136 0.09209277 0.6610841
The plotting functions can also be used as before.
show_prdseries(mod1, ylab = 'Chlorophyll-a (ug/L)')
show_prdseries(mod2, ylab = 'Chlorophyll-a (ug/L)')
show_prdseries(mod3, ylab = 'Chlorophyll-a (ug/L)')
show_prddoy(mod1, ylab = 'Chlorophyll-a (ug/L)')
show_prddoy(mod2, ylab = 'Chlorophyll-a (ug/L)')
show_prddoy(mod3, ylab = 'Chlorophyll-a (ug/L)')