| Transect | Species count | Species |
|---|---|---|
| 10 | 1 | Halodule |
| 3 | 2 | Halodule, Thalassia |
| 6 | 2 | DR: Chondria, Thalassia |
| 7 | 5 | DR: Acanthophora, DR: Chondria, Halodule, Syringodium, Thalassia |
| 8 | 3 | DR: Acanthophora, Halodule, Thalassia |
| 9 | 4 | DR: Chondria, DR: Hypnea, Syringodium, Thalassia |
How Report Card Scores Are Calculated
Overview
Each year, multiple monitoring groups survey the same seagrass transects independently. Because there is no single external ground truth, scores are based on how consistently groups agree with each other. A group that reports values close to the cross-group average earns a high score and one that deviates substantially earns a lower score.
Scores are calculated for three field measurements:
- Abundance — Braun-Blanquet cover category (0, 0.1, 0.5, 1, 2, 3, 4, 5)
- Blade Length — average blade length in cm
- Short Shoot Density — shoots per m²
These three metric scores are averaged into an overall Total score, which is then converted to a letter grade.
Step 1: The Consensus Species List
Not every species recorded across all groups counts toward scoring. A species at a given transect is considered truly present only if at least two distinct groups reported it with non-zero abundance. This filters out likely misidentifications while still being inclusive — a species does not need to be unanimous, just corroborated.
The consensus list is computed per transect, so a species may be on the list at one transect but not another.
Step 2: True Values
For each consensus species at each transect, true values are calculated as the cross-group average:
- Abundance: group Braun-Blanquet categories are converted to their numeric level (1–8), averaged across groups, then converted back to the nearest BB value.
- Blade Length and Short Shoot Density: simple means across groups.
Only consensus species enter this calculation. This prevents a misidentified species (recorded by only one group) from distorting the true values for everyone else.
| Transect | Species | Abundance | Blade Length | Short Shoot Density |
|---|---|---|---|---|
| 3 | Halodule | 4 | 24.6 cm | 9 per m² |
| 3 | Thalassia | 0.5 | 23.0 cm | 1 per m² |
A dash (—) means that measurement was not recorded for that species at that transect.
Step 3: Group Deviations and Species ID Penalties
For each group, reported values are compared to the true values across all consensus species and transects. This comparison uses a full join, which naturally reveals two types of species identification errors:
Missed species — a consensus species is present but the group did not record it. The group’s abundance for that species is treated as level 1 (“no coverage”) when computing the deviation, applying a slight penalty.
False positives — the group recorded a species that is not on the consensus list (i.e., no other group confirmed it). The true abundance is treated as level 1, again applying a slight penalty.
These penalties only affect the Abundance score. Blade Length and Short Shoot Density cannot be meaningfully penalised for species that were not found, so those metrics use na.rm = TRUE when averaging.
| Transect | Species | Abundance reported | Abundance true | Blade Length reported | Blade Length true |
|---|---|---|---|---|---|
| 3 | Halodule | 51-75% | 51-75% | 23.0 | 24.6 |
| 3 | Thalassia | solitary | few | 7.2 | 23.0 |
Step 4: Metric Scores
For each metric, deviations from the true value are summarised per species across transects, then combined into a single number per group per metric. The combination uses a weighted mean of absolute differences, where the weight for each species is the inverse of the standard deviation of the true values across transects:
\[ \text{metric score}_{\text{raw}} = \sum_{\text{species}} \frac{|\bar{d}_s|}{1 + \sigma_s} \]
where \(\bar{d}_s\) is the mean absolute deviation for species \(s\) and \(\sigma_s\) is the standard deviation of the true values for that species across transects. Species where the true value varies a lot across transects (high \(\sigma_s\)) contribute less to the final score, since agreement there is inherently harder to achieve.
| Species | Reported (avg) | True (avg) | Mean deviation |
|---|---|---|---|
| Halodule | 25-50% | 25-50% | 0 |
| Syringodium | <5% | <5% | 0 |
| Thalassia | <5% | 5-25% | -1 |
Step 5: Score Calibration
Raw metric scores are converted to a 0–100 scale. Without calibration, the best group in any year always maps to ~100 and the worst always maps to ~50, regardless of how closely groups agreed. This means a year where everyone performed very well would still produce a spread from A to D — an unfair outcome.
To address this, the score floor (the minimum possible score) is raised in years when all groups agree closely with each other, and kept at 50 in years when disagreement is typical or high.
How calibration works
For each year and metric, we compute the within-year standard deviation of group deviations — a measure of how spread out the groups were. Across all training years, these yearly spreads are standardised to z-scores. A negative z-score means groups agreed more than usual; a positive z-score means more disagreement than usual.
The score floor for a given year and metric is:
\[ \text{floor} = \max\!\left(50,\ 50 - z \times 15\right) \]
where \(z\) is the z-score of within-year spread for that metric and year, and 15 is a scaling constant (grade-points per standard deviation). A year that is 1 SD tighter than average gets a floor of 65; 2 SDs tighter gets a floor of 80. Loose years (positive z) are capped at 50 — no extra penalty beyond the standard range.
Effect on scores: a tight vs. loose year
The table below compares the calibrated score floor for each year and metric, illustrating how the floor shifts in tighter training years.
| Year | Abundance | Blade Length | Short Shoot Density |
|---|---|---|---|
| 2020 | 50 | 68 | 50 |
| 2021 | 55 | 50 | 55 |
| 2022 | 50 | 50 | 50 |
| 2023 | 50 | 61 | 55 |
| 2024 | 50 | 59 | 58 |
| 2025 | 74 | 50 | 61 |
Step 6: Letter Grades
After calibration, each group’s numeric score for each metric falls on a 0–100 scale. These are mapped to letter grades using fixed thresholds:
| Grade | Score range |
|---|---|
| A | 95 – 100 |
| A- | 90 – 94 |
| B+ | 85 – 89 |
| B | 80 – 84 |
| B- | 75 – 79 |
| C+ | 70 – 74 |
| C | 65 – 69 |
| C- | 60 – 64 |
| D+ | 55 – 59 |
| D | below 55 |
The Total score is the unweighted average of the Abundance, Blade Length, and Short Shoot Density numeric scores, then converted to a letter grade using the same thresholds.
Worked Example
The following walks through the full scoring pipeline for SWFWMD (T. Harter, C. Anastasiou, W. VanGelder, M. Walton, E. Walters) in 2025.
Raw deviations
| Species | Reported avg | True avg | Mean deviation |
|---|---|---|---|
| Abundance | |||
| Halodule | 6.0 | 6.0 | 0.0 |
| Syringodium | 4.0 | 4.0 | 0.0 |
| Thalassia | 4.0 | 5.0 | -1.0 |
| Blade Length | |||
| Halodule | 11.4 | 13.2 | -1.8 |
| Syringodium | 6.4 | 13.9 | -7.5 |
| Thalassia | 11.9 | 18.0 | -6.1 |
| Short Shoot Density | |||
| Halodule | 6.3 | 7.2 | -0.9 |
| Syringodium | 1.0 | 0.7 | 0.3 |
| Thalassia | 1.9 | 1.6 | 0.3 |
Scores
| Metric | Numeric score | Letter grade |
|---|---|---|
| Abundance | 91.4 | A- |
| Blade Length | 82.3 | B |
| Short Shoot Density | 82.5 | B |
| Total | 85.4 | B+ |
How the calibration affected this group’s scores
| Metric | Score without calibration | Score with calibration | Z-score (within-year spread) | Score floor |
|---|---|---|---|---|
| Abundance | 83.3 | 91.4 | -1.60 | 74 |
| Blade Length | 82.3 | 82.3 | 0.67 | 50 |
| Short Shoot Density | 77.4 | 82.5 | -0.76 | 61 |
A negative Z-score (tighter than average cohort) raises the floor for all groups, including this one. A floor of 50 means no calibration adjustment was applied.