Introduction

The presence and abundance of organisms and the use of other biological properties to indicate an environmental property or condition has a long tradition in ecology and environmental science, particularly freshwater ecology. While indicator species was the earlier focus, the evidence from all species (Carins, 1974; Cao et al., 1996; Lucke & Johnson, 2009; Friberg, 2014) is the approach now favoured in this field (paradigm). How biological information is used as an indicator in freshwater varies considerably with country and biological group, perhaps due to tradition in the area. There are major differences in the approach/methodology used in Europe and the USA, the former now driven by the Water Framework Directive and the latter by the Clean Water Act. In the USA, the use of biological properties to indicate the condition of rivers and lakes is generally based on an Index of Biotic Integrity; it uses qualitative rules to characterize a site based on characteristics related to species composition and richness and to ecological factors, for example, number of species, species richness of functional groups and proportion of feeding groups (Karr, 1991). Whereas in Europe, there is less consistency in approach, where methods range from species-based or trait-based metrics, sensitivity scores and physiological characteristics to traditional richness metrics (Birk et al., 2012). Whichever approach or methodology forms the basis of a bioassessment method, it is important that its ability to indicate the environmental property or condition is independently assessed so that the confidence in applying the method is known.

A large number of bioassessment methods have been developed in Europe to support the implementation of the 2000/60/EC Water Framework Directive (WFD), at least three hundred covering rivers, lakes, transitional waters and coastal waters (Birk et al., 2012; Friberg, 2014). Annex V of the Directive classifies the ecological status of surface water bodies using assessments of biological elements, supported by hydromorphological, physical and chemical elements to form an Ecological Quality Ratio (EQR). This indicates deviation from a hypothetical ecological system with minimal human impact/disturbance (anthropogenic pressure) and classifies the site or water body into one of five states, High, Good, Moderate, Poor and Bad. The aim of the Directive is that all water bodies achieve Good Status.

Pressure-impact relationships

These bioassessment methods are also used in support of the WFD as a diagnostic tool help establish why a water body does not reach Good Status, as was a focus of two large European research projects, STAR and WISER, in the publications by Johnson et al. (2006), Hering et al. (2006) and Marzin et al. (2012). When used for this purpose, it is important to provide independent evidence that a method developed to indicate a biological impact does at least correlate with the corresponding pressure, otherwise there is little confidence in using the results of the method. This is particularly challenging, as reviewed by Demars et al. (2012), whose opinion was that reliable indication of river conditions using macrophyte indices is difficult.

Birk et al. (2012), who completed an assessment of the pressure-impact relationships of a wide range of bioassessment methods, found that most methods were developed to respond to eutrophication/organic pollution, other water quality characteristics (e.g. acidification), hydrology/morphology or general degradation pressures. They also found that the number of methods that had empirically validated the pressure-impact relationship varied with water body category and biological element, and was particularly low for macrophyte methods in rivers, where only three had been checked. Establishing the performance of macrophyte based methods in rivers that indicate a pressure is, therefore, desirable.

Macrophytes as biological indicators in rivers

During the expansion of limnology in the 1970s, few methods that used macrophytes to indicate disturbance in rivers were developed. For example, they only get a short paragraph as indicators of pollution in Callow & Petts (1994) and are almost absent in Welch & Jacoby (2004), whereas macrophytes in lakes get a full chapter. There are exceptions; for example, Haslam (1982) developed a method to indicate river pollution and Holmes et al. (1999) the impact of a point source of nutrients in rivers. However, the inclusion of macrophytes in the WFD prompted the development of methods for rivers (Szoszkiewicz et al., 2006; Demars et al., 2012) and other water body categories (Birk et al., 2012).

The river macrophyte method used in this investigation is an update of CBAS (Canonical correspondence analysis Based Assessment System) that was developed by Dodkins et al. (2005). The underlying methodology could be called empirical scores, as field results were used to derive scores (optima) for macrophyte species along pressure variables and other variables (slope, width and alkalinity) that account for the natural variation of species. This methodology has been used to develop a bioassessment method for phytoplankton in lakes (Phillips et al., 2013). The update to the method is described in Materials and methods.

WFD status of rivers in the Republic of Ireland

The quality of surface waters in the Republic of Ireland has remained static between 2007 and 2009 and 2013–2015, but there were declines of 1 and 2.6% of High and Good Status/Potential, respectively, of monitored rivers and lakes (EPAI, 2017). Only 18% of monitored river sites had High Status in 2013–2015, compared to 30% in 1987–1990 (Department of Housing, Planning, Community and Local Government, 2017). Despite reducing seriously polluted rivers from 19 in 2007–2009 to 9 in 2015, the EPAI are actively working to make further improvements, particularly to prevent the loss of High Status ‘reference condition’ sites which have decreased from 38 sites in 2007–2009 to 21 in 2013–2015.

Jordan et al. (2005), Mockler et al. (2017) and the EPAI (2017) found that diffuse pollution from farmyards and agriculture are the major pressures on Irish rivers, resulting in eutrophication. The EPAI also suggest that land sediment, domestic wastewater emissions, indirect impacts of forestry and extractive industries releasing ammonium and sediment are significant ecological challenges. Given this, identifying the pressures that are causing deterioration of the river is the first step in developing and implementing measures to improve a water body and bioassessment methods help to provide this information.

The need for bioassessment methods that indicate an environmental property or condition, particularly for river macrophytes, has been identified, so the aim of the investigation was to use independent results to assess the ability of river macrophytes to indicate their corresponding anthropogenic pressure. While the method used was calibrated using extensive field results, an independent assessment of it is necessary to establish the reliability of macrophytes as indicators.

There were two objectives, the first to establish the correlation between a macrophyte metric and a direct measure of its corresponding river property. Metrics for soluble reactive phosphorus (SRP), nitrate (NO3), ammonia (NH4), dissolved oxygen (DO), pH (PH) and siltation (SUBS) were compared with direct measures of the river properties using bivariate and rank correlation. The river physical and chemical properties were the annual mean concentrations of soluble reactive phosphorus, nitrate, dissolved oxygen % saturation and ammonia, mean pH and the Substrate Siltation Score.

The second objective was to establish the independent association between a macrophyte metric and its corresponding river physical and chemical property. There is always correlation between the metrics and between the properties that represent the pressures, so it is difficult to isolate the correlation between the metric and pressure that is independent of all the other correlations (Demars et al., 2012). This was achieved using a General Linear Model (Graffen & Hails, 2002) and in this way some confidence that variations in the environmental variable explain variations in the metric produced.

Materials and methods

The results used in this work were collated during the EPA-funded DETECT Project (DisEnTangling the impacts of multiple stressors on the Ecology of waTerbodies) and the river macrophyte method used was CBAS (Canonical correspondence analysis Based Assessment System). The results consisted of macrophyte abundance, physical and chemical results and visual survey of up to 3135 river reaches throughout Ireland over the 2010 to 2012 period.

Macrophyte results

Macrophyte sampling was undertaken by EPA staff between 1st May and 30th September, and each sample site (station) was assessed once in a year, either 2010, 2011 or 2012, with surveying adhering to the Mean Trophic Rank method described in Holmes et al. (1999); the survey method is given in the Supplementary Material. In summary, a 100 m river reach was surveyed and the macrophyte taxa recorded using a cover scale of 1-5, with values being < 0.1, 0.1–1, 1–5, 5–10, and > 10, respectively. The width (m, mean of a minimum of 4 representative samples) of the reach was measured and the slope (m/km) estimated using a digital elevation model.

River physical and chemical properties

Monitoring data collected between 2010 and 2012 were collated and used for the direct measurement of the six river physical and chemical properties. The results were soluble reactive phosphorus (SRP-P), nitrate (NO3-N) and ammonia (NH4-N) concentrations, dissolved oxygen percentage (DO%) and pH in river water and the Substrate Siltation Score (SSS) at the site, the latter being estimated in the field as a score from 1 to 7; 1 clean, 2 clean to slight, 3 slight, 4 slight to moderate, 5 moderate, 6 moderate to heavy, 7 heavy. The number of results available at a site varied, with median, 10 and 90%-ile values of 14, 12 and 36 for SRP-P, 17, 12 and 57 for NO3-N, 15, 11 and 36 for NH4-N, 12, 4 and 31 for DO% and 14, 5 and 36 for pH. In addition, alkalinity (mg/l CaCO3, mean of a minimum of 4 representative samples) values were collated.

River macrophyte method

The river macrophyte method is described in the Supplementary Material. It is an update of the CBAS method developed by Dodkins et al. (2005), used to establish the status of a site and to diagnose pressures through the use of six metrics. The update is a recalibration of the macrophyte metric optima, use of site-specific reference conditions rather than a typology and expression of the final Metric Score using an EQR scale. In the recalibration, part of the North South Shared Aquatic Resource (NS SHARE) Project, an INTERREG IIIA project part funded by the European Union, the number of sites increased from 273 in Northern Ireland to 520 throughout Ecoregion 17 (Northern Ireland and the Republic of Ireland), 68 of which were least impacted and used as reference sites. Regression models are used to estimate site-specific reference conditions based on width, slope and alkalinity.

The six macrophyte Metric Scores are SRP, NO3, NH4, DO, PH and SUBS and they were developed to indicate deviation of SRP-P, NO3-N, NH4-N, DO%, pH and silt from the reference values at a site. For each metric, the Site Score is the average of the optima of the macrophyte taxa present and the site-specific Reference Score estimated using the width, slope and alkalinity at the site. The ratio of Site Score to the Reference Score is the Metric Score and is expressed using an EQR scale, with values generally between 1 (close to reference condition, unimpacted) and 0 (very different from reference condition, very impacted).

The assessment

The ability of each macrophyte metric to indicate its corresponding pressure was assessed using bivariate and rank correlation and General Linear Models. The results were prepared as follows. The number of macrophyte taxa was reduced to the 51 used in the macrophyte method and the number of river stations with all the physical, chemical and macrophyte results was 810. For each station, the Metric Scores for the six metrics were calculated along with mean values of the six physical and chemical pressure variables. Metric values less than zero were found at some stations and greater than 1 at more, and this indicates that the sites were of higher quality than the Reference Score or poorer than the worst sites used to calibrate CBAS or they were errors. The method can be altered to accommodate these poorer and better sites by reassessing the reference sites and models or changing the constants in the Metric Score equation used to convert the result to an EQR scale, as the species optima remain unchanged, or both.

The correlation between the macrophyte Metric Score and the corresponding physical or chemical property (pressure) was described using bivariate (Pearson’s r) and rank (Spearman’s rs) correlation. The distribution of the river environmental properties was checked for normality (skewness, kurtosis and histograms) and logarithmically transformed when necessary; only DO% and SSS were not transformed.

The correlations were completed on the full data set and on a reduced one, as employed by Johnson et al. (2006). The reduced set consisted of stations in the first and fourth quartile ranges of the river environmental properties (See Table 1) and produced two quality classes. The Best Available (BA) was sites in the 0–25%-ile for NO3-N, NH4-N, SRP-P and pH, and 75-100%-ile for DO%. Perturbed (P) was sites in the 75–100%-ile for NO3-N, NH4-N, SRP-P and pH and 0–25%-ile for DO%. For SSS, BA was sites with an SSS of 1 (Clear) and P of 7 (Heavy).

Table 1 Statistical summary of the six river physical and chemical variables. The properties are the annual mean concentration of SRP-P (mg PO4-P/l), NO3-N (mg NO3-N/l), NH4-N (mg NH4-N/L) and DO% (% sat), annual mean pH and the SSS at 810 river stations over the 2010 to 2012 period

To supplement the correlation, the ability of a metric to distinguish P from BA sites was established using the parametric t test and non-parametric Mann–Whitney U test. In addition, the Type II or false negative error for five metrics was estimated using the approach of Johnson et al. (2006); the percentage of sites classified as P using an environmental variable, but not detected by the metric, was considered to be the Type II error of the metric. Specifically, the percentage of sites classified as P using the corresponding physical and chemical variable but indicated by the macrophyte metric to be good quality, based on a criterion of the Metric Score being greater than a critical threshold of the 25%-ile of the all the BA sites.

General Linear Models (GLMs) were used to provide evidence for the association between a macrophyte Metric Score and its corresponding environmental variable, when the associations with other variables have been taken into account through statistical elimination (Graffen & Hails, 2002) and were constructed as follows. The river environmental variables except SSS were logarithmically transformed to achieve normality. Only the four variables that represent pressures, logSRP-P, logNO3-N, logNH4-N and SSS, were included, as the other two variables, DO% and pH, are responses to pressures such as eutrophication or organic population. For example, elevated NO3-N concentrations would lead to increased photosynthesis and so to higher DO% and higher pH values during daylight. The first variable in a model was always the river environmental variable corresponding to the macrophyte metric and the order of variables was NO3-N, NH4-N, SRP-P and SSS. SSS, which has a seven point scale, was treated as a continuous variable (Grafen & Hails (2002, pp. 104–106).

Using NO3 as an example, the GLM was as follows.

NO3 = logNO3-N + logNH4-N + logSRP-P + SSS, with NO3, logNO3-N, logNH4-N, logSRP-P and SSS as continuous variables.

The Sequential Sum of Squares (Seq SS, Type I) and Adjusted Sum of Squares (Adj SS, Type III) were used to investigate the contribution of each environmental variable to explaining variations in the macrophyte metric, with Adj SS (Type III) providing the sum of squares for a variable when the contributions of all the others have been accounted for. SPSS Statistics version 24 was used.

Results

River physical and chemical variables

A statistical summary of the physical and chemical variables at the 810 river stations in Ireland over the 2010–2012 period, the full data set, is given in Table 1 and it indicates the characteristics of some of the main pressures on the rivers. The range of pH and DO% is much less compared to the other properties, with SSS and SRP-P larger and NH4-N and, especially, NO3-N the largest. This is shown by the difference between the 95 and 5%-ile values, expressed as a percentage of the median; those values are 17.7% for pH, 20.5 for DO%, 150 for SSS, 233 for SRP-P, 325 for NH4-N and 7710% for NO3-N.

There are only weak bivariate correlations between the physical and chemical properties, excluding SSS; the significant correlations are logNH4-N/logNO3-N, logNH4-N/logSRP-P, logpH/logNH4-N and logpH/logNH4-N, which have r values between 0.23 and 0.27.

The characteristics of the reduced data set, which consists of sites in the first and fourth quartile ranges of the river environmental properties, are shown in Table 1. Sites with values up to the 25%-ile represent BA and greater than the 75%-ile represent P, for NO3-N, NH4-N, SRP-P and pH. For DO%, BA is greater than the 75%-ile and P below the 25%-ile, while BA is 7 and P 1 for SSS.

Correlation between macrophyte metric and its corresponding river physical and chemical variable

The variability of the six macrophyte metrics is summarized in Table 2 and, using the difference between the 95 and 5%-ile values expressed as a percentage of the median, PH and DO have the smallest ranges of 79.6 ad 123%, as was the case with the corresponding environmental variables. NH4 (127%) and NO3 (141) have only a slightly greater range, with SRP (197) and SUBS (212) the largest.

Table 2 Statistical summary of the Metric Score of six river macrophyte metrics at 810 river stations over the 2010 to 2012 period

It can be noted that 10% of the sites have a Metric Score greater than 1 (Table 2), indicating that they are of better quality than the reference value, as expressed by the macrophyte taxa present; 5% of the sites have an EQR greater than between 1.09 and 1.33. A reassessment of the reference sites or model or a change in the conversion of the Metric Score to an EQR scale would resolve these outliers.

The bivariate and rank correlation coefficients between each of the six macrophyte metrics and their corresponding river environmental variable for the full and the reduced data sets are shown in Table 3.

Table 3 Pearson’s (r) and Spearman’s (rs) coefficients for the correlation between a Metric Score and its corresponding river physical and chemical variable for the full and reduced data sets

The NO3 and DO macrophyte Metric Scores have the strongest correlation with their corresponding pressure, as represented by the annual mean logNO3-N and DO%, respectively, in both the full and reduced data set. While the coefficients are not large, varying between 0.22 and − 0.39, this is evidence that the two metrics do correlate with variations in the corresponding river property.

The PH and SUBS Metric Scores have less precise responses, as PH is only rank correlated with the annual mean logpH in both the data sets (− 0.32, − 0.28) and SUBS rank correlated with SSS in the reduced − -0.28). Neither SRP nor NH4 have any correlation with their corresponding river property.

The NO3, DO, PH and SUBS Metric Scores are different in the two quality classes, best available (BA) and perturbed (P), as shown by both the t test and Mann–Whitney U test (P < 0.001; Table 4). While significant, the differences are not large. It is greatest for NO3, where the mean and median metric values at the BA sites are 0.72 and 0.76, compared to 0.55 and 0.52 at the P sites. The difference for SUBS is also considerable (0.54 and 0.53 compared to 0.31 and 0.36), but it should be noted that, as the metric values at the BA sites are a good deal less than 1, the quality of the best sites is not high, at least as indicated by this macrophyte metric. It could also be that the SUBS metric is not responding to changes in siltation in the river as represented by SSS, a possibility considered below. With DO and PH, the differences, while statistically significant, are small and even the P sites have high metric values that indicate they are not too degraded, as represented by these metrics; the mean/median Metric Score for DO are 0.90/1.00 for the BA sites and 0.70/0.72 for P sites, and 1.11/1.14 and 0.92/0.93 for PH.

Table 4 The mean and median Metric Score of six river macrophyte metrics in the best available (BA) and perturbed (P) groups of sites and the t test and Mann–Whitney U test (MW U) P values

There is almost no difference between the NH4 Metric Score in the two groups and the P sites have high metric values; only the Mann–Whitney U test is significant and the median value for the BA sites is 0.89 and 0.78 for P. There are no statistical differences between the SRP Metric Scores in the two quality classes.

The differences between the macrophyte Metric Scores in the two quality classes are also displayed as box plots in Fig. 1, which visually indicates that the median Metric Score for NO3, DO, PH and SUBS is different in the two groups of sites, whereas there is little difference with NH4 and SRP, further confirming the findings of the statistical tests (Table 4).

Fig. 1
figure 1

Box plots of the six macrophyte Metric Scores at the best available (clear) and perturbed (light pattern) river stations for SRP NO3 NH4 DO PH and SUBS

There is considerable overlap in the Metric Scores of the two groups in Fig. 1, so many sites that are classified by the river environmental variable as P have metric values above the 25%-ile of the BA class, indicating good quality and so producing a Type II error. Table 5 shows that the Type II errors are quite high, with a range from 37% for PH to 69 for NH4 and a mean of 50 for the five metrics. It can be noted that the critical threshold for SUBS metric is very low, at 0.307.

Table 5 Type II error for five river macrophyte metrics

The evidence from the correlation coefficients (Table 3) and the difference between the P and BA sites (Table 4, Fig. 1) is that NO3 is the best performing macrophyte metric. It does indicate the NO3-N concentration in the river water, although the pressure-impact relationship is not very precise, with an absolute value of the correlation coefficient between NO3 and logNO3-N between 0.23 and 0.39.

The next best metric is DO. Its pressure-impact relationship is only a little less precise (0.22 to 0.31) than for NO3 and it is able to distinguish the two quality classes, even though the P sites are not very perturbed (Median Metric Score 0.72; Table 4).

The other metrics perform poorly (PH, SUBS) or do not correlate with the corresponding river environmental variable (SRP, NH4). The SUBS metric does discriminate the two quality classes, although the BA sites are indicated not to be of good quality, with a median Metric Score of 0.53 (Table 4).

General linear models

The strongest evidence from the GLMs is for the NO3 metric, where logNO3-N explains the largest Seq SS (Type I) and Adj SS (Type III) and, while the other three environmental variables are significant, they explain less variability (Table 6). This is evidence, particularly from the Adj SS, that the NO3 macrophyte metric has an association with the NO3-N concentration.

Table 6 The GLM for NO3 = logNO3-N + logNH4-N + logSRP-P + SSS, where NO3, logNO3-N, logNH4-N, logSRP-P and SSS are continuous variables

The other GLMs provide no support for an association between the metric and its corresponding physical and chemical variable. With NH4, the associated environmental variable, logNH4-N, is not significant; only SSS is and it is a poor model, with an Adjusted R2 of 0.025. With SRP, the associated environmental variable, logSRP-P, is not significant; logNO3-N and SSS are in another poor model (Adjusted R2 of 0.022).

Finally, all the variables are significant in the SUBS GLM (Table 7). However, as the SUBS metric was calibrated to respond to silt in the river substrate, it is surprising that SSS explains the smallest Adj SS (0.836) and logNH4-N the largest (4.010). Based on this GLM, the SUBS metric is an indicator of the NH4-N concentration in the river water and there is support for this from the bivariate correlation coefficient between SUBS and logNH4-N of − 0.267 and a rank correlation coefficient of − 0.286. The poor association between SUBS and SSS may also be influenced by the different ways substrate was described in the sites used to calibrate the metric and the sites used in this assessment. In CBAS, SUBS was calibrated to the proportion of silt the river substrate, while SSS is a score of the degree of siltation at the site. While a relationship between the cover of silt and the degree of siltation at a site might be expected, SUBS hardly responded to SSS.

Table 7 The GLM for SUBS = SSS + logNO3-N + logNH4-N + logSRP-P, where SUBS, SSS, logNO3-N, logNH4-N and logSRP-P are continuous variables

The GLMs provide evidence for an association between the NO3 metric and logNO3-N that is additional to the associations with the other environmental variables (Table 6). While not strictly independent evidence (MacNally, 2000), it is additional to that from the correlation. There is none for the NH4, SRP and SUBS metrics and their corresponding pressure variable. Interestingly, the SUBS GLM is the best model (largest Adjusted R2) but it provides evidence for an association between SUBS and logNH4-N (Table 7). Even though the SUBS metric was developed to indicate the river substrate, the macrophyte optima are better at representing the gradient of ammonia in the river.

Discussion

River physical and chemical variables

The statistical summary of the six river physical and chemical variables (Table 1) describes the variability of the environmental properties that the six macrophyte metrics were developed to indicate and it shows that NO3-N has the greatest range, followed by NH4-N, with SRP-P and SSS intermediate and pH and DO% the smallest. Only weak correlations were found between some of these environmental variables at the 810 river stations, NO3-N, NH4-N, SRP-P and pH.

The SRP-P, NO3-N and NH4-N concentrations are not high, compared to many countries in Europe. Specifically, the mean SRP-P value (0.031 mg PO4-P/l; Table 1) is at the lower end of the range of values for countries complied by Foy (2007) and the mean NO3-N (0.366 mgNO3-N/l) and NH4-N (0.059 mg NH4-N/l) are in the middle of their ranges (Available from the European Environment Agency at https://www.eea.europa.eu/data-and-maps). The typical DO% (mean 96.5% saturation) and pH (7.78) values do not represent much disturbance of the rivers. In addition, the ranges of these properties is relatively low, the difference between the 95 and 5%-ile values expressed as a percentage of the median being less than 300%, except for NO3-N, at 7.710.hg.

Pressure-impact relationships for the macrophyte metrics

The ability of the river macrophyte metrics to indicate a pressure was assessed using correlation between the Metric Score and its corresponding river environmental variable and GLMs. NO3 was found to be the best, having the largest correlation coefficient (− 0.23 to − 0.39; Table 3), being able to distinguish P from BA sites (Table 4) and NO3-N being the most important variable in the GLM (Table 6). Next best was DO, with a correlation coefficient between 0.22 and 0.31 and an ability to distinguish the two quality classes. The evidence for the other four metrics is either weak (PH and SUBS) or none or almost none (SRP and NH4). Although the evidence for a pressure-impact relationship for the SUBS metric and SSS is weak, the GLM shows that SUBS is associated with NH4-N in the river (Table 7) and it has a correlation coefficient (− 0.27 to − 0.29), just less than NO3’s, so it could be used as a macrophyte indicator for NH4-N.

Even though there is evidence that the NO3 metric does indicate the NO3-N concentration and SUBS the NH4-N, they are not very precise indicators, with absolute values of the correlation coefficients between 0.23 and 0.39. We suggest the reasons for this low precision by considering the degree of pressure at the river sites, what others have found, how the pressure-impact relationships are described, the conceptual basis of the macrophyte method and phenotypic plasticity.

One reason for the imprecise pressure-impact relationships could be the relatively low pressures from the NO3, NH4-N and SRP-P concentrations. The statistical summary and discussion show that these concentrations are low at the European scale, as is the range of values, and so a strong response of the macrophytes to the pressures may not be possible. It could be that NO3 is the best indicator as NO3-N has the highest concentration and greatest range of the river properties and the inability of the NH4 and SRP metrics to discriminate between the P and BA sites because there little difference between the chemical concentrations in the two quality classes.

Similarly imprecise pressure-impact relationships for macrophytes and other biological metrics in rivers have been found in other investigations. In an analysis of three hundred bioassessments methods using phytoplankton, macroscopic plants, benthic invertebrates, phytobenthos and fish in rivers, lakes, transitional waters and coastal waters, Birks et al. (2012) found that the uncertainty in the pressure-impact relationship was greatest with river methods; the median bivariate correlation coefficient was 0.55, compared to 0.75 for coastal waters, 0.70 for lakes and 0.60 for transitional water. As their box and whisker values for the river methods are 0.20, 0.45, 0.55, 0.70 and 0.85, the precision of the NO3 and SUBS metrics is in their lowest quantile; although, it can be noted that the properties used to represent the pressures were not always direct estimates of the property but included ordination axis scores (Birks et al., 2012).

Szoszkiewicz et al. (2006) evaluated four macrophyte metrics (Hemerobdy index, IMBR, MTR and Ellenberg (N)) and other general ecological metrics in lowland rivers and mountain streams using rank correlation with a direct measure of the river property, including ammonia, nitrate and the orthophosphate concentration. The only significant correlation in mountain streams was with orthophosphate (− 0.42 to − 0.47), while the absolute values in lowland rivers were 0.24–0.55 for ammonia, 0.24–0.36 for nitrate and 0.31–0.68 for orthophosphate.

Hering et al. (2006) and Johnson et al. (2006) used the same macrophyte results as Szoszkiewicz et al. (2006), along with benthic diatom, macroinvertebrate and fish data, to describe the pressure-impact relationships, with the pressures represented by ordination axis scores. They found a considerable variation in the precision of the responses to the environmental gradients, depending on biological group and river type. Most correlation coefficients were < 0.2, with only a few > 0.6, and so the precision of the pressure-impact relationships was low.

In addition, Johnson et al. (2006) estimated the Type II error rates for two macrophyte methods and obtained 21.1% for MTR, and 26.3% for IMBR for mountain streams and 56.3 and 31.3% for lowland rivers. Our range and average, 37 to 69% and 51, are poorer than these values.

Demars & Edwards (2009) assessed the ability of the MTR macrophyte metric to indicate pressures, including ammonium, nitrate and SRP-P concentration, in rivers using bivariate correlation. They found 0.81 for nitrate and 0.69 for SRP-P, but intercorrelation between the environmental properties, high correlation with conductivity (0.75), high unexplained species variance and ecological considerations led them to conclude that macrophytes are unreliable or unspecific indicators of nutrient concentrations.

Finally, Demars et al. (2012) evaluated two widely used methods developed to indicate river environmental properties. Using independent data, they assessed IBMR for SRP-P and NH4-N and LEAFPACS for SRP-P and silt. IBMR correlated with SRP-P (0.54), but, if the strong correlation with pH (0.75) was removed, it was much smaller (0.28). Bicarbonate and pCO2 were better predictors of the IBMR than SRP-P and ammonia in another analysis. Variance partitioning in both analyses showed that the natural properties, pH, bicarbonate and pCO2, explained much more variance than SRP-P and ammonia. With the LEAFPACS metrics, there was no correlation between the nutrient index and SRP-P and the hydraulic index and siltation. In both these evaluations, pH, bicarbonate and pCO2 were more strongly correlated with the macrophyte indices than the environmental variables that represented the pressure and this is always the case. Some methods use variables such as alkalinity, slope and distance from source to account for the natural variation of macrophyte composition and so to estimate the reference conditions, as a way to help isolate the compositional change due to the anthropogenic pressure. Reference conditions are not used in the IMBR, but alkalinity, altitude, slope and distance from source are in LEAFPACS and alkalinity, slope and width in CBAS. Nevertheless, based on this evidence, a critique of macrophyte indices and a range of ecological considerations, Demars et al. (2012) concluded that composition-based indices are unreliable indicators of river environmental properties.

The low correlation between macrophyte metrics and direct measures of the pressures found in this and the other investigations may be influenced by the way the pressure-impact relationship is described, including limitations of the data. Correlation is the usual way of describing the strength of the pressure-impact relationship, but this only provides evidence for an association between the metric and the variable that is almost always influenced by other variables and, of course, correlation does not imply causality. We did use GLMs to provide some evidence for the association between the metric and its corresponding environmental variable and correlation is sufficient for an indicator, at least within the ranges and interrelationships of the environmental variables. Poikane et al. (2014) suggested that the high variability of pressure-response relationships could be due to the indirect influence of nutrients and Friberg (2010) also noted the influence of unmeasured variables. The limitations of data apply to both the environmental variables and macrophyte results; characterizing the river properties probably has errors through not having enough measurements to account for temporal variability and inappropriate spatial scale of the measurements. Also, the characterization of the pressure may not be adequate to establish significant events (e.g. point source discharge not captured during spot sampling). Friberg (2010) also suggested that stochastic events and uneven data quality contributed to the large amount of unexplained variability in the pressure-response relationships in streams. Finally, relatively small ranges of the pressure gradients would make it difficult for a biological metric to detect change using correlation, as is suggested for the NH4 and SRP macrophyte metrics and for their inability to discriminate perturbed from best available sites.

The conceptual basis of macrophyte methods vary and this is true for all bioassessment methods (Birk et al., 2012). The basis of the CBAS method is the niche, represented by the optima of taxa along the pressure gradients and estimated using weighted averaging with field data. Juggins (2013) presents a critical evaluation of theses widely used biological transfer functions based on the niche and recommends ways to improve their realism. While it could also be proposed that this method has a rational basis and is calibrated using field observations, the NO3 and SUBS metrics were found not to be very precise indicators; indeed, they are at the lower end of the range of performance of other macrophyte metrics. Plasticity, specifically physiological plasticity (Miner et al., 2005), could limit the effectiveness of applying the niche concept to a bioassessment method; Vestergaard and Sand-Jensen (2000) suggested it was a contributor to the imprecise relationship between aquatic macrophyte species and alkalinity in lakes. If physiological plasticity increases the ability of an aquatic macrophyte species to grow in wider ranges of soluble reactive phosphorus, nitrate and ammonia concentrations in the river water and in varying substrates than in its absence, then the species niche would be wider. The optimum should not change if it is derived using enough good quality results, although achieving this is a challenge. Even though the CBAS methodology uses species presence and theoretically should not be influenced by the niche size, this plasticity could lead to differences between the optima in the calibration and evaluation data sets. It could contribute to the imprecise pressure-impact relationships. If the physiological plasticity applied to one or a few species and one or two traits, as suggested in the reviews by Wells and Pigliucci (2000) for aquatic species and Hodges (2004) for the roots of grasses and grassland species, then the effect would be less than if it applied to all the species and most traits.

Use of macrophyte metrics

As the evidence is that river macrophyte metrics are imprecise indicators of pressures such as nitrate, ammonia, soluble reactive phosphorus or siltation, what value are they as a diagnostic tool? As a single indication from one metric is not enough evidence for the impact of the pressure, we could suggest that it is used in conjunction with other indicators and to characterize a group of river sites.

If metrics from other biological groups such as invertebrates and diatoms are available, then the indications from two or more of them can be combined with the macrophyte metric in a weight-of-evidence. To do this, we assume independence of the metrics and a probability of correct indication of the pressure of 1 in 3 (P = 0.33) to 1 in 2 (0.50); the choice of probability is based on Type II errors of 37 to 69% (Table 5). Using the multiplicative law of probability, then the probability of being correct with two indicators is 0.56 to 0.75, respectively, and 0.70 to 0.88 with three. Evidence from simple direct measures of the pressure could also be included to provide justification for more intensive investigation of the cause of deterioration in the water body; for example, spot measurements of nitrate, ammonia or soluble reactive phosphorus concentration in the river or rapid assessment of siltation by visual assessment of the substrate or by the Shuffle method (Clapcott et al., 2011).

While a group of sites in a river sub-basin may not be independent, if three or more of them have macrophyte metric values that indicate elevated nitrate, ammonia, soluble reactive phosphorus or siltation, then this is more reliable evidence for the pressure; this is based on the same weight-of-evidence assumptions. Demars et al. (2102) also suggested this is the best application for macrophyte indices.

Conclusions

The correlation between macrophyte metrics developed to indicate soluble reactive phosphorus (SRP), nitrate (NO3) and ammonia (NH4) concentrations, dissolved oxygen saturation (DO), pH (PH) and siltation (SUBS) in rivers and direct measures of the corresponding environmental variables was established using a data set of 810 sites in the Republic of Ireland. This was supplemented with General Linear Models.

Only the NO3 and DO metrics had absolute values of the correlation coefficients greater than 0.21 and only the NO3 GLM provided support for an association between the metric and the nitrate concentration that was independent of other correlations. While the SUBS metric did not indicate siltation, it correlated with the ammonia concentration (− 0.28) and had an independent association with ammonia in the GLM. The NO3 and SUBS metrics, therefore, provide some indication of the nitrate and ammonia concentrations in the river, although not very precisely.

A review of the precision of pressure-impact relationships for river macrophyte metrics in the literature showed that NO3 and SUBS metrics perform at the lower end of the range.

Given the uncertainty of the indication, it can be suggested that macrophyte metrics could be used in two ways. In combination with evidence from one or two other biological groups, probably diatoms and invertebrates, or direct measures of pressures at one site. Evidence from a macrophyte metric at three or more sites in a sub-basin may be sufficient evidence for an impact from that pressure.