Introduction

Phosphorus (P) is an essential element for food and biofuel production, which cannot be substituted by any other chemical element, and is a non-renewable geological resource (Elser and Bennett 2011). The UK and the rest of Europe have no significant indigenous phosphate rock resources, and the import of P fertilisers has provided a vital foundation for modern European food production (Withers et al. 2015). However, as a result of the availability of cheap P fertilisers, P usage along food production, consumption and waste management chains have become increasingly inefficient and dissipative (Jarvie et al. 2013a, b). Inefficiencies in P use are not only of concern for long-term P security, but have profound negative downstream impacts, through eutrophication, which impairs water quality, undermines aquatic ecosystem function and threatens water security (Jarvie et al. 2015). Reliance on imports of vital P resources risks exposure to commodity market volatilities (Elser et al. 2014), and the UK’s national food and water security will become increasingly dependent on its ability to manage P more sustainably (Jarvie et al. 2015). Consequently, it is of strategic importance to understand the evolution of the UK P balance, to track P inputs and outputs. Furthermore, several authors have proposed that P is accumulating in the terrestrial biosphere. Surplus P has been noted in a range of environments, (e.g., global croplands—MacDonald et al. 2008) and that this has led to increased eutrophication of surface waters (e.g., Bennett et al. 2001). However, it has been proposed that this surplus has accumulated in soils leading to prolonged P leaking into catchments and this has been referred to as ‘legacy’ P (Sharpley et al. 2013). Haygarth et al. (2014) proposed that the only way to understand whether P legacy was and would alter P concentrations and fluxes in surface waters then a full P budget would be required. Indeed, Powers et al. (2016) demonstrated that for two large river basins (Thames, UK, and Maumee, USA) that during the 1990s the net exports from the catchments exceeded the inputs suggesting that both basins were relying on legacy stores of P.

The potential for accumulation and for legacy releases means that all fluxes should be viewed in a wider budgetary context. This baseline information will be needed to assist in identifying opportunities for P recycling and to exploit stores of P within the landscape (Withers et al. 2015). The UK’s intensive and advanced agriculture coupled with its high population density (third largest population density within the EU—262 people/km2) mean that UK can be expected to be an end member in terms of P processing; this has already been noticed with respect for nitrogen with respect to the UK (Worrall et al. 2009) and for The Netherlands, another advanced agricultural economy with a population density even greater than that of the UK (Kroeze et al. 2003). Therefore, as a first step towards understanding the UK’s national P balance, this study explores (1) the changes the national P flux from, and through, the rivers of the UK between 1974 and 2012; (2) the controls upon the distribution and processing of that fluvial flux of P across the UK; and (3) how these national river fluxes compare with P imports and exports across the UK national boundary, in fertilisers, industrial products, food and feed and direct (coastal) waste discharges.

Methodology

This study sought to estimate the P fluvial flux from the UK in both terms of time and space. By calculating fluxes in time and space it would be possible to assess controls upon the fluxes and to assess in stream processing and the loss from the terrestrial biosphere as opposed to just the loss at the tidal limit to the continental shelf. The approach used to calculate fluvial fluxes for individual catchments is based upon approach developed in Worrall et al. (2014). Multiple flux estimation techniques are used to develop the best possible values given the information available for each catchment over time.

The study used data from the Harmonised Monitoring Scheme (HMS—Bellamy and Wilkinson, 2001). There are 56 HMS sites in Scotland and 214 sites in England and Wales (Fig. 1; Table 1). Note that one Scottish river (River Tweed) actually is included in the NE England dataset because, although most of its catchment is in Scotland, its tidal limit is in England. HMS monitoring sites were selected for inclusion into the original monitoring programme if they were immediately above the tidal limit of rivers with an average annual discharge greater than 2 m3/s; in addition, any tributaries with a mean annual discharge above 2 m3/s were also included in the original monitoring programme. These criteria provided good spatial coverage of the coast of England and Wales, but river flow in most western Scottish rivers did not meet the average discharge criterion. No HMS data were available from Northern Ireland so this analysis is necessarily restricted to Great Britain (GB) rather than the entire United Kingdom (UK). Within the database maintained as part of the HMS programme, three determinants were of particular interest to this study: total reactive phosphorus concentration (TRP; mg P/l); total phosphorus concentration (TP; mg P/l); average and instantaneous river discharge (m3/s). From 1974 to 2012 there were 144,126 TRP concentration data reported above the detection limit; and 45,564 measurements of TP concentrations.

Fig. 1
figure 1

The location of sites that could be included in this study and showing the regional divisions used for scaling up to give national TRP and TP fluxes

Table 1 The distribution and spatial coverage of catchments from which the UK TRP and TP flux could be calculated

Within the HMS monitoring programme (Simpson 1980; Dept. of Environment 1972) entries listed as orthophosphate concentration should be considered as TRP, since the analysis is based on colorimetric analysis of molybdate-reactive P on an unfiltered sample, and therefore includes orthophosphate, and other easily-hydrolysable P fractions in both dissolved and particulate phases (Jarvie et al. 2002). The total phosphorus measurements involved an additional acid-persulphate digestion step, before the colorimetric analysis (Dept. of Environment 1972). Monitoring for TP is far more restricted in time and space than for TRP.

Analysis of variance (ANOVA) was used to consider all data from all sites for which the frequency of sampling was more than 12 per year. Prior to ANOVA a Box–Cox transformation was used to remove outliers and the distribution of the data tested using the Anderson–Darling test which, if failed, the data were log-transformed. In ANOVA three factors in relation to the concentration of TRP and TP, and the ratio of the concentrations of TRP:TP were considered: (1) the difference between calendar years with 39 factor levels, one for each year between 1974 and 2012—henceforth referred to as the year factor; (2) the month of sampling with 12 factor levels, one for each calendar month—henceforth referred to as the month factor; and, (3) the differences between sampling sites—henceforth referred to as the site factor. The analysis was considered with and without the instantaneous river discharge at the time of sampling as a covariate. This covariate was log-transformed to ensure the greatest proportion of the original variance in the dataset was explained. Results of the ANOVA are expressed as least squares means (also called marginal means) as these are the means controlled for the other factors and covariates.

Annual river loads

Among the monitoring agencies, water quality sampling frequencies (f) vary, ranging from sub-weekly to monthly or even less frequently. Annual data were rejected at any site within any catchment where there were fewer than 12 samples in that year with the samples in separate months (f < 12); in this way a range of flow conditions would be sampled. A range of methods were then applied.

The approach to the calculation of annual river load used here was the same as that used by Worrall et al. (2014) for the particulate organic matter (POM) flux from the UK. The approach used a combination of three methods. The first method used was an interpolation method based upon “method 2” of Littlewood and Marsh (2005) modified for irregular sampling: this approach assumes that each sample taken at a site is equally likely to be representative of an equal proportion of the year as any other sample. Cassidy and Jordan (2011), with sub-daily measurement of P, considered both bias and precision in using interpolation methods and showed increasing bias with decreasing sampling frequency, with bias of up to 60 % on monthly sampling, and this scale of large uncertainty for all sampling frequencies except for near continuous monitoring. Therefore, common interpolation methods are also inadequate because they have considerable and variable precision and accuracy across a range of sampling frequencies.

Alternatively, Worrall et al. (2013a) show by considering both high-frequency data and by considering the sources of variation (Goodman 1960) that the best method was a very simple method that had a very low bias (8 % for f = 1 per month) and a high accuracy (2 % at f = 1 per month):

$$ F = KE\left( {C_{i} } \right)Q_{total} $$
(1)

where: Qtotal = the total flow in a year (m3/year); E (Ci) = the expected value of the sampled concentrations (mg/l); and K = unit conversion constant (0.000001 for flux in tonnes). The concentration data were best described by a gamma distribution and the value of E (Ci) was based on the expected value of a fitted gamma distribution. This method was applied to the available TRP and TP data to every site in the HMS data where the total flow per year could be calculated from daily flow measurements.

However, the daily flow is not known at all sampling sites within the HMS. To tackle this issue, Worrall et al. (2013b) used analysis of covariance (ANCOVA) to establish and correct for sample frequency bias. The ANCOVA was used to derive a correction factor for different sampling frequencies and these correction factors were applied in such a way as to correct all flow-weighted fluxes to the average sampling bias to a sampling frequency equivalent to better than weekly sampling. The bias-correction method was used on all those sites where Eq. (1) could not be used.

The flux for each HMS site in each year was then calculated using the methods described above. As with the concentration data the flux values for the flux of TRP and TP were analysed using ANOVA, as described above, with two factors (year and site) and then also analysed including the annual river discharge at each site as a covariate.

From the flux estimate for each site-year combination, the export rate as the flux per unit catchment area per year was then calculated. The national flux was then calculated using an area-weighted average of export rates by region. For each region of the UK for which P fluxes could be estimated, an average export was calculated for each year from 1974 to 2012 (Fig. 1; Table 1). The regions are based upon UK Environment Agency’s administrative areas that are bounded by watersheds. The flux from all the regions was summed to give the national flux. This area-weighted, regional approach better represents regional differences without biasing the national value due to uneven spatial distribution of available records, while also using all site information to calculate national-scale flux. Errors due to upscaling from catchment export estimates to the regional and national scales was estimated as half the percentage difference between the values estimated from the 5th and 95th percentile exports for each region: this gives an estimated upscaling error of ±15 %. No HMS data were available for Northern Ireland. However, the land area of Northern Ireland is 13,843 km2 and so the results for Great Britain (the countries of England, Wales and Scotland) could be scaled up to give an estimate of the flux from the whole of the UK—no account was taken of the particular land use, soil types or climate of Northern Ireland.

Catchment characteristics

Where a catchment P flux (TP or TRP) could be calculated for the period 2003–2012, the average catchment flux over those years was compared to a range of catchment characteristics. By comparing to catchment characteristics it should be possible to map the P flux across the UK and account for in-stream processing such that the flux from the terrestrial biosphere could be estimated and mapped across the UK and not just national flux of P at a tidal limit. The period 2003–2012 was chosen to remove distortion due to any particular wet or dry years and because the available land use data were collected for 2004. The land use and soil types of the Great Britain (i.e., the UK minus Northern Ireland) were classified as per previous studies (Worrall et al. 2012a). Land use was classified into: arable, grass and urban based upon the June Agricultural Census for 2004 (Defra 2005). A single measure for livestock, the equivalent livestock units per hectare were calculated based on published nitrogen, rather than phosphorus, export values (Johnes et al. 1996). The dominant soil-type of each 1 km2 grid square in Great Britain was classified into mineral, organo-mineral and organic soils based upon the classification system of Hodgson (1997). Note that, by this definition, peat soils are a subset of organic soils. The catchment area to each monitoring point was calculated from the CEH Wallingford digital terrain model which has a 50 m grid interval and a 0.1 m altitude interval (Morris and Flavin 1994). For each of the catchments, for which a P flux could be calculated, the following hydrological characteristics were used: the base flow index, the average actual evaporation, the average annual rainfall and the average annual river discharge. The hydrological characteristics for each catchment were available from the National River Flow Archive (www.ceh.ac.uk/data/nrfa/).

Multiple linear regression was used to compare the average annual flux and export for the period 2003–2012 to catchment characteristics (number of variables: 3 soil types; 3 land use; 1 livestock and 4 hydrological characteristics). Regression models were developed firstly, with both explanatory variables and the response variable untransformed and then, if necessary, log-transformed. Normality of transformed and untransformed variables was tested using the Anderson–Darling test and variables were only included in the model if they were statistically significant (probability of difference from zero at p < 0.05). Stepwise regression was used for variable selection with both forward and backward selection and the probability for inclusion set at 95 % of not being zero. The variance inflation factor (VIF) was used to check for collinearity with values of VIF > 5 taken as being the level of concern. Models were chosen both on the basis of model fit as assessed by the correlation coefficient (r2), and, given the potential for collinearity (as assessed by VIF), the physical interpretation of the model. Of particular interest were models which only included those soil, hydroclimatic or land-use characteristics that could be mapped across Great Britain, and models that identified a relationship between P flux and catchment area. The latter were used because significant net losses should be discernible from the relationship between total P flux and catchment area. The best-fit significant model was obtained to judge this relationship. If the best-fit model included catchment area, the model was then recalculated excluding catchment area and the residuals of that model were compared to the catchment area. In using regression to filter the data for effects other than that of catchment area, care was taken to consider information that was a proxy or collinear with catchment area, e.g., area of arable land. An analysis of residuals was performed for statistically significant models, where a standardised residual (residual divided by its standard deviation) greater than 2 was considered an outlier and worthy of further investigation. As further analysis of fit of preferred models, the residuals after model fitting were tested for normality using the Anderson–Darling test.

National-scale estimates of P imports and exports

The fluvial flux of TP was compared to the P fluxes to and from Great Britain with the national border taken as the boundary of the budget assessment. The inputs considered were: industrial usage; synthetic inorganic fertilizers; and food and feed transfers. The outputs considered were: fluvial flux of TP; and direct discharges of TP to the continental shelf. The food and feed transfers could be either a net input or net output and so, for initial analysis, we will assess it as an input but this does not prejudice the ultimate value or status of that flux pathway. The fluvial flux of TP, as conceived above, already incorporates leaching and soil erosion and the discharge of sewage and waste to rivers, but is the flux at the tidal limit and so does not include direct sewage and waste output to the sea. However, the OSPAR Commission has reported fluxes of TP for direct discharge to the UK continental shelf of industrial and sewage wastes since 1990 (OSPAR Commission 2013). It has been assumed that the input from wet and dry and atmospheric deposition is negligible (Neal et al. 2004).

Phosphorus is redistributed in the landscape and across national boundaries with food and feed transfers as well as plant and seed transfers. For the first time, this study uses commodity trade data to estimate the P export or import in the form of food and feed for the UK. The tonnage of imports versus exports was compared for with the range of food and feed commodities detailed in Table 2. The balance of trade, i.e., the difference between imports and exports, of food and feedstuff commodities was available for every year from 1990 (DEFRA 1990 to 2013). The balance of trade in each chosen commodity was then converted to a balance of P trade using typical P contents of common foodstuffs based on reports of the COFID database (McCance and Widdowson 2002; Table 2). In each case the composition of raw uncooked food was used. For freshwater fish, the composition of salmon was assumed; for sea fish the composition of haddock was used (haddock was the most commonly landed sea fish); and for shellfish the composition of whole crab was assumed (crab was the most commonly landed shell fish). For meat it was assumed that the dressed carcass was being transferred. Dressed carcass to live weight ratios were taken from Lord et al. (2002); the large P content of meats is because the dressed carcass includes a proportion of the bones of the original animal.

Table 2 The P content of food items, and fluxes used in this study to estimate UK boundary food and feed transfers

Figures for the use of synthetic inorganic fertilizer in the UK were derived for the period 1974–2012 from surveys published by the Fertiliser Manufacturers Association and the Environment Agency of England and Wales (British Survey of Fertiliser Practice 2013). The vast majority of animal wastes in the UK are returned to the land on the same farm as they are produced and therefore represent an internal transfer with no loss from the system (less than 3 % of cattle manure—Smith et al. 2001), There is only one river in the UK that crosses an international border (River Blackwater between Northern Ireland and the Republic of Ireland) which is only 1507 km2 in size and it is in Northern Ireland for which we have no data.

Industrial usage can be considered as an input because the UK has no actively worked reserves of mineral phosphate and it thus relies on import of P for inorganic fertilizers and industrial raw materials containing P—for example, for manufacture or included in detergents. Villalba et al. (2008) estimate that 75 % of mined P is consumed in the production of fertilizer and that was accounted for as described above. Villalba et al. (2008) do not give values broken down such that they could be used for the UK. Comber et al. (2013) studied the domestic consumption of P in the UK and showed that the average per capita discharge to sewage treatment works was 2.05 g P/day/ca of which 0.8 g P/day/ca was food and the rest was non-food items. Therefore, import of P in industrial goods represents 1.205 g P/day/ca and annual records of the UK population from 1973 to 2012 were used to calculate the import of P as industrial goods. This approach does assume that all industrial usage of P is for consumables (e.g., detergents) and that none are disposed directly to landfill (e.g., P used in plasticizers), it also assumes that no industrial goods are re-exported, e.g., detergent manufactured in the UK from imported P is then immediately exported from the UK. However, P used in iron and steel manufacture is normally recycled to land in crushed slag products. Therefore, the use of the study of Comber et al. (2013) is probably an underestimate of industrial usage of P and its processing in the UK. The results of Comber et al. (2013) cover the period 2000–2010 and so we have assumed that the results can be extrapolated back to 1990.

Since Northern Ireland is not included in the fluvial flux data, it had to be corrected for by upscaling from the area of Great Britain to that of the UK. Further, most of the comparative P fluxes listed above are from Government and other published data sources and, as such, the flux was not calculated within this study. Therefore, the original data were not available to this study and so this study has had to accept the error estimation from each individual source. In some cases, no error or uncertainty estimate is given; in other cases the error estimate is not credible. For example, the OSPAR Commission (OSPAR Commission 2013) report the upper and lower limit of direct flux of TP from the UK in 2011 as between 5.81 and 5.88 ktonnes P/year, an error of only 0.6 %. In other cases, although the reported flux error is given as a range, it is not always clear what this range actually represents (e.g., min–max, inter-quartile range or confidence intervals) making it impossible for them to be considered here as any comparison of uncertainties between different forms or types of uncertainty (e.g., Range vs. standard error) would not be a fair comparison. Therefore, for some fluxes considered in this study, it was necessary to accept the uncertainty as reported in the original source.

Results

TRP concentrations

Within the HMS database the median value was 0.16 mg P/l with a 5th to 95th percentile range of 0.009 to 2.25 mg P/l. Values below the reported detection limit were not considered further. The least sampled year was 1974 with only 2158 samples and the most sampled year was 1992 with 5289 samples. Box–Cox transformation removed 171 data leaving a consistent detection limit of 0.002 mg P/l, but no further transformation was required. Analysis of variance (ANOVA) explained 83 % of the variation in the data and showed that there was a significant seasonal cycle and a significant downward trend across the 39 years of the dataset. If flow is not included as a covariate, the peak TRP concentration was in 1989 (Fig. 2), i.e., prior to the inception of the Urban Waste Water Directive (European Commission 1991). By 2012, the least-squares mean value was less than 50 % of the value in 1974. When flow was included as a covariate, then the peak TRP concentration was in 1984 but there was no consistent decline in TRP concentration until after 1990.

Fig. 2
figure 2

The time series of the TRP concentrations for the UK expressed as the least squares means of the year factor for TRP concentrations

TP concentrations

Within the HMS database the median TP concentration was 0.114 mg P/l and a 5th to 95th percentile as 0.012 and 1.53 mg P/l. The least sampled year was 1983 with 147 samples and the most sampled year was 2012 with 2928 samples. Transformation did not improve the dataset. Analysis of variance explained 82 % with all factors being significant (greater than 95 % probability) with a decline in the annual least squares mean from 1983 with values in 2012 again less than 50 % of the value in 1974 (Fig. 3). Inclusion of flow as a covariate did not change the observed pattern but removed a large peak predicted for 1999.

Fig. 3
figure 3

The time series of the TP concentrations for the UK expressed as the least squares means of the year factor for TP concentrations

TP versus TRP

It was possible to assess the ratio of TRP to TP in 40,880 cases; the median value of this ratio was 0.73 with the 5th to 95th percentile as 0.17 to 1; these results show that a median of 73 % of total phosphorus was reactive phosphorus. The ANOVA of the TRP/TP ratio compared to the three factors used as previously (year, month and site with and without flow as a covariate) explained only 25 % of the variance in the original dataset though all factors were found to be significant. There were significant differences between years but the least-squares means show that a consistent trend across time is difficult to discern even when flow is included as a covariate (Fig. 4). When TRP and TP concentrations are directly compared, then it would appear that, perhaps not surprisingly, all samples could be described as a mixture between TP, made up almost entirely of TRP, and a form of TP made up of non-reactive P (Fig. 5).

Fig. 4
figure 4

The time series of the TRP:TP concentrations for the UK expressed as the least squares means of the year factor for the TRP:TP ratio with and without the inclusion of river flow as a covariate

Fig. 5
figure 5

Comparison of TRP and TP for all available samples

Annual fluvial fluxes of TRP

Within the dataset there were 5892 site-year combinations for which a flux could be calculated. Over the monitored period the number of sites that could be included in the calculation of any one year’s flux was between 64 in 1974 and 211 in 2009. The average sampling frequency at each site peaked at 27 samples per site in each of 1975, 1976 and 1977 and decreased to only 13 samples/site/year in 2012 (remember that sites with f < 12 were already removed). Applying ANOVA to the annual site-year flux estimates shows that when both site and year factors are included then the ANOVA explained 39 % of the variation in the original dataset, but when annual water yield was included, this rises to 70 % of the original variance being explained and the inclusion of water yield as a covariate removes a sharp peak in fluxes in between 1994 and 1996 (Fig. 6). Once water yield is allowed for, then catchment fluxes peaked in 1995.

Fig. 6
figure 6

The time series of the TRP fluxes for the UK expressed as the least squares means of the year factor for TRP fluxes

When using only method 2 with correction for the unsampled area of the UK, then the flux from the UK was between 22.6 ktonnes P/year in 1986 and 8.3 ktonnes P/year in 2011. The preferred method combination gave results that were between 8.3 ktonnes P/year in 2011 and 41.7 ktonnes P/year in 2000. The pattern is clearly dominated by two peaks in 1984 and 2000, which were both very wet years in the UK; the fluxes decline dramatically after 2000 (Fig. 7).

Fig. 7
figure 7

The TRP flux from the UK over the period 1974–2012

The annual flux from each catchment was averaged over the last 10 years and compared to the available catchment characteristic (land use, soil type and hydroclimatic data). The best fit equation was:

(2)

where TRP = decadal average annual flux of TRP (tonnes P/year); Urban = area of urban land use (km2); Grass = area of grass land (km2); OrgMin = area of organo-mineral soils (km2); and Org = area of organic soils (km2). All terms in Eq. (2) were significant at least at the 95 % probability of being different from zero and the terms beneath in the brackets are the standard errors in the coefficients. No other variables (land use, soil type or hydrology) were found to be significant and there was no significant constant term. There was no significant role for catchment area or for the area of arable land use and the variance inflation factors for terms in Grass or OrgMin suggest a high degree of collinearity between soil and land-use terms. Equation (2) was not suitable for mapping across the UK and unexpectedly there was no significant term in catchment area. The term in catchment area has previously been used to estimate in-stream losses (e.g., Worrall et al. 2014).

Principal component analysis was used to assess the multivariate structure of the dataset, in which 3 components were found with an eigenvalue >1. The loadings on the first 3 components (Table 3) show that the first component has similarly high loadings for the terms in area, mineral soils, organo-mineral soils, arable; grass and urban land uses that compare with a high positive loading for TRP flux but not for the TRP export and a negative loading for annual flow. Component 1 shows a correlation between most of the catchment characteristics which would lead to issues of co-linearity in isolating a single predictive parameter set. Component 2 has only low loadings for the TRP flux and export while component 3 has high positive loadings for both, but for component 3 the third most important variable was a negative loading for arable land followed by a positive loading for urban land. Therefore, component 3 represents catchments with high flux and export but are dominated by urban but not arable land.

Table 3 The principal component analysis of the TRP fluxes and exports

The fit of Eq. (2) can be understood when the relationship between TRP flux and urban area is examined (Fig. 8). There is a clear linear relationship (in log–log space) between the most rural catchments in the dataset (marked A—Fig. 8) and the most urbanised catchment in the dataset (marked B—Fig. 8). However, it is also clear from Fig. 8 that there is an influence of urbanised catchments with low TRP with the extreme case being the site marked C. These latter catchments may be ones where, although urbanised, these export their major sewage or waste outfalls export to a different catchment, i.e., the sewage from the population of these catchments is actually processed in another catchment. Alternatively, the type of behaviour typified by site C is that where there has been extensive clean-up and P stripping from sewage effluent. When a linear relationship is explored as an alternative to Eq. (2), then 0.98 tonnes P would be expected from each 1 km2 of urban land use.

Fig. 8
figure 8

The average TRP flux (tonnes P/year) for each study catchment in comparison the proportion of urban land use within that catchment. The points distinguished by letters are discussed in the text

Annual fluvial fluxes of TP

Within the dataset there were 3606 site-year combinations for TP flux. The median flux was 31 tonnes P/year with a 5th to 95th percentile of 2.7 to 510 tonnes P/year. Over the monitored period the number of sites that could be included in the calculation of any one year’s flux was between 64 in 1974 and 226 sites in 2012 with a minimum in 17 sites in 1984. The average sampling frequency peaked in 1977 at 14.8 samples per year. Normality testing and the Box-Cox transformation removed only 4 data points. When ANOVA was performed, the factors of site and year explained 63 % of the original variance but this increased to 73 % when annual water yield was considered as a covariate (Fig. 9). The main effects plot shows a decline in TP fluxes from the study sites since 1992—the year of the inception of the Urban Waste Water Treatment Directive (European Commission 1991).

Fig. 9
figure 9

The time series of the TP fluxes for the UK expressed as the least squares means of the year factor for TP fluxes

Using the preferred method, the flux of TP from the UK varied from 120 ktonnes P/year in 1976 to 10.8 ktonnes/year in 1986 (0.04–0.49 tonnes P/km2—Fig. 10). However, only after 2002 did the number of sites sampled in any 1 year rise above and stay above 200 and the period 2003–2012 was the period of the entire record when all regions were represented within the weighted sum used to calculate to the UK scale. From 2003 onwards, the peak flux was in 2003 at 53.8 ktonnes P/year to a low of 15.8 ktonnes P/year in 2011 (0.06 to 0.22 tonnes P/km2). Earl et al. (2014) estimated that the average annual flux from the UK between 1993 and 2003 was 34.5 ktonnes P/year (0.14 tonnes P/km2) compared to a value from this study of 30.3 ktonnes P/year (0.12 tonnes P/km2). Smith et al. (2005) considered the TP flux from Northern Ireland based and found a TP flux to coastal waters of 2174 tonnes P/year (0.15 tonnes P/km2).

Fig. 10
figure 10

The TP flux from the UK over the period 1974–2012

When compared to catchment characteristics:

(3)

where OrgMin is the area of organo-mineral soils within the catchment (km2), Org is the area of organic soils within the catchment (km2), Grass is the area of grassland within the catchment (km2), Urban is the area of urban land within the catchment (km2), and Area is the the area of the catchment (km2). Only parameters found to be significant at least at the 0.95 probability level were included and values in brackets beneath Eq. (3) are the standard errors in the coefficients. There was no significant constant term in Eq. (3) which could imply that there was no background TP flux from the study catchments; however, it should be pointed out that for no single square kilometre in the UK is the area of organo-mineral or organic soils; grassland or urban area actually zero and so Eq. (3) would predict a TP flux from any location in the UK. Equation (3) has no significant term for arable land. When arable land was retained within Eq. (3), then it had a negative coefficient implying the physically illogical conclusion that arable land acted as a sink of TP. The variance inflation factors showed that area and arable terms were, as also demonstrated for TRP above, highly collinear. A similar collinearity between terms in arable and area was observed for nitrate (Worrall et al. 2012a) and can be ascribed to the fact that in the UK the lowland areas most suitable for arable farmland are closest to the sea in the UK and, as catchment area increases, the land being added to the catchment area is more likely to be arable. It is physically reasonable that a negative term in area represents in-stream removal and this was also observed for both fluvial nitrogen and carbon (Worrall et al. 2012a, b). Equation (3) can now be considered an export coefficient model for TP and, as such, Eq. (3) predicts that, for example, grassland on organo-mineral soils would export 0.33 tonnes P/km2/year. White and Hammond (2007) summarise export coefficients from McGuckin et al. (1999) and Smith et al. (2005) for UK land uses (but not soil types) and they range from 0.01 to 0.48 tonnes P/km2/year. Unlike Eqs. (2) and (3) implies a significant in-stream loss at a rate of 0.14 tonnes P/km2/year.

Equation (3) can be mapped across Great Britain at a scale of 1 km2 (Fig. 11) and contributions to national TP fluvial fluxes can be estimated (Table 4)—this includes an estimated in-stream loss of 34.2 ktonnes P/year. The map of TP fluxes across Great Britain (Fig. 11) shows hotspots in fluvial P fluxes (>1 tonnes P/km2/year) in the major cities. Large areas of western and northern England, which correspond with areas of highest precipitation and runoff with grassland/livestock production, had P fluxes of 0.2–0.5 tonnes P/km2/year. In contrast, the areas of arable production in the south and east of England, which receive high fertiliser inputs, but very low net annual precipitation, have fluvial total P fluxes of less than 0.1 tonnes P/km2/year, similar to TP fluxes in the highlands of Scotland.

Fig. 11
figure 11

The TP export from the terrestrial biosphere to the river network at the 1 km2 scale

Table 4 The components of the Eq. (3) for the UK

Comparing national river P fluxes with trade inputs and outputs

The import of synthetic fertiliser peaked in 1984 at 2175 ktonnes P and has declined ever since to 81.5 ktonnes P by the end of the 2013 (Fig. 12; Table 5). Note that Fig. 12 and Table 5 run from 1990 to 2013 as this is the common period for which all flux pathways could be estimated. The export of food and feed has declined since 1990 with a net export peaking in 1994 at 7.9 ktonnes P, becoming a net import of P in 1998 and by 2012 the UK was importing 28.7 ktonnes P/year in food and feedstuffs. Imports of industrial P have risen exactly in line with population and by 2012 the imports were 28.6 ktonnes P/year, but this is to be expected given that the estimation method was based on per capita data. Villalba et al. (2008) considered the global industrial production and consumption of P and find that 75 % of P production goes to fertiliser and the remainder to other products. Villalba et al. (2008) did not give values for the UK but did for Europe (1464 Mtonnes P in 2007, or 1.97 tonnes P/ca/year). Rescaling the values of Villalba et al. (2008) for the UK population this would be 124 ktonnes P/year for industrial use which would divide as 93 ktonnes P/year as fertiliser and 31 ktonnes P/year as other industrial uses. In this study, for the UK, in 2012 the total industrial consumption was 110.3 ktonnes P of which 74 % was for fertiliser, i.e., the estimates based upon Comber et al. (2013) may only be a slight underestimate. The direct discharges of TP to the sea have declined from 22.0 ktonnes P/year in 1990 to 5.5 ktonnes P/year in 2012. This decline can be ascribed to improved waste water treatment provision, implemented in part as a response to the Urban Waste Water Treatment Directive (European Commission 1991). The total UK P budget changed from a net input of 101 ktonnes P/year in 1990 to a net accumulation of 112 ktonnes P/year in 2012 (Fig. 12). It is important to note that the UK is now accumulating P and that the rate of accumulation is not declining because, although fertiliser inputs are diminishing, the amount being exported via rivers and direct discharges has declined more rapidly. Indeed, in 1990 the fluvial flux of TP at the tidal limit represent 40.8 % of the imports of P to the UK but by 2012 the fluvial flux of TP at the tidal limit represented only 15.4 %.

Fig. 12
figure 12

The total P budget with respect to the UK border

Table 5 The components of the TP budget at the UK border between 1990 and 2013

Discussion

It is difficult to find comparable studies that consider total P budgets at a regional scale. There have been farm-scale budgets (e.g., Haygarth et al. 1998; Ruane et al. 2014) and these note the P accumulation in agricultural systems that would lead to over-fertilised soils (Sharpley and Smith 1989). Smith et al. (2005) considered the TP flux from Northern Ireland based on an estimate of the country running an agricultural surplus of P of 1.65 tonnes P/km2 (Foy et al. 2002). For some sectors there have been large-scale studies, for example, Bouwman et al. (2013) explore the global impact of changing livestock production on P cycles and show increasing P surpluses with those surpluses being lost to fluvial networks. There have been studies at the regional scale of fluvial fluxes of P, Similarly, examples of P budgets exist for urban land use at a global scale (Moree et al. 2013); global soils (Van Vuuren et al. 2010) or the projection of global P demand (Bouwman et al. 2009). Kronvang et al. (2007) modelled the TP flux from European countries draining into the Danube and calculated fluxes between 0.07 and 0.12 tonnes P/km2. Seitzinger et al. (2005) estimated that the global flux of TP to the oceans was 11 Mtonnes P/year (0.074 tonnes P/km2/year) compared to 9 Mtonnes P/year (Meybeck 1982) and 20 Mtonnes P/year (Mackenzie et al. 1993). Meybeck (1982) predicted that only 7.5 % of the TP was as SRP. So although these studies have considered fluvial fluxes and agricultural surpluses, they do not provide a national budget and cannot give an estimate of the fluvial fluxes as a proportion of the total P exchanges as described here. However, these approaches have not considered the fluvial fluxes in the context of total P budget and have focused upon the agricultural sector rather than all the fluxes, although such inclusive approaches have becomes common in the study of nitrogen (e.g., Sutton et al. 2011).

The P budget presented here was quantified relative to the UK national border and so does not cover the re-distribution of the P within landscape compartments. However, internal re-distribution is an important consideration where the projected P accumulation is concerned. The increase in P accumulation has been brought about by declines in fluvial and direct coastal waste P disposal fluxes, as a direct result of improvements to sewage treatment. This has resulted in a major shift in environmental pathways of sewage P, which was previously discharged via effluent to rivers and the coastal zone. Accordingly, an increasing proportion of this effluent P flux is now removed in sewage sludge and disposed of to the terrestrial environment. In 2000, 52 % of sewage sludge was disposed on agricultural land in England and Wales, and 17 % was sent to landfill (DEFRA 2002). Thus, a high proportion of this accumulation of P from sewage sludge is likely to be within soil. Groundwater represents another potential sink for dissolved P mobilised from sewage sludge. Holman et al. (2008) have collated the SRP concentration data for English and Welsh groundwater and found a median value of 30 μg P/l but did not assess any trend on that data making it impossible to judge whether groundwater is acting as an increasing sink of SRP. Furthermore, the use of P fertiliser and feedstuffs to enhance livestock production will act to flux P to landfills through supply of meat on the bone to consumers; the majority of P consumed by livestock goes to develop the animal’s bone structure rather than its protein content (McCance and Widdowson 2002). The bones of butchered animals are often rendered and now incinerated to be returned to land, but for meat sold to customers as meat on the bone, the bones component will be sent to landfill. Furthermore, if in 2000, 52 % of sewage sludge was disposed on agricultural land in England and Wales, and 17 % was sent to landfill then 38 % was incinerated. Incinerator ash, rich in P, would be disposed of to landfill or used in cement manufacture, i.e., locked into long-term stores. The P accumulating within terrestrial or groundwater P stores provides a long-term legacy P source to surface waters (Jarvie et al. 2013a; Sharpley et al. 2013) but accumulation into landfill will not represent a long –term source to surface waters. These terrestrial and groundwater P pools have vastly different residence times, with major implications for timescales of legacy P storage and recycling. For example, landfill will likely be a much longer-term P store, than soils where P will likely be recycled over much shorter timescales, e.g., years to a few decades (Jarvie et al. 2013a). The longevity of P storage in groundwater is more variable and dependent on aquifer type. For example, in Chalk, co-precipitation of phosphate with calcium carbonate is very efficient at locking up P into a highly insoluble form within the Chalk matrix, effectively resulting in an effectively permanent P sink (Neal 2001). In contrast, other aquifer systems can exhibit much shorter timescales of P storage and remobilisation, for example in karst groundwaters, where P storage and remobilisation can occur within the order of a decade (Jarvie et al. 2014). Sattari et al. (2016) have proposed that at the global scale, grasslands have a negative P budget because the return of manure and sewage sludge is not sufficient compared to the offtake of food from grassland implying that grassland would not be the site of accumulation. However, the analysis of Sattari et al. (2016) did not allow for the increase diversion of human sewage (as is noted in this study); the fact a proportion of the P in sewage sludge that is returned to land most come from industrial products and not from human sewage; and the return P via rendered animal carcasses (even after incineration as is required in the UK). Between 1992 and 2012, the total amount sewage sludge either incinerated or supplied directly to landfill rose from 219.5 ktonnes dry solids to 268.3 ktonnes dry solids (i.e., an increase of 2.57 ktonnes dry solids/year—DEFRA 2012). Bending et al. (1999) give the P content of dry sewage sludge in the UK as between 1.5 and 1.9 % which means that the increase in sewage sludge incineration represents between 0.04 and 0.05 ktonnes P/year, or that 8 % of the increasing accumulation is presented by changing sludge handling alone.

Consideration of TRP and TP fluxes has shown that they are dominated by urban land use. Given the export coefficients estimated in Eq. (3), it is possible to assess the comparative importance of soils and land uses in the UK (Table 4). Although the area of grassland is almost three times larger than the UK’s urban area, the far larger export coefficient means that as a source urban areas represent 60 % of the TP flux. Defra (2004) estimated that the urban contribution to TP loads was 50 % but this was revised upwards to 70 % by White and Hammond (2007). Table 3 also shows that Eq. (3) predicts that 34.2 ktonnes P/year are lost in the stream network. As already noted above, previous studies have either seen such a loss term as either loss for carbon or nitrogen to the atmosphere or, in the case of particulates (e.g., POM), stored within channels or on floodplains (Worrall et al. 2014). In the case of storage of particulates, Worrall et al. (2014) predicted 3 % of POM to storage, and therefore in the case of TP this would give a median loss to storage of 1.1 ktonnes P/year. Unlike carbon and nitrogen, P cannot be lost to the atmosphere but loss could be ascribed to fixation into biomass or fixation bound into sediment.

Conclusions

For the first time, a P balance for Great Britain has been calculated, by bringing together data on P fluxes from rivers and direct coastal discharges to the national coastal boundary, along with new detailed information on export and import of P within a wide range of trade commodities, including fertiliser, animal feed, food and drink and industrial products. Time series of fluvial P fluxes to the national coastal boundary of the UK were calculated for 38 years (1974–2012). The results showed that: (1) the total national flux of TP peaked in 1976 at 120 ktonnes P/year (0.49 tonnes P/km2) and declined to a minimum of 15.8 ktonnes P/year (0.06 tonnes P/km2) by 2011, while TRP fluxes peaked in 2000 at 70.9 ktonnes P/year (0.29 tonnes P/km2) and declined to 9.3 ktonnes P/year (0.05 tonnes P/km2) by 2011; and (2) at the national scale, fluvial TP fluxes are dominated by P inputs from urban areas, even after the introduction of the EU Waste Water Treatment Directive, which enhanced P stripping from effluent. By comparing all P imports to Great Britain with exports, including P loss via fluvial fluxes and waste disposal direct to the coastal zone, a national P budget was estimated. The national P budget shows an accumulation of P within Great Britain, which has continued over the last 15 years, by an average rate of accumulation of 0.6 ktonnes P/year2 across the whole of the UK, equivalent to (24 kg P/km2/year2). This accumulation corresponds with upgrades to sewage treatment, which have diverted effluent discharges of P to sewage sludge, which has changed the disposal of effluent P from an aquatic to a terrestrial environmental pathway. The P budget approach used here does not elucidate any internal re-distribution or flows of P. However, ~77 % of sewage sludge in England and Wales is now applied to agricultural land, so storage in agricultural soils likely represents an important P sink, while this study must also emphasize a role for accumulation in landfill from food waste. Further research is now needed to explore the spatial distribution, accumulation, residence times and flows of P within, and between, environmental pools, including soils, landfill, groundwater and surface waters, to better understand the long-term environmental and agronomic implications of the changing P balance within Great Britain.