# Stochastic modelling of rainfall for the island of Ireland

## Abstract

This paper analyses a recently created continuous 305-year (1711–2016) monthly rainfall series for the island of Ireland. The findings are as follows. The excess skewness in the monthly series may be eradicated by using a Box-Cox transformation with parameter equal to 0.6: a value very similar to that found for the U.K. and its regions. There is no evidence of either an overall stochastic trend or of evolving monthly seasonal patterns, but positive linear trends are found for January, March, and December and a negative linear trend is found for July. Analysis of the seasonal and annual series (which require no transformation) confirms the implication from the monthly data that winters have become progressively wetter and summers progressively drier, with the positive linear trend for winter being twice the size of the negative summer trend. Since there is no trend in either spring or autumn rainfall, annual rainfall shows a positive linear trend. Given that the rainfall series exists for over three centuries, breaks and structural shifts in the model were investigated. Five breaks were identified, three of which occurred in the early portion of the series during the eighteenth century. However, trends were found to be much more stable from the middle of the nineteenth century. For the seasonal series, only a single break, at 1790 for the winter series, was found: it was only after this break that winters became wetter; before then, winter rainfall had a negative trend. In terms of predictability, predictions from the model were found to be more volatile during the second half of the eighteenth century and again from 1976 onwards.

## 1 Introduction

Murphy et al. (2018) have recently created a continuous 305-year (1711–2016) monthly rainfall series for the island of Ireland, known as IoI_1711. They have also provided detailed descriptive statistical analysis of the series but have not attempted any stochastic time series modelling of the type undertaken by Mills (2005, 2015, 2017) for the U.K. and its regions. The purpose of this paper is to undertake, such an analysis on IoI_1711 to enable a wider perspective on the evolution of the series to be obtained.

To this end, Sect. 2 outlines the model used for the analysis of the monthly IoI_1711 series and discusses how deterministic and stochastic trends and seasonal patterns may be identified, along with model estimation and testing. Section 3 provides a complementary analysis of the seasonal and annual series obtained from the monthly data. Given the long span of the data, Sect. 4 investigates the possibility of breaks and shifts in the model and in its predictability. Section 5 completes the paper by providing a summary, conclusions, and a comparison with the findings of Murphy et al. (2018).

## 2 A model for monthly rainfall

### 2.1 The basic model

Following Mills (2005, 2015, 2017), a basic model for a monthly rainfall series observed from time *t* = 1 to time *t* = *T* has been found to be

*x*

_{t}is transformed by the Box and Cox (1964) power transformation defined as

This has been applied to ameliorate the skewness found in the raw data, a consequence of *x*_{t} being bounded below at zero and possibly having a long right (positive) tail. Through its scaling property, the transformation helps to induce normality, linearity, and constancy of variance into the model. It is used here in its simplest form, as it may be generalised in a variety of directions to deal, for example, with negative values and heteroskedasticity, neither of which are needed here. The *s*_{i, t}*t* = 1, 2, ⋯, *T*, are “dummy” variables defined to take the value 1 in month *i* and 0 elsewhere (where *i* = 1 signifies January, etc.). Their inclusion allows a deterministic monthly pattern to be modelled. The presence of the *s*_{i, t}*t* “interaction” variables allows for the possibility of different monthly linear time trends. The *α*_{i} and *β*_{i} parameters measure the intercept and slope of these trends, so that if *β*_{i} ≠ 0 then the seasonal pattern for month *i* evolves linearly over time.

The error *u*_{t} can, in general, follow a seasonal autoregressive-moving average (ARMA) process: see for example, Mills (2019), chapter 8) for technical details and Mills (2014) for a discussion of such models in a meteorological context:

are “non-seasonal” polynomials of orders *p* and *q* in the lag operator *B*, defined such that *B*^{j}*a*_{t} ≡ *a*_{t − j}, *a*_{t} being zero mean white noise (*E*(*a*_{t}) = 0, *E*(*a*_{t}*a*_{t − j}) = 0 for all *j* ≠ 0) with variance \( E\left({a}_{\mathrm{t}}^2\right)={\sigma}_{\mathrm{a}}^2 \). The “seasonal” polynomials

are of orders *P* and *Q*, their presence allowing the error to be autocorrelated at seasonal lags, such as 12, 24, ⋯, as well as being autocorrelated at non-seasonal lags.

### 2.2 Deterministic and stochastic trends and seasonality

More general models result if unit roots are allowed in the *ϕ*(*B*) and Φ(*B*^{12}) polynomials. If the non-seasonal autoregressive polynomial contains a unit root, i.e., the characteristic equation associated with *ϕ*(*B*) contains a root of unity, then *ϕ*(*B*) can be factorised as

*ϕ*

^{∗}(

*B*) is a polynomial of order

*p*− 1. Equation (1) then becomes, with ∇ = 1 −

*B*signifying the first-difference operator and \( {u}_{\mathrm{t}}^{\ast }=\nabla {u}_{\mathrm{t}} \),

Noting that ∇*s*_{i, t} = *s*_{i, t} − *s*_{i + 1, t} and ∇*s*_{i, t}*t* = (12 + *i*)(*s*_{i, t} − *s*_{i + 1, t}), where it is taken that ∇*s*_{12, t} = *s*_{12, t} − *s*_{1, t + 1}, Eq. (3) in turn becomes

Since \( \nabla {x}_{\mathrm{t}}^{\left(\uplambda \right)}=\alpha +{u}_{\mathrm{t}}^{\ast } \) would depict a random walk with drift *α*, Eq. (4) may be interpreted as implying that \( {x}_{\mathrm{t}}^{\left(\uplambda \right)} \) contains a stochastic, random walk, trend with differing seasonal drifts, i.e. each month evolves as a random walk with its own drift.

Alternatively, suppose that the seasonal autoregressive polynomial contains a (seasonal) unit root:

_{12}= (1 −

*B*

^{12}) and \( {u}_{\mathrm{t}}^{\dagger }={\nabla}_{12}{u}_{\mathrm{t}} \)

Since ∇_{12}*s*_{i, t} = 0 and ∇_{12}*s*_{i, t}*t* = 12*s*_{i, t}, Eq. (5) becomes

and \( {x}_{\mathrm{t}}^{\left(\uplambda \right)} \) now contains a stochastic seasonal random walk with differing seasonal drifts.

If Φ(*B*^{12}) = Θ(*B*^{12}), then there is only deterministic seasonality (see Pierce 1978, and Mills and Mills 1992, for similar set-ups and additional analysis).

### 2.3 Estimating and testing the model

*T*= 3672 monthly observations on rainfall for the island of Ireland for the 305 years from 1711 to 2016, known as the IoI_1711 series. Figure 1 displays the histogram and empirical kernel density of the series, superimposed on which is a normal distribution with the same mean and standard deviation as IoI_1711. Although the distribution is not excessively kurtotic (the kurtosis measure is only 3.12), it is highly skewed to the right (the skewness measure is 0.52), as might be expected. Figure 2 shows the plot of the log-likelihood function for the Box-Cox transformation parameter

*λ*in Eq. (1). The maximum likelihood (ML) estimate is \( \widehat{\lambda}=0.58 \) with a 95% confidence interval running from 0.53 to 0.63. (ML estimation of and the construction of a confidence interval for the Box-Cox transformation parameter in models, such as (1) is conveniently discussed in Mills 2019, chapter 2). For convenience,

*λ*was thus set at the value of 0.6, and the histogram and empirical kernel density of the transformed series are shown in Fig. 3. Skewness has been eradicated (it is now just 0.03), and the distribution is close to the superimposed normal distribution.

To determine the most appropriate form of the combined model given by Eqs. (1) and (2), initial analysis using the information from the sample autocorrelation and partial autocorrelation functions, along with residual diagnostic checks from fitted models, established that the polynomial orders could be set at *p* = 3, *q* = 0, and *P* = *Q* = 1 (at most), leading to the model

*ϕ*

_{1}+

*ϕ*

_{2}+

*ϕ*

_{3}would have to be unity (the unit root condition). The estimates show clearly the absence of such a (stochastic) trend since \( {\widehat{\phi}}_1+{\widehat{\phi}}_2+{\widehat{\phi}}_3=-0.195 \), with a standard error of 0.0285. There is, though, evidence of non-seasonal autocorrelation as both \( {\widehat{\phi}}_1 \) and \( {\widehat{\phi}}_3 \) are significantly different from zero. The estimates of both Φ and Θ are insignificantly different from zero, so that there is, in fact, no evidence of stochastic seasonality, so that the seasonal pattern does not evolve over time. However, the hypothesis of no deterministic monthly trends (

*β*

_{1}=

*β*

_{2}= ⋯ =

*β*

_{12}= 0) may be conclusively rejected, although only four of the months have trends that are individually significant.

Estimates of Eq. (7)

Eq. (7) | Restricted Eq. (7) | |||
---|---|---|---|---|

\( {\widehat{\alpha}}_1 \) | 21.539 | (0.640) | 21.560 | (0.622) |

\( {\widehat{\alpha}}_2 \) | 20.397 | (0.773) | 21.123 | (0.337) |

\( {\widehat{\alpha}}_3 \) | 18.477 | (0.824) | 18.522 | (0.805) |

\( {\widehat{\alpha}}_4 \) | 18.812 | (0.778) | 18.853 | (0.389) |

\( {\widehat{\alpha}}_5 \) | 19.761 | (0.697) | 19.532 | (0.358) |

\( {\widehat{\alpha}}_6 \) | 19.992 | (0.699) | 19.674 | (0.350) |

\( {\widehat{\alpha}}_7 \) | 24.734 | (0.687) | 24.651 | (0.668) |

\( {\widehat{\alpha}}_8 \) | 24.919 | (0.748) | 24.034 | (0.346) |

\( {\widehat{\alpha}}_9 \) | 23.896 | (0.684) | 23.048 | (0.327) |

\( {\widehat{\alpha}}_{10} \) | 25.332 | (0.699) | 25.838 | (0.339) |

\( {\widehat{\alpha}}_{11} \) | 25.068 | (0.740) | 25.552 | (0.350) |

\( {\widehat{\alpha}}_{12} \) | 23.260 | (0.672) | 23.323 | (0.654) |

\( {\widehat{\beta}}_1 \) | 0.00163 | (0.00032) | 0.00162 | (0.00031) |

\( {\widehat{\beta}}_2 \) | 0.00040 | (0.00033) | – | |

\( {\widehat{\beta}}_3 \) | 0.00085 | (0.00038) | 0.00083 | (0.00037) |

\( {\widehat{\beta}}_4 \) | 0.00002 | (0.00038) | – | |

\( {\widehat{\beta}}_5 \) | − 0.00012 | (0.00033) | – | |

\( {\widehat{\beta}}_6 \) | − 0.00017 | (0.00034) | – | |

\( {\widehat{\beta}}_7 \) | − 0.00119 | (0.00034) | − 0.00114 | (0.00034) |

\( {\widehat{\beta}}_8 \) | − 0.00048 | (0.00034) | – | |

\( {\widehat{\beta}}_9 \) | − 0.00046 | (0.00031) | – | |

\( {\widehat{\beta}}_{10} \) | 0.00027 | (0.00032) | – | |

\( {\widehat{\beta}}_{11} \) | 0.00026 | (0.00034) | – | |

\( {\widehat{\beta}}_{12} \) | 0.00131 | (0.00031) | 0.00128 | (0.00031) |

\( {\widehat{\phi}}_1 \) | 0.0477 | (0.0168) | 0.0481 | (0.0167) |

\( {\widehat{\phi}}_2 \) | − 0.0212 | (0.0169) | – | |

\( {\widehat{\phi}}_3 \) | − 0.0459 | (0.0169) | − 0.0464 | (0.0168) |

\( \widehat{\Phi} \) | 0.0914 | (0.7056) | – | |

\( \widehat{\Theta} \) | − 0.0676 | (0.7071) | – | |

\( {\widehat{\sigma}}_a \) | 6.076 | 6.076 | ||

\( {\overline{R}}^2 \) | 0.157 | 0.157 |

The residuals from this model exhibit no autocorrelation. The question of whether the deterministic seasonal model might contain a non-linear component was addressed by including additional quadratic and cubic trends, taking the form *s*_{i, t}*t*^{2} and *s*_{i, t}*t*^{3}, but these were found to be insignificant (an *F* test for their inclusion has a marginal significance level of just 0.65 when just quadratic trends are included and 0.60 when both quadratic and cubic trends are included).

*i*in year

*y*, where

*y*= 1 corresponds to 1711, etc., is given by

then the predicted rainfall itself is given by the inverted value

January, March, and December exhibit positive trends, so that rainfall in these months has increased over the three centuries, while the trend for July is negative, indicating that this month has become progressively drier. The slopes of these (non-linear) trends are rather small; however, January rainfall is predicted to have increased from 81 to 118 mm between 1711 and 2016, March rainfall from 64 to 81 mm, and December rainfall from 91 to 121 mm. July rainfall is predicted to have declined from 99 to 74 mm over the three centuries. The remaining 8 months show constant seasonal factors.

## 3 Modelling the seasonal and annual rainfall data

*y*− 1, January of year

*y*, and February of year

*y*, i.e.

*win*

_{y}=

*x*

_{12, y − 1}+

*x*

_{1, y}+

*x*

_{2, y}), spring (

*spr*

_{y}=

*x*

_{3, y}+

*x*

_{4, y}+

*x*

_{5, y}), summer (

*sum*

_{y}=

*x*

_{6, y}+

*x*

_{7, y}+

*x*

_{8, y}) and autumn (

*aut*

_{y}=

*x*

_{9, y}+

*x*

_{10, y}+

*x*

_{11, y}). An annual series may then be defined as

*ann*

_{y}=

*x*

_{1, y}+

*x*

_{2, y}+ ⋯ +

*x*

_{12, y}. These series are displayed in Fig. 5, and obviously, such annual series display no seasonality. Interestingly, these series do not require transformation, since at this level of aggregation, no significant departures from normality are found in any of them, presumably because aggregation “averages out” many of the more extreme rainfall fluctuations observed at monthly frequencies.

^{1}Fitted trend lines are also shown in Fig. 5, obtained from the following models

These models are consistent with the findings from the monthly IoI_1711 series. Only the summer series exhibits any autocorrelation, and this is of just 1-year duration. Winter exhibits a positive trend in rainfall, which is approximately twice the size of the negative trend for summer. Both spring and autumn have no trends in rainfall, the positive March trend being dissipated in significance by the lack of trends in April and May rainfall. Consequently, annual rainfall has a positive trend, being approximately the average of the (absolute) winter and summer trends.

## 4 Breaks and changing predictability

Over such a long sample period, somewhat in excess of three centuries, it is quite conceivable that the model may have undergone one or more shifts over time. To investigate this possibility, a model closely related to Eq. (7),

^{2}The statistics from this test, shown in Table 2, identify five breaks, at July 1739, December 1765, December 1786, February 1843, and August 1976. Interestingly, three of these breaks occur during the eighteenth century, a period for which Murphy et al. (2018) have “low confidence” in the reliability of the data. The seasonal trends estimated from this five-break model are shown in Fig. 6. The trends are quite volatile across the three breaks during the eighteenth century, but from the mid-1800s, the seasonal trends are rather stable within subsamples, with only one significantly negative linear trend (for September) during the fifth subsample from 1843 to 1976 and one significantly positive linear trend (for July) during the last subsample from 1976.

Break test statistics

Break test |
| 5% critical value |
---|---|---|

0 vs. 1 | 71.24 | 28.49 |

1 vs. 2 | 52.38 | 30.65 |

2 vs. 3 | 52.07 | 31.90 |

3 vs. 4 | 37.51 | 32.83 |

4 vs. 5 | 37.16 | 33.57 |

Break tests were also performed on the seasonal and annual series. The only break that could be identified was for the winter series with a break at 1790:

There is thus a trend towards drier winters in the years up to 1790, with the trend then being reversed towards wetter winters after this break point.

The potential for changes in predictability was also investigated.^{3} To assess whether the pattern of rainfall has altered in predictability over time relative to the fitted models, moving residual standard deviations were computed for both the model fitted assuming no breaks and for the model with five breaks. The *n* period moving residual standard deviation at time *t*, \( {\widehat{\sigma}}_{\mathrm{a},\mathrm{t},\mathrm{n}} \) is defined from

*n*equal to the sample size

*T*.

*n*= 120, i.e. for a 10-year (decadal) moving window. Variation is much more pronounced during the eighteenth century, the first half of which exhibits less unpredictability than the second half, during which unpredictability was at its greatest. The period since 1976 has also exhibited a tendency towards greater unpredictability.

## 5 Summary and concluding comments

Using a stochastic model that has already been successfully employed to model monthly rainfall series for the U.K. and its regions, this paper has demonstrated that this model can also be successfully fitted to the IoI_1711 series for the island of Ireland. The excess skewness in the monthly IoI_1711 data may be eradicated by using a Box-Cox transformation with parameter equal to 0.6, a value very similar to that found for the U.K. and its regions. There is no evidence of either an overall stochastic trend or of evolving monthly seasonal patterns, but positive linear trends are found for January, March, and December and a negative linear trend found for July. Analysis of the seasonal and annual series (which require no transformation) confirms the implication from the monthly data that winters have become progressively wetter and summers progressively drier, with the positive linear trend for winter being approximately twice the size of the negative summer trend. Since there is no trend in either spring or autumn rainfall, annual rainfall shows an overall positive linear trend.

Given that the IoI_1711 series exists for over three centuries, breaks in trend and structural shifts in the model were investigated. Five breaks were identified, three of which occurred in the early portion of the series during the eighteenth century. However, trends were found to be much more stable from the middle of the nineteenth century. For the seasonal series, only a single break, at 1790 for winter, was found, so that it was only after this break that winters became wetter; as before then winter rainfall had a negative trend. In terms of predictability, predictions from the model were found to be more volatile during the second half of the eighteenth century and again from 1976 onwards.

The formal modelling results presented in this paper may also be compared with the essentially descriptive findings of Murphy et al. (2018). The overall finding of increasingly wetter winters and dryer summers complements their conclusions, as does the finding that most of the eighteenth century was characterised by dryer winters, which Murphy et al. suggest may be a consequence of the under-catch of snowfall. They also point out that before 1790, confidence in the data is low and this may have led to the finding of multiple breaks and more volatile predictions in the models during the eighteenth century. The break analysis also complements their conclusions that trends were less significant from 1850 onwards and that trends computed from recent data are not necessarily representative of long-term trends. Of course, a perennial problem with all trend fitting techniques is their projection into the future. Given the results of the break analysis one should be wary of projecting current trends too far!

Comparisons with the findings for the U.K. regions given in Mills (2017) are necessarily limited by the much shorter sample periods available for the U.K. and the rather different focus of that paper. Perhaps, the most noticeable difference is that the annual IoI_1711 series, which contains a positive trend, stands in contrast to all the U.K. regions, for which trends in rainfall are conspicuously absent.

The above findings, which complement and enhance the descriptive analysis of Murphy et al. (2018), thus conclusively demonstrate the importance and usefulness of the formal modelling of rainfall series undertaken in this paper.

## Footnotes

- 1.
An alternative approach to aggregation would be to average the monthly fitted trends either across seasons or annually. This would, however, lose any information gained from the estimated seasonal models.

- 2.
The testing procedure involves the following steps: (i) begin with the full sample and perform a test of parameter constancy with unknown break using a standard Chow (1960)

*F*test; (ii) if the test rejects the null hypothesis of constancy, determine the break date using an Andrews (1993) modified*F*test given by the largest*F*statistic over all possible break dates; (iii) the sample is then divided at this break date into two subsamples, and single unknown breakpoint tests in each subsample are performed. Each of the tests may be viewed as a test of the alternative of*l*+ 1 = 2 breaks versus the null of*l*= 1 break. A breakpoint is then added whenever a subsample null is rejected; (iv) the procedure is then repeated up to a maximum of five breaks until all subsamples do not reject the null hypothesis or until the maximum number of breakpoints is reached; and (v) the break dates are then refined by re-estimation if they are obtained from a subsample containing more than one break. A “trimming percentage” is required to ensure that individual subsamples are not too small. Given the length of the series, a trim of 5% was chosen, with the tests using 5% critical values. The procedure is based on least squares estimation, which precludes ARMA errors: hence, the use of lagged dependent variables in Eq. (8) to model potential autocorrelation. - 3.
The term “changes in predictability” refers to whether the goodness of fit of the estimated models alters, for better or worse, over the sample period.

## Notes

## References

- Andrews DWK (1993) Tests for parameter instability and structural change with unknown change point. Econometrica 61:821–856CrossRefGoogle Scholar
- Bai J (1997) Estimating multiple breaks one at a time. Econometric Theory 13:315–352CrossRefGoogle Scholar
- Bai J, Perron P (1998) Estimating and testing linear models with multiple structural changes. Econometrica 66:47–78CrossRefGoogle Scholar
- Bai J, Perron P (2003a) Comparison and analysis of multiple structural change models. J Appl Econ 18:1–22CrossRefGoogle Scholar
- Bai J, Perron P (2003b) Critical values for multiple structural change tests. Econ J 6:72–78Google Scholar
- Box GEP, Cox DR (1964) An analysis of transformations. J R Stat Soc Ser B 26:211–246Google Scholar
- Chow GC (1960) Tests of equality between sets of coefficients in two linear regressions. Econometrica 28:591–605CrossRefGoogle Scholar
- Mills TC (2005) Modelling precipitation trends in England and Wales. Meteorol Appl 12:169–176CrossRefGoogle Scholar
- Mills TC (2014) Time series modelling of temperatures: an example from Kefalonia. Meteorol Appl 21:578–584CrossRefGoogle Scholar
- Mills TC (2015) ‘Modelling rainfall trends in England and Wales’, Cogent GeoScience OA, 1, 1133218, 2015Google Scholar
- Mills TC (2017) Stochastic modelling of rainfall patterns across the United Kingdom. Meteorol Appl 24:580–595CrossRefGoogle Scholar
- Mills TC (2019) Applied time series analysis: a practical guide to modelling and forecasting. Academic Press, Elsevier, North-HollandGoogle Scholar
- Mills TC, Mills AG (1992) Modelling the seasonal patterns in UK macroeconomic time series. J R Stat Soc Ser A 155:61–75CrossRefGoogle Scholar
- Murphy C, Broderick C, Burt TP, Curley M, Duffy C, Hall J, Harrigan S, Matthews TKR, Macdonald N, McCarthy G, McCarthy MP, Mullan D, Noone S, Osborn TJ, Ryan C, Sweeney J, Thorne PW, Walsh S, Wilby RL (2018) A 305-year continuous monthly rainfall series for the island of Ireland (1711–2016). Clim Past 14:413–440, 2018CrossRefGoogle Scholar
- Pierce, D.A. (1978). ‘Seasonal adjustment when both deterministic and stochastic seasonality are present’, In A. Zellner (editor), Seasonal analysis of economic time series, 242–269, Washington, DC: US Department of CommerceGoogle Scholar

## Copyright information

**Open Access** This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.