Generalised Pareto distribution: impact of rounding on parameter estimation
- 117 Downloads
Problems that occur when common methods (e.g. maximum likelihood and L-moments) for fitting a generalised Pareto (GP) distribution are applied to discrete (rounded) data sets are revealed by analysing the real, dry spell duration series. The analysis is subsequently performed on generalised Pareto time series obtained by systematic Monte Carlo (MC) simulations. The solution depends on the following: (1) the actual amount of rounding, as determined by the actual data range (measured by the scale parameter, σ) vs. the rounding increment (Δx), combined with; (2) applying a certain (sufficiently high) threshold and considering the series of excesses instead of the original series. For a moderate amount of rounding (e.g. σ/Δx ≥ 4), which is commonly met in practice (at least regarding the dry spell data), and where no threshold is applied, the classical methods work reasonably well. If cutting at the threshold is applied to rounded data—which is actually essential when dealing with a GP distribution—then classical methods applied in a standard way can lead to erroneous estimates, even if the rounding itself is moderate. In this case, it is necessary to adjust the theoretical location parameter for the series of excesses. The other solution is to add an appropriate uniform noise to the rounded data (“so-called” jittering). This, in a sense, reverses the process of rounding; and thereafter, it is straightforward to apply the common methods. Finally, if the rounding is too coarse (e.g. σ/Δx~1), then none of the above recipes would work; and thus, specific methods for rounded data should be applied.
The constructive comments from two anonymous reviewers are gratefully acknowledged.
This work has been supported in part by the Croatian Science Foundation under the project 2831. K. Cindrić received funding from the European Union’s Horizon 2020 research and innovation program under the grant agreement no. 653824/EU-CIRCLE.
- Hogg RV, McKean J, Craig AT (2012) Introduction to mathematical statistics. Pearson, BostonGoogle Scholar
- Hosking JRM (1990) L-moments: analysis and estimation of distributions using linear combinations of order statistics. J R Statist Soc B 52(1):105–124Google Scholar
- Mudelsee M (2014) Climate time series analysis: classical statistical and bootstrap methods. Springer International Publishing, SwitzerlandGoogle Scholar
- Reiss RD, Thomas M (2007) Statistical analysis of extreme values. Birkhäuser, BaselGoogle Scholar
- Smith RL (2003) Statistics of extremes, with applications in environment, insurance and finance. In: Finkenstadt B (ed) Extreme values in finance, telecommunications, and the environment. Chapman and Hall/CRC Press, LondonGoogle Scholar
- Vicente-Serrano SM, Begueria-Portugues S (2003) Estimating extreme dry spell-risk in the middle Ebro valley (NE Spain): a comparative analysis of partial duration series with a general Pareto distribution and annual maxima series with a Gumbel distribution. Int J Climatol 23:1103–1118. https://doi.org/10.1002/joc.934 CrossRefGoogle Scholar