In drought frequency analysis, important drought characteristics (e.g., duration, and average severity) are generally derived from hydro-meteorological datasets using a drought index. Given that drought characteristics are typically interrelated, it is more convenient to use bivariate or multivariate models in their frequency analysis. On the other hand, given that the drought characteristics usually follow different types of univariate distributions, it is not possible to use traditional bivariate distributions in their frequency analysis. In this study, copula functions are utilized to overcome such difficulties and provide reliable estimates of multivariate drought frequencies (Hangshing and Dabral 2018; Tosunoglu and Can 2016).
Copula functions
Based on Sklar’s theorem, copula functions link the univariate distribution functions to form multivariate distribution functions. The advantage of using copula functions in forming multivariate distribution functions is that they can model the joint dependence among different variables without being dependent on their marginal distributions (Afshar and Yilmaz 2017). Given that U and V are two dependent random variables (e.g., drought duration and severity), the bivariate joint cumulative distribution functions (CDF) of them can be obtained by using Eq. (1) as follows:
$$ \mathrm{F}\left(\mathrm{u},\mathrm{v}\right)=\mathrm{C}\left(\mathrm{F}\left(\mathrm{u}\right),\mathrm{F}\left(\mathrm{v}\right)\right) $$
(1)
where C(.) is the copula function; F(u, v) is the joint CDF of u; and v are random variables; and F(u) and F(v) are the marginal CDF of u and v, respectively. Here, F(u) and F(v) are used as inputs to copula functions to obtain F(u, v) and defined as follows:
$$ \mathrm{F}\left(\mathrm{u}\right)=\mathrm{P}\left(\mathrm{U}\le \mathrm{u}\right) $$
(2)
$$ \mathrm{F}\left(\mathrm{v}\right)=\mathrm{P}\left(\mathrm{V}\le \mathrm{v}\right) $$
(3)
where P(U ≤ u) and P(V ≤ v) are the probabilities that random variables U and V take values smaller than u and v. In Eqs. (2) and (3), the copula functions link the univariate CDF to bivariate joint CDFs (F(u, v)). Using these joint CDFs, bivariate conditional CDFs can be found by utilizing the multiplication rule:
$$ \mathrm{F}\left(\mathrm{u}|\mathrm{v}\right)=\frac{\mathrm{F}\left(\mathrm{u},\mathrm{v}\right)}{\mathrm{F}\left(\mathrm{v}\right)} $$
(4)
After the marginal distributions of each drought characteristics are defined and their CDF’s are computed, the copula functions can be applied for joint modeling of the drought characteristics. There are multiple different copula functions available for modeling the joint behavior of different dependent univariate variables, while in this study, bivariate Archimedean (i.e., Clayton, Frank, Gumbel Hougaard, and Joe) and elliptical copulas (normal, and t student) are considered (Table 1). Among different theoretical copula functions, the best function for modeling different scenarios is selected by using the root mean square error (RMSE) as a goodness of fit statistics between any fitted copula and empirical one as follows:
$$ \mathrm{RMSE}=\sqrt{\frac{\sum_{\mathrm{i}=1}^{\mathrm{n}}{\left(\mathrm{C}{\left(\mathrm{u},\mathrm{v}\right)}_{\mathrm{t}}-\mathrm{C}{\left(\mathrm{u},\mathrm{v}\right)}_{\mathrm{e}}\right)}^2}{\mathrm{n}}} $$
(5)
where C(u, v)t and C(u, v)e are the theoretical and empirical copula functions, respectively, that are used for modeling the joint dependence among characteristics of n drought events. In this study, the calculation of empirical copula and validation of different theoretical copula functions are done using copula package (Yan 2007) in the R environment (R Core Team 2018).
Table 1 Equations of copula functions. Here, u and ν are two dependent univariate variables, df is the degree of freedom, θ and ρ are copula dependence parameters, ∅ is the CDF of standard univariate Gaussian distribution, and tdf is the Student t distribution function Return periods
The return period of certain drought event is associated with a specified exceedance probability. According to Shiau and Shen (2001), the return periods of drought events with univariate drought characteristics (e.g., duration and average severity) can be estimated as a function of drought inter-arrival time:
$$ {\mathrm{T}}_{\mathrm{U}\ge \mathrm{u}}=\frac{\mathrm{E}\left(\mathrm{DIT}\right)}{\mathrm{P}\left(\mathrm{U}\ge \mathrm{u}\right)} $$
(6)
where E(DIT) is the expected value of drought inter-arrival time estimated using the past observations, and TU ≥ u is the return period of the drought event with the characteristic of U greater than or equal to u. However, since a drought event is considered mostly a bivariate event that is characterized mostly by drought duration and severity, estimation of the joint return period of these characteristics is more helpful for the assessment and management of droughts. Therefore, in this study, the estimated joint return periods has been done by using a methodology proposed by Shiau (2006) as follows:
$$ {\mathrm{T}}_{\left(\mathrm{U}\ge \mathrm{u},\kern0.5em \mathrm{V}\ge \mathrm{v}\right)}=\frac{\mathrm{E}\left(\mathrm{DIT}\right)}{\mathrm{P}\left(\mathrm{U}\ge \mathrm{u},\mathrm{V}\ge \mathrm{v}\right)} $$
(7)
$$ {\mathrm{T}}_{\left(\mathrm{U}\ge \mathrm{u}\ \mathrm{or}\ \mathrm{V}\ge \mathrm{v}\right)}=\frac{\mathrm{E}\left(\mathrm{DIT}\right)}{\mathrm{P}\left(\mathrm{U}\ge \mathrm{u}\ \mathrm{or}\ \mathrm{V}\ge \mathrm{v}\right)} $$
(8)
$$ {\mathrm{T}}_{\left(\mathrm{U}\ge \mathrm{u}\ \right|\ \mathrm{V}\ge \mathrm{v}\Big)}=\frac{\mathrm{E}\left(\mathrm{DIT}\right)}{\mathrm{P}\left(\mathrm{U}\ge \mathrm{u},|\mathrm{V}\ge \mathrm{v}\right)} $$
(9)
where P(.) values in the denominators of Eqs. (7–9) are the bivariate joint probability of drought events with characteristics of U and V varying for a combination of duration and average severity (defined below in Section 2.4) among and/or conditional cases in Eqs. (7–9), respectively. The P(.) for the conditional cases can be found by using the CDF of the used drought characteristics in them. Equations (10) and (11) are the examples of univariate and bivariate cases whereby using copula function in calculation of FUV(u, v); the joint probabilities of drought events with characteristics of U and V for three cases of and/or conditional can be calculated via Eqs. (12–15).
$$ \mathrm{P}\left(\mathrm{U}\ge \mathrm{u}\right)=1-{\mathrm{F}}_{\mathrm{U}}\left(\mathrm{u}\right) $$
(10)
$$ \mathrm{P}\left(\mathrm{U}\ge \mathrm{u},\mathrm{V}\ge \mathrm{v}\right)=1-{\mathrm{F}}_{\mathrm{UV}}\left(\mathrm{u},\mathrm{v}\right) $$
(11)
$$ \mathrm{P}\left(\mathrm{U}\ge \mathrm{u},\mathrm{V}\ge \mathrm{v}\right)=1-{\mathrm{F}}_{\mathrm{U}}\left(\mathrm{u}\right)-{\mathrm{F}}_{\mathrm{V}}\left(\mathrm{v}\right)+\mathrm{C}\left({\mathrm{F}}_{\mathrm{U}}\left(\mathrm{u}\right),{\mathrm{F}}_{\mathrm{V}}\left(\mathrm{v}\right)\right) $$
(12)
$$ \mathrm{P}\left(\mathrm{U}\ge \mathrm{u}\ \mathrm{or}\ \mathrm{V}\ge \mathrm{v}\right)=1-\mathrm{C}\left({\mathrm{F}}_{\mathrm{U}}\left(\mathrm{u}\right),{\mathrm{F}}_{\mathrm{V}}\left(\mathrm{v}\right)\right) $$
(13)
$$ \mathrm{P}\left(\mathrm{U}\ge \mathrm{u}|\mathrm{V}\ge \mathrm{v}\right)=\frac{\mathrm{P}\left(\mathrm{U}\ge \mathrm{u},\mathrm{V}\ge \mathrm{v}\right)}{\mathrm{P}\left(\mathrm{V}\ge \mathrm{v}\right)}=\frac{1-{\mathrm{F}}_{\mathrm{U}}\left(\mathrm{u}\right)-{\mathrm{F}}_{\mathrm{V}}\left(\mathrm{v}\right)+\mathrm{C}\left({\mathrm{F}}_{\mathrm{U}}\left(\mathrm{u}\right),{\mathrm{F}}_{\mathrm{V}}\left(\mathrm{v}\right)\right)}{1-{\mathrm{F}}_{\mathrm{V}}\left(\mathrm{v}\right)} $$
(14)
The described equations above are the main drivers of the joint probability and hence joint return period analysis. By using such formulations, the joint probability of any drought event (e.g., a drought event with duration more than 7 months and average severity of 1.25) can be calculated. Once the joint probability is being calculated, by inserting joint probability information to the equations of 7 to 8, the return period of that event can be calculated with three different scenarios of and/or conditional forms.
Standard precipitation index
The SPI is one of the most commonly used drought indices which is developed based on the normalization of precipitation probabilities. Although SPI is generally calculated by using monthly precipitation data (for the different number of timescales), its values can be produced with daily or weekly precipitation data as well (WMO, 2006). Owing to the simplicity of SPI and its utilities, the World Meteorological Organization (WMO) recommends the use of SPI as the most essential meteorological drought index compulsory in all countries in monitoring drought conditions (Hayes et al. 2011).
Regardless of the time interval at which precipitation values are presented, the SPI calculation method is the same for all time intervals. Based on the definition of (McKee et al. 1993), initially, the accumulated precipitation amounts are calculated based on considered time step (here in this study, 12-month time step has been selected for further drought analysis). Later, the accumulated time series of precipitation data is fitted to the desired distribution function, and finally, the probabilities of accumulated precipitation observations are normalized with using inverse CDF of the standard normal distribution function. In this study, the SPI calculations are done by using the SPEI package (Santiago and Vicente-Serrano 2017) in the R environment. For more details about SPI and other different drought indices and their way of calculations, see (WMO, 2006).
Drought analysis
Drought events, in general, have multiple characteristics including drought duration (the length of the dry period), drought severity (summation of SPI values during the dry period), drought average severity (average of SPI values during the dry period), and drought intensity (minimum SPI value during the dry period). Among drought characteristics, the drought severity and intensity are highly associated with drought duration and average severity (the severity of drought can be determined by multiplication of drought duration and average severity, and drought intensity is highly correlated with drought average severity; Afshar et al. 2016). Hence, this study is based on two drought characteristics of drought duration and average severity, while any dry period is defined to have continuously negative SPI values and at least one SPI value below minus one. While among the SPI time series, many periods satisfy this condition; those dry periods with less than 3-month duration are not considered as drought events to avoid an enormous number of droughts and get reliable information about the total number of drought events and their dispersion over different time periods. The two important drought characteristics investigated in this study (i.e., drought duration and average severity) and the overall procedure of drought analysis conducted in this study are presented schematically in Fig. 1.
Given that the frequency and return period analysis require long time series of drought characteristics, in this study, the drought characteristics of events occurred over different stations are bound to generate longer drought characteristic time series and hence more robust areal average analysis. Moreover, to visualize the univariate and joint cumulative probabilities, the bound drought characteristics are fitted to different univariate probability distributions to generate synthetically continuous data of different drought characteristics. Among different univariate distribution functions (i.e., gamma, log-normal, logistic, normal, and Weibull), the best distribution is selected based on the chi-squared statistics between the cumulative distribution functions of theoretical and empirical cumulative distributions function values for each join dependence and scenario separately.