Advertisement

Is intraday data useful for forecasting VaR? The evidence from EUR/PLN exchange rate

  • Barbara Będowska-Sójka
Original Article
  • 12 Downloads

Abstract

In this paper, we evaluate alternative volatility forecasting methods under Value at Risk (VaR) approach by calculating one-step-ahead forecasts of daily VaR for the EUR/PLN foreign exchange rate within the 4-year period. Using several risk models, including GARCH specifications and realized volatility models as well as hybrid of these two, we examine whether incorporation of intraday data allows to produce better one-step-ahead volatility forecasts in daily horizon than in case of using daily data only. The volatility forecasts are compared within VaR framework in two-step procedure: the statistical accuracy test are conducted as well as the loss functions are obtained. We find that GARCH models produce better backtesting results than models for realized volatility. When the loss functions of the models that passed the first-stage filtering procedure are compared, there is no distinct winner of the race. We also find no evidence that skewed Student t distribution assumption within GARCH models provides better VaR forecasts when compared to symmetric Student.

Keywords

VaR Intraday data Realized volatility GARCH ARFIMA HAR-RV Jumps 

Introduction

Since Basel II banks are obligated to calculate Value at Risk (VaR) forecasts of the expected losses. There is a huge tension therefore to improve the existing estimation techniques and backtesting procedures. If it is assumed that returns belong to a location-scale family, and it often is, VaR is a linear function of somehow calculated volatility. There is a variety of approaches possible to choose when estimating unobservable volatility based on different assumptions and different information sets. The availability of intraday prices in recent years contributes to the development of new class of volatility measures using data sampled at different frequencies (Andersen and Bollerslev 1998; Barndorff-Nielsen and Shephard 2004). It is still open to question, which approach, based on daily or intradaily data, generate better volatility forecasts and thus VaR forecasts. On the one hand, the high-frequency data contain more information on the actual changes in prices within a day, but using intraday data both cause higher costs of data gathering and are limited to the assets for which the data are available. On the other hand, the application of daily data requires less computer power, but the data are not as informative.

Although extensive research has been carried out on the comparison of the usefulness of intraday and daily data, the conclusions are not consistent. Giot and Laurent (2004) indicate that in the case of stock indices and exchange rates an adequate ARCH-type model, that account for asymmetry of returns, delivers as good VaR forecasts as the models based on realized variance. Lunde and Hansen (Hansen and Lunde 2005) in the analysis of exchange rates show similar results—they find no evidence that GARCH(1,1) is outperformed by more sophisticated models based on intraday data. Contrary view is presented in McMillan et al. (2008) who show that by using intraday data one improves daily volatility and VaR forecasts relatively to daily data. Also Fuertes and Olmo (2012) show that the ARFIMA models produce better backtesting results than GARCH models. However, they emphasize that the GARCH models prevail in terms of independence of hits sequence. Clements et al. (2008) compare HAR and MIDAS models to simple AR(5) and show that the latter yields satisfactory VaR forecasts. Louzis et al. (2014) found that both the realized variance and the augmented GARCH models with the filtered historical simulation or the extreme value theory quantile estimation methods produce equally good VaR forecasts. Brownless and Gallo (2009) propose the daily range for VaR prediction and shows that this simple volatility measure behaves as good as computationally complicated and costly measures based on ultra-high-frequency data and better than baseline GARCH models. Będowska-Sójka (2015) finds that in the stock market VaR estimates based on daily and intradaily returns give comparable results. However, when loss functions are considered, the models based on daily data allow minimizing regulatory loss function, whereas the models based on realized volatility allow minimizing the opportunity cost of capital. Kambouroudis et al. (2016) find that a preferred model for forecasting volatility is one that combines an asymmetric GARCH model with implied and realized volatility through (asymmetric) ARMA model. Wong et al. (2016) show that although RV models outperform GARCH models for volatility forecasts, in the VaR forecasts framework EGARCH model outperforms other approaches.

The main purpose of the paper is to compare the volatility forecasts’ accuracy of EUR/PLN exchange rate returns within the VaR framework.1 The forecasts are based on the different information sets, that consist of daily and intradaily returns. In the comparison of the VaR forecasts, two-stage procedure is undertaken (Sarma et al. 2003; Wong et al. 2016): first, the unconditional and conditional coverage tests are conducted, and then the models that passed the first-stage filtering procedure are compared on the basis of the loss functions.

This paper contributes to the literature in several ways. First, for volatility estimates calculated on the basis of daily data, we consider few GARCH models with different distribution assumption, Gaussian, symmetric Student t and skewed Student t. Second, for volatility estimates calculated from equally sampled intraday returns, we consider two types of stochastic models with specifications fitted to model-free measures of volatility: ARFIMA models (Andersen et al. 2003; Fuertes and Olmo 2012; Ahoniemi et al. 2016) and the heterogeneous autoregressive realized variance (HAR-RV) models with and without jumps (Andersen et al. 2007; Corsi 2009; Patton and Sheppard 2015). Third, we obtain hybrid VaR forecasts that combine GARCH estimates with these obtained from realized volatility models and examine, if there are the potential advantages of mixing forecasts. Our analysis extends the existing research by focusing on relatively less liquid FX rate and broadening the class of the models.

We find that VaR forecasts from GARCH models based on daily data have better backtesting results that these based on ARFIMA or HAR models for realized volatility. However, there is no prevailing advantage in all loss functions when well-fitted GARCH and realized volatility models are compared. EWMA specification for daily data passes all backtesting procedures and obtains the lowest firm loss function, but simultaneously achieves the highest values of binominal and regulatory loss function. Also, contrary to several works (Giot and Laurent 2003; Louzis et al. 2014; Wong et al. 2016), we do not find an evidence that skewed Student t distribution assumption in GARCH models provides better VaR forecasts when compared to other distribution, namely Gaussian or symmetric Student. In line with Clements (2008), we find that simple AR(5) model for realized variance performs as good as ARFIMA model showing that long-memory specification is not obligatory. We also show that combining both approaches, based on daily and intradaily returns into hybrid models, offer possibility of obtaining lower regulatory loss functions.

The rest of the paper is as follows: in “Models for volatility forecasting” section competing volatility models used in the study are presented, in “Data” section the data are described, “Empirical results” section presents the methods of VaR evaluation and the empirical results. The last section concludes.

Models for volatility forecasting

Our VaR modeling approach builds on the contribution of (Giot and Laurent 2004; Fuertes and Olmo 2012; Wong et al. 2016). Let \(r_{t}\) be a daily returns at time t. The models are specified as follows:
$$\begin {aligned} & r_{t} = \mu_{t} + \varepsilon_{t} \\ & \varepsilon_{t} = \sigma_{t} z_{t} \end {aligned}$$
(1)
where \(\mu_{t}\) is the conditional mean of returns and \(\sigma_{t}^{{}}\) is the conditional standard deviation at time t of innovations \(\varepsilon_{t}\), based on the information set \(\varOmega_{t - 1}\), whereas \(\,z_{t}\) is an independently and identically distributed (I.I.D.) unit variance random variable that follows one of three different density functions: Gaussian (N.I.D.), symmetric Student t and skewed Student t. The form of the conditional volatility \(\sigma_{t}^{2}\) differs among specifications: \(\sigma_{t}^{2}\) is either a GARGH-type conditional variance of the daily return, or a realized volatility conditional expectation forecasted from ARFIMA or HAR models.
Value at Risk quantifies the market risk of a portfolio to future market fluctuations (Sarma et al. 2003). The VaR of \(r_{t}\) is an \(\alpha\) percentage quantile of the conditional distribution \(F_{t}^{{}} ( \cdot )\) of the financial daily return process \(r_{t}\), given the agent’s information set \(\varOmega_{t - 1}\). We forecast the \(\alpha\)-quantile of \(F_{t}^{{}} ( \cdot )\) defined as \(VaR_{t,\alpha } \equiv F_{t}^{ - 1} (\alpha )\) with \(P(r_{t} \le VaR_{t,\alpha } \left| {\varOmega_{t - 1} } \right.) = \alpha \,\) (Ahoniemi et al. 2016). The one-day-ahead VaR is defined as
$${\text{VaR}}_{t} (\alpha ) = \mu_{{t\left| {t - 1} \right.}} + \sigma_{{t\left| {t - 1} \right.}} F_{z}^{ - 1} (\alpha ) ,$$
(2)
where \(F_{z}^{ - 1} (\alpha )\) is the \(\alpha\)-quantile estimate from the given distribution. The accuracy of VaR predictions depends on the method used to generate volatility forecasts and the assumed conditional distribution. In the paper, we consider the conditional approaches that deliver forecasts for both tails of distribution in order to measure risk for the long position and the short position at the \(\alpha\) quantile.

ARCH-type models

We present the GARCH-type specifications for the conditional variance of innovations \(\varepsilon_{t}\) used in the study in Table 1. The exponentially weighted moving average EWMA model is an application of well-known Risk Metrics methodology popularized by JP Morgan in 1996. Under some restrictions, EWMA is also equivalent to Integrated GARCH(1,1) model. In our specification, the autoregressive parameter is set at 0.94 following the JP Morgan recommendations. The symmetric GARCH (p,q) model introduced by Bollerslev (1986) is a plain vanilla model, often standing as a benchmark for volatility specification (Hansen and Lunde 2005). We consider also exponential GARCH (EGARCH) model that allows for asymmetric effects between positive and negative returns, the so-called leverage effect (Nelson 1991), where \(\gamma_{1}\) and \(\gamma_{2}\) are real constants. Both \(\varepsilon_{t}\) and \(\left| {\varepsilon_{t - i} } \right| - E\left| {\varepsilon_{t - i} } \right|\) are zero mean I.I.D. sequences with continuous distributions. Next, IGARCH(p,q) model is included as a representation that is an extension of EWMA. As long memory in volatility is one of the stylized facts known in the literature, the fifth model in this group is FIGARCH(p,d,q), that is capable of accommodating the persistence in volatility (Baillie et al. 1996), where L is the backshift operator, d is a fractionally integrated parameter, \(0 \le d \le 1\), \(\varPhi (L) = 1 - \sum\nolimits_{i = 1}^{q} {\phi_{i} L^{i} }\) and \(B(L) = 1 - \sum\nolimits_{i = 1}^{p} {\beta_{i} L^{i} }\).
Table 1

GARCH specifications of conditional volatility

Model

Specification

Eqs.

EWMA

\(\sigma_{t}^{2} = 0.06r_{t - 1}^{2} + 0.94\sigma_{t - 1}^{2}\)

(3)

GARCH(p, q)

\(\sigma_{t}^{2} = \omega + \sum\limits_{i = 1}^{q} {\alpha_{i} r_{t - i}^{2} } + \sum\limits_{j = 1}^{p} {\beta_{j} \sigma_{t - j}^{2} }\)

(4)

EGARCH(p, q)

\(\log \sigma_{t}^{2} = \alpha_{0} + \sum\limits_{i = 1}^{q} {\alpha_{i} \left\{ {\gamma_{1} \varepsilon_{t - i} + \gamma_{2} \left[ {\left| {\varepsilon_{t - i} } \right| - E\left( {\left| {\varepsilon_{t - i} } \right|} \right)} \right]} \right\}} + \sum\limits_{j = 1}^{p} {\beta_{i} \log \sigma_{t - j}^{2} }\)

(5)

IGARCH(p, q)

\(\sigma_{t}^{2} = \omega + p\sum\limits_{i = 1}^{p} {(1 - \beta_{i} )r_{t - i}^{2} } + \sum\limits_{j = 1}^{q} {\beta_{j} \sigma_{t - j}^{2} }\)

(6)

FIGARCH(p,d,q)

\((1 - L)^{d} \varPhi (L)r_{t}^{2} = \omega + B(L)(r_{t}^{2} - \sigma_{t}^{2} )\),

(7)

In the estimation, we use rolling sample of size T (T equals 537 observations) and set one-step ahead conditional standard deviation forecasts within the underlying specifications (Eqs. (3)–(7)) and VaR forecasts following Eq. (2). On the basis of information criteria, for all GARCH specifications we assume \(p = 1\,\,\) and \(\,q = 1\). In all cases, we test the specifications of the models with respect to autocorrelation of residuals and squares of residuals (Ljung–Box test).

Models for realized variance and realized bipower variation

The second group of the models uses the conditional volatility estimates based on intraday data. The theory that gives the foundations for realized variance and realized bipower variation is based on the continuous time processes (Barndorff-Nielsen and Shephard 2004; Andersen and Benzoni 2009). The realized variance \(RV_{t} (\Delta )\) is calculated by summing squared intradaily returns that are observed within a day with a given frequency (\(\Delta\)):
$${\text{RV}}_{t} (\Delta ) = \sum\limits_{n = 1}^{1/\Delta } {r_{t,n}^{2} } ,$$
(8)
where \(r_{t,n}\) is n-th intraday return on day t. This volatility estimator converges in probability (as the sampling frequency of the return series increases, \(\Delta \to 0\)) to the quadratic variation process that characterizes the latent true variance \(RV_{t} (\Delta ) \to \int_{t - 1}^{t} {\sigma^{2} (s)ds} + \sum\limits_{t - 1 < s \le t}^{{}} {\kappa^{2} (s)}\), where the first term \(\int_{t - 1}^{t} {\sigma^{2} (s){\text{d}}s}\) is called integrated variance, whereas \(\sum\limits_{t - 1 < s \le t}^{{}} {\kappa^{2} (s)}\) is describing the jump process at time s, and \(t - 1 < s \le t\). In the absence of jumps, realized variance will be a consistent estimator of integrated variance. This result is fundamental for modeling and forecasting realized variance (Andersen et al. 2003). However, as jumps are quite common in financial returns series, Barndorff-Nielsen and Shephard (2004) introduced another measure called realized bipower variation, which is an estimate of integrated variance that is robust to jumps:
$${\text{RBV}}_{t} (\Delta ) = \frac{\pi }{2}\sum\limits_{n = 1}^{1/\Delta } {\left| {r_{t,n} } \right|\left| {r_{t,n + 1} } \right|} .$$
(9)
These two, realized variance, \({\text{RV}}_{t} (\Delta )\), and realized bipower variation, \({\text{RBV}}_{t} (\Delta )\), are used in the estimation of the jump component in the following way:
$${\text{RV}}_{t} (\Delta ) - {\text{RBV}}_{t} (\Delta ) \to \sum\limits_{t - 1 < s < t} {\kappa^{2} (s)}$$
In order to prevent the estimates of squared returns from being negative, Barndorff-Nielsen and Shephard (2004) truncated the measurement of jumps J at zero:
$$J_{t} = \hbox{max} [{\text{RV}}_{t} (\Delta ) - {\text{RBV}}_{t} (\Delta ),0] .$$
(10)
Two stylized facts should be mentioned here: the first is that the unconditional distribution of the logarithms of realized volatilities (e.g., RV and RBV) is nearly Gaussian (Giot and Laurent 2004). Therefore for the purpose of modeling and forecasting, we use logarithms of RV and RBV. The second is that realized volatility is characterized by long memory (Andersen et al. 2001). In financial markets, either traders are perceived to be heterogeneous in the sense of a different horizon of investments (Müller et al. 1995), or information arrival is heterogeneous in its nature (Andersen and Bollerslev 1998). This heterogeneity causes long memory in the series. There are two commonly used classes of models that take into account the long memory feature: ARFIMA models and HAR-RV models. In this paper, we use both models. Their specifications are given in Table 2.
Table 2

Models for realized volatility and realized bipower variation

Model

Specification

Eqs.

ARFIMA-RV

\(\begin{aligned} \varPhi (L)(1 - L)^{\delta } \ln (RV_{t} ) = \varTheta (L)e_{t} \hfill \\ (1 - \varphi L)(1 - L)^{\delta } (s_{t} - \omega ) = e_{t} ,e_{t} \sim {\text{NID}}(0,\sigma_{e}^{2} ) \hfill \\ \end{aligned}\)

(11)

ARFIMA-RBV

\(\begin{aligned} \varPhi (L)(1 - L)^{\delta } \ln (RV_{t} ) = \varTheta (L)e_{t} \hfill \\ (1 - \varphi L)(1 - L)^{\delta } (s_{t} - \omega ) = e_{t} ,e_{t} \sim {\text{NID}}(0,\sigma_{e}^{2} ) \hfill \\ \end{aligned}\)

(12)

AR5RV

\(\ln (RV_{t} ) = \varphi_{0} + \sum\nolimits_{i = 1}^{m} {\varphi_{i} \ln (RV_{t - i} )} \,\, + e_{t} ,\,\,\,e_{t} \sim {\text{NID}}(0,\,\,\sigma_{e}^{2} )\,\,\,\,\,\,\,\,\)

(13)

HAR-RV

\(\begin{aligned} \ln (RV_{t} ) = \, & \beta_{0} + \beta_{d} \ln (RV_{t - 1} ) + \beta_{w} \ln (RV_{t - 1}^{w} ) \\ & \; + \beta_{m} \ln (RV_{t - 1}^{m} ) + e_{t} e_{t} \sim {\text{NID}}(0,\sigma_{e}^{2} ) \\ \end{aligned}\)

(14)

HAR-RV-J

\(\begin{aligned} \ln (RV_{t} ) =\, & \beta_{0} + \beta_{d} \ln (RV_{t - 1} ) + \beta_{w} \ln (RV_{t - 1}^{w} ) + \beta_{m} \ln (RV_{t - 1}^{m} ) \\ & \; + \beta_{j} \ln (J_{t - 1} + 1) + e_{t} ,e_{t} \sim {\text{NID}}(0,\sigma_{e}^{2} ) \\ \end{aligned}\)

(15)

First, in the ARFIMA(m,δ, s) process L is the backshift operator; \((1 - L)^{\delta }\) is the fractional integration operator; \(\delta\) is a fractional integration parameter, and \(e_{t}\) is a stationary process. Within the study, we use ARFIMA(m,δ,s) for realized volatility and realized bipower variation (hereafter ARFIMA-RV and ARFIMA-RBV, respectively). This allows us to check, if capturing the long memory feature of RV or RBV improves VaR forecasts. Based on information criteria, we end up with ARFIMA(1,d,0) specification both for RV and RBV. As Clements et al. (2008) indicated that simple autoregressive models for RV provide accurate volatility forecasts for majority of currencies, we also consider autoregressive process for the logs of realized volatility with 5 lags (hereafter AR5RV).

Second class of models that take into account the long memory feature used to forecasts RV is the heterogeneous autoregressive model of realized variance (hereafter HAR-RV) (Corsi 2009). This regression models mix information of volatility components from different frequencies. In the HAR-RV specification, \({\text{RV}}_{t}\) is a daily realized volatility (for simplicity we omit \(\Delta\)); \({\text{RV}}_{t}^{w}\) and \({\text{RV}}_{t}^{m}\) are weekly and monthly volatility components, respectively, that are defined as simple averages of daily measures, that is \({\text{RV}}_{t}^{w} = \frac{1}{5}({\text{RV}}_{t} + {\text{RV}}_{t - 1d} + \cdots + {\text{RV}}_{t - 4d} )\) and \({\text{RV}}_{t}^{m} = \frac{1}{22}({\text{RV}}_{t} + {\text{RV}}_{t - 1d} + \cdots + {\text{RV}}_{t - 21d} )\). HAR-RV models are additionally extended as in (Andersen et al. 2007) by including the jump component. The heterogeneous autoregressive model of realized variance with jumps, hereafter HAR-RV-J, is specified in Eq. (15), where Jt stands for the jumps calculated as in Eq. (10). It is assumed that the all RV models fully capture the conditional volatility through a linear functions. For all realized variance models, we estimate ARFIMA or HAR models and obtain the forecasts of volatility and VaR forecasts (Eq. 3).

Data

The raw database consists of 5-min equally sampled bid and ask prices of exchange rate EUR/PLN. Our sample period starts on 2010-10-01 and ends on 2014-09-30 (1037 days). The primary data come from Swiss Forex Broker Dukascopy. We calculate the logarithmic middle price p as the geometric average of the bid (pbid) and ask (pask) price. As such, it is said to create a good approximation of the true price (Dacorogna et al. 2001). At the time t, the middle price is defined as
$$p_{t(i)} = \frac{{\log p_{{{\text{bid}},t(i)}} + \log p_{{{\text{ask}},t(i)}} }}{2} = \log \sqrt {p_{{{\text{bid}},t(i)}} p_{{{\text{ask}},t(i)}} },$$
(16)
where t(i) is a homogenous sequence of times regularly spaced by intervals of size Δ t . We use the logarithmic middle price. Thus, the logarithmic return at time t(i) is defined as
$$r_{t(i)} = p_{t(i)} - p_{{t(i) - \Delta_{t} }}.$$
We use a business time instead of a physical time and omit weekends and holidays. For each day, we obtain 288 5-min returns. The choice of delta is widely discussed in the literature, but we follow much of the existing one and use a 5-min sampling frequency (Andersen et al. 2001, 2007; Corsi 2009). It allows to obtain a balance between bias coming from the microstructure noise, if delta is chosen to be too small, and offer sufficient power to detect jumps. Daily returns are calculated as the sum of intradaily logarithmic returns. The latter are also used when calculating realized volatility and realized bivariate variation, the non-parametric estimates of daily volatility.

Empirical application

The risk models are estimated using G@RCH 6.2 package of Ox (GARCH and ARFIMA models) as well as PcGive mode (Doornik and Hendry 2005; Laurent 2010). Then VaR forecasts are computed in Ox. We estimate GARCH-type models for daily returns as well as ARFIMA models for realized variance and realized bivariate variation data and HAR models for realized volatility. All models are tested for possible misspecifications. At the end, we choose one model that is well specified and has the best fit according to information criteria. Then we generate one-step-ahead forecasts of volatility from 2012-10-24 to 2014-09-30 which gives 500 forecasts from each approach. Every model from those presented in “Models for volatility forecasting” section is re-estimated in a moving window consisting of 537 observations.

VaR evaluation

As in Sarma et al. (2003), we use a two-stage model selection and evaluation procedure. In the first stage, models are tested for statistical accuracy, whereas in the second, three various subjective loss functions are used. In evaluation of VaR forecasts, we consider the test for unconditional coverage (Kupiec 1995), for conditional coverage (Christoffersen 1998) and Dynamic Quantile Test, DQT (Engle and Manganelli 2004). The tests and loss functions used in the study are described below.

VaR specification tests

VaR is considered to be adequate, if the sequence of hits (also called failures or exceedances) exhibits unconditional coverage and serial independence. The Kupiec (1995) test focuses on unconditional coverage (henceforth KUC) and requires computing the empirical failure rate which is the fraction of excess returns under (for the long position) and over (for the short position) the forecasted one-day-ahead VaR. In the case of a correctly specified VaR, the fraction of failures,\(f,\) is equal to the assumed number of failures which is identical with the probability \(\alpha\).
$$\begin{aligned} H_{0} :f = \alpha \hfill \\ H_{1} :f \ne \alpha \hfill \\ \end{aligned}$$
The test statistic is
$${\text{LR}} = 2\left( {\ln \left( {\left( {\frac{N}{T}} \right)^{N} \left( {1 - \frac{N}{T}} \right)^{T - N} } \right) - \ln (\alpha^{N} (1 - \alpha )^{T - N} )} \right)\sim \chi^{2}_1,$$
where N is the number of exceedances (hits) of the forecasted VaR, and T is the number of observations.

Christoffersen’s (1998) exceedance independence test (henceforth CEI) is a likelihood ratio test that looks for unusually frequent consecutive exceedances. Defining the indicator function \(I_{t + 1} (\alpha )\) for the exceedances associated with the VaR forecasts, \(I_{t + 1} (\alpha ) \equiv 1(r_{t + 1} \le VaR_{t + 1} \left| {\varOmega_{t - 1} } \right.)\), we have \(q_{0}^{*} = \Pr (I_{t} = 0\left| {I_{t - 1} = 0)} \right.\) and \(q_{1}^{*} = \Pr (I_{t} = 0\left| {I_{t - 1} = 1)} \right.\). These are the VaR measure’s conditional coverages—its actual probabilities of not experiencing an exceedance, given that it did not (in the case of \(q_{0}^{*}\)) or did (in the case of \(q_{1}^{*}\)) experience an exceedance in the previous period. Our null hypothesis H0 is that \(q_{0}^{*} = q_{1}^{*} = q^{*}\). If a VaR measure is observed for \(\pi + 1\) periods, there will be \(\pi\) pairs of consecutive observations (\(I_{t - 1}\),\(I_{t}\)). Disaggregate these as follows: \(\pi_{00} + \pi_{01} + \pi_{10} + \pi_{11} = \pi\), where \(\pi_{00}\) is the number of pairs (\(I_{t - 1}\),\(I_{t}\)) of the form (0, 0); \(\pi_{01}\) is the number of the form (0, 1); etc. To test if \(\pi_{00} /(\pi_{00} + \pi_{01} ) \approx \pi_{10} /(\pi_{10} + \pi_{11} )\), we estimate the following: \(\hat{q}_{0}^{*} = \frac{{\pi_{00} }}{{\pi_{00} + \pi_{01} }}\), \(\hat{q}_{1}^{*} = \frac{{\pi_{10} }}{{\pi_{10} + \pi_{11} }}\), and \(\hat{q}_{{}}^{*} = \frac{{\pi_{00} + \pi_{10} }}{{\pi_{00} + \pi_{01} + \pi_{10} + \pi_{11} }}\).

The likelihood ratio is
$$\begin{aligned} \varLambda = \,& \frac{{(1 - \hat{q}^{*} )^{{\pi_{01} + \pi_{11} }} (\hat{q}^{*} )^{{\pi_{00} + \pi_{10} }} }}{{(1 - \hat{q}_{0}^{*} )^{{\pi_{01} }} (\hat{q}_{0}^{*} )^{{\pi_{00} }} (1 - \hat{q}_{1}^{*} )^{{\pi_{11} }} (\hat{q}_{1}^{*} )^{{\pi_{10} }} }} \\ = \, & \left( {\frac{{\hat{q}^{*} }}{{\hat{q}_{0}^{*} }}} \right)^{{\pi_{00} }} \left( {\frac{{1 - \hat{q}^{*} }}{{1 - \hat{q}_{0}^{*} }}} \right)^{{\pi_{01} }} \left( {\frac{{\hat{q}^{*} }}{{\hat{q}_{1}^{*} }}} \right)^{{\pi_{10} }} \left( {\frac{{1 - \hat{q}^{*} }}{{1 - \hat{q}_{1}^{*} }}} \right)^{{\pi_{11} }} \\ \end{aligned}$$
and under \(H_{0}\) the test statistics \({-\,\text{2log(}}\varLambda ) { }\) follows \(\chi_{ 1}^{ 2}\) distribution.

The above CIT test is based on a first-order Markov chain only and thus works properly if exceedances are actually consecutive. Therefore, we also use the Dynamic Quantile Test, DQ (Engle and Manganelli 2004), that allows to examine if the present exceedances of the VaR measure are not correlated with the past ones. Taking previously defined the indicator function, \(I_{t} (\alpha )\), hit variables are obtained as follows: \({\text{Hit}}_{t} (\alpha ) = I_{t} (r_{t} \le {\text{VaR}}_{t}^{{}} (\alpha )) - \alpha\). The two hypotheses are tested jointly: \({\text{H}}_{01} :E({\text{Hit}}_{t} (\alpha )) = 0\) and \({\text{H}}_{02} :{\text{Hit}}_{t} (\alpha )\) is uncorrelated with the variables included in the information set. Testing both hypotheses can be done jointly within an artificial regression: \({\text{Hit}}_{t} (\alpha ) = {\mathbf{Z}}\lambda + \varepsilon_{t}\), where Z is a \(T\, \times \,k\) matrix with the first column consisting of ones, and the next \(p\) columns consisting of past exceedances \({\text{Hit}}_{t - 1} , \ldots ,{\text{Hit}}_{t - p}\). In the \(k - p - 1\) remaining columns, additional independent variables are included (e.g., past returns, the squared past returns, VaR itself). Engle and Manganelli (2004) showed that the DQ test statistic satisfies the following relation:\(DQ = \frac{{\hat{\lambda }^{T} {\mathbf{X}}^{T} {\mathbf{X}}\hat{\lambda }}}{\alpha (1 - \alpha )}\sim \chi^{2} (k),\) where \(\hat{\lambda }\) is OLS estimate of \(\lambda\).

The loss functions

In the second stage of the VaR forecasts evaluation process, we use the so-called ‘loss functions’ (Lopez 1999; Sarma et al. 2003). These functions are defined on the basis of risk managers’ utility functions. Each loss function reflects the different approaches of the risk manager, but they all are defined in negative orientation—they give higher scores when the failure occurs. Consequently, the VaR model that minimizes the value of the loss functions is considered to be attractive. The loss functions should be compared under the condition that the VaR specification passes all statistical accuracy tests.

In the study, we use three loss functions. The binominal loss function (BL) proposed by Lopez (1999) penalizes all failures equally:
$$f_{t}^{BL} = \left\{ \begin{aligned} 0,\quad \ if \ r_{t} > {\text{VaR}}_{t} (\alpha ) \hfill \\ 1,\quad \ if \ r_{t} \le {\text{VaR}}_{t} (\alpha ) \hfill \\ \end{aligned} \right. .$$
(17)

Assuming that VaR is correctly specified, the fewer failures observed, the lower the binominal loss is, and thus the better the model.

The regulatory loss function (RL) proposed by Sarma et al. (2003) reflects the regulator’s utility function by paying attention to the magnitude of failures. It is constructed in the following way:
$$f_{t}^{RL} = \left\{ \begin{array}{ll} 0, & if \ r_{t} > {\text{VaR}}_{t} (\alpha ) \hfill \\ 1 + (r_{t} + {\text{VaR}}_{t} (\alpha ))^{2} , & if \ r_{t} \le {\text{VaR}}_{t} (\alpha ) \hfill \\ \end{array} \right.$$
(18)

The square term that is calculated when the failure occurs, penalizes more large failures than the small ones. The higher the exceedance is, the higher the loss function value will be. For two models with the same BL, RL might be decisive.

The third loss function in our study is a firm’s loss function (FL), that not only penalizes failures, but also takes into account the opportunity cost of capital, measured by the constant c (Sarma et al. 2003):
$$f_{t}^{FL} = \left\{ \begin{array}{ll} c \cdot {\text{VaR}}(\alpha ), & if \ r_{t}> {\text{VaR}}_{t} (\alpha ) \hfill \\ 1 + (r_{t} + {\text{VaR}}_{t} (\alpha ))^{2} , & if \ r_{t} \le {\text{VaR}}_{t} (\alpha) \hfill \\ \end{array} \right.$$
(19)

In the study, we assume that c = 1. From the risk manager point of view, not only is it important, how large the failures are, but also how conservative the VaR measure is. Additionally, to what is reported in the RL function, the FL includes also a penalty for being cautious and ‘freezing’ too much capital. BL and RL usually indicate the same models as they are focused on the minimization of the number of exceedances and minimization of the size of the failure, respectively. The third loss function, FL, is aimed to find the balance between safety and profit maximization and thus obtain the lowest values for different models.

Empirical results

VaR forecasts from singular models

The one-day-ahead, out-of-sample VaR forecasts are evaluated within the two-stage model selection and evaluation procedure described previously. Table 3 presents the results of three statistical accuracy tests in short position, specifically the estimated p-values of the following tests: Kupiec (1995) unconditional coverage test (KUC), Christoffersen (1998) exceedance independence test (CEI), and Engle and Manganelli (2004) dynamic quantile test (DQ) for two VaR levels. As shown in Table 3, in all approaches in α-quantile 0.95, and eight out of fifteen in α-quantile 0.99 the statistical accuracy tests are passed. The poor performance of various models in the former case results from the clustering of the exceedances detected by the DQ test. In those cases, VaR models are not sufficiently responsive to changing market circumstances.
Table 3

Statistical accuracy tests for the short position

 

KUC

CEI

DQ

KUC

CEI

DQ

Quantile 0.95

Quantile 0.99

EWMA (N)

0.84

0.86

0.97

0.05

0.38

0.05

GARCH (N)

0.05

0.67

0.25

0.64

0.87

0.00

GARCH (St)

0.20

0.81

0.85

0.64

0.87

0.00

EGARCH (St)

0.84

0.95

0.91

1.00

0.83

1.00

IGARCH (St)

0.84

0.95

0.96

0.64

0.87

0.00

FIGARCH (St)

0.13

0.77

0.61

0.64

0.87

0.00

GARCH (skSt)

0.05

0.67

0.25

0.33

0.90

0.96

EGARCH (skSt)

0.20

0.43

0.48

0.33

0.90

0.96

IGARCH (skSt)

0.29

0.86

0.95

0.64

0.87

0.00

FIGARCH (skSt)

0.08

0.72

0.44

0.33

0.90

0.96

HAR-RV

0.53

0.34

0.70

0.13

0.93

0.60

HAR-J-RV

0.40

0.91

0.57

0.13

0.93

0.60

ARFIMA-RV

0.13

0.77

0.14

0.64

0.87

0.00

ARFIMA-RBV

0.40

0.67

0.38

0.64

0.74

0.00

AR5RV

0.08

0.72

0.44

0.13

0.93

0.60

Note This table summarizes statistical accuracy tests by presenting p-values for the test statistic: Kupiec (1995) unconditional coverage test (KUC), Christoffersen (1998) exceedance test (CEI) and Engle and Manganelli (2004) dynamic quantile test (DQ). The tests are deployed over 537 rolling windows of one-day out-of-sample forecasts in two quantiles, 0.95 and 0.99. The full description of models used in the study is given in “Models for volatility forecasting” section. N, St, and skSt in the first column indicate the distribution of zt in Eq. (1), Gaussian, symmetric, and skewed Student t, respectively. The values in italics indicate that a given model does not pass at least one of the statistical accuracy tests

Table 4 shows the values of three loss functions, binominal, regulatory, and firm, again for both quantiles 0.95 and 0.99. The best three models within a given criterion are marked with rank in the brackets (only the first 3 ranks are presented). Among the models that have passed the accuracy tests in the first stage in α-quantile 0.95, the lowest values of binominal and regulatory loss function are observed for GARCH model with Gaussian and skewed Student t distribution, whereas the lowest firm loss function is observed for EWMA that at the same time generates the highest values of binominal and regulatory loss function among all methods. Within the RV and RBV model class simple autoregressive model for RV (AR5RV) obtains third rank both in binominal and regulatory loss function. When α-quantile 0.99 is considered, the lowest value of binominal and regulatory loss function are observed in three models for RV: HAR-RV, HAR-J-RV, and AR5RV. As the firm loss function is considered, EWMA obtains the lowest value, followed by EGARCH with Student distribution and AR5RV. Summing up, in the short position, there is no distinct winner of the race as both GARCH-class models and RV models are among these with the lowest loss functions values.
Table 4

The values of loss functions for the short position

 

BL

RL

FL

BL

RL

FL

Quantile 0.95

Quantile 0.99

EWMA (N)

26

28.47

310.85 (1)

10

10.78

421.56 (1)

GARCH (N)

16 (1)

17.66 (1)

342.25

4

4.42

473.99

GARCH (St)

19

20.89

330.16

4

4.33

500.56

EGARCH (St)

19

26.05

332.29

6

6.57

466.44 (2)

IGARCH (St)

24

26.06

323.00 (2)

4

4.35

490.58

FIGARCH (St)

18

19.93

327.80

3

3.28

505.37

GARCH (skSt)

16 (1)

17.70 (2)

340.88

3

3.26

525.39

EGARCH (skSt)

19

20.83

343.75

3

3.39

519.74

IGARCH (skSt)

20

21.84

333.54

4

4.27

516.79

FIGARCH (skSt)

17 (3)

18.73

339.48

3

3.26

521.30

HAR-RV

22

23.34

363.87

2 (1)

2.49 (3)

501.84

HAR-J-RV

21

22.26

365.05

2 (1)

2.43 (2)

504.05

ARFIMA-RV

18

19.28

342.14

4

4.23

472.26

ARFIMA-RBV

21

22.65

327.70 (3)

4

4.36

448.63

AR5RV

17 (3)

18.16 (3)

346.55

2 (1)

2.25 (1)

480.14 (3)

Note The full description of models used in the study is given in “Models for volatility forecasting” section. BL, RL, and FL show the values of binominal, regulatory, and firm loss functions, respectively, in two quantiles, 0.95 and 0.99. The values in italics indicate that a given model does not pass the statistical accuracy tests. Values in the brackets show the score of the model within a given criterion (only first 3 ranks are presented)

Table 5 presents the results of the statistical accuracy tests for the long position in α-quantile 0.05 and 0.01. In the former quantile, two GARCH-class models and four RV or RBV out of five models are rejected at the first stage. In most cases, two tests are violated: neither the number of exceedances is proper, nor VaR models are responsive enough to market dynamics. For the α-quantile 0.01, all considered models are specified correctly.
Table 5

Statistical accuracy tests for the long position

 

KUC

CEI

DQ

KUC

CEI

DQ

Quantile 0.05

Quantile 0.01

EWMA (N)

0.20

0.81

0.08

0.11

0.71

0.31

GARCH (N)

0.01

0.59

0.01

0.64

0.87

1.00

GARCH (St)

0.05

0.50

0.06

0.33

0.90

0.96

EGARCH (St)

0.05

0.50

0.06

0.64

0.87

1.00

IGARCH (St)

0.13

0.77

0.24

0.33

0.90

0.96

FIGARCH (St)

0.13

0.45

0.23

0.13

0.93

0.60

GARCH (skSt)

0.08

0.48

0.13

0.64

0.87

1.00

EGARCH (skSt)

0.20

0.81

0.04

0.64

0.87

1.00

IGARCH (skSt)

0.13

0.77

0.24

0.64

0.87

1.00

FIGARCH (skSt)

0.20

0.43

0.33

0.64

0.87

1.00

HAR-RV

0.01

0.52

0.01

0.33

0.90

0.96

HAR-J-RV

0.00

0.47

0.00

0.33

0.90

0.96

ARFIMA-RV

0.00

0.62

0.00

0.13

0.93

0.60

ARFIMA-RBV

0.08

0.59

0.15

0.64

0.80

1.00

AR5RV

0.00

0.62

0.00

0.13

0.93

0.60

Note This table summarizes statistical accuracy tests by presenting p-values for the test statistic: Kupiec (1995) unconditional coverage test (KUC), Christoffersen (1998) exceedance test (CEI), and Engle and Manganelli (2004) dynamic quantile test (DQ). The tests are deployed over 537 rolling windows of one-day out-of-sample forecasts in two quantiles, 0.05 and 0.01. The full description of models used in the study is given in “Models for volatility forecasting” section. N, St, and skSt in the first column indicate the distribution of zt in Eq. (1), Gaussian, symmetric, and skewed Student t, respectively. The values in italics indicate that a given model does not pass at least one of the statistical accuracy tests

Table 6 presents the value of the loss functions for the long position. In α-quantile 0.05, EGARCH and GARCH with Student distribution have the lowest values of BL and RL functions, while the lowest firm loss function is observed for IGARCH model with skewed Student t distribution. In α-quantile 0.01, VaR from FIGARCH model obtains the lowest value of the regulatory loss, whereas EWMA again gains the lowest firm loss function, followed by EGARCH with skewed Student distribution and ARFIMA-RBV model.
Table 6

The values of loss functions for the long position

 

BL

RL

FL

BL

RL

FL

Quantile 0.05

Quantile 0.01

EWMA (N)

19

20.25

306.15 (2)

9

9.21

421.62 (1)

GARCH (N)

13

13.68

339.67

4

4.05

473.31

GARCH (St)

16 (2)

16.84 (1)

328.67

3 (3)

3.03

500.90

EGARCH (St)

12 (1)

16.99 (2)

316.60

4

4.10

465.60

IGARCH (St)

18

18.94

319.80

3 (3)

3.03

490.82

FIGARCH (St)

16 (2)

18.74

327.13

2 (1)

2.01 (1)

506.51

GARCH (skSt)

17

18.02

315.27

4

4.06

471.53

EGARCH (skSt)

19

20.22

301.78

4

4.15

447.48 (2)

IGARCH (skSt)

18

19.16

305.74 (1)

4

4.07

460.03

FIGARCH (skSt)

19

19.93

313.46 (3)

4

4.03

466.09

HAR-RV

13

13.36

358.83

3 (3)

3.01

501.39

HAR-J-RV

12

12.37

360.30

3 (3)

3.02

503.75

ARFIMA-RV

12

12.50

338.65

2 (1)

2.06 (3)

472.91

ARFIMA-RBV

17

17.62 (3)

324.28

4

4.07

449.85 (3)

AR5RV

12

12.43

343.99

2

2.05 (2)

480.29

Note The full description of models used in the study is given in “Models for volatility forecasting” section. BL, RL, and FL show the values of binominal, regulatory, and firm loss functions, respectively, in two quantiles, 0.05 and 0.01. The values in italics indicate that a given model does not pass the statistical accuracy tests. The values in the brackets show the score of the model within a given criterion (only first 3 ranks are presented)

Summarizing, in the GARCH models group three of them have been always qualified after the first stage: GARCH and FIGARCH with skewed Student t distribution and EGARCH with symmetric Student t distribution. Within the VaR forecasts based on RV and RBV models, each has been rejected at least once. Thus, we confirm the result of Fuertes and Olmo (2012) that GARCH models prevail in terms of independence of hits sequence. When the loss functions are taken into account, the scoring obtained for GARCH models is ambiguous, for realized volatility models the best results are definitely obtained for AR5RV followed by HAR-RV and HAR-J-RV models. Taking into account these two categories, statistical accuracy tests and loss functions, for both short and long position, there is no distinct winner in the single model race. Thus, we move toward a hybrid approach.

Forecast combinations: hybrid models

Mixing forecasts in our study allow to use different information sets and different methodological approaches: one coming from daily volatility obtained from non-linear ARCH-type models, and the other based on intraday information from models using intraday data. The combination of forecasts might improve forecast accuracy if one properly uses the advantages of the single approach. We examine fifteen hybrid models that mix two broad classes. The selection is based on the already-obtained results and take into account the best models among these using daily data: from GARCH class we choose three models that pass all tests in both long and short position (GARCH with skewed Student t distribution, EGARCH with Student t distribution and FIGARCH with skewed Student t distribution). As for the models that use intradaily data no single model passes all tests, we consider all possible specifications. Thus, the hybrid models are ARFIMA-RV and GARCH (hereafter RV-GARCH), EGARCH (RV-EGARCH) or FIGARCH (RV-FIGARCH), and by analogy the following models combined with these three GARCH specifications: AR5RV (AR5RV-GARCH, AR5RV-EGARCH, AR5RV-FIGARCH), HAR-RV (HAR-GARCH, HAR-EGARCH, HAR-FIGARCH), HAR-J-RV (HARJ-GARCH, HARJ-EGARCH, HARJ-FIGARCH), and ARFIMA-BV (RBV-GARCH, RBV-EGARCH, RBV-FIGARCH).

As Elliott and Timmermann (2004) argue, theoretically, there is always a gain in mixing forecasts, unless one of them is better than the other. Following the suggestions in the literature we use equal weights of 0.5 to obtain hybrid VaR quantile forecast (Elliott and Timmermann 2004). Again, we evaluate and compare models using two-stage VaR back-testing procedure. Table 7 presents the results of the tests for short position and both quantiles, 0.95 and 0.99, for all models. For the former quantile, there are only four hybrids that pass the statistical accuracy tests, while for the latter, ten hybrids perform well. However, only two hybrids, AR5RV-FIGARCH and HAR-EGARCH, pass the tests in both quantiles.
Table 7

Statistical accuracy tests for the hybrid models: short position

 

KUC

CEI

DQ

KUC

CEI

DQ

Quantile 0.95

Quantile 0.99

RV-GARCH

0.01

0.57

0.05

0.13

0.93

0.60

RV-EGARCH

0.00

0.62

0.00

0.33

0.90

0.96

RV-FIGARCH

0.01

0.56

0.06

0.03

0.97

0.01

AR5RV-GARCH

0.05

0.67

0.25

0.03

0.97

0.01

AR5RV-EGARCH

0.08

0.72

0.50

0.64

0.87

0.00

AR5RV-FIGARCH

0.08

0.72

0.44

0.33

0.90

0.96

HAR-GARCH

0.01

0.57

0.04

0.13

0.93

0.60

HAR-EGARCH

0.05

0.50

0.26

0.33

0.90

0.96

HAR-FIGARCH

0.03

0.62

0.12

0.13

0.93

0.60

HAR-J-GARCH

0.01

0.57

0.04

0.33

0.90

0.96

HAR-J-EGARCH

0.01

0.56

0.06

0.33

0.90

0.96

HAR-J-FIGARCH

0.03

0.62

0.12

0.13

0.93

0.60

RBV-GARCH

0.03

0.53

0.14

0.33

0.62

0.96

RBV-EGARCH

0.05

0.50

0.26

1.00

0.83

1.00

RBV-FIGARCH

0.03

0.53

0.14

0.33

0.90

0.96

Note This table summarizes statistical accuracy tests for the hybrid models by presenting p-values for the following test statistics: Kupiec (1995) unconditional coverage test (KUC), Christoffersen (1998) exceedance test (CEI) and Engle and Manganelli (2004) dynamic quantile test (DQ) in two quantiles, 0.95 and 0.99. The full description of models used in the study is given in “Forecast combinations: hybrid models” section. The values in italics indicate that a given model does not pass at least one of the statistical accuracy tests

Table 8 complements the analysis with the values of the loss functions. For the 0.95 quantile, the lowest values of binominal and regulatory function are observed for AR5RV-GARCH model, while AR5RV-EGARCH gains the lowest firm loss function value. In case of 0.99 quantile RV-GARCH has the lowest binominal and regulatory function values, while HAR-J-GARCH obtains the lowest firm loss function value.
Table 8

The values of loss functions for the hybrid models: short position

 

BL

RL

FL

BL

RL

FL

Quantile 0.95

Quantile 0.99

RV-GARCH

14

14.92

378.81

2 (1)

2.14 (1)

535.39

RV-EGARCH

12

13.03

371.64

3

3.20

517.80

RV-FIGARCH

14

14.93

378.07

1

1.15

533.47

AR5RV-GARCH

16 (1)

17.02 (1)

362.79 (3)

1

1.19

507.82

AR5RV-EGARCH

17 (2)

18.52

337.48 (1)

4

4.31

485.07

AR5RV-FIGARCH

17 (2)

18.41 (3)

343.13 (2)

3

3.24

500.62

HAR-GARCH

14

15.00

383.18

2 (1)

2.25 (2)

541.34

HAR-EGARCH

16 (1)

17.10 (2)

377.42

3

3.32

523.72

HAR-FIGARCH

15

16.03

382.48

2 (1)

2.27

539.24

HAR-J-GARCH

14

14.97

384.09

3

3.40

475.89 (1)

HAR-J-EGARCH

14

15.06

377.85

3

3.30

524.97

HAR-J-FIGARCH

15

15.99

383.39

2 (1)

2.25 (2)

540.47

RBV-GARCH

15

16.24

355.60

3

3.19

502.18

RBV-EGARCH

16

17.38

349.67

5

5.26

484.73 (2)

RBV-FIGARCH

15

16.27

354.65

3

3.19

500.02 (3)

Note The full description of models used in the study is given in “Forecast combinations: hybrid models” section. BL, RL, and FL show the values of binominal, regulatory, and firm loss functions, respectively, in two quantiles, 0.95 and 0.99. The values in italics indicate that a given model does not pass the statistical accuracy tests. The values in the brackets show the score of the model within a given criterion (only first 3 ranks are presented)

Table 9 presents the results of the statistical accuracy tests for the hybrid models in the long position. While in the quantile 0.01 all models are well specified, in the quantile 0.05 only one model, AR5RV-FIGARCH, passes the statistical accuracy tests.
Table 9

Statistical accuracy tests for the hybrid models: long position

 

KUC

CEI

DQ

KUC

CEI

DQ

Quantile 0.05

Quantile0.01

RV-GARCH

0.00

0.71

0.00

0.13

0.93

0.60

RV-EGARCH

0.00

0.74

0.00

0.13

0.93

0.60

RV-FIGARCH

0.00

0.71

0.00

0.13

0.93

0.60

AR5RV-GARCH

0.03

0.53

0.13

0.13

0.93

0.60

AR5RV-EGARCH

0.01

0.56

0.06

0.13

0.93

0.60

AR5RV-FIGARCH

0.05

0.50

0.24

0.13

0.93

0.60

HAR-GARCH

0.00

0.65

0.00

0.33

0.90

0.96

HAR-EGARCH

0.00

0.71

0.00

0.33

0.90

0.96

HAR-FIGARCH

0.00

0.65

0.00

0.33

0.90

0.96

HAR-J-GARCH

0.00

0.68

0.00

0.33

0.90

0.96

HAR-J-EGARCH

0.00

0.71

0.00

0.33

0.90

0.96

HAR-J-FIGARCH

0.00

0.68

0.00

0.33

0.90

0.96

RBV-GARCH

0.00

0.90

0.00

0.13

0.93

0.60

RBV-EGARCH

0.00

0.68

0.00

0.13

0.93

0.60

RBV-FIGARCH

0.00

0.65

0.00

0.13

0.93

0.60

Note This table summarizes statistical accuracy tests for the hybrid models by presenting p-values for the following test statistics: Kupiec (1995) unconditional coverage test (KUC), Christoffersen (1998) exceedance test (CEI) and Engle and Manganelli (2004) dynamic quantile test (DQ) in two quantiles, 0.95 and 0.99. The full description of models used in the study is given in “Forecast combinations: hybrid models” section. The values in italics indicate that a given model does not pass at least one of the statistical accuracy tests

Table 10 reports the values of the loss functions for the hybrid models. Few of them have the same or very similar values of binominal and regulatory function, and thus indication of the winner is not obvious. The lowest firm loss function is observed for AR5RV-FIGARCH model. Both for the short and long position the hybrid has lower regulatory loss function value, while the single models win with lower binominal and firm loss functions. Generally lower regulatory loss function is usually accompanied by the increase of the firm loss functions. With the loss functions the same conclusion applies for both, short and long position: the decrease in regulatory loss function value caused by joining approaches with different information sets is obtained at the cost of the increase in the firm loss function.
Table 10

The values of loss functions for the hybrid models: long position

 

BL

RL

FL

BL

RL

FL

Quantile 0.05

Quantile 0.01

RV-GARCH

9

9.34

364.04

2 (1)

2.01 (1)

508.85

RV-EGARCH

8

8.34

365.08

2 (1)

2.02 (2)

513.07

RV-FIGARCH

9

9.32

362.78

2 (1)

2.02 (2)

506.21

AR5RV-GARCH

15

15.60

329.06

2 (1)

2.04

475.86 (2)

AR5RV-EGARCH

14

14.60

330.13

2 (1)

2.05

480.08 (3)

AR5RV-FIGARCH

16 (1)

16.58 (1)

328.07 (1)

2 (1)

2.04

473.22 (1)

HAR-GARCH

11

11.31

368.34

3

3.00

514.30

HAR-EGARCH

9

9.33

369.35

3

3.01

518.59

HAR-FIGARCH

11

11.30

367.06

3

3.00

511.66

HAR-J-GARCH

10

10.33

369.36

3

3.01

515.61

HAR-J-EGARCH

9

9.35

370.39

3

3.01

519.90

HAR-J-FIGARCH

10

10.32

368.09

3

3.00

512.97

RBV-GARCH

12

12.38

354.58

2 (1)

2.02 (2)

495.56

RBV-EGARCH

10

10.38

355.58

2 (1)

2.02 (2)

499.78

RBV-FIGARCH

11

11.36

353.28

2 (1)

2.02 (2)

492.91

Note The full description of models used in the study is given in “Forecast combinations: hybrid models” section. BL, RL, and FL show the values of binominal, regulatory and firm loss functions, respectively, in two quantiles, 0.95 and 0.99. The values in italics indicate that a given model does not pass the statistical accuracy tests. The values in the brackets show the score of the model within a given criterion (only first 3 ranks are presented)

In order to examine if the employment of hybrid models is reasonable, we compare the loss function values of single and hybrid models. For this comparison, we consider only these models that passed the statistical verification stage for short and long position in both quantiles. Only four models have met these rigorous criteria: three GARCH specifications, GARCH, EGARCH, and FIGARCH, and the hybrid constructed from the latter model and AR5RV. In order to test superiority of one model versus another, we use the approach proposed in Sarma et al. (2003, p. 344), that is the one-sided sign test which compares the values of the loss functions of two competing models. We examine the potential superiority of model j over model k and vice versa. In order to save space, we only present the values of standardized sign statistics for each pair of competitors. According to the sign tests there is no statistically significant difference between the models with respect to the binominal or regulatory loss functions. Table 11 presents the results of the comparison of pairs of models.
Table 11

The comparison of the competing VaR models

Model j versus k

Short position

Long position

Quantile 0.95

Quantile 0.99

Quantile 0.05

Quantile 0.01

GARCH-EGARCH

7.33

12.25

− 1.16

− 3.40

EGARCH-GARCH

− 7.33

− 12.25

1.16

3.40

GARCH-FIGARCH

2.24

3.58

3.58

5.01

FIGARCH-GARCH

− 2.24

− 3.58

− 3.58

− 5.01

EGARCH-FIGARCH

− 2.06

− 5.99

3.67

5.10

FIGARCH-EGARCH

2.06

5.99

− 3.67

− 5.10

AR5RV-GARCH

0.89

9.66

− 5.46

1.52

GARCH-AR5RV

− 0.89

− 9.66

   5.46

− 1.52

EGARCH-AR5RV

− 2.77

 0.18

− 2.24

2.86

AR5RV-EGARCH

2.77

0.18

2.24

− 2.86

FIGARCH-AR5RV

− 0.18

− 11.72

8.85

0.98

AR5RV-FIGARCH

0.18

11.72

− 8.85

− 0.98

Note The values in the Table are the statistics values of the sign test. Rejection of H0 imply that model j is significantly better than model k in terms of the firm loss function, FL. H0 is rejected at the 5% level of significance if the statistic is lower than -1.66—in these cases the values of statistics are bolded

When the firm loss function is considered, in the short position EGARCH model is the winner, as it is better than other three specifications in all but one quantiles considered. Then FIGARCH model is better than GARCH and AR5-FIGARCH in both quantiles and 0.99 quantile, respectively, whereas AR5-FIGARCH itself is better than GARCH in terms of 0.99 quantile only. For the long position, we find that FIGARCH model is superior over GARCH and EGARCH in both quantiles, then GARCH model is better than AR5-FIGARCH and EGARCH (0.01 quantile only). Finally, AR5-FIGARCH model is superior over EGARCH (0.01 quantile). Although the sign test does not indicate one distinct winner of the race, it seems that a single GARCH class model performs better than our hybrid.

Conclusions

In the paper, we consider the evaluation of alternative volatility forecasting methods within the risk management framework, namely under Value at Risk approach. We compare one-step ahead VaR forecasts, where volatility is estimated from the GARCH models based on daily data and models that use realized variance or realized bipower variation, calculated from intradaily returns. The alternative VaR measures obtained from differently sampled data give different VaR forecasts. We find that only three GARCH models from all specifications considered in the study pass the statistical accuracy tests for both positions and both VaR quantiles. VaR forecasts based on ARFIMA or HAR models for RV have worse backtesting results. When the loss functions are compared at the second stage, there is no distinct winner of the race. Surprisingly, in the case of firm loss function only, EWMA method seems to perform the best, but it is simultaneously characterized by the highest values of binominal and regulatory loss functions. Within GARCH class models, we find no evidence that skewed Student t distribution assumption provides better VaR forecasts when compared to other distribution such as Gaussian and symmetric Student.

This study shows that although the informational content of daily data seems to be sufficient to predict VaR properly, models based on intradaily data sometimes offer better result in searching for the best VaR forecasts. However, the gains must be carefully assessed by taking into account the effort required to gather and manage intradaily datasets. We also consider the hybrid models, that account for a balance between the different speeds of reaction to the market changes observed in the models based on daily or intradaily sampling frequencies. Although the combination of these two sets of information seem to open promising fields in the research, in our study the hybrids do not offer more efficient capital allocations. Only one out of fifteen hybrid models passes the statistical verification stage. In the comparison to single models, the hybrid obtains lower value of regulatory loss function, but loses on the firm loss function.

The results presented here should be of interest to firms that search for the optimal risk management model. If the statistical accuracy tests could be disregarded, the probably choice of the models might be directed to ARFIMA or HAR as they offer lower regulatory and firm loss functions, and thus allow for better capital allocation. However, if statistical accuracy tests are to be taken into account, GARCH approach is more advisable. Results should be of interest also for regulators, as they could help to recommend models that allow for achieving better regulatory profile and mitigate overall systematic risk. Realized volatility models could be recommended, especially if one controls for validity of VaR estimates, as these models often allow to diminish regulatory costs. The introduction of realized volatility, however, could be recommended only under circumstances of sufficient availability of intraday data.

Footnotes

  1. 1.

    The Polish zloty currency (PLN) is obviously not one of the most liquid in FX market and the issue of entering Eurozone is still questionable. About 80 per cent of Polish international trade is accountable in Euro.

Notes

Acknowledgements

I gratefully acknowledge the comments by anonymous referees as well as conference participants at the International Risk Management Conference 2016 organized by University of Florence, NYU Stern Salomon Center and Hebrew University of Jerusalem. All remaining errors are mine.

References

  1. Ahoniemi, K., A.-M. Fuertes, and J. Olmo. 2016. Overnight news and daily equity trading risk limits. Journal of FInancial Econometrics 14 (3): 525–551.  https://doi.org/10.2139/ssrn.2065017.CrossRefGoogle Scholar
  2. Andersen, T.G., and L. Benzoni. 2009. Realized volatility. In Handbook of financial time series, ed. T.G. Andersen, R.A. Davis, J.-P. Kreiss, and T. Mikosch, 555–575. New York: Springer.CrossRefGoogle Scholar
  3. Andersen, T.G., and T. Bollerslev. 1998. Answering the skeptics: Yes, standard volatility models do provide accurate forecasts. International Economic Review 39 (4): 885–905.CrossRefGoogle Scholar
  4. Andersen, T.G., T. Bollerslev, and F.X. Diebold. 2007. Roughing it up: Including jump components in the measurement, modeling, and forecasting of return volatility. Review of Economics and Statistics 89 (11): 701–720.  https://doi.org/10.1162/rest.89.4.701.CrossRefGoogle Scholar
  5. Andersen, T.G., T. Bollerslev, F.X. Diebold, and P. Labys. 2001. The distribution of realized exchange rate volatility. Journal of the American Statistical Association 96 (8): 42–55.  https://doi.org/10.1198/016214501750332965.CrossRefGoogle Scholar
  6. Andersen, T.G., T. Bollerslev, F.X. Diebold, and P. Labys. 2003. Modeling and forecasting realized volatility. Econometrica 71: 529–626.CrossRefGoogle Scholar
  7. Baillie, R.T., T. Bollerslev, and H.O. Mikkelsen. 1996. Fractionally integrated generalized autoregressive conditional heteroskedasticity. Journal of Econometrics 74 (1): 3–30.  https://doi.org/10.1016/S0304-4076(95)01749-6.CrossRefGoogle Scholar
  8. Barndorff-Nielsen, O.E., and N. Shephard. 2004. Power and bipower variation with stochastic volatility and jumps. Journal of Financial Econometrics 2: 1–48.  https://doi.org/10.1093/jjfinec/nbh001.CrossRefGoogle Scholar
  9. Będowska-Sójka, B. 2015. Daily VAR forecasts with realized volatility and GARCH models. Argumenta Oeconomica 34 (1): 157–173.Google Scholar
  10. Bollerslev, T. 1986. Generalized autoregressive conditional heteroskedasticity. Journal of Econometrics 31 (3): 307–327.  https://doi.org/10.1016/0304-4076(86)90063-1 CrossRefGoogle Scholar
  11. Brownlees, C.T., and G.M. Gallo. 2009. Comparison of volatility measures: A risk management perspective. Journal of Financial Econometrics 8 (1): 29–56.  https://doi.org/10.1093/jjfinec/nbp009.CrossRefGoogle Scholar
  12. Christoffersen, P.F. 1998. Evaluating interval forecasts. International Economic Review 39 (4): 841–862.  https://doi.org/10.2307/2527341.CrossRefGoogle Scholar
  13. Clements, M.P., A.B. Galvão, and J.H. Kim. 2008. Quantile forecasts of daily exchange rate returns from forecasts of realized volatility. Journal of Empirical Finance 15 (4): 729–750.  https://doi.org/10.1016/j.jempfin.2007.12.001.CrossRefGoogle Scholar
  14. Corsi, F. 2009. A simple approximate long-memory model of realized volatility. Journal of Financial Econometrics 7 (2): 174–196.  https://doi.org/10.1093/jjfinec/nbp001.CrossRefGoogle Scholar
  15. Dacorogna, M.M., R. Gençay, U. Müller, R. Olsen, and O. Pictet. 2001. An introduction to high frequency finance. London: Academic Press.Google Scholar
  16. Doornik, J.A., and D.F. Hendry. 2005. Empirical econometric modelling. PcGiveTM11. London: Timberlake Consultants.Google Scholar
  17. Elliott, G., and A. Timmermann. 2004. Optimal forecast combinations under general loss functions and forecast error distributions. Journal of Econometrics 122 (1): 47–79.  https://doi.org/10.1016/j.jeconom.2003.10.019.CrossRefGoogle Scholar
  18. Engle, R.F., and S. Manganelli. 2004. CAViaR: Conditional autoregressive value at risk by regression quantiles. Journal of Business & Economic Statistics 22 (4): 367–381.  https://doi.org/10.1198/073500104000000370.CrossRefGoogle Scholar
  19. Fuertes, A.-M., and J. Olmo. 2012. Exploiting intraday and overnight price variation for daily VaR prediction. Frontiers in Finance and Economics 9 (2): 1–31.Google Scholar
  20. Giot, P., and S. Laurent. 2003. Value-at-risk for long and short trading positions. Journal of Applied Econometrics 18 (6): 641–664.  https://doi.org/10.1002/jae.710.CrossRefGoogle Scholar
  21. Giot, P., and S. Laurent. 2004. Modelling daily value-at-risk using realized volatility and ARCH type models. Journal of Empirical Finance 11 (3): 379–398.  https://doi.org/10.1016/j.jempfin.2003.04.003.CrossRefGoogle Scholar
  22. Hansen, P.R., and A. Lunde. 2005. A forecast comparison of volatility models: Does anything beat a GARCH(1,1)? Journal of Applied Econometrics 20 (7): 873–889.  https://doi.org/10.1002/jae.800.CrossRefGoogle Scholar
  23. Kambouroudis, D.S., D.G. McMillan, and K. Tsakou. 2016. Forecasting stock return volatility: A comparison of GARCH, implied volatility, and realized volatility models. Journal of Futures Markets 36 (12): 1127–1163.  https://doi.org/10.1002/fut.21783.CrossRefGoogle Scholar
  24. Kupiec, P.H. 1995. Techniques for verifying the accuracy of risk measurement models. The Journal of Derivatives 3 (2): 73–84.  https://doi.org/10.3905/jod.1995.407942.CrossRefGoogle Scholar
  25. Laurent, S. 2010. G@rch 6.0 help. London: Timberlake Consultants Ltd.Google Scholar
  26. Lopez, J.A. 1999. Methods for evaluating value at risk esimates. FRBSF Economics Review 2 (1): 3–17.  https://doi.org/10.1086/250095.Google Scholar
  27. Louzis, D.P., S. Xanthopoulos-Sisinis, and A.P. Refenes 2014. Realized volatility models and alternative Value-at-Risk prediction strategies. in Economic modelling. Elsevier B.V., 40, pp. 101–116.  https://doi.org/10.1016/j.econmod.2014.03.025.
  28. McMillan, D.G., A.E.H. Speight, and K.P. Evans. 2008. How useful is intraday data for evaluating daily Value-at-Risk? Evidence from three Euro rates. Journal of Multinational Financial Management 18 (5): 488–503.  https://doi.org/10.1016/j.mulfin.2007.12.003.CrossRefGoogle Scholar
  29. Müller, U.A., M.M Dacorogna, R.D. Dave, O.V. Pictet, R.B. Olsen and J.R. Ward 1995 Fractals and intrinsic time: A challenge to econometricians, Olsen & Associates Research Group. Available at: http://finance.martinsewell.com/stylized-facts/scaling/Muller-etal1993.pdf.
  30. Nelson, D.B. 1991. Conditional Heteroskedasticity in asset returns : A new approach author. Econometrica 59 (2): 347–370.CrossRefGoogle Scholar
  31. Patton, A.J., and K. Sheppard. 2015. Good volatility, bad volatility: Signed jumps and the persistence of volatility. Review of Economics and Statistics 97 (3): 683–697.  https://doi.org/10.1162/REST_a_00503.CrossRefGoogle Scholar
  32. Sarma, M., S. Thomas, and A. Shah. 2003. Selection of value-at-risk models. Journal of Forecasting 22: 337–358.  https://doi.org/10.1002/for.868.CrossRefGoogle Scholar
  33. Wong, Z.Y., W.C. Chin, and S.H. Tan. 2016. Daily value-at-risk modeling and forecast evaluation: The realized volatility approach. The Journal of Finance and Data Science 2 (3): 171–187.  https://doi.org/10.1016/j.jfds.2016.12.001.CrossRefGoogle Scholar

Copyright information

© Macmillan Publishers Ltd., part of Springer Nature 2018

Authors and Affiliations

  1. 1.Department of EconometricsPoznań University of Economics and BusinessPoznańPoland

Personalised recommendations