Advertisement

Digital Finance

, Volume 1, Issue 1–4, pp 67–89 | Cite as

Blockchain analytics for intraday financial risk modeling

  • Matthew F. DixonEmail author
  • Cuneyt Gurcan Akcora
  • Yulia R. Gel
  • Murat Kantarcioglu
Original Article
  • 220 Downloads

Abstract

Blockchain offers the opportunity to use the transaction graph for financial governance, yet properties of this graph are understudied. One key question in this direction is the extent to which the transaction graph can serve as an early-warning indicator for large financial losses. In this article, we demonstrate the impact of extreme transaction graph activity on the intraday volatility of the Bitcoin prices series. Specifically, we identify certain sub-graphs (‘chainlets’) that exhibit predictive influence on Bitcoin price and volatility and characterize the types of chainlets that signify extreme losses. Using bars ranging from 15 min up to a day, we fit GARCH models with and without the extreme chainlets and show that the former exhibit superior value-at-risk backtesting performance.

Keywords

Blockchain Cryptocurrencies Graph analysis GARCH Intraday financial risk 

JEL Classification

C58 C63 G18 

1 Introduction

The global financial system collapsed in 2008 and created the most severe recession in the history of the United States since the great depression of the 1920s. The “Flash Crash” of May 6th of 2010 in which the Dow Jones Industrial Average plunged approximately 5% only to recover within minutes not only highlighted the continued fragility of our financial markets and the inability of regulators to either preempt this crash or promptly release post-crisis results of investigative studies aimed at preventing a re-occurrence of similar events. The double auction system deployed by financial markets is not only largely inaccessible to researchers, but the identity of the market participants is hidden. It is, therefore, impossible to reliably trace the movement of money through a financial network, attributing financial market characteristics to the actions of specific agents.

Since the seminal Bitcoin paper (Nakamoto 2008) in 2008, cryptocurrencies (Tschorsch and Scheuermann 2016) have been the most prominent Blockchain application. Designed to facilitate a secure distributed platform without central regulation, Blockchain is heralded as a paradigm that will be as powerful as big data, cloud computing, and machine learning.

Blockchain provides the capability to use the transaction graph for financial governance but properties of this graph are understudied. One key question in this direction is the extent to which transaction graph can serve as an early-warning indicator for large financial losses.

The goal of this article is to develop a representation of the transaction graph which augments classical financial risk modeling and can be used to address such a question. This novel data representation permits a big data version of financial econometrics—with the emphasis on the topological network structures in addition to the covariance of historical time series of prices. By processing all financial interactions, we model the network with a high-fidelity graph so that it is possible to characterize how the flow of information in the network evolves over time. In this graph, each node (also called a vertex) is an address that is created from the cryptographic keys of a Bitcoin account. The graph contains edges that represent inputs (i.e., received Bitcoins) or outputs (i.e., sent Bitcoins) to this address. The owner of the account can be uniquely identified by its corresponding address, but address creation is free and cheap. Furthermore, the community discourages using the same address to receive/send money multiple times; for change coins a user usually creates a new address. As a result, there can be multiple addresses belonging to a single user. On this graph, we employ a novel graph data model, referred to as a ‘chainlet’ (Akcora et al. 2018a) that captures local transaction information patterns, to store the graph for a given time period (e.g., one day). In particular, we model the impact of ‘extreme chainlets’ on intra-day prices and volatility in a GARCH framework.

1.1 Related research

Several studies have already partially addressed the capacity and limitations for cryptocurrencies to provide a robust and transparent economic system for all economic participants (Caporale et al. 2018; Shah and Zhang 2014; Corbet et al. 2017; Dyhrberg 2016a; Gomber et al. 2017; Sovbetov 2018). In contrast to existing financial networks, Blockchain-based crypto-currencies expose the entire transaction graph to the public, where payment sender, receiver and amount are visible to the public. Although any node can join the network without identifying itself, Bitcoin transactions are listed for all participants and the most significant agents can be immediately located on the network.

Graph analysis Analyzing the relationship between transactions and addresses and Bitcoin price has emerged as a pivotal research direction  (Tasca et al. 2018). In particular, there is a growing focus on building statistical models which can predict and attribute price movements to transactions and transaction graph properties. While simple Blockchain transactional features, such as average transaction amount, are shown to exhibit mixed performance for cryptocurrency price forecasting (Greaves and Au 2015), a number of recent studies complement this article in demonstrating the utility of global graph features to predict the price (Akcora et al. 2018b; Madan et al. 2015; Kondor et al. 2014; Greaves and Au 2015). For instance, Sorgente and Cibils (2014) analyzed the predictive effects of average balance, clustering coefficient, and number of new edges on the Bitcoin price and Akcora et al. (2018a) use Blockchain chainlets as predictors. Two network flow measures were recently proposed by Yang and Kim (2015) to quantify the dynamics of the Bitcoin transaction network and to assess the relationship between flow complexity and Bitcoin market variables.

Intraday volatility All of the aforementioned studies are performed on daily Bitcoin market data. However, not only are data available on a much more granular timescale, many practical financial applications rely on intraday data to make trading and risk management decisions. Recently, Guo and Antulov-Fantulin (2018) applies various time series and machine learning approaches to the short-term (i.e. intraday) prediction of the Bitcoin exchange price fluctuations. The novelty in their study is the use of historical intraday market data combined with limit order information. Their results showed that limit order book information holds predictive power at short time horizons, lagging up to 30 min, but found little empirical evidence to support longer term price impact.

GARCH Generalized autoregressive conditional heteroskedasticity (GARCH) are popularized in the financial econometric literature for their capacity to model volatility with empirically supported properties. The GARCH model, for example, can capture volatility clustering, correlation between the returns and the volatility, and for certain classes of GARCH models, asymmetric effects between positive and negative returns. Asymmetric GARCH is particularly useful in risk management and ideal for risk averse investors in anticipation of negative shocks to the market (Dyhrberg 2016b).

GARCH modeling has emerged as the primary approach for cryptocurrency volatility modeling from daily price history (Cermak 2017; Dyhrberg 2016b; Chu et al. 2017; Ardia et al. 2018). Yet the application of GARCH to intraday cryptocurrency volatility modeling has garnered little attention. Building on this literature, we demonstrate the effect of including extreme chainlet activity, \(x_{t}\) in an intraday eGARCH model. Note that our focus is primarily on longer intraday risk horizons (hours) that would be relevant to high-frequency GARCH modeling (Shephard and Sheppard 2010) (sub-minute) and, for this reason, we do not include the exchange’s limit order book in our study (see Guo and Antulov-Fantulin 2018), instead choosing to use the Blockchain graph.

Market sentiment indices Finding alternative data sources which are strongly linked with prices and volatility is a well-established practice in the conventional financial markets. The ability to exploit such data sources often requires more advanced methodology as such data are irregularly spaced and may depart from stylized econometric data modeling assumptions. For example, Borovkova and Mahakena (2015) study the impact of news on returns, prices and price jumps in natural gas futures, and deploy a local level state space model to construct a news sentiment time series from irregularly spaced news announcements. Other data sources include limit order books, fundamentals, SEC filings and equity analyst ratings. In this regard, our proposed use of the transaction graph to predict price and risk is a form of alternative data analysis for trading and investment management. The compelling aspect of the transaction graph is its objectivity, instantaneous access and transparency. On the other hand, new sentiment analysis can be subjective, and fundamental ratios are reported infrequently and with some degree of opacity and potential for manipulation. Thus, in principle, the transaction graph has the potential to deliver a robust and transparent cryptocurrency market sentiment indicators.

Contribution Bitcoin requires new data science research and methodology to demonstrate how full disclosure of an agent’s actions in a crypto-currency market inform price discovery and ultimately serve as an early-warning indicator for excess market volatility or even a crash.

This article contributes to the growing body of research on the role of users, entities and their interactions in formation and dynamics of crypto-currency risk investment, financial predictive analytics and, more generally, in re-shaping the modern financial world.

Specifically, we model the impact of extreme chainlets on intraday prices and volatility in a GARCH framework. We mention in passing that the application of GARCH models to forecast daily Bitcoin prices has already been extensively investigated in Chu et al. (2017), and Cermak (2017). We supplement these findings, by (i) demonstrating the importance of including extreme chainlet activity and (ii) modeling intraday price and volatility, extending our previous study on the effect of chainlets on daily volatility (Akcora et al. 2018b).

2 Chainlets and extreme chainlets

As shown in Fig. 1, a Bitcoin graph consists of three main components: addresses, transactions and blocks (see Akcora et al. 2017 for a primer on Blockchain graphs). In the rest of this paper, we will use address and account interchangeably.

The Bitcoin protocol allows multiple accounts to participate in a transaction by each party signing its own part of the transaction. Both inputs and outputs of a transaction can have multiple accounts. For instance, in Fig. 1 transaction \(t_2\) receives Bitcoins from addresses \(a_2\) and \(a_3\) and deposits the amount to addresses \(a_5\) and \(a_6\). A real-life analogy is a person using multiple bank accounts, merging funds in a single transaction and sending the amount to multiple accounts. However, there are coin mixing (Maxwell 2013) services that allow unacquainted people to create a transaction together, where the coins are mixed to hide their origins. As such, addresses that appear in the inputs or outputs of a transaction may not belong to the same person/entity.

Inputs and outputs can provide vital information about transaction purposes. For example, transactions that involve hundreds of inputs and very few outputs may imply large amounts of Bitcoin investments.

One approach to understand how transactions relate to market price is to introduce the novel concept of k-chainlets (Akcora et al. 2018a). A k-chainlet is a Bitcoin sub-graph of \(k \ge 1\) transactions and their corresponding input and output addresses corresponding to different accounts, not necessarily unique to a user. In the simplest case, a single transaction creates a 1-chainlet with one or more inputs and a single output. For example, in Fig. 1, transaction \(t_1\) results in the transfer of Bitcoin from address \(a_1\) to address \(a_4\) and \(a_5\). Such a transaction creates a 1-chainlet that has one input and two outputs. We denote this subgraph as a chainlet of type \({\mathbb {C}}_{1 \rightarrow 2}\), where 1 and 2 are the number of input and output addresses, respectively.

A 1-chainlet is the smallest building block of the Bitcoin graph; inputs and outputs of the chainlet are determined at once, and the transaction is digitally signed. This signed information cannot be modified, but multiple 1-chainlets can be combined to extend the graph. For simplicity, in the rest of this work, we use the term chainlet to refer to 1-chainlets.
Fig. 1

A transaction-address graph representation of three blocks in the Bitcoin network. Addresses are represented by circles, transactions with rectangles and directed edges indicate a transfer of coins. Blocks order transactions in time, whereas each transaction with its input and output nodes represents an immutable decision that is encoded as a subgraph on the Bitcoin network. Some addresses, such as \(a_7\), contain unspent Bitcoins. The balance between inputs and outputs of a transaction is collected by the miner as the transaction fee. The transaction fee can be zero, but this leaves the miner no incentive to mine the transaction. Below the graph, we show the occurrence (e.g., \(O_1\)) and amount (e.g., \(A_1\)) matrices for each block

2.1 Extreme chainlets

Graph analysis allows us to evaluate the local topological structure of the Bitcoin graph over time and assess the role of chainlets on Bitcoin price formation and dynamics. Figure 2 illustrates how the activity of the network can be represented by a chainlet occurrence matrix. For a given time period, we may count the occurrences of each \({\mathbb {C}}_{i \rightarrow j}\) and store it in a matrix. The maximum number of inputs or outputs of a chainlet can be large, however, sometimes exceeding 1000. When the number of inputs and/or outputs equals or exceeds a threshold N, we refer to these chainlets as “extreme chainlets”. In our historical analysis of daily snapshots, we choose \(N=20\), which corresponds to the 97.5 percentile of all chainlet occurrences. In other words, if there are chainlets falling beyond the Nth row or column, their information is stored in the last row and/or column of the matrix. It is instructive to distinguish between ‘left extreme chainlets’ and ‘right extreme chainlets’:

Definition: extreme chainlets
  • Left extreme chainletsare the subset \({\mathcal {C}}^\mathrm{l}:=\{{\mathbb {C}}_{i \rightarrow j}\,|\,i=N,\,j\in \{1,\ldots ,N\}\}\) highlighted in the bottom row in the figure. They represent transactions from a large number of accounts to fewer addresses. They represent Bitcoin investment—transfer of Bitcoin from a large number of wallets to a relatively few number of wallets represents the supply of liquidity at an exchange.

  • Right extreme chainletsare the subset \({\mathcal {C}}^\mathrm{r}:=\{{\mathbb {C}}_{i \rightarrow j}\,|\,i\in \{1,\ldots ,N-1\},\,j=N\}\) highlighted in the far right column in the figure. They represent the sale of a large sum of Bitcoins across the market—the seller divides the balance and sends them to potentially hundreds of Bitcoin addresses.
    Fig. 2

    This figure illustrates how the 400 chainlets (i.e., \({N}=20\)) are related to each other in terms of their occurrence (left) and amount (right). The occurrence and amount matrices are formed by taking daily snapshots of the Bitcoin graph and counting the occurrences and summing amounts of \({\mathbb {C}}_{i \rightarrow j}, \,\forall i, j\), respectively (over the period 2015–2018). The color scale denotes cluster (group) membership. The left and right extreme chainlets are shown by the bottom row and far right columns, respectively. The overall figure shows that chainlet occurrences are more similar to each other, whereas amounts are not. For clarity, we use agglomerative hierarchical clustering with cosine similarity (Steinbach et al. 2000) and terminate the process when the within cluster similarity is less than 0.5. We only color clusters that contain more than three chainlets

For a chosen time granularity, extreme chainlets are intrinsically informative by quantifying the size and frequency of extreme chainlets relative to the non-extreme chainlets. This information is illustrated in Fig. 3, using, for simplicity a \(6\times 6\) occurrence matrix (although we use a \(20\times 20\) matrix) by counting all chainlets over a time period. Row and column indices correspond to number of inputs and outputs of transactions, respectively. The percentage of all chainlet counts which are left and right extreme chainlets are shown by the bottom row and right column, respectively.
Fig. 3

Percentages of Bitcoin chainlets (all times) with an \(N=6\) chainlet occurrence matrix, where the last row and column hold the extreme chainlet counts. Row and column indices correspond to number of inputs and outputs of transactions, respectively. The matrix shows that one-to-one coin (i.e., \({\mathbb {C}}_{1 \rightarrow 1}\)) payments are 8.45% of all transactions, whereas 57.04% of all chainlets are \({\mathbb {C}}_{1 \rightarrow 2}\). Colors scales are 10%, 1%, 0.1 and 0% with decreasing density. Dashed edges indicate aggregation of \(N\ge 6\) inputs and outputs

We denote the amount of Satoshis (one BTC is \(10^{8}\) Satoshis) transferred between dates \(t-1\) and t by left and right extreme chainlets as \(A^\mathrm{l}_{t}\) and \(A^\mathrm{r}_{t}\), and the total occurrences as \(O^\mathrm{l}_{t}\) and \(O^\mathrm{r}_{t}\), respectively. In Fig. 1, the three blocks and their occurrence and amount matrices are given. As all transactions have 2 or less inputs/outputs, we use \(N=2\) in this toy example. The sum of occurrence values equals the number of transactions (e.g., 2 in Block 1 and 1 in Block 3), whereas the sum of amount matrix cells gives the total transaction volume (e.g., 6.8 BTC in Block 2).

Figure 4 illustrates how extreme chainlets vary intraday by choosing to construct the chainlet matrices at hourly periods. The extreme chainlet amounts (scaled) and Bitcoin hourly returns are shown over a 60-h period spanning February 5th–8th, 2015. Over this horizon, the extreme chainlet amounts show peaked activity prior to two large losses and more volatile periods. In the following section, we shall apply statistical techniques for quantifying the relationship between extreme chainlets and returns and volatility.
Fig. 4

Left extreme chainlet amounts, \(A_{t}^\mathrm{l}\), right extreme chainlet amounts, \(A_{t}^\mathrm{r}\), and total extreme chainlet percentage amounts, \(A_{t}^{x}\) (scaled) and Bitcoin hourly returns, \(r_{t}\), over a 60-h period spanning February 5th–8th, 2015. Over this horizon, the extreme chainlet amounts show peaked activity prior to two large losses

Discussion The economic rationale for using graph analysis for risk management and price forecasting is its ability to directly capture supply and demand dynamics. People negotiate trades to buy or sell goods, services, and fiat currencies in exchange for cryptocurrency and pay for these trades by transferring cryptocurrency between virtual wallets. Market participants, and particularly investors, move cryptocurrency between virtual wallets to buy, sell, and hold cryptocurrency as they attempt to generate capital gains or reduce exposure to cryptocurrencies. Transfer of Bitcoins from a large number of wallets to a relatively few number of wallets represents the supply of liquidity at an exchange. For example, Bitcoin holders seeking to reduce exposure would convert their Bitcoin to fiat currency. Conversely, Bitcoin transfer from a few number to a large number of wallets represents demand—the sale of Bitcoin to a large number of Bitcoin investors and hence increased exposure. The magnitude and direction of Bitcoin movement is a measure of price sentiment in Bitcoin and the frequency of large transactions is additional source of price volatility.

3 Intraday forecasting Bitcoin prices and volatility

The extent to which we can build predictive models from the chainlets has already led to some promising results (see Akcora et al. 2018a for specification of the types and groups of chainlets that exhibit predictive influence on Bitcoin price and volatility).

Data There are numerous sources of historical Bitcoin prices which are collected by various exchanges. In this work, we used the Bitcoin USD price information that is sourced from the Blockchain exchange Coinbase.com over the period from February 2015 to December 2018 (1,499,040 observations) and the corresponding chainlet occurrence and amount matrices.1 This price is estimated as an average over multiple exchanges worldwide.

On average, the Bitcoin network attempts to generate a block every 10 min, and the mining difficulty is adjusted to achieve this goal. However, in practice, there is high degree of uncertainty in the time taken by the miner to return a valid blockhash; a new block may be found after only 2 s or as long as 20 min. Bitcoin limits the number of transactions in a block by limiting block size to 1MB. Between 2015 and 2018, a Bitcoin block had 1432 transactions in average with a minimum of 1 and maximum of 12,239. Figure 3 shows how these chainlets are distributed for various inputs and outputs.

Every 2016 blocks, Bitcoin computes the time that it actually took to mine these blocks. If it took less than 14 days, the difficulty is deemed to be too easy and increased. If it took more than 2 weeks, difficulty is decreased. The difficulty decreases very rarely though.

3.1 Risk and time series modeling

We characterize the uncertainty of a ‘loss’ and, in particular, estimate the probability of extreme losses occurring over a future horizon. The loss is defined as the negative of the discrete returns, \(L_{t}=-r_{t}\), where \(r_t:={(P_{t+1} - P_t)}/{P_{t}}\) and \(P_{t}\) is the Bitcoin price at time t.

Granger causality Figure 5 shows the results from applying a Granger causality test of extreme chainlets as predictors of square of the demeaned price returns \((r_{t}-\mu )^2\), a proxy for volatility. Each plot shows the p value of accepting the null hypothesis that there is no causal effect of lagged extreme chainlets on squared price returns. The x-axis shows the maximum lag chosen in the experiment. Each row in the figure shows the extreme chainlet amounts (left) and extreme chainlet occurrences (right) for a time intervals at 15, 30, 60, 120 and 240 min. In general, we observe that the left extreme chainlet amounts, \(A_{t}^\mathrm{l}\), have the strongest causal effect on volatility—both the amounts and occurrences show statistical significance test results, for the optimal maximum lag. In fact, at almost all timescales chosen, the Lag 1 left extreme chainlet amount is sufficient. In contrast, the right extreme chainlets amounts, \(A^\mathrm{r}_{t}\), are not causal. At most timescales, the extreme chainlet amount ratios, \(O^{x}_{t}\), exhibit causality although the test is more sensitive to the number of lags. The left and right extreme chainlet occurrences, \(O^\mathrm{l}_{t}\) and \(O^\mathrm{r}_{t}\), are selectively causal with optimal maximum lag.
Fig. 5

This figure shows the results from applying a Granger causality test of extreme chainlets as predictors of volatility. Each plot shows the p value of accepting the null hypothesis that there is no causal effect of lagged extreme chainlets. The x-axis shows the maximum lag chosen in each test. Each row in the figure shows the extreme chainlet amounts (left) and extreme chainlet occurrences (right) for time intervals at 15, 30, 60, 120 and 240 min

Table 1

This table compares the empirical densities of the standardized daily losses conditioned on the lower and upper \(\alpha =0.05\) percentiles of extreme chainlet activity by amount (\(A^{x}\)) and occurrences (\(O^{x}\))

Interval (min)

Pdf

Mean

Std.dev.

Skewness

Kurtosis

15

\(\phi (L_t)\)

0

1

9.154

1056.313

15

\(\phi \left( L_t | A^x_t < \Phi ^{-1}_{A^x_t}(0.05)\right)\)

\(-\) 0.013

0.907

\(-\) 14.93

918.836

15

\(\phi \left( L_t | A^x_t > \Phi ^{-1}_{A^x_t}(0.95)\right)\)

0.008

1.181

9.706

839.024

15

\(\phi \left( L_t | O^x_t < \Phi ^{-1}_{O^x_t}(0.05)\right)\)

\(-\) 0.01

0.931

\(-\) 11.531

764.131

15

\(\phi \left( L_t | O^x_t > \Phi ^{-1}_{O^x_t}(0.95)\right)\)

0.041

1.641

22.525

1034.844

30

\(\phi (L_t)\)

0

1

7.181

498.262

30

\(\phi \left( L_t | A^x_t < \Phi ^{-1}_{A^x_t}(0.05)\right)\)

\(-\) 0.039

0.971

\(-\) 15.781

519.805

30

\(\phi \left( L_t | A^x_t > \Phi ^{-1}_{A^x_t}(0.95)\right)\)

0.018

1.124

13.496

526.045

30

\(\phi \left( L_t | O^x_t < \Phi ^{-1}_{O^x_t}(0.05)\right)\)

\(-\) 0.014

0.787

\(-\) 6.176

209.604

30

\(\phi \left( L_t | O^x_t > \Phi ^{-1}_{O^x_t}(0.95)\right)\)

0.028

1.127

14.669

528.621

60

\(\phi (L_t)\)

0

1

4.723

227.521

60

\(\phi \left( L_t | A^x_t < \Phi ^{-1}_{A^x_t}(0.05)\right)\)

\(-\) 0.034

0.761

\(-\) 6.031

107.801

60

\(\phi \left( L_t | A^x_t > \Phi ^{-1}_{A^x_t}(0.95)\right)\)

0.029

1.362

11.197

255.048

60

\(\phi \left( L_t | O^x_t < \Phi ^{-1}_{O^x_t}(0.05)\right)\)

\(-\) 0.024

0.77

\(-\) 4.014

99.398

60

\(\phi \left( L_t | O^x_t > \Phi ^{-1}_{O^x_t}(0.95)\right)\)

\(-\) 0.006

1.182

11.231

297.34

120

\(\phi (L_t)\)

0

1

4.615

228.85

120

\(\phi \left( L_t | A^x_t < \Phi ^{-1}_{A^x_t}(0.05)\right)\)

\(-\) 0.025

0.749

\(-\) 5.148

112.134

120

\(\phi \left( L_t | A^x_t > \Phi ^{-1}_{A^x_t}(0.95)\right)\)

0.027

1.37

11.155

253.848

120

\(\phi \left( L_t | O^x_t < \Phi ^{-1}_{O^x_t}(0.05)\right)\)

\(-\) 0.022

0.781

\(-\) 3.575

98.738

120

\(\phi \left( L_t | O^x_t > \Phi ^{-1}_{O^x_t}(0.95)\right)\)

\(-\) 0.019

0.906

3.781

148.533

240

\(\phi (L_t)\)

0

1

4.642

229.758

240

\(\phi (L_t | A^x_t < \Phi ^{-1}_{A^x_t}(0.05))\)

\(-\) 0.021

0.802

\(-\) 2.709

106.828

240

\(\phi (L_t | A^x_t > \Phi ^{-1}_{A^x_t}(0.95))\)

0.025

1.712

11.622

351.914

240

\(\phi (L_t | O^x_t < \Phi ^{-1}_{O^x_t}(0.05))\)

\(-\) 0.023

0.768

\(-\) 3.164

98.804

240

\(\phi (L_t | O^x_t > \Phi ^{-1}_{O^x_t}(0.95))\)

\(-\) 0.016

0.957

4.071

130.399

1440

\(\phi (L_t)\)

0

1

1.38

11.512

Daily

\(\phi (L_t | A^x_t < \Phi ^{-1}_{A^x_t}(0.05))\)

\(-\) 0.102

0.712

\(-\) 0.693

5.881

Daily

\(\phi (L_t | A^x_t > \Phi ^{-1}_{A^x_t}(0.95))\)

0.186

1.356

1.309

6.623

Daily

\(\phi (L_t | O^x_t < \Phi ^{-1}_{O^x_t}(0.05))\)

\(-\) 0.026

0.919

2.056

13.222

Daily

\(\phi (L_t | O^x_t > \Phi ^{-1}_{O^x_t}(0.95))\)

\(-\) 0.119

1.047

\(-\) 1.072

5.619

Each row represents different time intervals of 15, 30, 60, 120, 240 min and days. The standardized unconditional loss densities for these time intervals are also given

Conditional loss distributions Table 1 shows the unconditional loss densities, \(\phi (L_t)\) (black) and conditional densities of the standardized Bitcoin losses over the considered time period. At each time interval, the skewness and kurtosis of the conditional loss distributions are observed to differ significantly from the unconditional loss distribution. Since we seek early warning indicators for extreme losses, we highlight (i.e,. use a bold font) each result where the conditional loss distribution exhibits larger skewness and kurtosis.

This accentuated right skewness, combined with the higher kurtosis, indicates more extreme losses following abnormal extreme chainlet activity in the previous time period. The trend across most of the time scales is that large extreme chainlet amounts are followed by larger losses. The role of occurrences is also important—at time intervals of 30 and 60 min, high occurrences are followed by larger losses.

Note that at timescales of 15 min, conditioning on the extreme chainlets does not result in a more right skewed or fatter tailed distribution. Additional results, not shown in Table 1 show similar effects at shorter intervals.

3.2 GARCH

Generalized autoregressive conditional heteroskedasticity (GARCH) are popularized in the financial econometric literature for their capacity to model volatility with empirically supported properties.

Building on related studies applying GARCH to the volatility modeling of Bitcoin (Cermak 2017; Dyhrberg 2016b; Chu et al. 2017), we demonstrate the effect of including extreme chainlet activity, \(\mathbf{x}_\mathbf{t}=:[A_t^, A_t^x, A_t^\mathrm{r}, O_t^\mathrm{l}, O_t^x, O_t^\mathrm{r} ]\), in an intraday GARCHX model. To capture asymmetry in the volatility dynamics and to preserve its positivity, we choose an eGARCHX(pq) model of Nelson (1991):
$$\begin{aligned} \ln (\sigma _t^2) & = \omega + \sum _{j=1}^q\alpha _j\epsilon _{t-j} + \gamma _j\left( |\epsilon _{t-j}| - {\mathbb {E}} |\epsilon _{t-j}|\right) \nonumber \\& \quad + \sum _{j=1}^p \beta _j \ln (\sigma _{t-j}^2) + \sum _{j=1}^{|x_t|}\beta ^x_{j} x_{t,j}, \end{aligned}$$
(1)
where the coefficient \(\alpha _j\) captures the sign effect, \(\gamma _i\), the size effect, and \(\beta ^\mathbf{x}\) is the coefficient vector for the exogenous chainlet activity, \(\mathbf{x}_\mathbf{t}\), of length \(|x_t|\). We additionally use a ARMA\((p',q')\) model for the mean equation:
$$\begin{aligned} y_t & = \mu + \sum _{i=1}^{p'} \phi _i y_{t-i} + \sum _{i=1}^{q'} \psi _i \epsilon _{t-i} + \sigma _t\epsilon _t, \end{aligned}$$
(2)
where \(y_t\) are the observed daily returns, \(\epsilon _t\) are standard skewed Student’s t-innovations and \(\sigma _t\) is the volatility. We refer to our model with exogenous regressors as an ARMA-eGARCHX model. The model is fitted to price time series at intervals of 15, 30, 60, 120 and 240 min. For completeness, we also fit to daily intervals. The fitted coefficients are given in the Appendix.

Model selection and diagnostics Additional diagnostics, provided in the Appendix, show that the returns time series are always stationary, irrespective of timescale (see Table 7). All ARMA models test positive for an ARCH effect in the residual. Table 6 shows that all models pass a Box–Ljung and Lagrange multiplier test at the 99% confidence level for the residuals and square of the residuals. Table 2 shows the specification of each optimal model, as determined by the AIC, for each timescale. Note that the optimal orders of p and q are determined separately for the mean equation and GARCH model using the returns and squared residuals, respectively.

Table 8 shows the results of various sign-bias tests. Asymmetry is observed to be significant across all the timescales. The positive sign bias and joint effect are observed to be significant at the 95% and 90% confidence levels, respectively. All fitted coefficients of the ARMA–eGARCH model are given in Tables 9, 10, 11, 12, 13 and 14.
Table 2

This table shows the fitted ARMA–eGARCH(X) models together with the training set size for each time interval

Interval (mins)

ARMA

eGARCH

# observations

15

(4, 5)

(3, 5)

99,795

30

(5, 0)

(2, 2)

49,923

60

(4, 3)

(1, 0)

24,967

120

(5, 5)

(1, 0)

12,483

240

(4, 4)

(2, 5)

5985

Daily

(5, 4)

(4, 4)

1040

Table 3 compares the 99% value-at-risk (VaR) backtest for each ARMA–eGARCH and ARMA–eGARCHX model, using the model specified in Table 2. The backtesting period is over the remainder of the dataset after using the first 250 observations for training. The backtest is configured to using a rolling window of 250 observations and a day ahead forecast. The model is refitted approximately every 10% of the back-testing period. At each time interval, the ARMA–eGARCH model consistently substantially underestimates the VaR, resulting in an excessive number of breaches (a.k.a. exceedances). In contrast, the ARMA–eGARCHX model breaches closer to the expected number of breaches, with a slight underestimation bias.
Table 3

This table compares the VaR backtesting performance with and without the chainlet regressors

Interval (mins)

Backtest length

Expected breaches

Actual VaR breaches

W/o chainlets

With chainlets

15

99,545

995.5

1023

1014

30

49,673

496.7

512

501

60

24,717

247.2

261

249

120

12,233

122.3

141

126

240

5735

57.4

74

59

Daily

790

7.9

19

12

Each row represents different time intervals of 15, 30, 60, 120, 240 min and days

Table 4

This table compares Kupiec’s unconditional coverage test results for the VaR breaches, with and without the chainlet regressors

Interval (mins)

Unconditional coverage null-hypothesis

Kupiec correct breaches

W/o chainlets

With chainlets

15

LR.uc statistic

11.306

7.291

LR.uc critical

3.841

3.841

LR.uc p value

0.001

0.005

Reject null

Yes

Yes

30

LR.uc statistic

8.060

1.854

LR.uc critical

3.841

3.841

LR.uc p value

0.003

0.210

Reject null

Yes

No

60

LR.uc statistic

4.671

1.755

LR.uc critical

3.841

3.841

LR.uc p value

0.035

0.223

Reject null

Yes

No

120

LR.uc statistic

3.771

1.317

LR.uc critical

3.841

3.841

LR.uc p value

0.056

0.259

Reject null

No

No

240

LR.uc statistic

28.515

1.516

LR.uc critical

3.841

3.841

LR.uc p value

0

0.242

Reject null

Yes

No

Daily

LR.uc statistic

11.306

1.955

LR.uc critical

3.841

3.841

LR.uc p value

0.001

0.195

Reject null

Yes

No

Each row represents different time intervals of 15, 30, 60, 120, 240 min and days

Table 5

This table compares Christoffersen’s conditional coverage test results for the VaR breaches, with and without the chainlet regressors

Interval (mins)

Unconditional coverage null-hypothesis

Christoffersen

Correct breaches and independence of failures

W/o chainlets

With chainlets

15

LR.uc statistic

9.124

6.899

LR.uc critical

5.991

5.991

LR.uc p value

0.007

0.029

Reject null

Yes

Yes

30

LR.uc statistic

14.025

2.105

LR.uc critical

5.991

5.991

LR.uc p value

0.001

0.157

Reject null

Yes

No

60

LR.uc statistic

6.982

1.592

LR.uc critical

5.991

5.991

LR.uc p value

0.03

0.213

Reject null

Yes

No

120

LR.uc statistic

9.773

2.547

LR.uc critical

5.991

5.991

LR.uc p value

0.008

0.141

Reject null

Yes

No

240

LR.uc statistic

32.487

1.855

LR.uc critical

5.991

5.991

LR.uc p value

0

0.173

Reject null

Yes

No

Daily

LR.uc statistic

14.389

2.738

LR.uc critical

5.991

5.991

LR.uc p value

0.001

0.154

Reject null

Yes

No

Each row represents different time intervals of 15, 30, 60, 120, 240 min and days

Table 4 compares the results of the Kupiec’s unconditional coverage test applied to the ARMA–eGARCH and ARMA–eGARCHX models. This test assesses whether the amount of expected versus actual breaches, given the tail probability of VaR, actually occur as predicted. The results show that at almost all time intervals, the ARMA–eGARCH model fails the backtest (rejection of \(H_0\) at the 95% confidence level) whereas the ARMA–eGARCHX always passes the backtest for timescales at 30 min or more.

Finally, Table 5 shows the results of conditional coverage test of Christoffersen. This test is a joint test of the unconditional coverage and the independence of the breaches. Both the joint and the separate unconditional test are reported since it is always possible that the joint test passes while failing either the independence or unconditional coverage test. The results again show that the ARMA–eGARCH model always fails the backtest (rejection of \(H_0\) at the 95% confidence level) whereas the ARMA–eGARCHX always passes the backtest at 30 min or more.

Under a quadratic loss function additional results, not shown here, we reject \(H_0\) in the Diebold–Mariano test and conclude that the differences in the ARMA–eGARCH and ARMA–eGARCHX model residuals are always significant at the 95% level.

4 Summary and outlook

In this article, we model the Blockchain transaction history of Bitcoin with high-fidelity graphs. Extreme chainlet activity, characterized by transaction amounts and occurrences, is shown empirically to result in significant changes in the intraday volatility. With the inclusion of these chainlet activities as external regressors in the conditional variance equation, we show a significant improvement in the GARCH model for predicting next period extreme losses at scales of 15, 30, 60, 120, 240 min and also daily losses. Across all timescales at 30-min resolution or large, the inclusion of extreme chainlet regressors results in from 10% up to 90% reduction in the number of false next period 99% VaR breaches or under-breaches over an approximately 2 year backtesting horizon.

Our experiments show that extreme chainlets are well suited as a tool for risk averse Bitcoin investors in anticipation of large market exposures. Broadly, extreme chainlets provide a more granular representation of the market than Bitcoin price information, enabling investors to make more informed portfolio allocation and hedging decisions. The ability to link extreme chainlets to price movement also supports their usage for speculation. For example, Bitcoin ‘whale’ activity might be linked to the extreme chainlets to trace the impact of whale wallets on prices and risk.

In future research, we seek to characterize the temporal evolution of volatility predictability (see Antulov-Fantulin et al. 2019 for details) using extreme chainlets. Such a study would provide further insight into the reliability of extreme chainlets as a short-term predictive indicator of risk. Further more, there is scope for building in the study by Akcora et al. (2018a), who find that combining transactional volume with extreme chainlets can yield stronger predictive performance over longer time horizons.

Additionally, a future direction for research would be to link the extreme chainlets with relevant news events, SEC announcements, social media accounts and other relevant macro-economic data. Aggregating data across multiple sources, we shall evaluate the extent to which the extreme chainlets support geo-location-sensitive querying of events. For example, one approach could be to develop a time-based anomaly index for Bitcoin which measures geo-sensitive sentiment through abnormal extreme chainlet patterns.

Footnotes

  1. 1.

    Chainlet matrices are available from https://github.com/cakcora/CoinWorks.

Notes

Acknowledgements

The authors are grateful for useful comments by the reviewers. The work of Dixon is partially supported by NSF EEC 1840433 and Intel Corporation. The work of Gel is partially supported by NSF IIS 1633331, NSF DMS 1736368 and NSF ECCS 1824710. The work of Kantarcioglu was supported in part by NIH award 1R01HG006844, NSF awards CICI-1547324, IIS-1633331, CNS-1837627, OAC-1828467 and ARO award W911NF-17-1-0356.

References

  1. Akcora, C. G., Gel, Y. R., & Kantarcioglu, M. (2017). Blockchain: A graph primer. arXiv:1708.08749.
  2. Akcora, C. G., et al. (2018a). Forecasting Bitcoin price with graph chainlets. In The 22nd pacific-asia conference on knowledge discovery and data mining, PaKDD.CrossRefGoogle Scholar
  3. Akcora, C. G., et al. (2018b). Bitcoin risk modeling with blockchain graphs. Economics Letters, 173, 138–142.CrossRefGoogle Scholar
  4. Antulov-Fantulin, N., et al. (2019). Inferring short-term volatility indicators from the bitcoin blockchain. In L. M. Aiello, et al. (Eds.), Complex networks and their applications VII (pp. 508–520). Cham: Springer International Publishing. (ISBN: 978-3-030-05414-4).CrossRefGoogle Scholar
  5. Ardia, D., Bluteau, K., & Rüede, M. (2018). Regime changes in bitcoin GARCH volatility dynamics. Finance Research Letters, 29, 266–271.  https://doi.org/10.1016/j.frl.2018.08.009. (ISSN: 1544-6123).CrossRefGoogle Scholar
  6. Borovkova, S. A., & Mahakena, D. (2015). News, volatility and jumps: The case of natural gas futures. Quantitative Finance, 15(7), 1217–1242.  https://doi.org/10.1080/14697688.2014.986513. (ISSN: 1469-7688).CrossRefGoogle Scholar
  7. Caporale, Guglielmo Maria, Gil-Alana, Luis, & Plastun, Alex. (2018). Persistence in the cryptocurrency market. Research in International Business and Finance, 46, 141–148.  https://doi.org/10.1016/j.ribaf.2018.01.002.CrossRefGoogle Scholar
  8. Cermak, V. (2017). Can bitcoin become a viable alternative to fiat currencies? An empirical analysis of bitcoin’s volatility based on a GARCH model (pp. 1–53).Google Scholar
  9. Chu, J., et al. (2017). GARCH modelling of cryptocurrencies. Journal of Risk and Financial Management, 10, 17.  https://doi.org/10.3390/jrfm10040017. http://www.mdpi.com/1911-8074/10/4/17(ISSN: 1911-8074).CrossRefGoogle Scholar
  10. Corbet, S., et al. (2017). Exploring the dynamic relationships between cryptocurrencies and other financial assets. Economics Letters, 165, 28–34.CrossRefGoogle Scholar
  11. Dyhrberg, A. H. (2016a). Bitcoin, gold and the dollar—A GARCH volatility analysis. Finance Research Letters, 16, 85–92.CrossRefGoogle Scholar
  12. Dyhrberg, A. H. (2016b). Bitcoin, gold and the dollar—A GARCH volatility analysis. Finance Research Letters, 16, 85–92.  https://doi.org/10.1016/j.frl.2015.10.008. (ISSN: 1544-6123).CrossRefGoogle Scholar
  13. Gomber, P., Koch, J.-A., & Siering, M. (2017). Digital Finance and FinTech: Current research and future research directions. Journal of Business Economics, 7(5), 537–580.CrossRefGoogle Scholar
  14. Greaves, A., & Au, B. (2015). Using the bitcoin transaction graph to predict the price of bitcoin. No data.Google Scholar
  15. Guo, T., & Antulov-Fantulin, N. (2018). An experimental study of bitcoin fluctuation using machine learning methods. arXiv:1802.04065 [stat.ML].
  16. Kondor, D., et al. (2014). Inferring the interplay between network structure and market effects in bitcoin. New Journal of Physics, 16(12), 125003.CrossRefGoogle Scholar
  17. Madan, S., Saluja, I., & Zhao, A. (2015). Automated bitcoin trading via machine learning algorithms. Technical report, Department of Computer Science, Stanford University.Google Scholar
  18. Maxwell, G. (2013). CoinJoin: Bitcoin privacy for the real world. In Post on bitcoin Forum. https://bitcointalk.org/index.php?topic=279249.0.
  19. Nakamoto, S. (2008). Bitcoin: A peer-to-peer electronic cash system. https://bitcoin.org/bitcoin.pdf.
  20. Nelson, D. B. (1991). Conditional heteroskedasticity in asset returns: A new approach. Econometrica, 59(2), 347–370. http://www.jstor.org/stable/2938260(ISSN: 00129682, 14680262).CrossRefGoogle Scholar
  21. Shah, D., & Zhang, K. (2014). Bayesian regression and bitcoin. In Communication, control, and computing (Allerton), 2014 52nd annual Allerton conference. IEEE (pp. 409–414).Google Scholar
  22. Shephard, N., & Sheppard, K. (2010). Realising the future: Forecasting with high-frequency-based volatility (HEAVY) models. Journal of Applied Econometrics, 25(2), 197–231.  https://doi.org/10.1002/jae.1158.CrossRefGoogle Scholar
  23. Sorgente, M., & Cibils, C. (2014). The reaction of a network: Exploring the relationship between the bitcoin network structure and the bitcoin price. Technical report, Department of Computer Science, Stanford University.Google Scholar
  24. Sovbetov, Y. (2018). Factors in uencing cryptocurrency prices: Evidence from bitcoin, Ethereum, Dash, Litcoin, and Monero. Journal of Economics and Financial Analysis, 2(2), 1–27.Google Scholar
  25. Steinbach, M., Karypis, G., & Kumar, V., et al. (2000). A comparison of document clustering techniques. In KDD workshop on text mining, Boston (Vol. 400, no. 1, pp. 525–526).Google Scholar
  26. Tasca, P., Hayes, A., & Liu, S. (2018). The evolution of the bitcoin economy: Extracting and analyzing the network of payment relationships. The Journal of Risk Finance, 19(2), 94–126.CrossRefGoogle Scholar
  27. Tschorsch, F., & Scheuermann, B. (2016). Bitcoin and beyond: A technical survey on decentralized digital currencies. IEEE Communications Surveys & Tutorials, 18(3), 2084–2123.CrossRefGoogle Scholar
  28. Yang, S. Y., & Kim, J. (2015). Bitcoin market return and volatility forecasting using transaction network flow properties. In IEEE SSCI (pp. 1778–1785).Google Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  • Matthew F. Dixon
    • 1
    Email author
  • Cuneyt Gurcan Akcora
    • 4
  • Yulia R. Gel
    • 3
  • Murat Kantarcioglu
    • 2
  1. 1.Department of Applied MathematicsIllinois Institute of TechnologyChicagoUSA
  2. 2.Data Security and Privacy LabUniversity of Texas at DallasRichardsonUSA
  3. 3.Department of Mathematical SciencesUniversity of Texas at DallasRichardsonUSA
  4. 4.Department of Computer Science and StatisticsUniversity of ManitobaWinnipegCanada

Personalised recommendations