Keywords

1 Introduction

The number of catastrophic events has increased during the last decades which induces a need for an extension of insurance capacity in terms of risk capital [25]. Economic losses caused by natural disasters are USD123 billion in 2015, but only 28 percent of global economic losses (USD35 billion) were covered by insurance [4]. Since the early 1990s, the market for cat bonds has grown steadily providing an additional source of risk capital to insurance and reinsurance companies. Cat bonds are alternative risk transfer instruments used to lay off natural disaster risk in the capital markets and meet funding demands following natural catastrophes. Cat bonds offer regular coupons to the investor and refund of the principal at maturity unless the predefined catastrophic event occurs, leading to a full or partial loss. For insurers and reinsurers, cat bonds are hedging instruments that offer protection without the credit risk by using the principal as full collateral to cover the losses if the trigger condition is fulfilled. The complexity of the impact factors specifying the structure and terms of cat bonds and the scarcity of information and historical data about extreme events in the past results in unpredictable loss probabilities, especially for catastrophic events with low probability and high impact. Although the body of literature on cat bonds is growing continuously, it seems that there is a lack of research about how to implement data-intensive analytics to improve loss prediction and adequate risk premium calculation. Besides, data-intensive analytics should also be used for pricing cat bonds. To some extent, the sustainable growth of the cat bond market depends on cat bond pricing techniques. When there is a diversity of cat bonds in the capital market, data-intensive analytics could be used to provide decision support for investors.

In this paper we first present a comprehensive literature review of the relevant data-intensive analytics for the optimization of the cat bond policy values. We focus on how related methods revise the management of handling increasing additional information to make prediction more precise. Additionally, data-intensive analytics leads to less dependence on not sufficiently available historical data. Our purpose is to provide an idea of a data-intensive analytics platform using advanced modeling, computing, and database techniques to provide decision support for potential policyholders, investors, the insurer and reinsurer. With more precise prediction, insurability of low probability high impact risks will be enhanced. To the best of our knowledge, this is the first approach to incorporate data-intensive analytics and alternative risk transfer. This is also the first paper focusing on information flows of an extended cat bond structure. As such, the paper is a first step towards a new research direction in alternative risk transfer and related fields.

Our paper is organized as follows. Section 2 introduces a typical cat bond structure. A review of the literature on data-intensive analytics to classify and compare the methods is presented in Sect. 3. Afterwards, we provide a classification of policyholders of catastrophe risks and provide an extended structure of cat bonds. Information flows are analyzed and indicated on the extended cat bond structure. An idea of a data-intensive analytics platform is proposed to provide decision supports for all participants of the extended cat bond structure. Finally, a conclusion of our contribution and future research direction is presented.

Fig. 1.
figure 1

(Source: [9])

Financial flows for a cat bond (see Sect. 2 for detailed notation)

2 Cat Bond Structure

Catastrophic risks (e.g. earthquake or flood) have extraordinary loss potential inherent and the high correlations of the losses constitute high risks for the insurer and reinsurer [6]. Up to the 1990s there was no alternative to classic reinsurance. However, the high losses from Hurricane Andrew in 1992 which exceeded many times the loss sizes caused by a natural disaster that were considered plausible up to that moment drove the developments in financial instruments for hedging catastrophic risks [25]. Covering the high layers of reinsurance protection, cat bonds offer coverage for layers that are difficult to insure. The high layers often go uninsured for two reasons. On the one hand primary insurance companies are concerned about the credit risk of the reinsurance company in the case of a catastrophic event; on the other hand high pricing spreads and reinsurance margins that are put on high layers by reinsurance companies result in high costs for the primary insurer [7]. Thus, cat bonds being fully collateralized eliminate the concerns about credit risk and offer insurance for a lower spread [8].

The covered territory is the geographic area in which catastrophes need to occur to be relevant under the bond contract and are usually defined in terms of countries, regions or states. The type of catastrophe covered by a cat bond, normally earthquakes, windstorms or multiple perils, is called the reference peril. The most common combinations of territory and peril are as follows [14]:

(a) US wind: Bonds that reference severe windstorms in the United States in particular Florida and Gulf Coast hurricane risk

(b) US Earthquake: Seismic events most focused on California

(c) Europe Wind: Cover extratropical cyclones that affect Northern and Western European countries

(d) Japan Earthquake: Due to the rifts of the tectonic plates earthquakes occur and could also lead to damage by a subsequent Tsunami

A typical cat bond structure is shown in Fig. 1 [9, 14]. The transaction starts from a single purpose reinsurer (SPR), who issues bonds to investors. The principals from the investors are put in highly rated short-term investments by the SPR. A call option, which is embedded in the cat bonds, is triggered by a predefined catastrophic event. Once the predefined catastrophic event occurs, principals are released from the SPR to help the insurer pay claims arising from the event [8]. In this case, the principal will be lost entirely or partly. If no predefined catastrophic event occurs during the term of the cat bonds, the principal is returned to the investors upon the expiration of the bonds. In return for the risk that the investors take, the insurer pays a premium to the investors. The fixed returns on the securities held in the trust are usually swapped for floating returns based on LIBOR or another widely accepted index. The reason for the swap is to immunize the insurer and the investors from interest rate risks and also default risks [9].

In [9], cat bonds have been structured to pay off on basic types of triggers:

(a) indemnity triggers, where the bond payoffs are determined by the event losses of the issuing insurer

(b) industry-index triggers, where the bond payoffs are triggered by the value of an industry loss index

(c) modeled loss triggers, the payoff is determined by simulated losses generated by inputting specific event parameters into the catastrophe model maintained by one of the catastrophe modeling firms

(d) parametric triggers, a parametric trigger pays off if the covered event exceeds a specified physical severity level

(e) hybrid triggers, which blend more than one trigger in a single bond [8] The choice of trigger type has an important impact on the cat bond structure, e.g. indemnity trigger can induce information asymmetry.

3 Literature Review

Numerical analysis methods have been used for pricing cat bonds in [15, 18, 26, 32]. In [18], a model to price default-free and default-risky cat bonds with a simple form of payoff function is proposed with consideration of default risk, basis risk, and moral hazard that are associated with cat bonds. Default risk is the risk that the insurer becomes insolvent and defaults. Basis risk refers to the risk that the losses that insurers incur will not have an anticipated correlation with the underlying loss index of the cat bond. Moral hazard issues need to be taken into account because economic losses are determined by the cat bonds issuing firm when the losses incurred approach the trigger level. The moral hazard behavior occurs when the insurer’s cost of loss control efforts exceeds the benefits from debt forgiveness. Cat bonds with stepwise and piecewise payoff functions are developed in [26]. An arbitrage approach is applied to cat bond pricing in [32]. From the case study about cat bonds for earthquakes sponsored by the Mexican government, cat bonds proved to be a good choice to provide coverage for a lower cost and lower exposure at default than reinsurance itself [15]. Hybrid cat bonds, combining the transfer of cat risk with protection against a stock market crash, are proposed to complete the market in [3]. According to the authors of [3], replacing simple cat bonds with hybrid cat bonds would lead to an increase in market volume.

Natural catastrophes include hurricanes, earthquakes, severe thunderstorms, floods, extratropical cyclones, wildfires, winter storms, and etc. Prediction of a catastrophe is important to reduce economic losses for local residents and companies. For instance, long-term (one year or longer) catastrophe prediction provides support for firms and insurers to hedge low-loss frequency, high-loss severity catastrophe risks. Short-term catastrophe prediction provides decision support for firms to make preparations before catastrophes, evacuate efficiently once a catastrophe happens and recover fast after the catastrophe. Economic losses of a firm after a catastrophe are relevant with catastrophe prediction accuracy. Natural disaster prediction is a typical data-intensive scientific application. Predictive modeling is at the intersection of machine learning, statistical modeling, and database technology [2]. Corresponding prediction tools or methods are developed for different kinds of catastrophes.

According to [1], hurricane prediction models should consider about temperature and wind observations near the center of the storm, as well as specific humidity observations. The record of net hurricane power dissipation, which is related with the severity of the hurricane, is found to be highly correlated with tropical sea surface temperature, including multi-decadal oscillations in the North Atlantic and North Pacific, and global warming [12]. A variety of effects premonitory to earthquakes such as crustal movements and anomalous changes in such phenomena as tilt, fluid pressure, electrical and magnetic fields, radon emission, the frequency of occurrence of small local earthquakes, and the ratio of the number of small to large shocks have been observed before various earthquakes [28]. According to [22], a large amount of seismological data is required for earthquake hazard assessment and earthquake prediction. Data mining techniques which are used for the prediction of earthquakes are reviewed in [27]. Machine learning techniques are applied in the field of forecasting frequently, such as [19, 21]. In [21], the Support Vector Machine (SVM) concept which is based on statistical learning theory is explored for flood prediction. The Geospatial Stream Flow Model (GeoSFM) is applied in [11] for flood forecasting and stream flow simulation with remotely acquired data. New techniques emerge for data-intensive scientific discovery. In [20], two kinds of typhoon rainfall forecasting models, SVM-based and BPN (backpropagation network)-based models, are investigated. Compared with BPNs, which are the most frequently used conventional neural networks (NNs), SVMs have advantages on their generalization ability. In the investigation of [20], the proposed SVM-based models are more accurate, robust and efficient than existing BPN-based models. Furthermore, the proposed modeling technique is also expected to be helpful to support flood, landslide, debris flow and other disaster warning systems and should therefore be considered amongst other methods for the prediction in our case.

The increasing volume of additional information available from internal and external sources (e.g. weather forecasts, seismic monitors, and satellite data) improves the predictability of loss probabilities to set up beneficial cat bond structures. In recent years, advanced data processing methods for handling data-intensive applications have become available. These applications are usually associated with the execution of computations on large data sets or large data structures [29]. It is beyond the capability of any individual machine and requires clusters—which means that large-data problems are fundamental regarding organizing computations on dozens, hundreds, or even thousands of machines.

Since prediction of loss probabilities in catastrophic environments involves large-scale data, predictive modeling requires high performance computing platforms. Three classes of platforms are mainly used to deal with large-scale data, namely, batch processing tools, stream processing tools, and interactive analysis tools [5]. With batch processing tools, multiple jobs can be processed simultaneously. Most batch processing tools are based on the Apache Hadoop infrastructure, such as Mahout, which is an Apache project for building scalable machine learning libraries [30]. MapReduce is a programming model and an associated implementation for processing and generating large datasets that is amenable to a broad variety of real-world tasks [10]. The MapReduce programming model has been used at Google for many different purposes. Stream processing platforms are necessary for real-time analytics for stream data applications. S4 (Simple Scalable Streaming System) is a distributed stream processing engine inspired by the MapReduce Model [24]. S4 is designed for solving real-world problems in the context of search applications that use data mining and machine learning algorithms. For example, in order to provide personalized search advertising for millions of unique users, thousands of queries per second are needed to be processed in real-time. There is a clear need for highly scalable stream computing solutions, such as S4. Another application of stream processing is Twitter Storm, which is a distributed, fault-tolerant, real-time large-scale streaming data analytics platform [31]. The interactive analysis processes the data in an interactive environment, allowing users to undertake their own analysis of information [5]. For instance, Dremel is a distributed system that supports interactive analysis of very big datasets over clusters of machines [23]. Apache Drill is designed to handle up to petabytes of data spread across thousands of servers; the goal of Drill is to respond to ad-hoc queries in low-latency manner [16].

Big data analytics require innovations of both hardware and software. Future hardware innovations will continue to drive software innovation [17]. The authors of [17] propose that minimizing the time spent in moving the data from storage to the processor or between storage/compute nodes in a distributed setting will be the main focus.

4 Data Intensive Analytics

4.1 Classification of Policyholders of Catastrophe Risk

Economic losses after a catastrophe are relevant with the severity of the catastrophe and catastrophe prediction in advance. To some extent, the role of a firm on a supply chain impacts economic losses of the firm after a catastrophic event. In addition, the flexibility of the policyholder, as well as policy values of cat bonds will also impact economic losses of the policyholder in case of a catastrophe. According to the role of a firm on a supply chain, policyholders are classified into raw material suppliers, manufacturers, carriers, distributors, retailers, etc. (see Fig. 2)

Fig. 2.
figure 2

Information flow for a cat bond (see Sect. 4.2 for detailed introduction)

From a farmer’s perspective of view, the main concern of catastrophes comes from extreme weathers. The economic losses mainly depends on the scale of the insured farm and the severity of the catastrophe. Economic losses of a manufacturer in case of a catastrophe comes from two sides: Direct losses from rebuilding or repairing destroyed facilities; indirect losses related to business interruptions and to temporary relocation and/or rerouting of materiel. In this case, the amount of economic loss depends on the prediction and flexibility of the manufacturer, especially for indirect losses. Economic losses of a carrier company in case of a catastrophe rely on the accuracy of catastrophe prediction. For instance, with detailed and accurate weather prediction, the carrier company will be able to modify transport planning to avoid economic losses from extreme weathers. Once a distributor is attacked by a catastrophe, facilities and inventories at the distribution center may be ruined. Comparing with the manufacturer of the supply chain, the function of a distributor is more replaceable. The indirect economic losses of a retailer due to lost customers and unsatisfied demand are crucial to a whole supply chain. A fast replenishment of final products will effectively reduce negative impacts to a supply chain. Whereas the replenishment speed after a catastrophe depends on the flexibility of a supply chain.

4.2 Information Flow for an Extended Cat Bond Structure

Cat bond structures usually display the cash flows and the risk transfer. In Fig. 2, an extended cat bond structure with information flows is provided.

If all participants on the extended cat bond structure are seen as a interconnected system, the external risks is derived from potential economic losses of policyholders caused by natural catastrophes. According to the analysis in Sect. 4.1, economic losses of policyholders depend on the severity and probability of catastrophes, the prediction techniques, the role of the policyholder in a supply chain, as well as the flexibility of the supply chain. In the process of transferring catastrophic risk from policyholders to investors via capital markets, the credit and behavior of the insurer and the reinsurer influences the risks that investors take.

The price of a cat bond is influenced by a series of factors e.g.:

(a) The probability and severity of the catastrophe/catastrophes,

(b) policyholder’s role in a supply chain,

(c) policyholder’s response speed once the catastrophe occurs,

(d) catastrophe prediction and the accuracy of the prediction that the policyholder can receive in advance,

(e) the credit and behavior of the insurer and the reinsurer involved in the cat bond structure.

Policyholders possess all the information about their own and potential economic losses from catastrophes. However, the insurer can’t get all the information, especially policyholders’ commercial confidential information. However, policyholders only hold parts of the insurer’s knowledge about evaluating catastrophic risk. The information asymmetry occurs in every interface of this extended cat bond structure. The insurance companies have an information advantage about the policyholders and the potential risk compared with the investors due to their business model.

As the final undertaker of risks, investors have the least information about catastrophes and policyholders.

Due to information asymmetry among policyholders, the insurer, the reinsurer and investors, the price for cat bonds may not reflect the real value of the risk. Without a proper price for cat bonds, basis risk and moral hazard cannot be reduced. Basis risk might reduce the hedging effect of cat bonds and increase the default probability of the issuing firm [18]. Enhancing the information flow with more appropriate methods can lead to less information asymmetry and therefore to a better pricing and more efficient risk sharing [13].

4.3 Data-Intensive Analytics for Cat Bonds

Due to all participants in Fig. 2 take catastrophe prediction into account, data-intensive analytics is required by all participants.

For a company in a environment with possible catastrophic disruptions, data-intensive analytics is required to provide the company with strategic decision support. Strategies, such as purchasing catastrophe insurance and/or keeping a high flexibility of the company, can be chosen to reduce economic losses once a catastrophe occurs. Factors, including the moment of occurrence and the severity of the catastrophe, should be taken into account by the company. However, from our literature review, these factor cannot be predicted accurately and precisely yet. The best information that people can obtain from the current catastrophe prediction techniques is the probability distribution for the severity and occurrence moment of a catastrophe. In this case, stochastic programming will help to make a beneficial decision for the company. Stochastic programming has been applied in a broad range of areas from finance to production and transportation planning. In order to get a high quality solution for a stochastic programming problem, high performance computing platform (HPC) is required.

For an insurer, before making decision on transferring catastrophic risk to the capital market, data-intensive analytics is also required. Two crucial questions are needed to be clear for the insurer: (1) based on all risk that the insurer holds, which catastrophic risk should be transferred to the capital market through cat bonds, and (2) how much the insurer would like to pay for transferring the risk. In order to answer these questions, the following information is needed: the most precise and accurate prediction of all catastrophes that the insurer hold, each related policyholder’s economic loss once the corresponding catastrophe occurs, the inter-dependency of risks that the insurer holds. Here comes with another stochastic programming problem due to the lack of accurate and precise catastrophe prediction techniques.

For a reinsurer, constructing the cat bond structure and pricing cat bonds also requires data-intensive analytics. Because the prediction information about the related catastrophe and the possible economic losses of the policyholder are needed to be taken into account in designing the cat bond structure.

For a investor, before the decision of investing in a specific cat bond, the investor should check the dependencies between the risk behind the cat bond and the risk of the investor’s own as well as the risks of capital products that the investor has already held. Cat bonds should be used to provide investment diversification and balance catastrophe risks, but it should not act as an evil that aggravates the tragedy. To avoid double tragedies from purchasing the wrong cat bonds, data-intensive analytics is necessary for investors.

Based on analysis above, catastrophe prediction information is required for all participants, an innovative idea is to design a data-intensive analytics platform, which can make use of catastrophe prediction information as well as analytic tools for all users(Fig. 3). The idea of the data-intensive analytics platform is to provide a common application to users for catastrophe related decision support. The database of catastrophe prediction is provided by professional catastrophe prediction organizations, which can be a government department or a third party data analytic company. The database of catastrophe prediction should always incorporate with the latest information.

Users will get decision supports by providing required parameters to the platform. For instance, a company gets to know whether they should buy a catastrophe insurance or not by providing its location, its supply chain partners (for calculating indirect economic losses) and costs parameters. According to the location, the prediction of catastrophes in this area will be selected from the catastrophe prediction database. With the information of costs and supply chain partners, economic losses in case of a catastrophe will be calculated. By further incorporating the probability and severity of possible catastrophes, a better strategy will be selected for the company. A insurer will obtain decision supports on which catastrophe risk should be transferred to the capital market as the insurer may hold diversity of catastrophe risks from policyholders. A reinsurer can get a suggestion on cat bond price based on the evaluation of catastrophe risk of the policyholder. The data-intensive analytics platform will help a investor to make the right decision on which cat bond is better for the investor to obtain investment diversification.

Based on this idea, the latest catastrophe prediction information will be saved in a database which provides valuable real-time and historical information to all users. On the one hand, by using a common database, single users don’t need to save copies which will save a big amount of data storage space. On the other hand, with real-time renewing catastrophe prediction information, it will help to reduce economic losses for the platform users in case of a catastrophe. Besides, due to the insurer and reinsurer have their own advanced data analytic techniques, once they get the latest prediction or real-time warning about catastrophes, they should immediately inform relevant policyholders. With the latest more precise and accurate catastrophe prediction, policyholders will be able to reduce economic losses effectively once a catastrophe occurs.

Fig. 3.
figure 3

Data-intensive analytics structure (see Sect. 4.3 for detailed introduction)

5 Conclusion

Cat bonds are attractive to investors for diversification purposes. Catastrophe insurance has a wide future market. The insurer and reinsurer are able to meet both policyholders and investors’ demand through issuing cat bonds and pricing them at the proper levels. Unlike existing articles about cat bonds and catastrophe insurance, this paper provides an extended structure of cat bonds and analyzes risk from a systematic perspective. Based on the situation that catastrophe prediction information and data-intensive analytics are required by policyholders, investors, the insurer and reinsurer, an idea of a common Data-Intensive Analytic Platform is proposed in our paper. The extended cat bond structure and the Data-Intensive Analytic Platform are two main contributions of this paper. A further contribution is that we identify more factors which will impact pricing cat bonds and catastrophe insurance. Further research should focus on how to realize such a common data-intensive analytic platform.