Keywords

1 Introduction

Since the introduction of China’s “Reform and Opening Up” policy, the Chinese government has make great effort to market-oriented reform and the “go out and bring in” strategy [1]. The amount of inward foreign direct investment (IFDI) grew explosively, and now China has become the second largest IFDI host country (behind America) all over the world. Meanwhile, with the rapid economic development, China is gradually transit from FDI recipient to FDI investor. The total amount of outward foreign direct investment (OFDI) is increasing continuously, especially since the proposal of the Belt and Road initiative. There is no doubt that bilateral FDI are the engine and a kind of catalyst for China in economic development. However, for those emerging economies like China, low standards of environmental regulations, cheap labor and energy costs are always used as incentives to attract IFDI. Along with the urgent need of economic development, IFDI brings a series of environmental problems to host country, such as CO2 emissions. On the other hand, China’s OFDI will also have an impact on home country’s CO2 emissions, whereas the results of the relationship between OFDI and CO2 are still confused. Since the CO2 emissions must be taken into account for sustainable development in China, this leads to an important issue related to how to accurately predict CO2 emissions considering bilateral FDI.

Many forecasting methods have been used to CO2 emissions forecasting, including time series analysis, spatial econometric, artificial intelligence [2,3,4,5]. Usually, the above-mentioned methods required a large number of samples to reduce the random interference caused by uncertain factors. Beyond this, econometric methods required the data conform to statistical assumptions, such as normal distribution, as well [6]. However, the data of CO2 emissions do not often satisfy the assumption, limiting the forecasting capabilities. Therefore, constructing a prediction model that works well with small samples, without making any statistical assumption is the main purpose of this paper. One of the grey prediction models, grey multivariable model GM(1,N), has drawn out attention to CO2 emissions forecasting. Compared to another widely used grey prediction GM(1,1) model, GM(1,N) model takes account the influence of N-1 relevant factors on the system to improve the prediction accuracy.

The GM(1,N) model is a first order grey multivariable prediction model, contained a system behavior variable and N-1 relevant variables. This model can analyze the effect of multiple relevant variables on the system behavior variable. To improve the simulation and prediction performance of the traditional GM(1,N) model, several improved versions have been proposed, such as a grey prediction model with convolution integral (GMC(1,N)) and an improved GM(1,N) models with optimal background values [7,8,9]. These existing grey GM(1,N) models and their improved models have linear features with regards structure [10]. However, most of the structure of a real system is non-linear. Therefore, using a linear structure of GM(1,N) model to describe or predict the behavior of a non-linear system can cause unacceptable modeling error [10].

A traditional Verhulst model is mainly used to describe the process with saturation, which is commonly used in population prediction, biological growth forecast, product economic life prediction and so on [11]. It grows slowly in the initial stage, then the growth rate increase quickly, finally, the growth changes from the high-speed to the low-speed until to the cessation of growth, so the data are being the form of S-type [12]. Grey prediction model is established on the randomicity of accumulated generating weakened sequence and the rules of system changes. The model is constructed on the basis of a qualitative analysis, so it generally has higher simulation accuracy and prediction precision, gets preferable results in application. The grey Verhulst model, GVM(1,1), is modeling on base of the raw data from accumulating generation, which expands the available range of the traditional Verhulst model to the approximate single-peak type of data. Therefore, GVM(1,1) model has been widely applied in different fields recently [13,14,15].

Referring to GVM(1,1) model, this paper proposes a grey multivariable Verhulst model, GVM(1,N), with a view to providing an effective quantitative method to forecast Chinese CO2 emissions considering the non-linear effect of bilateral FDI. In addition, the residual modification model built on residual obtained from the GVM(1,N) model could improve the prediction accuracy. The proposed improved prediction model is a two-stage procedure; the first stage uses the GVM(1,N) model to generate the predicted value, and the second stage use the traditional GM(1,1) model and Fourier series to modify the residual error, respectively. To verify the validity of the proposed prediction model, the hybrid prediction model is compared with original GVM(1,N) model in terms of prediction performance using mean absolute percentage error (MAPE).

The rest of this paper is organized as follows: the Sect. 2 is a review of the literature related to this research. Section 3 introduce the GVM(1,N) model and the proposed grey residual modification model combining GVM(1,N) with residual modification model using GM(1,1) and a Fourier series. The empirical analysis of CO2 emissions in China is presented in Sect. 4. Finally, Sect. 5 discusses the results and presents conclusion.

2 Literature Review

2.1 Forecasting China’s CO2 Emissions

As a developing country, Chinese energy consumption, especially fossil energy consumption is constantly growing with the acceleration industrialization and urbanization, hence, the future changing trend in CO2emissions is a concern: to that end, many scholars have forecast Chinese CO2 emissions from different perspectives.

For the prediction of Chinese future CO2 emissions, the most widely used model is the IPTA, which is also known as the Kaya model. Du, Wang [16] improved the IPTA model and used it to predict and analyze China’s per capita carbon emissions in three assumed scenarios up to 2050.

In addition, many scholars have used other methods to forecast Chinese carbon emissions. Zhou, Fridley [17] evaluated the efficiency of Chinese energy consumption and thought that Chinese carbon emissions would reach a peak in 2030. Gambhir, Schulz [18] used the hybrid modeling method to forecast Chinese carbon emissions in 2050. Liu, Mao [19] forecast the gross carbon dioxide emissions and their intensity in China from 2013 to 2020 using a system dynamics simulation.

From the above forecasting results, we know that the traditional EKC method and other forecasting methods have been widely used. As economic growth has a prominent non-linear impact on carbon emissions, if the prediction of carbon emissions were based on the non-linear relationship between carbon emissions and economic growth, it would not only have theoretical support but also could result in more direct policy suggestions for economic growth and environmental quality. In fact, few scholars make such an attempt at present.

2.2 Grey Forecasting Method

To solve the problem of analysis, modeling, prediction and control of uncertain system, Deng [20] proposed the use of a grey system. As this theory had obtained ideal effects of application in practice, the grey theory has been recognized by many scholars at home and abroad in recent years, and its application fields have been extended from control science to many fields such as industry, agriculture, energy, economy, management, and so on [21, 22].

Deng [20] first proposed a multivariable grey model GM(1,N), and it was used in the coordination and development of the planning of the economy, technology, and society of a city in Hubei Province. The GM(1,N) is a first-order multivariable grey model, the model contains a system behavior variable and N-1 influencing factor variables: this model can analyze the effect of multiple influencing factor variables on the behavior of the system. When the changing trends of influencing factor variables were known, we can also predict the behavior variables of the system. Liu and Lin (2006) give approximate whitening time response functions of the GM(1,N). Tien [23] showed that the approximate whitening time response function of the GM(1,N) could lead to unacceptable experimental error at times. The time response function of the GM(1,N) is not always precise and the accuracy of the model is not high. Tien [24] added a control parameter into the grey differential equation of the traditional GM(1,N), meanwhile, used the convolution integral technique to solve the whitening differential equation: the improved model was named GMC(1,N). Hsu [25] used a genetic algorithm to optimize the interpolation coefficient of the background value of the GM(1,N), and the optimization model was applied to predict the output value of the integrated circuit industry in Taiwan, to better forecasting effect. Pei, Chen [8] applied the GA-based GM(1,N) model to forecast the input-output system of Chinese high-tech industry.

3 Methodologies

In this subsection, the algorithm of the grey multivariable Verhulst model is introduced and the inherent definitions of the main parameters are briefly analyzed to gain a better understanding of the relationship between the system behavior variable and the relevant variables. Then, the improved prediction model is proposed based on combining grey multivariable Verhulst model with residual modification model, and its modeling procedure is demonstrated stepwise.

3.1 Grey Multivariable Verhulst Model

Taking into account the fact that the existing grey multivariable model cannot be used to describe the non-linear relationship between CO2 emission and bilateral FDI, this paper will introduce a Verhulst model into the most widely used grey multivariable model GM(1,N) to describe the non-linear effect that the relevant variables exert on the system behavior variable, and then we will construct the grey multivariable Verhulst model (GVM(1,N)).

Assume that \( X_{1}^{(0)} = (x_{1}^{(0)} (1),\;x_{1}^{(0)} (2), \ldots ,\;x_{1}^{(0)} (n)) \) is original data of a system characteristic sequence (or dependent variable sequence), and \( X_{i}^{(0)} = (x_{i}^{(0)} (1),\;x_{i}^{(0)} (2), \ldots ,\;x_{i}^{(0)} (n)) \), where \( i = 2,\;3, \ldots ,\;N \) are the relevant variable sequences (or independent variable sequences), which have a certain relationship with sequence \( X_{1}^{(0)} \).

Then the new sequence \( x_{i}^{(1)} = (x_{i}^{(1)} (1),\;x_{i}^{(1)} (2), \ldots ,\;x_{i}^{(1)} (n)) \) can be generated from \( X_{i}^{(0)} \) by the first order accumulated generating operation (1-AGO) as follows:

$$ x_{i}^{(1)} (k) = \sum\limits_{j = 1}^{k} {x_{i}^{(0)} (j),\;k = 1,\;2, \ldots ,\;n} $$
(1)

The background value, \( z_{1}^{(1)} (k) \), is the adjacent neighbor mean generated sequence of \( X_{1}^{(1)} \).

$$ z_{1}^{(1)} (k) = 0.5 \times \left( {x_{1}^{(1)} (k) + x_{1}^{(1)} (k - 1)} \right) $$
(2)

Then,

$$ x_{1}^{(0)} (k) + az_{1}^{(1)} (k) = \sum\limits_{i = 2}^{n} {b_{i} \left( {x_{i}^{(1)} (k)} \right)^{2} } $$
(3)

is called a grey multivariable Verhulst model, abbreviated as GVM(1,N), where \( a \) is called the development coefficient of the system, \( b_{i} \left( {x_{i}^{(1)} (k)} \right)^{2} \) the driving term, and \( b_{i} \) the driving coefficient. The corresponding whitening differential equation is written as follows:

$$ \frac{{dx_{1}^{(1)} }}{dt} + ax_{1}^{(1)} = \sum\limits_{i = 2}^{n} {b_{i} \left( {x_{i}^{(1)} (k)} \right)^{2} } $$
(4)

In turn, \( a \) and \( b_{i} \) can be obtained by using the original least squares (OLS) method:

$$ \left[ {a,\;b_{i} } \right]^{T} = (B^{T} B)^{ - 1} B^{T} y $$
(5)
$$ B = \left[ {\begin{array}{*{20}c} { - z_{1}^{(1)} (2)} & {\left( {x_{2}^{(1)} (2)} \right)^{2} } & \ldots & {\left( {x_{N}^{(1)} (2)} \right)^{2} } \\ { - z_{1}^{(1)} (3)} & {\left( {x_{2}^{(1)} (3)} \right)^{2} } & \ldots & {\left( {x_{N}^{(1)} (3)} \right)^{2} } \\ \vdots & \vdots & \ddots & \vdots \\ { - z_{1}^{(1)} (n)} & {\left( {x_{2}^{(1)} (n)} \right)^{2} } & \ldots & {\left( {x_{N}^{(1)} (n)} \right)^{2} } \\ \end{array} } \right] $$
(6)

and

$$ y = \left[ {x_{1}^{(0)} (2),\;x_{1}^{(0)} (3),\; \ldots ,\;x_{1}^{(0)} (n)} \right]^{T} $$
(7)

The approximate time response sequence of the GVM(1,N) model is given by

$$ \hat{x}_{1}^{(1)} (k) = \left[ {x_{1}^{(0)} (1) - \frac{1}{a}\sum\limits_{i = 2}^{n} {b_{i} } \left( {x_{i}^{(1)} (k)} \right)^{2} } \right]e^{ - a(k - 1)} + \frac{1}{a}\sum\limits_{i = 2}^{n} {b_{i} } \left( {x_{i}^{(1)} (k)} \right)^{2} $$
(8)

Finally, using the inverse accumulated generating operation (IAGO), the predicted value \( x_{k}^{(0)} \) is

$$ \hat{x}_{1}^{(0)} (k) = \hat{x}_{1}^{(1)} (k) - \hat{x}_{1}^{(1)} (k - 1),\;k = 2,\;3, \ldots ,\;n $$
(9)

Note that \( \hat{x}_{1}^{(1)} (1) = x_{1}^{(0)} (1) \) holds.

3.2 The Improved Grey Multivariable Verhulst Model

The improved grey multivariable Verhulst model uses GVM(1,N) model to generate the predicted value, after which the GM(1,1) model and the Fourier series are used to correct the residuals generated by GM(1,N), respectively, abbreviated as RGVM(1,N) and FGVM(1,N). The construction of the GMGMF can be described as follows:

Step 1: Establish a GVM(1,N) model for \( x_{i}^{(0)} \) and generate the predicted value of GVM(1,N), \( \hat{x}_{i}^{(0)} \).

Step 2: Generate the sequence of residual values \( \varepsilon_{k}^{(0)} = (\varepsilon_{2}^{(0)} ,\;\varepsilon_{3}^{(0)} , \ldots ,\;\varepsilon_{n}^{(0)} ) \) based on the following equation.

$$ \varepsilon_{k}^{(0)} = x_{1}^{(0)} (k) - \hat{x}_{1}^{(0)} (k),\;k = 2,\;3, \ldots ,\;n $$
(10)

Step 3: Generating the predicted residual of \( \hat{\varepsilon }_{k}^{(0)} \) by the residual model established as GM(1,1) model (see Appendix A) and Fourier series (see Appendix B)for \( \varepsilon_{k}^{(0)} \), respectively.

Step 4: The predicted value of RGVM(1,N), and FGVM(1,N), \( x_{1}^{{'\text{(0)}}} (k) \), can be calculated as.

$$ x_{1}^{'(0)} (k) = \hat{x}_{1}^{(0)} + \hat{\varepsilon }_{k}^{(0)} ,\;k = 2,\;3, \ldots ,\;n $$
(11)

Figure 1 shows a procedure of the proposed residual modification model.

Fig. 1.
figure 1

Procedure of Improved GVM(1, N) Model

3.3 Evaluating Prediction Accuracy

In order to compare the forecasting ability of the proposed models against different models, mean absolute percentage error (MAPE) was employed to measure prediction performance. MAPE with respect to \( x_{k}^{(0)} \) is

$$ MAPE = \frac{1}{n}\sum\limits_{k = 1}^{n} {\frac{{\left| {x_{1}^{(0)} (k) - x^{'(0)} (k)} \right|}}{{x_{1}^{(0)} (k)}}} $$
(17)

Lewis [26] proposed MAPE criteria for evaluating a forecasting model, where MAPE ≤ 10, 10 < MAPE ≤ 20, 20 < MAPE ≤ 50, and MAPE > 50 correspond to high, good, reasonable, and weak forecasting models, respectively.

4 Empirical Study

4.1 Variables and Data Collection

In order to predict CO2 emission in China considering the bilateral FDI using proposed RGVM(1,N) model, China’s IFDI and OFDI are taken as the relevant variables. Additionally, in accordance with previous studies, GDP is selected as the other relevant variables as well [1]. Table 1 shows the proxies used for the relevant variables and the unit.

Table 1. Relevant variables of CO2 emissions in China

All data collected from World Bank Development Indicators between 2001 and 2014 are shown in Table 2. The real data from 2001 to 2012 were reserved for model-fitting, and data 2013–2014 were reserved for the ex post test.

Table 2. Data of CO2 and relevant variables from 2001 to 2014

4.2 Empirical Results

Following the calculation procedures, the estimated values of the development and driving coefficients, which can be expressed as a = −0.28, b2 = 2.47 × 10−8, b3 = −1.76 × 10−5, and b4 = −1.51 × 10−4. As explained in Ding, Dang [27], the driving coefficient bi has its own actual meaning. Because b2 > 0 > b4 > b3, it can be inferred that the GDP contributed significantly to the increase CO2 in China. Parameter b3 and b4 indicate that increasing the bilateral FDI is helpful to reduce the CO2 emissions. To demonstrate the efficacy and practicability of the proposed model, original GVM(1,N), RGVM(1,N), and FGVM(1,N) are used to comparison, and the predicted results and MAPE for CO2 emissions in China are summarized in Table 3.

Table 3. Prediction precision obtained by different forecasting models for CO2 emissions

From Table 3, it can be found that the MAPE of the GVM(1,N), RGVM(1,N), and FGVM(1,N) models for model-fitting were 18.07%, 21.82%, and 4.09%, respectively. For ex post testing, the MAPE were 10.03%, 3.21%, and 23.05%, respectively. Obviously, both proposed RGVM(1,N) and FGVM(1,N) model are superior to the original GVM(1,N) model for model-fitting, then RGVM(1,N) model is superior to the GVM(1,N) and FGFM(1,N) models for ex post testing.

5 Discussions and Conclusion

This paper explores the relationship between CO2 emissions in China and GDP, IFDI, and OFDI based on the proposed grey multivariable Verhulst model incorporating with residual modification model using GM(1,1) and Fourier series.

The comparison of prediction results by different models reveals that the proposed RGVM(1,N) and FGVM(1,N) models perform well on China’s CO2 emissions prediction. The applicability of improved GVM(1,N) model has been proved. For the relationship between CO2 emissions and GDP, IFDI, and OFDI, the control and driving coefficients indicate that the rapid economic growth would result in larger CO2 emissions, whereas enhancing environmental protection technology via the spillover effect of IFDI, and transferring the highly polluting industries via OFDI could reduce CO2 emissions.

In summary, the improved grey multivariable Verhulst model is an effective and practical model for analyzing and predicting CO2 emissions in Mainland China. The limitation of the new proposed model deserves further attention and research in the future, such as the big error could appear in the data fluctuation. What is more, in regard to the residual modification model, our future research will concentrate on some other approaches, such as the Markov-chain, and neural network.