# Nonparametric estimation for compound Poisson process via variational analysis on measures

- 1.5k Downloads
- 1 Citations

## Abstract

The paper develops new methods of nonparametric estimation of a compound Poisson process. Our key estimator for the compounding (jump) measure is based on series decomposition of functionals of a measure and relies on the steepest descent technique. Our simulation studies for various examples of such measures demonstrate flexibility of our methods. They are particularly suited for discrete jump distributions, not necessarily concentrated on a grid nor on the positive or negative semi-axis. Our estimators also applicable for continuous jump distributions with an additional smoothing step.

## Keywords

Compound Poisson distribution Decompounding Measure optimisation Gradient methods Steepest descent algorithms## Mathematics Subject Classification

Primary: 62G05 Secondary: 62M05 65C60## 1 Introduction

*compounding*measure \(\varLambda \) defined on the real line \(\mathbb {R}=(-\infty ,+\infty )\) such that

*decompounding*which is the main object of study in this paper.

We propose a combination of two nonparametric methods which we call characteristic function fitting (ChF) and convolution fitting (CoF). ChF may deal with a more general class of Lévy processes, while CoF explicitly targets the compound Poisson processes.

*k*terms, we build a loss function \(L_{\mathrm {CoF}}^{(k)}\) by comparing two estimates of \(F^{*2}\): the one based on the truncated series and the other being the empirical convolution \(F^{2*}_{n}\). CoF is able to produce nearly optimal estimates \(\hat{\varLambda } _{k}\) when large values of

*k*are taken, but at the expense a drastically increased computation time.

A practical combination of these methods recommended by this paper is to find \(\hat{\varLambda } _{k}\) using CoF with a low value of *k* and then apply ChF with \(\hat{\varLambda } _{k}\) as the starting value. The estimate for such a two-step procedure will be denoted by \(\tilde{\varLambda }_{k}\) in the sequel.

Previously developed methods include discrete decompounding approach based on the inversion of Panjer recursions as proposed in Buchmann and Grübel (2003). van Es et al. (2007) and, lately, Duval (2013), Comte et al. (2014) studied the continuous decompounding problem when the measure \(\varLambda \) is assumed to have a density. They apply Fourier inversion in combination with kernel smoothing techniques for estimating the unknown density of the measure \(\varLambda \). In contrast, we do not distinguish between discrete and continuous \(\varLambda \) in that our algorithms, based on direct optimisation of functionals of a measure, work for both situations on a discretised phase space of \(\varLambda \). However, if one sees many small atoms appearing in the solution, which fill a thin grid, this may indicate that the true measure is absolutely continuous and some kind of smoothing should yield its density.

*F*and the empirical distribution function \(\hat{F}_{n}(x) = \frac{1}{n} \sum _{k = 1}^{n} {{\mathrm{{1I}}}}_ {\{X_{k} \le x\}},\) where the dependence on \(\varLambda \) comes through

*F*via the inversion formula of the characteristic function:

*Parametric* inference procedures based on the empirical characteristic function have been known for some time, see Feuerverger and McDunnough (1981a) and Sueishi and Nishiyama (2005), and the references therein. Algorithms based on the inversion of the empirical characteristic function and on the relation between its derivatives were proposed in Watteel and Kulperger (2003). Note that the inversion of the empirical characteristic function, in contrast to the inversion of its theoretical counterpart, generally leads to a complex valued measure which needs to be dealt with.

One of the reviewers has drawn our attention to the recent preprint Coca (2015) which promises to be useful for testing the presence of discrete and/or continuous jump distribution components as well as for obtaining approximation accuracy bounds based on the central limit theorem. Practical implementations of these theoretical results are yet to be explored.

The rest of the paper has the following structure. Section 2 introduces the theoretical basis of our approach—a constraint optimisation technique in the space of measures. Section 3 provides an algorithmic implementation of the corresponding steepest descent method in R-language. Section 4 develops the necessary ingredients for the CoF method based on the main analytical result of the paper, Theorem 2. Section 5 contains a broad range of simulation results illustrating performance of our algorithms on simulated data with various compounding measures, both discrete and continuous. Section 6 presents an application of our approach to real currency exchange data. Section 7 summarises our approach and gives some practical recommendations. We conclude by “Appendix” with proofs and explicit formulas for the gradients of the two loss functions used in our steepest descent algorithm.

## 2 Optimisation of functionals of a measure

In this section, we briefly present the main ingredients of the constrained optimisation of functionals of a measure. Theorem 1 gives necessary conditions for a local minimum of a strongly differentiable functional. This theorem justifies a major step in our optimisation algorithm described in Sect. 3. Further details of the underlying theory can be found in Molchanov and Zuyev (2000a, b).

Denote by \(\mathbb {M}\) and \(\mathbb {M}_+\) the class of signed and, respectively, non-negative measures with a finite total variation. The set \(\mathbb {M}\) then becomes a Banach space with sum and multiplication by real numbers defined set-wise: \((\eta _1+\eta _2)(B):=\eta _1(B)+\eta _2(B)\) and \((t\eta )(B):=t\eta (B)\) for any Borel set *B* and any real *t*. The set \(\mathbb {M}_+\) is a *pointed cone* in \(\mathbb {M}\) meaning that the zero measure is in \(\mathbb {M}_+\), \(\mu _1+\mu _2\in \mathbb {M}_+\), and \(t\mu \in \mathbb {M}_+\) as long as \(\mu _1,\mu _2,\mu \in \mathbb {M}_+\) and \(t\ge 0\).

*Fréchet*(or

*strongly*)

*differentiable*at \(\eta \in \mathbb {M}\) if there exists a bounded linear operator (a

*differential*) \(DG(\eta )[\cdot ]:\ \mathbb {M}\mapsto \mathbb {R}\) such that

*gradient function*for

*G*at \(\eta \). Typically in applications, and it is indeed the case for the functionals of a measure considered in this paper, the gradient functions exist so that the differentials indeed have an integral form.

*G*in this example is strongly differentiable if both functions \(u'\) and

*f*are bounded.

### Theorem 1

*L*possesses a gradient function \(\nabla L(x;\varLambda )\) at this \(\varLambda \). Then

## 3 Steepest descent algorithm on the cone of positive measures

There is an extensive number of algorithms realising a *parametric* optimisation over a finite number of continuous variables, but optimisation algorithms over the cone of measures have been proposed only recently in Molchanov and Zuyev (2002) for the case of measures with a fixed total mass. The variation analysis of functionals of a measure outlined in the previous section allows us to develop a steepest descent type algorithm for minimisation of functionals of a compounding measure which we describe next. This algorithm has been used to obtain the simulation results presented in Sect. 5.

*l*and the finer is the grid \(\{x_{1}, \ldots , x_{l}\}\) the better is the approximation, however, at a higher computational cost.

^{1}

## 4 Description of the CoF method

As it was alluded in Introduction, the CoF method uses a representation of the convolution as a function of the compounding measure \(\varLambda \). We now formulate the main theoretical result of the paper on which the CoF method is based. The proof is given in Appendix.

*F*, denote \(U_{x}F(y)=F(y-x)-F(x)\) and

*J*of \(\{1, 2, \ldots , n\}\) including the empty set. Denote \(\Gamma _0(F,\varLambda ,y)=F(y)\), and

### Theorem 2

*h*. Then for each real

*y*, one has

*k*terms in (12) for computational reasons. The error introduced by the truncation can be accurately estimated by bounding the remainder term in the finite expansion formula (16) in the proof. Alternatively, turning to (10) and using \(0\le F(y)\le 1\), we obtain

*h*or/and increasing

*k*. For instance, for the horse kick data considered in Introduction, we have \(h=1\) and the estimated value of \(\Vert \varLambda \Vert \) is 0.61, giving the values \(R_k(0.61)=0.58, 0.21, 0.06\) for \(k=1,2,3\). This indicates that \(k=3\) is rather adequate cut-off for the data.

If the expected number of jumps, \(h\Vert \varLambda \Vert \), in the time interval [0, *h*], is large, the sample values \(X_i\), in the case of a finite variances, would have approximately normal distribution. Since the normal distribution is determined by the first two moments only and not by the entire compounding distribution, an effective estimation of \(\varLambda /\Vert \varLambda \Vert \) is hardly possible, see Duval (2014) for a related discussion. Indeed, to get the upper bound close to 0.2 given \(h\Vert \varLambda \Vert =8\), one would need to take \(k=41\) which is hardly computationally possible.

To summarise, if one has a control on the choice of *h*, it should be taken so that the estimated value of \(h\Vert \varLambda \Vert \) is close to 1. For large values of this parameter, the central limit theorem prevents an effective estimation of \(\lambda \), while the small values would result in almost always single jumps and the optimisation procesure giving basically the sample measure as a solution. Similarly to the problem of choice of a kernel estimator or the histogram’s bin width, a compromise should be sought. A practical approach would be to try various values of *h*, as we demonstrate below in Sect. 6 on the real FX data.

## 5 Simulation results

To illustrate the performance of our estimation methods, we generated samples of size \(n=1000\) for compound Poisson processes driven by various kinds of measure \(\varLambda \). In Sects. 5.1, 5.2 and 5.3, we considered examples of discrete jump size distributions. Note that lattice distributions with both positive and negative jumps are particularly challenging because of possible cancellations of jumps, the case barely considered in the literature so far.

In Sects. 5.4 and 5.5, we present simulation results for two cases of continuously distributed jumps: non-negative and general. The continuous measures are replaced in the simulations by their discretised versions given by (8). The grid size in these examples was \(\varDelta =0.25\). Note that no special account is given to the fact that the measure is continuous and the algorithms work the same way as with genuine discrete measures. However, the presence of atoms filling the consecutive grid ranges should indicate that the true compounding measure is probably continuous. A separate analysis could be tried to formally check this hypothesis, for instance, by the methods proposed in Coca (2015). If confirmed, some kind of kernel smoothing could be used to produce an estimated density curve or specific estimation methods for continuously distributed jumps employed, like the ones mentioned in Introduction.

For all the considered examples, we applied three versions of the CoF with \(h=1\), \(k=1,2,3\) and \(\omega \equiv 1\). We also apply ChF using the estimate of CoF with \(k=1\). Observe that CoF with \(k=1\) can be made particularly fast because here we have a non-negative least squares optimisation problem. If the computation time is no concern, one can also implement CoF with higher values of *k* to use the resulting measure as a starting point to ChF. Given a complicated nature of the loss function, this may or may not lead to a better fit. In all these examples \(\Vert \varLambda \Vert =1\), which explains our particular choice of \(h = 1\), see the discussion above in connection to the error estimate (13).

### 5.1 Degenerate jump measure

Consider first the simplest measure \(\varLambda (dx)=\delta _1(dx)\) corresponding to a standard Poisson process with rate 1. Since all the jumps are integer valued and non-negative, it would be logical to take the non-negative integer grid for possible atom positions of the discretised \(\varLambda \). This is the way we have done it for the horse kick data analysis in Introduction. However, to test the robustness of our methods, we took the grid \(\{0,\pm 1/4,\pm 2/4,\ldots \}\). As a result, the estimated measures might place some mass on non-integer points or even on negative values of *x* to compensate for inaccurately fitted positive jumps. We have chosen to show on the graphs the discrepancies between the estimated and the true measure. An important indicator of the effectiveness of an estimation is the closeness of the total masses \(\Vert \hat{\varLambda }\Vert \) and \(\Vert \varLambda \Vert \). For \(\varLambda =\delta _1\), the probability to have more than 3 jumps is approximately 0.02; therefore, with the CoF method we expect that \(k=3\) would give an adequate estimate for these data. Indeed, the top panel of Fig. 3 demonstrates that the CoF with \(k=3\) is much more effective in detecting the jumps of the Poisson process compared to \(k=2\) and, especially, to \(k=1\). The latter methods generate large discrepancies both in atom sizes and in the total mass of the obtained measure. Observe also the presence of artifactual small atoms at large *x* and even at some non-integer locations.

### 5.2 Discrete positive and negative jumps

Consider now a jump measure \(\varLambda =0.2\delta _{-1}+0.2\delta _1+0.6\delta _2\). This gives rise to a compound Poisson process with rate \(\Vert \varLambda \Vert =1\) and jumps of sizes \(-1,1,2\) having respective probabilities 0.2, 0.2 and 0.6. Figure 4 presents the results of our simulations. The presence of negative jumps cancelling positive jumps creates an additional difficulty for the estimation task. This phenomenon explains why the approximation obtained with \(k=2\) is worse than with \(k=1\) and \(k=3\): two jumps of sizes \(+\)1 and −1 sometimes cancel each other, which is indistinguishable from no jumps case, see the top panel of Fig. 4. Moreover, −1 and 2 added together are the same as having a single size 1 jump. The phenomenon still persists when we increased the sample size: \(k=1\) and \(k=3\) still perform better. Notice that going from \(k=1\) through \(k=2\) up to \(k=3\) improves the performance of CoF, although the computing time increases dramatically. The corresponding total variation distances of \(\hat{\varLambda }_k\) to the theoretical distribution are 0.3669, 0.6268 and 0.1558. The combined method gives the distance 0.0975, and according to the bottom plot, it is again a clear winner in this case too. It is also much faster.

### 5.3 Unbounded compounding distribution

As an example of a measure \(\varLambda \) with unbounded support, we take a shifted Poisson distribution with parameter 1. Figure 5 presents our simulation results for this case; for computation purposes, we took the interval \(x\in [-2,5]\) as the support range for the estimated measure. In practice, the support range should be enlarged if atoms start appearing on the boundaries of the chosen interval indicating a wider support of the estimated measure, see also Buchmann and Grübel (2003) for a related discussion. As the top panel reveals, also in this case the CoF method with \(k=3\) gives a better approximation than those with \(k=1\) or \(k=2\) (the total variation distance to the theoretical distribution is 0.1150 compared to 0.3256 and 0.9235, respectively) and the combined (faster) method gives an even better estimate with \(d_{\mathrm {TV}}(\tilde{\varLambda }_1,\varLambda )=0.0386\). Interestingly, the case of \(k=2\) was the worst in terms of the total variation distance to the original measure. We suspect that the ’pairing effect’ may be responsible: the jumps are better fitted with a single integer- valued variable rather than with the sum of two. The algorithm may also got stuck in a local minimum producing small atoms at non-integer positions.

### 5.4 Continuous non-negative compounding distribution

Consider a compound Poisson process of rate 1 with the compounding distribution being exponential with parameter 1. The top plot of Fig. 6 shows that, as expected, the approximation accuracy increases with *k*. Observe that the total variation distance \(d_{\mathrm {TV}}(\hat{\varLambda }_3,\pmb {\varLambda })= 0.0985\) is comparable with the discretisation error: \(d_{\mathrm {TV}}(\varLambda ,\pmb {\varLambda })=0.075\). A Gaussian kernel smoothed version of \(\hat{\varLambda }_3\) is presented at the bottom plot of Fig. 6. The visible discrepancy for small values of *x* is explained by the fact that there were no sufficiently many small jumps in the simulated sample for the algorithm to put more mass around 0.

### 5.5 Continuous compounding distribution

Finally, Fig. 7 takes up the important example of compound Poisson processes with normally distributed jumps having both positive and negative values. Once again, the estimates \(\hat{\varLambda }_k\) improve as *k* increases, and the combined method CoF–ChF gives an estimate similar to \(\hat{\varLambda }_3\). Notice an inflection around 0 caused by the restraint on the estimated measure which imposes the origin to have a zero mass. This shows as a dip in the curve produced by the kernel smoother.

## 6 Currency exchange data application

*log-returns*which for a commodity with price \(S_t\) and time

*t*is defined to be \(W_t=\log (S_t/S_0)\). For this model, the increments \(W_h-W_0\), \(W_{2h}-W_h,\ldots \) are independent and have a common infinitely divisible distribution. For example, many authors argue that the log-returns of the currency exchange rates in a stable market have indeed i.i.d. increments, see e.g. Cont (2001). We took FX data of the Great Britain Pound (GBP) against a few popular currencies and chose to work with GBP to EUR exchange rates in a period of 200 consecutive days of a relatively stable market from 2014-01-02 to 2014-10-10, see the top plot of Fig. 8. We fitted various, popular among financial analysts, distributions to the daily increments of the log-returns: Gaussian, GEV, Weibull and stable distributions. The best fit was obtained by the stable distribution. In order to have a consistent comparison with our methods, we used the loss function (3) to estimate the parameters of the stable distribution, such estimation method goes back to at least Paulson et al. (1975). The fitted stable \(S(1.882,-1,0.002,0;0)\) distribution (in S0 parametrisation) is resented on the bottom plot of Fig. 8. A formal Chi-square test, however, rejected the null hypothesis that the data are coming from the fitted stable distribution due to the large discrepancies in the tails. The distance between the empirical characteristic function and the fitted stable distribution’s characteristic function measured in terms of the score function \(L_{\mathrm {ChF}}\) was \(6.12\times 10^{-3}\). We then ran our CoF algorithm with \(k=1\) and obtained a distribution within the distance \(5.53\times 10^{-6}\) from the empirical characteristic function. Taking the resulting jump measure as a starting point to our ChF method, we arrived at the distribution within the distance \(8.71\times 10^{-7}\). The observed improvement is due to more accurate estimates of the large jumps of the exchange rates (which are not related to global economical or political events).

It may be expected that, as in the case of a linear regression, the agreement of the estimated model with the data could be “too good”. To verify stability of our estimates, we ran our algorithms on the data with different time lags: every 2, 4 and 8 days records. It is interesting to note that even at 8 days lag our algorithms attained a distribution at the distance \(2.19\times 10^{-4}\), an order of magnitude closer to the empirical characteristic function than the fitted stable distribution despite that fact that 8 times less data were used, see Fig. 9.

The estimates of the measure \(\varLambda \) obtained for various lags are not that much different, apart from 8 days lag when only 25 observations are available, which reassures that our estimation methods give consistent results. These findings are illustrated on the bottom panel of Fig. 9.

## 7 Discussion

This paper deals with nonparametric inference for compound Poisson processes. We proposed and analysed new algorithms based on the characteristic function fitting (ChF) and convoluted cumulative distribution function fitting (CoF). The algorithms are based on the recently developed variational analysis of functionals of measures and the corresponding steepest descent methods for constraint optimisation on the cone of measures. CoF methods are capable of producing very accurate estimates, but at the expense of growing computational complexity. The ChF method critically depends on the initial approximation measure due to highly irregular behaviour of the objective function. We have observed that the problems of convergence of the ChF algorithms can often be effectively overcome by choosing the sample measure (discretised to the grid) as the initial approximation measure. However, a better alternative, as we demonstrated in the paper, is to use the measure obtained by the simplest (\(k=1\)) CoF algorithm. This combined CoF–ChF algorithm is fast and in majority of cases produces a measure which is closest in the total variation to the measure under estimation, and thus, this is our method of choice.

The practical experience we gained during various tests allows us to conclude that the suggested methods are especially well suited for estimation of discrete jump size distributions. They work well even with jumps that take both positive and negative values, not necessarily belonging to a regular lattice, demonstrating a clear advantage over the existing methods, see Buchmann and Grübel (2003), Buchmann and Grübel (2004). The use of our algorithms for continuous compounding distributions requires more trial and error in choosing the right discretisation grid and smoothing procedures. In order to properly take into account the continuity of the compounding measure, one may apply direct methods of the density estimation proposed by van Es et al. (2007), Watteel and Kulperger (2003). Alternatively, one can try to develop an optimisation algorithm for the class of absolutely continuous measures by characterising their tangent cones. Additional conditions on the density may also be imposed, like Lipschitz kind of conditions, to make the feasible set closed in the corresponding measure topology.

## 8 Appendix

### 8.1 Proof of Theorem 1

*tangent cone*to \(\mathbb {A}\) at \(\eta \) is the following subset of \(\mathbb {M}\):

*n*. Equivalently, \(\mathbb {T}_{\mathbb {A}}(\eta )\) is the closure of the set of such \(\nu \in \mathbb {M}\) for which there exists an \(\varepsilon =\varepsilon (\nu )>0\) such that \(\eta + t\nu \in \mathbb {A}\) for all \(0\le t\le \varepsilon \).

*L*over a set \(\mathbb {A}\), then one must have

*L*over \(\mathbb {A}\). This finishes the proof of (14).

In our case, the constraint set \(\mathbb {A}\) is the set \(\mathbb {L}=\{\eta \in \mathbb {M}_+:\ \eta (\{0\})=0\}\). Next step is to find a sufficiently rich class of measures belonging to the tangent cone \(\mathbb {T}_{\mathbb {L}}(\varLambda )\) for s given \(\varLambda \in \mathbb {L}\). For this, notice that for any such \(\varLambda \), the Dirac measure \(\delta _x\) belongs to \(\mathbb {T}_{\mathbb {L}}(\varLambda )\) since \(\varLambda +t\delta _x\in \mathbb {L}\) for any \(t\ge 0\) as soon as \(x\ne 0\). Similarly, given any Borel \(B\subset \mathbb {R}\), the negative measure \(-\varLambda |_B:=-\varLambda (\,\cdot \,\cap B)\), which is the restriction of \(-\varLambda \) onto *B*, is also in the tangent cone \(\mathbb {T}_\mathbb {L}(\varLambda )\), because for any \(0\le t\le 1\) we have \(\varLambda -t\varLambda |_B\in \mathbb {L}\).

*B*, we conclude that \(\nabla L(x;\varLambda )\le 0\) \(\varLambda \) almost everywhere which, combined with the previous inequality, gives the second relation in (7).

### 8.2 Proof of Theorem 2

Let \(\pmb {N}\) be the space of locally finite counting measures \(\varphi \) on \(\mathbb {R}\). Let \(\mathcal {N}\) be the smallest \(\sigma \)-algebra which makes measurable all the mappings \(\varphi \mapsto \varphi (B)\in \mathbb {Z}_+\) for \(\varphi \in \pmb {N}\) and compact sets *B*. A Poisson point process with the *intensity measure* \(\mu \) is a measurable mapping \(\varPi \) from some probability space into \([\pmb {N},\mathcal {N}]\) such that for any finite family of disjoint compact sets \(B_1,\cdots ,B_k\), the random variables \(\varPi (B_1),\cdots ,\varPi (B_k)\) are independent and each \(\varPi (B_i)\) has a Poisson distribution with parameter \(\mu (B_i)\). Clearly \(\mu (B)=\mathbf{E}\varPi (B)\) for any *B*. To emphasise the dependence of the distribution on \(\mu \), we write the expectation as \(\mathbf{E}_\mu \) in the sequel.

*J*| stands for the cardinality of

*J*, so that if

*J*is an empty set, then \(|J| = 0\). Define

*G*is such that there exists a constant \(c > 0\) satisfying

### 8.3 Gradient of ChF loss function

### 8.4 Gradient of CoF loss function

## Footnotes

## Notes

### Acknowledgements

This work was supported by Swedish Research Council Grant No. 11254331. The authors are grateful to Ilya Molchanov for fruitful discussions and to anonymous referees for valuable suggestions and stimulating criticism.

## References

- Asmussen, S.: Applied Probability and Queues. Stochastic Modelling and Applied Probability. Springer, New York (2008)MATHGoogle Scholar
- Asmussen, S., Rosińsky, J.: Approximations of small jumps of Lévy process with a view towards simulation. J. Appl. Prob.
**38**, 482–493 (2001)CrossRefMATHGoogle Scholar - Błaszczyszyn, B., Merzbach, E., Schmidt, V.: A note on expansion for functionals of spatial marked point processes. Stat. Probab. Lett.
**36**(3), 299–306 (1997)MathSciNetCrossRefMATHGoogle Scholar - Bortkiewicz, L.: Das gesetz der kleinen zahlen. Monatshefte für Mathematik und Physik
**9**(1), A39–A41 (1898)MathSciNetCrossRefGoogle Scholar - Buchmann, B., Grübel, R.: Decompounding: an estimation problem for Poisson random sums. Ann. Stat.
**31**(4), 1054–1074 (2003)MathSciNetCrossRefMATHGoogle Scholar - Buchmann, B., Grübel, R.: Decompounding Poisson random sums: recursively truncated estimates in the discrete case. Ann. Inst. Stat. Math.
**56**(4), 743–756 (2004)MathSciNetCrossRefMATHGoogle Scholar - Carrasco, M., Florens, J.P.: Generalization of GMM to a continuum of moment conditions. Econ. Theory
**16**, 797–834 (2000)MathSciNetCrossRefMATHGoogle Scholar - Coca, A.: Efficient nonparametric inference for discretely observed compound Poisson processes. Technical Report 1512.08472, ArXiv, December (2015)Google Scholar
- Comte, F., Duval, C., Genon-Catalot, V.: Nonparametric density estimation in compound Poisson processes using convolution power estimators. Metrika
**77**(1), 163–183 (2014)MathSciNetCrossRefMATHGoogle Scholar - Cont, R.: Empirical properties of asset returns: stylized facts and statistical issues. Quant. Financ.
**1**, 223–236 (2001)CrossRefGoogle Scholar - Cont, R., Tankov, P.: Financial Modelling with Jump Processes. Chapman & Hall/CRC, London (2003)CrossRefMATHGoogle Scholar
- Duval, C.: Density estimation for compound Poisson processes from discrete data. Stoch. Process. Appl.
**123**(11), 3963–3986 (2013)MathSciNetCrossRefMATHGoogle Scholar - Duval, C.: When is it no longer possible to estimate a compound Poisson process? Electron. J. Stat.
**8**(1), 274–301 (2014)MathSciNetCrossRefMATHGoogle Scholar - Feuerverger, A., McDunnough, P.: On the efficiency of empirical characteristic function procedures. J. R. Stat. Soc. Ser. B
**43**, 20–27 (1981a)MathSciNetMATHGoogle Scholar - Feuerverger, A., McDunnough, P.: On some Fourier methods for inference. J. Am. Stat. Assoc.
**76**, 379–387 (1981b)MathSciNetCrossRefMATHGoogle Scholar - Frees, E.W.: Nonparametric renewal function estimation. Ann. Stat.
**14**(4), 1366–1378 (1986)MathSciNetCrossRefMATHGoogle Scholar - Last, G.: Perturbation analysis of Poisson processes. Bernoulli
**20**(2), 486–513 (2014)MathSciNetCrossRefMATHGoogle Scholar - Mikosch, T.: Non-life Insurance Mathematics: An Introduction with the Poisson Process. Springer, New York (2009)CrossRefMATHGoogle Scholar
- Molchanov, I., Zuyev, S.: Variational analysis of functionals of Poisson processes. Math. Oper. Res.
**25**(3), 485–508 (2000a)MathSciNetCrossRefMATHGoogle Scholar - Molchanov, I., Zuyev, S.: Tangent sets in the space of measures: with applications to variational analysis. J. Math. Anal. Appl.
**249**(2), 539–552 (2000b)MathSciNetCrossRefMATHGoogle Scholar - Molchanov, I., Zuyev, S.: Steepest descent algorithms in a space of measures. Stat. Comput.
**12**, 115–123 (2002)MathSciNetCrossRefGoogle Scholar - Neumann, M.H., Reiss, M.: Nonparametric estimation for Lévy process from low-frequency observations. Bernoulli
**15**(1), 223–248 (2009)MathSciNetCrossRefMATHGoogle Scholar - Paulson, A.S., Holcomb, E.W., Leitch, R.A.: The estimation of the parameters of the stable laws. Biometrika
**62**(1), 163–170 (1975)MathSciNetCrossRefMATHGoogle Scholar - Quin, J., Lawless, J.: Empirical likelihood and general estimating equations. Ann. Stat.
**22**, 300–325 (1994)MathSciNetCrossRefMATHGoogle Scholar - R Core Team: R: a language and environment for statistical computing. R Foundation for Statistical Computing (2015)Google Scholar
- Sueishi, N., Nishiyama, Y.: Estimation of Lévy processes in mathematical finance: a comparative study. In: Zerger, A., Argent, R.M. (eds.) MODSIM 2005 International Congress on Modelling and Simulation, pp. 953–959 (2005)Google Scholar
- van Es, B., Gugushvili, S., Spreij, P.: A kernel type nonparametric density estimator for decompounding. Bernoulli
**13**(3), 672–694 (2007)MathSciNetCrossRefMATHGoogle Scholar - Watteel, R.N., Kulperger, R.J.: Nonparametric estimation of the canonical measure for infinitely divisible distributions. J. Stat. Comput. Simul.
**73**(7), 525–542 (2003)MathSciNetCrossRefMATHGoogle Scholar

## Copyright information

**Open Access**This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.