Skip to main content
Log in

Approximate maximum likelihood estimation for stochastic differential equations with random effects in the drift and the diffusion

  • Published:
Metrika Aims and scope Submit manuscript

Abstract

Consider N independent stochastic processes \((X_i(t), t\in [0,T])\), \(i=1,\ldots , N\), defined by a stochastic differential equation with random effects where the drift term depends linearly on a random vector \(\Phi _i\) and the diffusion coefficient depends on another linear random effect \(\Psi _i\). For these effects, we consider a joint parametric distribution. We propose and study two approximate likelihoods for estimating the parameters of this joint distribution based on discrete observations of the processes on a fixed time interval. Consistent and \(\sqrt{N}\)-asymptotically Gaussian estimators are obtained when both the number of individuals and the number of observations per individual tend to infinity. The estimation methods are investigated on simulated data and show good performances.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Berglund M, Sunnaker M, Adiels M, Jirstrand M, Wennberg B (2001) Investigations of a compartmental model for leucine kinetics using non-linear mixed effects models with ordinary and stochastic differential equations. Math Med Biol. https://doi.org/10.1093/imammb/dqr021

    Article  MathSciNet  Google Scholar 

  • Delattre M, Dion C (2017) MsdeParEst: Parametric estimation in mixed-effects stochastic differential equations. https://CRAN.R-project.org/package=MsdeParEst. Accessed 16 Sept 2017

  • Delattre M, Genon-Catalot V, Larédo C (2017) Parametric inference for discrete observations of diffusion with mixed effects in the drift or in the diffusion coefficient. Stoch Process Appl. https://doi.org/10.1016/j.spa.2017.08.016

    Article  MathSciNet  Google Scholar 

  • Delattre M, Genon-Catalot V, Samson A (2013) Maximum likelihood estimation for stochastic differential equations with random effects. Scand J Stat 40:322–343

    Article  MathSciNet  Google Scholar 

  • Delattre M, Genon-Catalot V, Samson A (2015) Estimation of population parameters in stochastic differential equations with random effects in the diffusion coefficient. ESAIM Probab Stat 19:671–688

    Article  MathSciNet  Google Scholar 

  • Delattre M, Lavielle M (2013) Coupling the SAEM algorithm and the extended Kalman filter for maximum likelihood estimation in mixed-effects diffusion models. Stat Interface 6(4):519–532

    Article  MathSciNet  Google Scholar 

  • Dion C (2016) Nonparametric estimation in a mixed-effect Ornstein–Uhlenbeck model. Metrika 79:919–951

    Article  MathSciNet  Google Scholar 

  • Ditlevsen S, De Gaetano A (2005) Mixed effects in stochastic differential equation models. REVSTAT Stat J 3:137–153

    MathSciNet  MATH  Google Scholar 

  • Donnet S, Samson A (2008) Parametric inference for mixed models defined by stochastic differential equations. ESAIM P&S 12:196–218

    Article  MathSciNet  Google Scholar 

  • Forman J, Picchini U (2016) Stochastic differential equation mixed effects models for tumor growth and response to treatment. Preprint arXiv:1607.02633

  • Grosse Ruse M, Samson A, Ditlevsen S (2017) Multivariate inhomogeneous diffusion models with covariates and mixed effects. Preprint arXiv:1701.08284

  • Jacod J, Protter P (2012) Discretization of processes, vol 67. Stochastic modelling and applied probability. Springer, Berlin

    MATH  Google Scholar 

  • Kessler M, Lindner A, Sørensen ME (2012) Statistical methods for stochastic differential equations, vol 124. Monographs on statistics and applied probability. Chapman & Hall, London

    MATH  Google Scholar 

  • Leander J, Almquist J, Ahlstrom C, Gabrielsson J, Jirstrand M (2015) Mixed effects modeling using stochastic differential equations: Illustrated by pharmacokinetic data of nicotinic acid in obese zucker rats. AAPS J. https://doi.org/10.1208/s12248-015-9718-8

    Article  Google Scholar 

  • Nie L (2006) Strong consistency of the maximum likelihood estimator in generalized linear and nonlinear mixed-effects models. Metrika 63:123–143

    Article  MathSciNet  Google Scholar 

  • Nie L (2007) Convergence rate of the MLE in generalized linear and nonlinear mixed-effects models: theory and applications. J Stat Plan Inference 137:1787–1804

    Article  MathSciNet  Google Scholar 

  • Nie L, Yang M (2005) Strong consistency of the MLE in nonlinear mixed-effects models with large cluster size. Sankhya Indian J Stat 67:736–763

    MathSciNet  MATH  Google Scholar 

  • Picchini U, De Gaetano A, Ditlevsen S (2010) Stochastic differential mixed-effects models. Scand J Stat 37:67–90

    Article  MathSciNet  Google Scholar 

  • Picchini U, Ditlevsen S (2011) Practicle estimation of high dimensional stochastic differential mixed-effects models. Comput Stat Data Anal 55:1426–1444

    Article  Google Scholar 

  • Whitaker GA, Golightly A, Boys RJ, Chris S (2017) Bayesian inference for diffusion driven mixed-effects models. Bayesian Anal 12:435–463

    Article  MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Maud Delattre.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 231 KB)

Appendices

Appendix

1.1 Proof of Proposition 1

For the likelihood of the i-th vector of observations, we first compute the conditional likelihood given \(\Phi _i=\varphi , \Psi _i=\psi \), and then integrate the result with respect to the joint distribution of \((\Phi _i, \Psi _i)\). We replace the exact likelihood given fixed values \((\varphi , \psi )\), i.e. the likelihood of \((X_i^{\varphi , \psi }(t_j), j=1, \ldots ,n)\) by the Euler scheme likelihood of (8). Setting \(\psi = \gamma ^{-1/2}\), it is given up to constants by:

$$\begin{aligned} L_n(X_{i,n},\gamma , \varphi )= L_n(X_{i},\gamma ,\varphi )= \gamma ^{n/2} \exp {\left[ -\frac{\gamma }{2}(S_{i,n} +\varphi ' V_{i,n} \varphi -2\varphi ' U_{i,n})\right] }. \end{aligned}$$
(33)

The unconditional approximate likelihood is obtained integrating with respect to the joint distribution \(\nu _{\vartheta }(d\gamma ,d\varphi )\) of the random effects \((\Gamma _i=\Psi _i^{-2},\Phi _i)\). For this, we first integrate \(L_n(X_i, \gamma ,\varphi )\) with respect to the Gaussian distribution \({{\mathcal {N}}}_d({\varvec{\mu }},\gamma ^{-1}{\varvec{\Omega }})\). Then, we integrate the result w.r.t. the distribution of \(\Gamma _i\). This second integration is only possible on the subset \(E_{i,n}(\vartheta )\) defined in (12).

Assume first that \({\varvec{\Omega }}\) is invertible. Integrating (33) with respect to the distribution \({{\mathcal {N}}}({\varvec{\mu }}, \gamma ^{-1}{\varvec{\Omega }})\) yields the expression:

$$\begin{aligned} \Lambda _n(X_{i,n},\gamma , {\varvec{\mu }}, {\varvec{\Omega }})= & {} \gamma ^{n/2}\exp {\left( -\frac{\gamma }{2}S_{i,n}\right) }\frac{\gamma ^{d/2}}{(2\pi )^{d/2} (det({\varvec{\Omega }}))^{1/2}} \\&\times \int _{{{\mathbb {R}}}^d} \exp {\left( \gamma \left( \varphi 'U_{i,n} -\frac{1}{2}\varphi 'V_{i,n}\varphi \right) \right) }\\&\times \exp {\left( -\frac{\gamma }{2} (\varphi -{\varvec{\mu }})'{\varvec{\Omega }}^{-1}(\varphi -{\varvec{\mu }})\right) } d\varphi \\= & {} \gamma ^{n/2}\exp {\left( -\frac{\gamma }{2}S_{i,n}\right) } \left( \frac{det({{\varvec{\Sigma }}}_{i,n})}{ det({\varvec{\Omega }})} \right) ^{1/2} \exp {\left( -\frac{\gamma }{2} T_{i,n} ({{\varvec{\mu }}}, {{\varvec{\Omega }}})\right) } \end{aligned}$$

where

$$\begin{aligned} T_{i,n}({{\varvec{\mu }}},{{\varvec{\Omega }}})= {\varvec{\mu }}'{\varvec{\Omega }}^{-1}{\varvec{\mu }}- \mathbf{m}_{i,n}'{{\varvec{\Sigma }}}_{i,n}^{-1}{} \mathbf{m}_{i,n}, \end{aligned}$$
(34)

and

$$\begin{aligned} {{\varvec{\Sigma }}}_{i,n}={\varvec{\Omega }}(I_d+V_{i,n}{\varvec{\Omega }})^{-1}, \quad \quad \mathbf{m}_{i,n}= {{\varvec{\Sigma }}}_{i,n} (U_{i,n}+{\varvec{\Omega }}^{-1}{\varvec{\mu }}). \end{aligned}$$

Computations using matrices equalities and (5), (6), (10) yield that \(T_{i,n}({{\varvec{\mu }}},{{\varvec{\Omega }}})\) is equal to the expression given in (11), i.e.:

$$\begin{aligned} T_{i,n}({\varvec{\mu }}, {\varvec{\Omega }})= ({{\varvec{\mu }}}- V_{i,n}^{-1}U_{i,n})'R_{i,n}^{-1}({{\varvec{\mu }}}- V_{i,n}^{-1}U_{i,n})- U_{i,n}'V_{i,n}^{-1}U_{i,n}. \end{aligned}$$

Noting that \( \frac{det({{\varvec{\Sigma }}}_{i,n})}{ det({\varvec{\Omega }})}= (det(I_d+V_{i,n}{\varvec{\Omega }}))^{-1} \), we get

$$\begin{aligned} \Lambda _n(X_{i,n}, \gamma , {\varvec{\mu }}, {\varvec{\Omega }})= & {} \gamma ^{n/2} (det(I_d + V_{i,n} {{\varvec{\Omega }}}))^{-1/2} \exp {\left[ -\frac{\gamma }{2}\left( S_{i,n}+T_{i,n}({\varvec{\mu }}, {\varvec{\Omega }})\right) \right] }. \nonumber \\ \end{aligned}$$
(35)

Then, we multiply \(\Lambda _n(X_{i,n},\gamma , {\varvec{\mu }}, {\varvec{\Omega }})\) by the Gamma density \((\lambda ^a /\Gamma (a)) \gamma ^{a-1}\exp {(-\lambda \gamma )}\), and on the set \(E_{i,n}(\vartheta )\) (see 12), we can integrate w.r.t. to \(\gamma \) on \((0,+\infty )\). This gives \({{\mathcal {L}}}_n(X_{i,n}, \vartheta )\).

At this point, we observe that the formula (14) and the set \(E_{i,n}(\vartheta )\) are still well defined for non invertible \({\varvec{\Omega }}\). Consequently, we can consider \({{\mathcal {L}}}_n(X_{i,n}, \vartheta )\) as an approximate likelihood for non invertible \({\varvec{\Omega }}\). \(\square \)

1.2 Proof of Theorem 1

For simplicity of notations, we consider the case \(d=1\) (univariate random effect in the drift, i.e. \(\mu = {\varvec{\mu }}\), \(\omega ^2={\varvec{\Omega }}\)). The case \(d>1\) does not present additional difficulties.

Proof of Lemma 2

We omit the index n but keep the index i for the i-th sample path. We have (see 20):

$$\begin{aligned} Z_i -\frac{S_{i}^{(1)}}{n}= \frac{n}{2a+n} \left( \frac{S_{i}}{n}-\frac{S_{i}^{(1)}}{n}\right) -\frac{2a}{2a+n} \frac{S_{i}^{(1)}}{n}+ \frac{2\lambda + T_i(\mu , \omega ^2)}{2a+n}. \end{aligned}$$

Therefore, using definition (22) and the notations introduced in (H4) (bounds on the parameter set),

$$\begin{aligned} \left| Z_i -\frac{S_{i}^{(1)}}{n}\right| \le \left| \frac{S_{i}}{n}-\frac{S_{i}^{(1)}}{n}\right| + \frac{2\alpha _1}{n}\Psi _i^2\frac{C_{i}^{(1)}}{n} + \frac{2\ell _1}{n}+ \frac{|T_i( \mu , \omega ^2)|}{n} \end{aligned}$$
(36)

Note that, using (11):

$$\begin{aligned} T_i( \mu , \omega ^2)= & {} \frac{V_i}{(1+\omega ^2 V_i)}\left( \mu - \frac{U_i}{V_i}\right) ^2 -\frac{U_i^2}{ V_i}=\frac{\mu ^2 V_i}{(1+\omega ^2 V_i)}\nonumber \\&-\frac{\omega ^2U_i^2}{(1+\omega ^2 V_i)}-2\mu \frac{U_i }{(1+\omega ^2 V_i)}. \end{aligned}$$
(37)

Thus, using (H4) and the fact that \(x+x^2\le 2(1+x^2)\) for all \(x\ge 0\),

$$\begin{aligned} |T_i( \mu , \omega ^2)|\le & {} \frac{\mu ^2}{\omega ^2}+ \omega ^2 U_i^2+ 2 |\mu | U_i|\le \frac{m^2}{c_0}+ c_1 U_i^2 +2 m | U_i|\\\le & {} \frac{m^2}{c_0}+ 2\max \{c_1, 2m\} (1+U_i^2)\lesssim 1+U_i^2. \end{aligned}$$

By the Hölder inequality, we obtain using (6), (36) and (37):

$$\begin{aligned} \left| Z_i -\frac{S_{i}^{(1)}}{n}\right| ^p \lesssim \left| \frac{S_{i}}{n}-\frac{S_{i}^{(1)}}{n}\right| ^p + n^{-p} \left[ 1+ \Psi _i^{2p} \left( \frac{C_{i}^{(1)}}{n}\right) ^p + U_i^{2p}\right] . \end{aligned}$$

Now, we apply Lemmas (7) and (5) of Sect. 7.2 and (48) of Sect. 7.3 and this yields inequality (23) of Lemma 2.

For the second inequality, we write:

$$\begin{aligned} Z_i^{-1}- \frac{n}{S_{i}^{(1)}}= & {} Z_i^{-1} \frac{n}{S_{i}^{(1)}}\left( \frac{S_{i}^{(1)}}{n}- Z_i\right) \\= & {} \left( Z_i^{-1}-\frac{n}{S_{i}^{(1)}}+\frac{n}{S_{i}^{(1)}}\right) \frac{n}{S_{i}^{(1)}}\left( \frac{S_{i}^{(1)}}{n}- Z_i\right) \\= & {} \left( \frac{n}{S_{i}^{(1)}}\right) ^2 Z_i^{-1} \left( \frac{S_{i}^{(1)}}{n}- Z_i\right) ^2 +\left( \frac{n}{S_{i}^{(1)}}\right) ^2\left( \frac{S_{i}^{(1)}}{n} -Z_i\right) . \end{aligned}$$

On \(F_{i}\), \(Z_i\ge c/\sqrt{n}\), so:

$$\begin{aligned} \left| Z_i^{-1}- \frac{n}{S_{i}^{(1)}}\right| 1_{F_{i}}\le \left( \frac{n}{C_{i}^{(1)}}\right) ^2 \Psi _i^{-4}\left( \frac{\sqrt{n}}{c}\left( \frac{S_{i}^{(1)}}{n} -Z_i\right) ^2+ \left| \frac{S_{i}^{(1)}}{n}- Z_i\right| \right) . \end{aligned}$$

Consequently, by the Hölder inequality,

$$\begin{aligned} \left| Z_i^{-1}- \frac{n}{S_{i}^{(1)}}\right| ^p 1_{F_{i}}\lesssim \left( \frac{n}{C_{i}^{(1)}}\right) ^{2p} \Psi _i^{-4p}\left( n^{p/2}\left( \frac{S_{i}^{(1)}}{n}- Z_i\right) ^{2p}+ \left| \frac{S_{i}^{(1)}}{n}- Z_i\right| ^p \right) . \end{aligned}$$
(38)

Now, we take conditional expectation w.r.t. \(\Psi _i=\psi , \Phi _i=\varphi \) and apply the Cauchy–Schwarz inequality. We use that, for n large enough, (see (48) of Sect. 7.3):

$$\begin{aligned} {{\mathbb {E}}}_{\vartheta }\left( \left( \frac{n}{C_{i}^{(1)}}\right) ^{4p} |\Psi _i=\psi , \Phi _i=\varphi \right) = {{\mathbb {E}}} \left( \frac{n}{C_{i}^{(1)}}\right) ^{4p}= O(1). \end{aligned}$$

And, we apply inequality (23) to get (24). \(\square \)

Now, we start proving Theorem 1. Omitting the index n in \(A_{i,n}, B_{i,n}\), we have [see (21), (25), (26)]

$$\begin{aligned} N^{-1/2}\frac{\partial \mathbf U _{N,n}}{\partial \lambda }(\vartheta )= & {} N^{-1/2}\sum _{i=1}^N \left( \frac{a}{\lambda } -\Gamma _i\right) + R_1,\\ N^{-1/2} \frac{\partial \mathbf U _{N,n}}{\partial a}(\vartheta )= & {} N^{-1/2} \sum _{i=1}^N \left( -\psi (a)+\log {\lambda }+\log {\Gamma _i}\right) +R_2+ R'_2, \\ N^{-1/2}\frac{\partial \mathbf U _{N,n}}{\partial \mu }(\vartheta )= & {} N^{-1/2} \sum _{i=1}^N \Gamma _i A_i+ R_3= N^{-1/2} \sum _{i=1}^N \Gamma _i A_i(T;\mu , \omega ^2)+ R'_3+ R_3\\ N^{-1/2}\frac{\partial \mathbf U _{N,n}}{\partial \omega ^2} (\vartheta )= & {} N^{-1/2}\frac{1}{2}\sum _{i=1}^N \left( \Gamma _i A_i^2- B_i\right) +R_4\\= & {} N^{-1/2}\frac{1}{2}\sum _{i=1}^N ( \Gamma _i A_i^2(T;\mu ,\omega ^2)-B_i(T, \omega ^2))+ R'_4+R_4. \end{aligned}$$

The remainder terms are:

$$\begin{aligned} R_1= & {} N^{-1/2}\sum _{i=1}^N (\Gamma _i -Z_i^{-1}1_{F_{i}}),\quad R_2 = N^{-1/2}\sum _{i=1}^N ( \log {Z_i^{-1}}\;1_{F_{i}}-\log {\Gamma _i}),\\ R'_2= & {} N^{1/2}(\psi (a+(n/2)) - \log {(a+(n/2))}) + N^{-1/2}\sum _{i=1}^N 1_{F_{i}^c},\\ R_3= & {} N^{-1/2}\sum _{i=1}^N A_i(Z_i^{-1}1_{F_{i}}-\Gamma _i),\;\;R_4 = N^{-1/2}\frac{1}{2}\sum _{i=1}^N A_i^2(Z_i^{-1}1_{F_{i}}-\Gamma _i),\\ R'_3= & {} N^{-1/2} \sum _{i=1}^N \Gamma _i (A_{i} -A_i(T;\mu , \omega ^2)),\\ R'_4= & {} N^{-1/2}\frac{1}{2}\sum _{i=1}^N \left( (B_{i}-B_i(T, \omega ^2)) - \Gamma _i( A_{i}^2 -A_i^2(T;\mu ,\omega ^2))\right) . \end{aligned}$$

The most difficult remainder terms are \(R_1, R_2,R_3,R_4\). They are treated in Lemma 3 below. For the term \(R'_2\), we use that \((\psi (a+(n/2)) - \log {(a+(n/2))})= O(n^{-1})\) (see 47) and that, for \(a>4\), \({{\mathbb {P}}}_{\theta }(F_{i}^c) \lesssim n^{-2}\) (see 17) to get that \(R'_2= O_P(\sqrt{N}/n)\).

Using Lemma 5 in Sect. 7.2, it is easy to check that \(R'_3\) and \(R'_4\) are \(O_P(\sqrt{N/n})\).

Therefore, there remains to find the limiting distribution of:

$$\begin{aligned} \left( \begin{array}{l} N^{-1/2}\sum _{i=1}^N \left( \frac{a}{\lambda } -\Gamma _i\right) ,\\ N^{-1/2} \sum _{i=1}^N \left( -\psi (a)+\log {\lambda }+\log {\Gamma _i}\right) ,\\ N^{-1/2} \sum _{i=1}^N \Gamma _i A_i(T;\mu , \omega ^2),\\ N^{-1/2}\frac{1}{2}\sum _{i=1}^N (B_i(T, \omega ^2) - \Gamma _i A_i^2(T;\mu ,\omega ^2)) \end{array}\right) . \end{aligned}$$

The first two components are exactly the score function corresponding to the exact observation of \((\Gamma _i, i=1, \ldots , N)\) (see Sect. 7.4). Hence, the first part of Theorem 1 is proved.

The whole vector is ruled by the standard central limit theorem. To compute the limiting distribution, we use results from Delattre et al. (2013) which deals with the case of \(\Gamma _i=\gamma \) fixed and \(\Phi _i \sim {{\mathcal {N}}}(\mu , \gamma ^{-1}\omega ^2)\). It is proved in this paper that

$$\begin{aligned} {{\mathbb {E}}}_{\vartheta }(A_i(T;\mu , \omega ^2)|\Gamma _i=\gamma )=0,\;\;{{\mathbb {E}}}_{\vartheta }(B_i(T, \omega ^2) - \Gamma _i A_i^2(T;\mu ,\omega ^2)|\Gamma _i=\gamma )=0. \end{aligned}$$

This result is stated for \(\gamma =1\) in Proposition 5, p. 328 of Delattre et al. (2013) with (unfortunately) different notations: \(A_i(T;\mu , \omega ^2)\) is denoted \(\gamma _i(\theta )\) and \(B_i(T, \omega ^2) \) is denoted \(I_i(\omega ^2)\) (formula (11) of this paper). It can be extended for any value of \(\gamma \) using formula (35) and the regularity properties of the statistical model. Hence, the third and fourth component are centered and the covariances between the first two components and the last two ones are null. Moreover, it is also proved (in the same Proposition 5) that

$$\begin{aligned}&\gamma {{\mathbb {E}}}_{\vartheta }(A_i^2(T;\mu , \omega ^2)|\Gamma _i=\gamma )= {{\mathbb {E}}}_{\vartheta }(B_i(T; \omega ^2)|\Gamma _i=\gamma ),\\&{{\mathbb {E}}}_{\vartheta }\left( \frac{1}{4} \left( \gamma A_i^2(T; \mu , \omega ^2)-B_i(T;\omega ^2)\right) ^2 |\Gamma _i=\gamma \right) \\&\quad = \frac{1}{2}{{\mathbb {E}}}_{\vartheta }\left( 2\gamma A_i^2 (T; \mu , \omega ^2)B_i(T;\omega ^2)-B_i^2(T;\omega ^2)|\Gamma _i =\gamma \right) \\&{{\mathbb {E}}}_{\vartheta }\left( \frac{1}{2}(\gamma A_i^2 (T; \mu , \omega ^2) -B_i(T;\omega ^2))A_i(T;\mu , \omega ^2)| \Gamma _i=\gamma \right) \\&\quad ={{\mathbb {E}}}_{\vartheta }\left( A_i(T;\mu , \omega ^2) B_i(T, \omega ^2)|\Gamma _i=\gamma \right) . \end{aligned}$$

Hence, the covariance matrix of the last two components is equal to \(J(\vartheta )\) defined in (29).

The proof of the last item (second order derivatives) relies on the same tools with more cumbersome computations but no additional difficulty. It is detailed in the electronic supplementary material. Note that this part only requires that Nn both tend to infinity without further constraint. So the proof is complete. \(\square \)

Lemma 3

Recall (20). Then, for \(a>4\), (see (16) for the definition of \(F_{i,n}\)), \(R_1, R_2\) are \(O_P(\max \{1/\sqrt{n},\frac{\sqrt{N}}{n}\})\), \(R_3, R_4\) are \(O_P(\sqrt{\frac{N}{n}})\).

Proof

The proof goes in several steps. We have introduced

$$\begin{aligned} S_{i}^{(1)}= \Psi _i^{2} \frac{1}{\Delta }\sum _{j=1}^{n}\left( W_i(t_{j})-W_i(t_{j-1})\right) ^2= \Gamma _i^{-1} C_{i}^{(1)}. \end{aligned}$$

We know the exact distribution of \(S_{i}^{(1)}\): \(C_{i}^{(1)}\) is independent of \(\Gamma _i\) and has distribution \(\chi ^2(n)=G(n/2, 1/2)\). By exact computations, using Gamma distributions (see Sect. 7), we obtain:

$$\begin{aligned} {{\mathbb {E}}}_{\vartheta }\left( \frac{n}{C_{i}^{(1)}}-1\right)= & {} \frac{2}{n-2},\quad {{\mathbb {E}}}_{\vartheta } \left( \frac{n}{C_{i}^{(1)}}-1\right) ^2\\= & {} \frac{2n+8}{(n-2)(n-4)}=O(n^{-1}),\\ \end{aligned}$$
$$\begin{aligned}&{{\mathbb {E}}}_{\vartheta } (\log {C_{i}^{(1)}/2})-\log {(n/2)} = \psi (n/2)-\log {(n/2)}=O(n^{-1}), \\&\quad \text{ Var }_{\vartheta } (\log {C_{i}^{(1)}/2})=\psi '(n/2)=O(n^{-1}). \end{aligned}$$

Thus,

$$\begin{aligned} \frac{1}{\sqrt{N}}\sum _{i=1}^N \left( \frac{n}{S_{i}^{(1)}} - \Gamma _i\right)= & {} O_P\left( \max \left\{ \frac{1}{\sqrt{n}}, \frac{\sqrt{N}}{n}\right\} \right) , \frac{1}{\sqrt{N}}\sum _{i=1}^N \left( \log {\frac{n}{S_{i}^{(1)}}} - \log {\Gamma _i}\right) \nonumber \\= & {} O_P\left( \max \left\{ \frac{1}{\sqrt{n}}, \frac{\sqrt{N}}{n}\right\} \right) \end{aligned}$$
(39)

Let us detail the first computation. Note that \((\Gamma _i, i=1,\ldots ,N)\) and \((C_{i}^{(1)}, i=1,\ldots ,N)\) are independent. Thus,

$$\begin{aligned} {{\mathbb {E}}}_{\vartheta }\left( \frac{1}{\sqrt{N}}\sum _{i=1}^N \left( \frac{n}{S_{i}^{(1)}} - \Gamma _i\right) \right) ^2= & {} \frac{1}{N}{{\mathbb {E}}}_{\vartheta } \left( \sum _{i=1}^N \Gamma _i \left( \frac{n}{C_{i}^{(1)}} - 1\right) \right) ^2\\= & {} \frac{a^2}{\lambda (\lambda +1)}{{\mathbb {E}}}_{\vartheta }\left( \frac{n}{C_{1}^{(1)}}-1\right) ^2\\&+\,(N-1)\frac{a^2}{\lambda ^2}\left( {{\mathbb {E}}}_{\vartheta }\left( \frac{n}{C_{1}^{(1)}}-1\right) \right) ^2\\= & {} O(n^{-1}) + O(N/n^2)=O(\max \{n^{-1}, N/n^2\}). \end{aligned}$$

The second computation is similar.

Then, we have to study

$$\begin{aligned} \frac{1}{\sqrt{N}}\sum _{i=1}^N \left( Z_i^{-1}1_{F_{i}} - \frac{n}{S_{i}^{(1)}}\right) , \quad \frac{1}{\sqrt{N}}\sum _{i=1}^N \left( \log {(Z_i^{-1})}1_{F_{i}}- \log {\frac{n}{S_{i}^{(1)}}}\right) . \end{aligned}$$
(40)

We write:

$$\begin{aligned} \left( Z_i^{-1}1_{F_{i}} - \frac{n}{S_{i}^{(1)}}\right) =\left( Z_i^{-1}- \frac{n}{S_{i}^{(1)}}\right) 1_{F_{i}} - \frac{n}{S_{i}^{(1)}}1_{F_{i}^c}. \end{aligned}$$

For \(a>4\), \({{\mathbb {P}}}_{\vartheta }(F_{i}^c) \lesssim n^{-2}\) (see 17). We have, applying the Cauchy–Schwarz inequality:

$$\begin{aligned} {{\mathbb {E}}}_{\vartheta }\left( \frac{1}{\sqrt{N}}\sum _{i=1}^N \frac{n}{S_{i}^{(1)}} 1_{F_{i}^c}\right) \le \frac{1}{\sqrt{N}}\sum _{i=1}^N \left( {{\mathbb {E}}}_{\vartheta }\Gamma _i^2 \;{{\mathbb {E}}}_{\vartheta }(\frac{n}{C_{i}^{(1)}})^2 \;{{\mathbb {P}}}_\vartheta (F_{i}^c)\right) ^{1/2}\lesssim \frac{\sqrt{N}}{n}. \end{aligned}$$

Next, apply Lemma 2,

$$\begin{aligned}&{{\mathbb {E}}}_{\vartheta }\left| \frac{1}{\sqrt{N}}\sum _{i=1}^N (Z_i^{-1} - \frac{n}{S_{i}^{(1)}}) 1_{F_{i}}\right| \\&\quad \lesssim \frac{\sqrt{N}}{n}{{\mathbb {E}}}_{\vartheta } \left( 1 +(1+\Phi _i^2)(\Psi _i^{-2}+ \Psi _i^{-4})+ \Phi _i^4+\Psi _i^4+\Phi _i^4\Psi _i^{-4}\right) . \end{aligned}$$

As noted above, \(\Psi _i=\Gamma _i^{-1/2}\), \({{\mathbb {E}}}_{\vartheta }(\Psi _i^{-q})< +\infty \) for all \(q\ge 0\). We can write \(\Phi _i=\mu +\omega \Psi _i \varepsilon _i\) with \(\varepsilon _i\) a standard Gaussian variable independent of \(\Psi _i\). We have

$$\begin{aligned} {{\mathbb {E}}}_{\vartheta }\Phi _i^2\Psi _i^{-2}= & {} {{\mathbb {E}}}_{\vartheta }(\mu \Psi _i^{-1} +\omega \varepsilon _i)^2<+\infty ,\\ {{\mathbb {E}}}_{\vartheta }\Phi _i^2\Psi _i^{-4}= & {} {{\mathbb {E}}}_{\vartheta }(\mu \Psi _i^{-2} +\omega \Psi _i^{-1} \varepsilon _i)^2<+\infty ,\quad \text{ and } \text{ for } a>2,\\ {{\mathbb {E}}}_{\vartheta }\Phi _i^4\lesssim & {} 1 + {{\mathbb {E}}}_{\vartheta }\Psi _i^4= 1 + {{\mathbb {E}}}_{\vartheta }\Gamma _i^{-2}<+\infty . \end{aligned}$$

Thus, for \(a>4\),

$$\begin{aligned} \frac{1}{\sqrt{N}}\sum _{i=1}^N \left( Z_i^{-1}1_{F_{i}} - \frac{n}{S_{i}^{(1)}}\right) = O_P\left( \frac{\sqrt{N}}{n}\right) . \end{aligned}$$
(41)

Joining (39) (left)–(40) (left) and (41), we obtain that, for \(a>4\), \(R_1=O_P(\max \{n^{-1/2}, \sqrt{N}/n\})\).

Analogously,

$$\begin{aligned} \left( \log {(Z_i^{-1})}1_{F_{i}}- \log {\frac{n}{S_{i}^{(1)}}}\right) =\left( \log {(Z_i^{-1})}- \log {\frac{n}{S_{i}^{(1)}}}\right) 1_{F_{i}} - 1_{F_{i}^c}\log {\frac{n}{S_{i}^{(1)}}}. \end{aligned}$$

As above, we can prove

$$\begin{aligned} \left| {{\mathbb {E}}}_{\vartheta }\left( \frac{1}{\sqrt{N}}\sum _{i=1}^N \log {\frac{n}{S_{i}^{(1)}}} 1_{F_{i}^c}\right) \right| = O\left( \frac{\sqrt{N}}{n}\right) , \quad \quad \text{ for }\quad a>4. \end{aligned}$$
(42)

And:

$$\begin{aligned} \log {(Z_i^{-1})}- \log {\frac{n}{S_{i}^{(1)}}}= & {} \log {\frac{S_{i}^{(1)}}{n}} -\log {Z_i} \\= & {} \left( \frac{S_{i}^{(1)}}{n}-Z_i\right) \int _0^1 ds\left( \frac{1}{s\frac{S_{i}^{(1)}}{n}+ (1-s)Z_i} -\frac{1}{\frac{S_{i}^{(1)}}{n}} + \frac{1}{\frac{S_{i}^{(1)}}{n}}\right) \\= & {} \frac{1}{\frac{S_{i}^{(1)}}{n}} \left( \frac{S_{i}^{(1)}}{n}-Z_i\right) \\&+\,\frac{1}{\frac{S_{i}^{(1)}}{n}} \left( \frac{S_{i}^{(1)}}{n}-Z_i\right) ^2 \int _0^1 ds \frac{(1-s)}{s\frac{S_{i}^{(1)}}{n}+ (1-s)Z_i}. \end{aligned}$$

On \(F_{i}\),

$$\begin{aligned} \int _0^1 ds \frac{(1-s)}{s\frac{S_{i}^{(1)}}{n}+ (1-s)Z_i}\le Z_i^{-1}\le \frac{\sqrt{n}}{c}. \end{aligned}$$

Therefore,

$$\begin{aligned} \left| \log {(Z_i^{-1})}- \log {\frac{n}{S_{i}^{(1)}}}\right| \le \frac{1}{\frac{C_{i}^{(1)}}{n}} \Psi _i^{-2}\left( \left| \frac{S_{i}^{(1)}}{n}-Z_i\right| + \frac{\sqrt{n}}{c}(\frac{S_{i}^{(1)}}{n}-Z_i)^2\right) . \end{aligned}$$
(43)

Now, we take conditional expectation w.r.t. \(\Phi _i=\varphi , \Psi _i=\psi \), and apply first the Cauchy–Schwarz inequality and then Lemma 2. This yields:

$$\begin{aligned}&{{\mathbb {E}}}_{\vartheta }\left( \left| \log {(Z_i^{-1})}- \log {\frac{n}{S_{i}^{(1)}}}\right| |\Phi _i=\varphi , \Psi _i=\psi \right) \nonumber \\&\quad \lesssim \frac{1}{n}(1+ \psi ^{-2}(1+\varphi ^2) +\psi ^2(1+\varphi ^4) +\varphi ^2+\varphi ^4 +\psi ^6). \end{aligned}$$
(44)

We have to check that the expectation above is finite. The worst term is \({{\mathbb {E}}}_{\vartheta }\Psi _i^6= {{\mathbb {E}}}_{\vartheta }\Gamma _i^{-3}\) which requires the constraint \(a>3\). Thus, for \(a>3\), we have:

$$\begin{aligned} {{\mathbb {E}}}_{\vartheta }\frac{1}{\sqrt{N}}\left| \sum _{i=1}^N (\log {(Z_i^{-1})}- \log {\frac{n}{S_{i}^{(1)}}})1_{F_{i}}\right| = O\left( \frac{\sqrt{N}}{n}\right) . \end{aligned}$$

Therefore, we have proved that \(R_1, R_2\) are \(O_P(\max \{\frac{1}{\sqrt{n}}, \frac{\sqrt{N}}{n}\})\).

For \(R_3, R_4\), we proceed analogously but we have to deal with the terms \(A_{i}, A_{i}^2\). We write again:

$$\begin{aligned} Z_i^{-1}1_{F_{i}} - \Gamma _i= \left( Z_i^{-1}-\frac{n}{S_{i}^{(1)}}\right) 1_{F_{i}}-\frac{n}{S_{i}^{(1)}}1_{F_{i}^c}+ \frac{n}{S_{i}^{(1)}}- \Gamma _i. \end{aligned}$$

Using Lemma 6 and the Cauchy–Schwarz inequality, we obtain:

$$\begin{aligned} {{\mathbb {E}}}_{\vartheta } \left| \frac{1}{\sqrt{N}}\sum _{i=1}^N A_i\left( \frac{n}{S_{i}^{(1)}}- \Gamma _i\right) \right|\lesssim & {} \sqrt{\frac{N}{n}} {{\mathbb {E}}}_{\vartheta }\left( \Gamma _i|\Phi _i| +\Gamma _i^{1/2}\right) =O\left( \sqrt{\frac{N}{n}}\right) ,\\ {{\mathbb {E}}}_{\vartheta } \left| \frac{1}{\sqrt{N}}\sum _{i=1}^N A_i\frac{n}{S_{i}^{(1)}}1_{F_{i}^c}\right|\lesssim & {} \sqrt{\frac{N}{n}} {{\mathbb {E}}}_{\vartheta }\left( \Gamma _i|\Phi _i|+\Gamma _i^{1/2}\right) =O\left( \sqrt{\frac{N}{n}}\right) . \end{aligned}$$

Applying now Lemma 2, we obtain:

$$\begin{aligned}&{{\mathbb {E}}}_{\vartheta } \left| \frac{1}{\sqrt{N}}\sum _{i=1}^N A_i\left( Z_i^{-1}-\frac{n}{S_{i}^{(1)}}\right) 1_{F_{i}}\right| \\&\quad \lesssim \frac{\sqrt{N}}{n} {{\mathbb {E}}}_{\vartheta } (|\Phi _i|(1+\Psi _i+\Psi _i^2)+\Phi _i^2\Psi _i+\Psi _i+\Psi _i^2+\Psi _i^3). \end{aligned}$$

This requires the constraint \({{\mathbb {E}}}_{\vartheta }\Psi _i^3 {{\mathbb {E}}}_{\vartheta } \Gamma _i^{-3/2}<+\infty \), i.e. \(a>3/2\). Note that \({{\mathbb {E}}}_{\vartheta }(\Psi _i^{-4p})\) is finite for any p. This is why these moments are omitted.

We proceed analogously for \(R_4\) and find that \(R_4= O_P(\sqrt{N/n})\) for \(a>2\). The different constraint on a derives from the presence of \(A_i^2\) in \(R_4\) which requires higher moments for \(\Gamma _i^{-1}\). \(\square \)

1.3 Proof of Theorem 3

We only give a sketch of the proof and assume \(d=1\) for simplicity. We compute \({{\mathcal {H}}}_{N,n}(\vartheta )\) (see 32) and set \(G_{i,n}=\{S_{i,n}\ge k\sqrt{n}\} \). We have:

$$\begin{aligned} \frac{\partial \mathbf V _{N,n}^{(1)}}{\partial \lambda }(\lambda ,a)= & {} \sum _{i=1}^N \left( \frac{a}{\lambda } -\xi _{i,n}^{-1}\right) ,\\ \frac{\partial \mathbf V _{N,n}^{(1)}}{\partial a}(\lambda ,a)= & {} N \left( \psi (a+n/2)-\log {(a+n/2)} -\psi (a)+\log {\lambda }\right) -\sum _{i=1}^N \log { \xi _{i,n} }, \\ \frac{\partial \mathbf W _{N,n}}{\partial \mu }(\mu , \omega ^2)= & {} \sum _{i=1}^N 1_{G_{i,n}}\;\frac{n}{S_{i,n}} A_{i,n},\\ \frac{\partial \mathbf W _{N,n}}{\partial \omega ^2} (\mu , \omega ^2)= & {} \frac{1}{2}\sum _{i=1}^N \left( 1_{G_{i,n}} \frac{n}{S_{i,n}} A_{i,n}^2 -B_{i,n}\right) . \end{aligned}$$

We can prove that, under (H1)–(H2), if \(a_0>2\), \({{\mathbb {P}}}_{\vartheta _0} (G_{i,n}^c)\lesssim n^{-2}\) under analogous and simpler tools as in Lemma 1. The result of Lemma 2 holds with \(\xi _{i,n}\) instead of \(Z_{i,n}\) and without \(1_{F_{i,n}}\). This allows to prove that:

$$\begin{aligned} N^{-1/2}\frac{\partial \mathbf V _{N,n}^{(1)}}{\partial \lambda }(\lambda ,a)= & {} N^{-1/2} \sum _{i=1}^N \left( \frac{a}{\lambda } -\Gamma _i\right) + r_1,\\ N^{-1/2} \frac{\partial \mathbf V _{N,n}^{(1)}}{\partial a}(\lambda ,a)= & {} N^{-1/2} \sum _{i=1}^N \left( \log { \Gamma _i }-\psi (a)+\log {\lambda }\right) +r_2 \end{aligned}$$

where \(r_1\) and \(r_2\) are \(O_P(\sqrt{N}/n)\).

The result of Lemma 2 holds with \(S_{i,n}/n\) instead of \(Z_{i,n}\) and \(G_{i,n}\) instead of \(F_{i,n}\) (and the proof is much simpler). This implies that:

$$\begin{aligned} N^{-1/2}\frac{\partial \mathbf W _{N,n}}{\partial \mu }(\mu , \omega ^2)= & {} N^{-1/2} \sum _{i=1}^N \Gamma _i A_i (T;\mu , \omega ^2)+r_3,\\ N^{-1/2}\frac{\partial \mathbf U _{N,n}}{\partial \omega ^2} (\mu , \omega ^2)= & {} N^{-1/2}\frac{1}{2}\sum _{i=1}^N ( \Gamma _i A_i^2(T;\mu ,\omega ^2)-B_i(T, \omega ^2))+r_4, \end{aligned}$$

and we can prove that \(r_3\) and \(r_4\) are \(O_P(N/n)\).

Auxiliary results

In Sect. 7.1, we explain why Assumption (H2) is required. Results on discretizations are recalled in Sect. 7.2. Sections 7.3 and 7.4 contain results on Gamma distributions and estimation from direct observations of the random effects.

1.1 Preliminary results for SDEs with random effects

SDEMEs have specific features that differ from usual SDEs. The discussion below justifies our setup and assumptions. Consider \((X(t), t \ge 0)\) a stochastic process ruled by:

$$\begin{aligned} dX(t)= b(X(t), \Phi ) dt + \sigma (X(t), \Psi ) dW(t), \quad X(0)=x \end{aligned}$$
(45)

where the Wiener process (W(t)) and the r.v.’s \((\Phi ,\Psi )\) are defined on a common probability space \((\Omega , {{\mathcal {F}}}, {{\mathbb {P}}})\) and independent. We set \(({{\mathcal {F}}}_t=\sigma ( \Phi , \Psi , W(s), s\le t), t\ge 0)\). To understand the properties of (X(t)), let us introduce the system of stochastic differential equations:

$$\begin{aligned} dX(t)= & {} b(X(t), \Phi (t)) dt + \sigma (X(t), \Psi (t)) dW(t), \quad X(0)=x\\ d\Phi (t)= & {} 0, \quad \Phi (0)=\Phi \\ d\Psi (t)= & {} 0, \quad \Psi (0)=\Psi . \end{aligned}$$

Existence and uniqueness of strong solutions is therefore ensured by the classical assumptions:

  1. (A1)

    The real valued functions \((x, \varphi )\in {{\mathbb {R}}}\times {{\mathbb {R}}}^{d} \rightarrow b(x, \varphi )\) and \((x, \psi )\in {{\mathbb {R}}}\times {{\mathbb {R}}}^{d'} \rightarrow \sigma (x, \psi )\) are \(C^1\).

  2. (A2)

    There exists a constant K such that, \(\forall x, \varphi , \psi \in {{\mathbb {R}}}\times {{\mathbb {R}}}^{d}\times {{\mathbb {R}}}^{d'}\), (\(\Vert .\Vert \) is the Euclidian norm):

    $$\begin{aligned} |b(x, \varphi )| \le K(1+|x| +\Vert \varphi \Vert ), \quad |\sigma (x, \psi )| \le K(1+|x| +\Vert \psi \Vert ) \end{aligned}$$

    If the following additional assumption holds:

  3. (A3)

    The r.v.’s \(\Phi ,\Psi \) satisfy \({{\mathbb {E}}}(\Vert \Phi \Vert ^{2p} + \Vert \Psi \Vert ^{2p})<+\infty \)

then, under (A1)–(A2), for all T, \(\sup _{t\le T}{{\mathbb {E}}}X^{2p}(t)<+\infty \).

To deal with discrete observations, moment properties are required to control error terms. Here, we consider a simple model for the drift term, \(b(x,\varphi )= \varphi ' b(x)\), and for the diffusion coefficient, \(\sigma (x,\psi )= \psi \sigma (x)\), with \(\varphi \in {{\mathbb {R}}}^{d}\), \(\psi \in {{\mathbb {R}}}\). The common assumptions on b and \( \sigma \) should be that \(b, \sigma \) are \(C^1({{\mathbb {R}}})\) and have linear growth w.r.t. x. Then, for all fixed \((\varphi , \psi ) \in {{\mathbb {R}}}^d\times (0,+\infty )\), the stochastic differential equation

$$\begin{aligned} d\, X^{\varphi , \psi }(t) = \varphi ' b(X^{\varphi ,\psi }(t))dt+\psi \sigma (X^{\varphi ,\psi }(t))\, dW(t),\quad X^{\varphi ,\psi }(0) =x \end{aligned}$$
(46)

admits a unique strong solution process \((X^{\varphi ,\psi }(t), t\ge 0)\) adapted to the filtration \(({{\mathcal {F}}}_t)\). Moreover, the stochastic differential equation with random effects (1) admits a unique strong solution adapted to \(({{\mathcal {F}}}_t)\) such that the joint process \((\Phi ,\Psi , X(t), t \ge 0)\) is strong Markov and the conditional distribution of (X(t)) given \(\Phi =\varphi , \Psi =\psi \) is identical to the distribution of (46). We need more than this property. Indeed, moment properties for X(t) are required. Let us stress that, with \(b(x, \varphi )=\varphi 'b(x), \sigma (x,\psi )=\psi \sigma (x)\), (A2) does not hold even if \(b,\sigma \) have linear growth. In particular, (A3) does not ensure that X(t) has moments of order 2p. Let us illustrate this point on an example.

Example

Consider the mixed effect Ornstein–Uhlenbeck process \(dX(t)= \Phi X(t) dt + \Psi dW(t), X(0)=x\). Then, \(\displaystyle X(t)= x \exp {(\Phi t)} + \Psi \exp {(\Phi t)}\smallint _0^t \exp {(-\Phi s)}\; dW(s) \). The first order moment of X(t) is finite iff \({{\mathbb {E}}}\exp {(\Phi t)}<+\infty \) and \({{\mathbb {E}}}\left[ |\Psi |((\exp {(2\Phi t)}\right. \left. -1)/2\Phi )^{1/2}\right] <+\infty \) which is much stronger than the existence of moments for \(\Phi , \Psi \). When \((\Phi , \Psi )\) has distribution (2), \(\displaystyle {{\mathbb {E}}}\exp {(\Phi t)}= \exp {(\mu t)} {{\mathbb {E}}}(\exp {(\omega ^2 t^2/\Gamma )})=+\infty \).

In a more general setup, the existence of moments is standardly proved by using the Gronwall lemma. In this case, it would lead to an even stronger condition such as \( {{\mathbb {E}}}(\exp {(\Phi ^2 t)}+ \exp {(\Psi ^2 t)})<+\infty \).

Hence, stronger assumptions than usual are necessary. Indeed, (A2) holds for model (1) either if \(\varphi \) and \(\psi \) belong to a bounded set, or if b and \(\sigma \) are uniformly bounded.

We could think of using a localisation device to get bounded b and \(\sigma \). The problem here is to deal with N sample paths \((X_i(t)), i=1, \ldots ,N\) which have possibly no moments. The localisation device is here complex.

1.2 Approximation results for discretizations

The following lemmas are proved in Delattre et al. (2017). In the first two lemmas, we set \(X_1(t)=X(t), \Phi _1=\Phi , \Psi _1=\Psi \).

Lemma 4

Under (H1)–(H2), for \(s\le t\) and \(t-s \le 1\), \(p\ge 1\),

$$\begin{aligned} {{\mathbb {E}}}_{\vartheta }(|X(t)-X(s)|^{p}|\Phi =\varphi , \Psi =\psi )\lesssim K^{p} (t-s)^{p/2} (|\varphi |^{p}+\psi ^{p}). \end{aligned}$$

For \(t\rightarrow H(t,X.)\) a predictable process, let \(V(H;T)= \int _0^T H(s,X.)ds\) and \(U(H;T)=\int _0^T H(s,X.)dX(s)\). The following results can be standardly proved.

Lemma 5

Assume (H1)–(H2) and \(p\ge 1\). If H is bounded, \( {{\mathbb {E}}}_{\vartheta }(|U(H;T)|^{p}|\Phi =\varphi , \Psi =\psi ) \lesssim | \varphi |^{p}+\psi ^{p}.\)

Consider \(f:{{\mathbb {R}}} \rightarrow {{\mathbb {R}}}\) and set \(H(s,X.)=f(X(s)), H_n(s,X.)= \sum _{j=1}^n f(X((j-1)\Delta ))1_{((j-1)\Delta ,j\Delta ]}(s)\). If f is Lipschitz,

$$\begin{aligned} {{\mathbb {E}}}_{\vartheta }(|V(H;T)-V(H_n;T)|^{p}|\Phi =\varphi , \Psi =\psi )\lesssim & {} \Delta ^{p/2} (|\varphi |^{p}+\psi ^{p}). \end{aligned}$$

If f is \(C^2\) with \(f',f''\) bounded

$$\begin{aligned} {{\mathbb {E}}}_{\vartheta }(|U(H;T)-U(H_n;T)|^{p}|\Phi {=}\varphi , \Psi {=}\psi )\lesssim & {} \Delta ^{p/2} (\varphi ^{2p}+|\varphi |^{p}\psi ^{p}+\psi ^{2p}+\psi ^{3p}). \end{aligned}$$

Lemma 6

Recall notations (25)–(26). Under (H1)–(H2),

$$\begin{aligned} {{\mathbb {E}}}_{\vartheta }(|B_{i,n} -B_i(T;\omega ^2)|^{p}|\Phi _i=\varphi , \Psi _i=\psi )\lesssim & {} \Delta ^{p/2} (|\varphi |^{p}+\psi ^{p})\\ {{\mathbb {E}}}_{\vartheta }(|A_{i,n} -A_i(T;\mu , \omega ^2)|^{p}|\Phi _i=\varphi , \Psi _i=\psi )\lesssim & {} \Delta ^{p/2} (\varphi ^{2p}+\psi ^{3p})\\ {{\mathbb {E}}}_{\vartheta }(|A_i(T;\mu , \omega ^2)|^{p}|\Phi _i=\varphi , \Psi _i=\psi ))\lesssim & {} |\varphi |^{p}+\psi ^{p}. \end{aligned}$$

Let

$$\begin{aligned} S_{i,n}^{(1)}=\frac{1}{ \Gamma _i} \sum _{j=1}^{n}\frac{(W_i(t_{j})-W_i(t_{j-1}))^2}{\Delta }. \end{aligned}$$

Lemma 7

Then, for all \(p\ge 1\),

$$\begin{aligned} {{\mathbb {E}}}_{\vartheta }\left( \left| \frac{S_{i,n}}{n}- \frac{S_{i,n}^{(1)}}{n}\right| ^{p}|\Phi _i=\varphi , \Psi _i=\psi \right) \lesssim \Delta ^{p} (\psi ^{2p}\varphi ^{2p} + \psi ^{4p} + \varphi ^{2p} ). \end{aligned}$$

1.3 Properties of the Gamma distribution

The digamma function \(\psi (a)= \Gamma '(a)/\Gamma (a)\) admits the following integral representation: \(\psi (z)= -\gamma +\int _0^1 (1-t^{z-1})/(1-t) dt\) (where \(-\gamma =\psi (1)=\Gamma '(1)\)). For all positive a, we have \( \psi '(a)= -\int _0^1 \frac{\log {t}}{1-t}\;t^{a-1} dt\). Consequently, using an integration by parts, \( -a\psi '(a)=-1- \int _0^1 t^a g(t)dt\), where \(g(t)=(\log {t}/(1-t))' \). A simple study yields that \(t^ag(t)\) is integrable on (0, 1) and positive except at \(t=1\). Thus, \(1-a\psi '(a) \ne 0\). The following asymptotic expansion as a tends to infinity holds:

$$\begin{aligned} \psi (a)= \log {a} - \frac{1}{2a} +O\left( \frac{1}{a^2}\right) ,\;\;\psi '(a) =\frac{1}{a} +O\left( \frac{1}{a^2}\right) . \end{aligned}$$
(47)

If X has distribution \(G(a, \lambda )\), then \(\lambda X\) has distribution G(a, 1). For all integer k, \({{\mathbb {E}}}(\lambda X)^k=\frac{\Gamma (a+k)}{\Gamma (a)}\). For \(a>k\), \({{\mathbb {E}}}(\lambda X)^{-k}=\frac{\Gamma (a-k)}{\Gamma (a)}\). Moreover, \({{\mathbb {E}}}\log {(\lambda X)}= \psi (a)\), Var \([\log {(\lambda X)}]= \psi '(a)\).

In particular, if \(X=\sum _{j=1}^n \varepsilon _i^2\) where the \(\varepsilon _i\)’s are i.i.d. \({{\mathcal {N}}}(0,1)\), then \(X\sim \chi ^2(n)=G(n/2,1/2)\). Therefore, \({{\mathbb {E}}}X^{-p}<+\infty \) for \(n>2p\) and as \(n\rightarrow +\infty \),

$$\begin{aligned} {{\mathbb {E}}}\left( \frac{X}{n}\right) ^p= O(1),\quad \quad {{\mathbb {E}}}\left( \frac{n}{X}\right) ^{p}= O(1) . \end{aligned}$$
(48)

1.4 Direct observation of the random effects

Assume that a sample \((\Phi _i, \Gamma _i), i=1, \ldots ,N\), is observed and that \(d=1\) for simplicity. The Gamma distribution with parameters \((a, \lambda )\) (\(a>0, \lambda >0\)) \(G(a, \lambda )\), has density \( \gamma _{a,\lambda }(x)=(\lambda ^a/\Gamma (a)) x^{a-1} e^{-\lambda x} \mathbb {1}_{(0, +\infty )}(x), \) where \(\Gamma (a)\) is the Gamma function. We set \(\psi (a)= \Gamma '(a)/\Gamma (a)\). The log-likelihood \(\ell _N(\vartheta )\) of the N-sample \((\Phi _i, \Gamma _i), i=1, \ldots ,N\) has score function \({{\mathcal {S}}}_N(\vartheta )= \left( \frac{\partial }{\partial \lambda } \ell _N(\vartheta ) \;\; \frac{\partial }{\partial a} \ell _N(\vartheta )\;\;\frac{\partial }{\partial \mu } \ell _N(\vartheta ) \;\; \frac{\partial }{\partial \omega ^2} \ell _N(\vartheta ) \right) '\) given by

$$\begin{aligned} \frac{\partial }{\partial \lambda } \ell _N(\vartheta )= & {} \sum _{i=1}^N \left( \frac{a}{\lambda } -\Gamma _i\right) , \quad \frac{\partial }{\partial a} \ell _N(\vartheta ) = \sum _{i=1}^N \left( -\psi (a)+ \log {\lambda } +\log {\Gamma _i}\right) , \\ \frac{\partial }{\partial \mu } \ell _N(\vartheta )= & {} \omega ^{-2}\sum _{i=1}^N \Gamma _i (\Phi _i -\mu ),\quad \frac{\partial }{\partial \omega ^2} \ell _N(\vartheta ) = \frac{1}{2 \omega ^4}\sum _{i=1}^N \left( \Gamma _i (\Phi _i -\mu )^2 -\omega ^2\right) . \end{aligned}$$

By standard properties, we have, under \({{\mathbb {P}}}_{\vartheta }\), \(N^{-1/2}{{\mathcal {S}}}_N(\vartheta ) \rightarrow _{{{\mathcal {D}}}}{{\mathcal {N}}}_4(0, {{\mathcal {J}}}_0(\vartheta )),\) where

$$\begin{aligned} {{\mathcal {J}}}_0(\vartheta )= & {} \left( \begin{array}{c|c} I_0(\lambda ,a) &{}\quad \mathbf{0}\\ \hline \mathbf{0} &{}\quad J_0(\lambda ,a,\mu , \omega ^2) \end{array}\right) ,\nonumber \\ I_0(\lambda ,a)= & {} \left( \begin{array}{ll} \frac{a}{\lambda ^2} &{}\quad -\frac{1}{\lambda }\\ -\frac{1}{\lambda } &{}\quad \psi '(a) \end{array}\right) , \quad J_0(\lambda ,a,\mu , \omega ^2)=\left( \begin{array}{ll} \frac{a}{\lambda \omega ^2} &{}\quad 0\\ 0 &{}\quad \frac{1}{2\omega ^4} \end{array}\right) . \end{aligned}$$
(49)

Using properties of the di-gamma function (\(a\psi '(a)-1\ne 0\)), \(I_0(a,\lambda )\) is invertible for all \((a,\lambda )\in (0,+\infty )^2\). The maximum likelihood estimator based on the observation of \((\Phi _i,\Gamma _i, i=1, \ldots , N)\), denoted \(\vartheta _N=\vartheta _N(\Phi _i,\Gamma _i, i=1, \ldots , N)\), is consistent and satisfies \(\sqrt{N}(\vartheta _N-\vartheta )\rightarrow _{{{\mathcal {D}}}}{{\mathcal {N}}}_4(0, {{\mathcal {J}}}_0^{-1}(\vartheta ))\) under \({{\mathbb {P}}}_{\vartheta }\) as N tends to infinity.

In the simulations presented in Sect. 4, we took \(a=8\) and observed that estimations of a are biased with a large standard deviation. This can be seen on \(I_0^{-1}(\lambda ,a)\):

$$\begin{aligned} I_0^{-1}(\lambda ,a)=\frac{1}{a\psi '(a)-1}\left( \begin{array}{ll} \lambda ^2 \psi '(a) &{}\quad \lambda \psi '(a)\\ \lambda \psi '(a) &{}\quad a \end{array}\right) . \end{aligned}$$
(50)

If a large, \(a/ (a\psi '(a)-1)=O(a^2)\).

However, natural parameters for Gamma distributions are \(m=a/\lambda , t= \psi (a)-\log {\lambda }\) with unbiased estimators \(\hat{m}= N^{-1}\sum _{i=1}^N \Gamma _i\), \(\hat{t}= N^{-1} \sum _{i=1}^N \log {\Gamma _i}\) such that the vector \(\sqrt{N}(\hat{m}-m, \hat{t} -t)\) is asymptotically Gaussian with limiting covariance matrix

$$\begin{aligned} \left( \begin{array}{ll} \frac{a}{\lambda ^2} &{}\quad \frac{1}{\lambda }\\ \frac{1}{\lambda } &{}\quad \psi '(a) \end{array}\right) . \end{aligned}$$
(51)

The asymptotic variance of \(\hat{t}\) is \(\psi '(a)=O(a^{-1})\) and both parameters (mt) are well estimated.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Delattre, M., Genon-Catalot, V. & Larédo, C. Approximate maximum likelihood estimation for stochastic differential equations with random effects in the drift and the diffusion. Metrika 81, 953–983 (2018). https://doi.org/10.1007/s00184-018-0666-z

Download citation

  • Received:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00184-018-0666-z

Keywords

Navigation