Skip to main content

Advertisement

Log in

A new multimodal and asymmetric bivariate circular distribution

  • Published:
Environmental and Ecological Statistics Aims and scope Submit manuscript

Abstract

Multimodal and asymmetric bivariate circular data arise in several different disciplines and fitting appropriate distribution plays an important role in the analysis of such data. In this paper, we propose a new bivariate circular distribution which can be used to model both asymmetric and multimodal bivariate circular data simultaneously. In fact the proposed density covers unimodality as well as multimodality, symmetry as well as asymmetry of circular bivariate data. A number of properties of the proposed density are presented. A Bayesian approach with MCMC scheme is employed for statistical inference. Three real datasets and a simulation study are provided to illustrate the performance of the proposed model in comparison with alternative models such as finite mixture Cosine model.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

References

  • Abe T, Pewsey A (2011) Sine-skewed circular distributions. Stat Pap 52(3):683–707

    Article  Google Scholar 

  • Amos DE (1974) Computation of modified Bessel functions and their ratios. Math Comput 28(125):239–251

    Article  Google Scholar 

  • Arnold BC, Strauss DJ (1991) Bivariate distributions with conditionals in prescribed exponential families. J R Stat Soc Ser B Methodol 53:365–375

    Google Scholar 

  • Berman HM, Westbrook J, Feng Z, Gilliland G, Bhat TN, Weissig H, Shindyalov IN, Bourne PE (2000) The protein data bank. Nucl Acids Res 28:235–242

    Article  PubMed  CAS  Google Scholar 

  • Best D, Fisher NI (1979) Efficient simulation of the von Mises distribution. Appl Stat 28:152–157

    Article  Google Scholar 

  • Cox DR (1975) Contribution to discussion of Mardia (1975a). J R Stat Soc Ser B Methodol 37:380–381

    Google Scholar 

  • Dahl DB, Bohannan Z, Mo Q, Vannucci M, Tsai J (2008) Assessing side-chain perturbations of the protein backbone: a knowledge-based classification of residue Ramachandran space. J Mol Biol 378(3):749–758

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  • De Finetti B (1972) Probability, induction, and statistics. Wiley, New York

    Google Scholar 

  • Fernández-Durán JJ, Gregorio-Domínguez MM (2014) Modeling angles in proteins and circular genomes using multivariate angular distributions based on multiple nonnegative trigonometric sums. Stat Appl Genet Mol Biol 13(1):1–18

    Article  PubMed  CAS  Google Scholar 

  • Ferreira JT, Jurez MA, Steel MF (2008) Directional log-spline distributions. Bayesian Anal 3(2):297–316

    Article  Google Scholar 

  • Gatto R, Jammalamadaka SR (2007) The generalized von Mises distribution. Stat Methodol 4(3):341–353

    Article  Google Scholar 

  • Geweke J, Tanizaki H (2001) Bayesian estimation of state-space models using the Metropolis Hastings algorithm within Gibbs sampling. Comput Stat Data Anal 37(2):151–170

    Article  Google Scholar 

  • Green PJ, Mardia KV (2006) Bayesian alignment using hierarchical models, with applications in protein bioinformatics. Biometrika 93(2):235–254

    Article  Google Scholar 

  • Johnson RA, Wehrly T (1977) Measures and models for angular correlation and angular-linear correlation. J R Stat Soc Ser B Methodol 39(2):222–229

  • Jones MC, Pewsey A, Kato S (2015) On a class of circulas: copulas for circular distributions. Ann Inst Stat Math 67(5):843–862

    Article  Google Scholar 

  • Kato S (2009) A distribution for a pair of unit vectors generated by Brownian motion. Bernoulli 15(3):898–921

    Article  Google Scholar 

  • Kato S, Pewsey A (2015) A Möbius transformation-induced distribution on the torus. Biometrika 102(2):359–370

    Article  Google Scholar 

  • Kim S, SenGupta A (2013) A three-parameter generalized von Mises distribution. Stat Pap 54(3):685–693

    Article  Google Scholar 

  • Kim S, SenGupta A, Arnold BC (2016) A multivariate circular distribution with applications to the protein structure prediction problem. J Multivar Anal 143:374–382

    Article  Google Scholar 

  • Lennox KP, Dahl DB, Vannucci M, Tsai JW (2009) Density estimation for protein conformation angles using a bivariate von Mises distribution and Bayesian nonparametrics. J Am Stat Assoc 104(486):586–596

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  • Mardia KV (1975a) Statistics of directional data (with discussion). J R Stat Soc Ser B Methodol 37:349–393

    Google Scholar 

  • Mardia KV (1975b) Characterizations of directional distributions. In: Patil GP, Kotz S, Ord JK (eds) Statistical distributions in scientific work, vol 3. Reidel, Dordrecht, pp 365–386

    Chapter  Google Scholar 

  • Mardia KV (2013) Statistical approaches to three key challenges in protein structural bioinformatics. J R Stat Soc Ser C Appl Stat 62(3):487–514

    Article  Google Scholar 

  • Mardia KV, Taylor CC, Subramaniam GK (2007) Protein bioinformatics and mixtures of bivariate von Mises distributions for angular data. Biometrics 63(2):505–512

    Article  PubMed  CAS  Google Scholar 

  • Mardia KV, Hughes G, Taylor CC, Singh H (2008) A multivariate von Mises distribution with applications to bioinformatics. Can J Stat 36(1):99–109

    Article  Google Scholar 

  • Rivest LP (1988) A distribution for dependent unit vectors. Commun Stat Theory Methods 17(2):461–483

    Article  Google Scholar 

  • Robert CP, Casella G (2004) Monte Carlo statistical methods. Springer texts in statistics. Springer, Berlin

    Book  Google Scholar 

  • SenGupta A (2004) On the constructions of probability distributions for directional data. Bull Calcutta Math Soc 96:139–154

    Google Scholar 

  • Shieh GS, Johnson RA (2005) Inferences based on a bivariate distribution with von Mises marginal. Ann Inst Stat Math 57(4):789–802

    Article  Google Scholar 

  • Shieh GS, Zheng S, Johnson RA, Chang Y-F, Shimizu K, Wang C-C, Tang S-L (2011) Modeling and comparing the organization of circular genomes. Bioinformatics 27:912–918

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  • Singh H, Hnizdo V, Demchuk E (2002) Probabilistic model for two dependent circular variables. Biometrika 89(3):719–723

    Article  Google Scholar 

  • Smith AFM, Roberts GO (1993) Bayesian computation via the Gibbs sampler and related Markov chain Monte Carlo methods. J R Stat Soc Ser B Methodol 55:3–25

    Google Scholar 

  • Thompson JW (1975) Contribution to discussion of paper by K. V. Mardia. J R Stat Soc Ser B Methodol 37:379

    Google Scholar 

  • Umbach D, Jammalamadaka SR (2009) Building asymmetry into circular distributions. Stat Probab Lett 79(5):659–663

    Article  Google Scholar 

  • Wehrly TE, Johnson RA (1980) Bivariate models for dependence of angular observations and a related Markov process. Biometrika 67(1):255–256

    Article  Google Scholar 

  • Yfantis EA, Borgman LE (1982) An extension of the von Mises distribution. Commun Stat Theory Methods 11:1695–1706

    Article  Google Scholar 

Download references

Acknowledgements

Authors gratefully acknowledge Editor-in-Chief and Reviewers for their valuable comments. The first author is grateful to The Scientific and Technological Research Council of Turkey (TUBITAK) for the support.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Fatemeh Hassanzadeh.

Additional information

Communicated by Pierre Dutilleul.

Appendix

Appendix

Proof of Proposition 2.2

The proof is easily obtained since

$$\begin{aligned} C( \delta ,\kappa _{1},\kappa _{2},\kappa _{3},\kappa _{4}) =C\left( \frac{\pi }{4},\kappa _{1},\kappa _{2}\right) C( \delta ,\kappa _{3},\kappa _{4}) \end{aligned}$$

where \(C\left( \frac{\pi }{4},\kappa _{1},\kappa _{2}\right) \) and \(C(\delta ,\kappa _{3},\kappa _{4}) \) are normalizing constants of AGvM and GvM, respectively. Gatto and Jammalamadaka (2007) maintained

$$\begin{aligned} C( \delta ,\kappa _{3},\kappa _{4}) =I_{0}(\kappa _{3}) I_{0}(\kappa _{4}) +\sum _{l=1}^{\infty }I_{2l}(\kappa _{3}) I_{l}(\kappa _{4}) \cos (2l\delta ) . \end{aligned}$$

Also \(C\left( \frac{\pi }{4},\kappa _{1},\kappa _{2}\right) \) can be obtained by choosing \(\delta =\frac{\pi }{4}\) in Eq. (4).

Table 4 Critical points of MABvM when \(\theta _{2}=0\)
Table 5 Critical points of MABvM when \(\theta _{2}=\pi \)
Table 6 Critical points of MABvM when \(\theta _{2}=\)arctan\(( R_{3},R_{4}) \) with condition \(\kappa _{1}< 2|\kappa _{2}|\) and \(\kappa _{3}< 4\kappa _{4}\)
Table 7 Critical points of MABvM when \(\theta _{2}=\)arctan\((-R_{3},R_{4}) \) with condition \(\kappa _{1}< 2|\kappa _{2}|\) and \(\kappa _{3}< 4\kappa _{4}\)
Fig. 7
figure 7

The posterior plots of \(\varphi \) in simulation study with real value \(\varphi =0\) when \(n=30\). a History plot, b Gelman–Rubin diagnostic, c Kernel density plot, d autoregressive plot

Proof of Proposition 2.4

We can derive the conditions for number of modes of the MABvM density when \(\mu =\mu _{1}=\mu _{2}=0 \) and \(\varphi \ne 0\), without loss of generality. To locate the critical values, we take partial derivatives of \( \log ( f( \theta _{1},\theta _{2}) ) \) in Eq. (3) w.r.t \(\theta _{1}\) and \(\theta _{2}\). Let \(g=\log ( f( \theta _{1},\theta _{2}) ) =\log C+\kappa _{1}\cos ( \theta _{1}-\varphi \theta _{2}) +\kappa _{2}\sin ( 2\theta _{1}-2\varphi \theta _{2}) +\kappa _{3}\cos ( \theta _{2}) +\kappa _{4}\cos ( 2\theta _{2}) \). Then

$$\begin{aligned} G_{\theta _{1}}=\frac{\partial g}{\partial \theta _{1}}=\kappa _{1}\sin (\varphi \theta _{2}-\theta _{1})+2\kappa _{2}\cos (2\varphi \theta _{2}-2\theta _{1}) \end{aligned}$$

and

$$\begin{aligned} G_{\theta _{2}}=\frac{\partial g}{\partial \theta _{2}}=-G_{\theta _{1}}+\kappa _{3}\sin (-\theta _{2})+2\kappa _{4}\sin (-2\theta _{2}) \end{aligned}$$

There are sixteen critical points for \(G_{\theta _{1}}=G_{\theta _{2}}=0\). Let \(D_{\theta _{1},\theta _{2}}=G_{\theta _{1}, \theta _{1}}G_{\theta _{2}, \theta _{2}}-G_{\theta _{1}, \theta _{2}}^{2}\),

  1. (i)

    if \(D_{\theta _{1}, \theta _{2}}>0\) and \(G_{\theta _{1},\theta _{1}}<0\) then \((\theta _{1},\theta _{2})\) is a mode.

  2. (ii)

    if \(D_{\theta _{1}, \theta _{2}}>0\) and \(G_{\theta _{1},\theta _{1}}>0\) then \((\theta _{1},\theta _{2})\) is an anti-mode.

  3. (iii)

    if \(D_{\theta _{1}, \theta _{2}}<0\) then \((\theta _{1},\theta _{2})\) is a saddle point.

We can find the number of modes after simplification of underlying algebraic equations. The critical points are given in Tables 4, 5, 6 and 7 where \( R_{1}=(4\kappa _{2})^{-2}\left( -\kappa _{1}^{2}+16\kappa _{2}^{2}+\sqrt{\kappa _{1}^{4}+32\kappa _{1}^{2}\kappa _{2}^{2}}\right) \), \(R_{2}=(4\kappa _{2})^{-2}\left( -2\kappa _{1}^{2}+32\kappa _{2}^{2}-2\sqrt{\kappa _{1}^{4}+32\kappa _{1}^{2}\kappa _{2}^{2}}\right) \), \(R_{3}=\frac{\sqrt{-\kappa _{3}^{2}+16\kappa _{4}^{2}}}{4\kappa _{4} }\), and \(R_{4}=-\frac{\kappa _{3}}{4\kappa _{4}} \).

In these tables \(\arctan (x,y) \) gives the arc tangent of \(\frac{y}{x}\). Table 4 illustrates that there is no restrictions on unimodality when \(\theta _{1}=\) arctan\(\left( \frac{2\kappa _{2}}{\kappa _{1}}\left( \frac{1}{2} R_{1}-1\right) ,\sqrt{R_{1}}\right) \) and \(\theta _{2}=0\). Checking all of the critical points shows that if \(\kappa _{1}< 2\vert \kappa _{2}\vert \) or \(\kappa _{3}< 4\kappa _{4}\) then MABvM has two modes. Also if \(\kappa _{1}< 2\vert \kappa _{2}\vert \) and \(\kappa _{3}< 4\kappa _{4}\) then MABvM has four modes.

Posterior plots in simulation study when \(n=30\) and \(\varphi =0\) Figure 7 shows the history, Gelman–Rubin diagnostic, kernel density, and autocorrelation plots from the simulation study. We define two sequences from different starting points. The Gelman–Rubin diagnostic shows that the behavior of the sequence of chains is the same. Therefore, the variance within the chains is the same as the variance across the chains. Also, the autocorrelation plot reveals there is low correlation between successive samples and the history plot moves up and down around the mode of the distribution. Thus the samples will reach a stationary distribution.

3-D plots of protein data and the fitted models The difference between the proposed models and the alternative models is an advantage of our study, since we have defined a different model to analysis bivariate circular data. An additional figure and comment on the difference are now provided as follows. Accordingly, the MABvM model seems to provide a better fit for the ruggedness of the histogram (i.e. the frequencies that should not be ignored). The 3-D histogram of protein data and the kernel density plots of the competitive models are given in Fig. 8.

Fig. 8
figure 8

The 3-D histogram of protein data and kernel density plots of the fitted models. a Histogram plot, b MABvM model, c EMABvM model, d Shieh and Johnson model, e Kato model, f bivariate Cosine model, g bivariate Sine model, h mixture of bivariate Cosine model, i mixture of bivariate Sine model

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Hassanzadeh, F., Kalaylioglu, Z. A new multimodal and asymmetric bivariate circular distribution. Environ Ecol Stat 25, 363–385 (2018). https://doi.org/10.1007/s10651-018-0409-3

Download citation

  • Received:

  • Revised:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10651-018-0409-3

Keywords

Navigation