Abstract
Multimodal and asymmetric bivariate circular data arise in several different disciplines and fitting appropriate distribution plays an important role in the analysis of such data. In this paper, we propose a new bivariate circular distribution which can be used to model both asymmetric and multimodal bivariate circular data simultaneously. In fact the proposed density covers unimodality as well as multimodality, symmetry as well as asymmetry of circular bivariate data. A number of properties of the proposed density are presented. A Bayesian approach with MCMC scheme is employed for statistical inference. Three real datasets and a simulation study are provided to illustrate the performance of the proposed model in comparison with alternative models such as finite mixture Cosine model.
Similar content being viewed by others
References
Abe T, Pewsey A (2011) Sine-skewed circular distributions. Stat Pap 52(3):683–707
Amos DE (1974) Computation of modified Bessel functions and their ratios. Math Comput 28(125):239–251
Arnold BC, Strauss DJ (1991) Bivariate distributions with conditionals in prescribed exponential families. J R Stat Soc Ser B Methodol 53:365–375
Berman HM, Westbrook J, Feng Z, Gilliland G, Bhat TN, Weissig H, Shindyalov IN, Bourne PE (2000) The protein data bank. Nucl Acids Res 28:235–242
Best D, Fisher NI (1979) Efficient simulation of the von Mises distribution. Appl Stat 28:152–157
Cox DR (1975) Contribution to discussion of Mardia (1975a). J R Stat Soc Ser B Methodol 37:380–381
Dahl DB, Bohannan Z, Mo Q, Vannucci M, Tsai J (2008) Assessing side-chain perturbations of the protein backbone: a knowledge-based classification of residue Ramachandran space. J Mol Biol 378(3):749–758
De Finetti B (1972) Probability, induction, and statistics. Wiley, New York
Fernández-Durán JJ, Gregorio-Domínguez MM (2014) Modeling angles in proteins and circular genomes using multivariate angular distributions based on multiple nonnegative trigonometric sums. Stat Appl Genet Mol Biol 13(1):1–18
Ferreira JT, Jurez MA, Steel MF (2008) Directional log-spline distributions. Bayesian Anal 3(2):297–316
Gatto R, Jammalamadaka SR (2007) The generalized von Mises distribution. Stat Methodol 4(3):341–353
Geweke J, Tanizaki H (2001) Bayesian estimation of state-space models using the Metropolis Hastings algorithm within Gibbs sampling. Comput Stat Data Anal 37(2):151–170
Green PJ, Mardia KV (2006) Bayesian alignment using hierarchical models, with applications in protein bioinformatics. Biometrika 93(2):235–254
Johnson RA, Wehrly T (1977) Measures and models for angular correlation and angular-linear correlation. J R Stat Soc Ser B Methodol 39(2):222–229
Jones MC, Pewsey A, Kato S (2015) On a class of circulas: copulas for circular distributions. Ann Inst Stat Math 67(5):843–862
Kato S (2009) A distribution for a pair of unit vectors generated by Brownian motion. Bernoulli 15(3):898–921
Kato S, Pewsey A (2015) A Möbius transformation-induced distribution on the torus. Biometrika 102(2):359–370
Kim S, SenGupta A (2013) A three-parameter generalized von Mises distribution. Stat Pap 54(3):685–693
Kim S, SenGupta A, Arnold BC (2016) A multivariate circular distribution with applications to the protein structure prediction problem. J Multivar Anal 143:374–382
Lennox KP, Dahl DB, Vannucci M, Tsai JW (2009) Density estimation for protein conformation angles using a bivariate von Mises distribution and Bayesian nonparametrics. J Am Stat Assoc 104(486):586–596
Mardia KV (1975a) Statistics of directional data (with discussion). J R Stat Soc Ser B Methodol 37:349–393
Mardia KV (1975b) Characterizations of directional distributions. In: Patil GP, Kotz S, Ord JK (eds) Statistical distributions in scientific work, vol 3. Reidel, Dordrecht, pp 365–386
Mardia KV (2013) Statistical approaches to three key challenges in protein structural bioinformatics. J R Stat Soc Ser C Appl Stat 62(3):487–514
Mardia KV, Taylor CC, Subramaniam GK (2007) Protein bioinformatics and mixtures of bivariate von Mises distributions for angular data. Biometrics 63(2):505–512
Mardia KV, Hughes G, Taylor CC, Singh H (2008) A multivariate von Mises distribution with applications to bioinformatics. Can J Stat 36(1):99–109
Rivest LP (1988) A distribution for dependent unit vectors. Commun Stat Theory Methods 17(2):461–483
Robert CP, Casella G (2004) Monte Carlo statistical methods. Springer texts in statistics. Springer, Berlin
SenGupta A (2004) On the constructions of probability distributions for directional data. Bull Calcutta Math Soc 96:139–154
Shieh GS, Johnson RA (2005) Inferences based on a bivariate distribution with von Mises marginal. Ann Inst Stat Math 57(4):789–802
Shieh GS, Zheng S, Johnson RA, Chang Y-F, Shimizu K, Wang C-C, Tang S-L (2011) Modeling and comparing the organization of circular genomes. Bioinformatics 27:912–918
Singh H, Hnizdo V, Demchuk E (2002) Probabilistic model for two dependent circular variables. Biometrika 89(3):719–723
Smith AFM, Roberts GO (1993) Bayesian computation via the Gibbs sampler and related Markov chain Monte Carlo methods. J R Stat Soc Ser B Methodol 55:3–25
Thompson JW (1975) Contribution to discussion of paper by K. V. Mardia. J R Stat Soc Ser B Methodol 37:379
Umbach D, Jammalamadaka SR (2009) Building asymmetry into circular distributions. Stat Probab Lett 79(5):659–663
Wehrly TE, Johnson RA (1980) Bivariate models for dependence of angular observations and a related Markov process. Biometrika 67(1):255–256
Yfantis EA, Borgman LE (1982) An extension of the von Mises distribution. Commun Stat Theory Methods 11:1695–1706
Acknowledgements
Authors gratefully acknowledge Editor-in-Chief and Reviewers for their valuable comments. The first author is grateful to The Scientific and Technological Research Council of Turkey (TUBITAK) for the support.
Author information
Authors and Affiliations
Corresponding author
Additional information
Communicated by Pierre Dutilleul.
Appendix
Appendix
Proof of Proposition 2.2
The proof is easily obtained since
where \(C\left( \frac{\pi }{4},\kappa _{1},\kappa _{2}\right) \) and \(C(\delta ,\kappa _{3},\kappa _{4}) \) are normalizing constants of AGvM and GvM, respectively. Gatto and Jammalamadaka (2007) maintained
Also \(C\left( \frac{\pi }{4},\kappa _{1},\kappa _{2}\right) \) can be obtained by choosing \(\delta =\frac{\pi }{4}\) in Eq. (4).
Proof of Proposition 2.4
We can derive the conditions for number of modes of the MABvM density when \(\mu =\mu _{1}=\mu _{2}=0 \) and \(\varphi \ne 0\), without loss of generality. To locate the critical values, we take partial derivatives of \( \log ( f( \theta _{1},\theta _{2}) ) \) in Eq. (3) w.r.t \(\theta _{1}\) and \(\theta _{2}\). Let \(g=\log ( f( \theta _{1},\theta _{2}) ) =\log C+\kappa _{1}\cos ( \theta _{1}-\varphi \theta _{2}) +\kappa _{2}\sin ( 2\theta _{1}-2\varphi \theta _{2}) +\kappa _{3}\cos ( \theta _{2}) +\kappa _{4}\cos ( 2\theta _{2}) \). Then
and
There are sixteen critical points for \(G_{\theta _{1}}=G_{\theta _{2}}=0\). Let \(D_{\theta _{1},\theta _{2}}=G_{\theta _{1}, \theta _{1}}G_{\theta _{2}, \theta _{2}}-G_{\theta _{1}, \theta _{2}}^{2}\),
-
(i)
if \(D_{\theta _{1}, \theta _{2}}>0\) and \(G_{\theta _{1},\theta _{1}}<0\) then \((\theta _{1},\theta _{2})\) is a mode.
-
(ii)
if \(D_{\theta _{1}, \theta _{2}}>0\) and \(G_{\theta _{1},\theta _{1}}>0\) then \((\theta _{1},\theta _{2})\) is an anti-mode.
-
(iii)
if \(D_{\theta _{1}, \theta _{2}}<0\) then \((\theta _{1},\theta _{2})\) is a saddle point.
We can find the number of modes after simplification of underlying algebraic equations. The critical points are given in Tables 4, 5, 6 and 7 where \( R_{1}=(4\kappa _{2})^{-2}\left( -\kappa _{1}^{2}+16\kappa _{2}^{2}+\sqrt{\kappa _{1}^{4}+32\kappa _{1}^{2}\kappa _{2}^{2}}\right) \), \(R_{2}=(4\kappa _{2})^{-2}\left( -2\kappa _{1}^{2}+32\kappa _{2}^{2}-2\sqrt{\kappa _{1}^{4}+32\kappa _{1}^{2}\kappa _{2}^{2}}\right) \), \(R_{3}=\frac{\sqrt{-\kappa _{3}^{2}+16\kappa _{4}^{2}}}{4\kappa _{4} }\), and \(R_{4}=-\frac{\kappa _{3}}{4\kappa _{4}} \).
In these tables \(\arctan (x,y) \) gives the arc tangent of \(\frac{y}{x}\). Table 4 illustrates that there is no restrictions on unimodality when \(\theta _{1}=\) arctan\(\left( \frac{2\kappa _{2}}{\kappa _{1}}\left( \frac{1}{2} R_{1}-1\right) ,\sqrt{R_{1}}\right) \) and \(\theta _{2}=0\). Checking all of the critical points shows that if \(\kappa _{1}< 2\vert \kappa _{2}\vert \) or \(\kappa _{3}< 4\kappa _{4}\) then MABvM has two modes. Also if \(\kappa _{1}< 2\vert \kappa _{2}\vert \) and \(\kappa _{3}< 4\kappa _{4}\) then MABvM has four modes.
Posterior plots in simulation study when \(n=30\) and \(\varphi =0\) Figure 7 shows the history, Gelman–Rubin diagnostic, kernel density, and autocorrelation plots from the simulation study. We define two sequences from different starting points. The Gelman–Rubin diagnostic shows that the behavior of the sequence of chains is the same. Therefore, the variance within the chains is the same as the variance across the chains. Also, the autocorrelation plot reveals there is low correlation between successive samples and the history plot moves up and down around the mode of the distribution. Thus the samples will reach a stationary distribution.
3-D plots of protein data and the fitted models The difference between the proposed models and the alternative models is an advantage of our study, since we have defined a different model to analysis bivariate circular data. An additional figure and comment on the difference are now provided as follows. Accordingly, the MABvM model seems to provide a better fit for the ruggedness of the histogram (i.e. the frequencies that should not be ignored). The 3-D histogram of protein data and the kernel density plots of the competitive models are given in Fig. 8.
Rights and permissions
About this article
Cite this article
Hassanzadeh, F., Kalaylioglu, Z. A new multimodal and asymmetric bivariate circular distribution. Environ Ecol Stat 25, 363–385 (2018). https://doi.org/10.1007/s10651-018-0409-3
Received:
Revised:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10651-018-0409-3