Principal component analysis is one of the most popular unsupervised learning methods for reducing the dimension of a given data set in a high-dimensional Euclidean space. However, computing principal components on a space of phylogenetic trees with fixed labels of leaves is a challenging task since a space of phylogenetic tree is not Euclidean. In 2017, Yoshida et al. defined a notion of tropical principal component analysis and they have applied it to a space of phylogenetic trees. The challenge, however, they encountered was a computational times.
In this paper we estimate tropical principal components in a space of phylogenetic trees using the Metropolis-Hasting algorithm. We have implemented an R software package to efficiently estimate tropical principal components and then we have applied it to African coelacanth genomes data set.
Phylogenetic trees Polytopes Tropical geometry
This is a preview of subscription content, log in to check access.
R. Y. is supported by NSF Division of Mathematical Sciences: CDS&E-MSS program. Proposal number:1622369.
Akian, M., Gaubert, S., Viorel, N., Singer, I.: Best approximation in max-plus semimodules. Linear Algebra Appl. 435, 3261–3296 (2011)MathSciNetCrossRefGoogle Scholar
Levine, N.D.: Progress in taxonomy of the Apicomplexan protozoa. J. Eukaryot Microbiol. 35, 518–520 (1988)Google Scholar
Liang, D., Shen, X.X., Zhang, P.: One thousand two hundred ninety nuclear genes from a genome-wide survey support lungfishes as the sister group of tetrapods. Mol. Biol. Evol. 8, 1803–1807 (2013)CrossRefGoogle Scholar