Advertisement

Bulletin of Mathematical Biology

, Volume 81, Issue 2, pp 568–597 | Cite as

Tropical Principal Component Analysis and Its Application to Phylogenetics

  • Ruriko YoshidaEmail author
  • Leon Zhang
  • Xu Zhang
Special Issue: Algebraic Methods in Phylogenetics

Abstract

Principal component analysis is a widely used method for the dimensionality reduction of a given data set in a high-dimensional Euclidean space. Here we define and analyze two analogues of principal component analysis in the setting of tropical geometry. In one approach, we study the Stiefel tropical linear space of fixed dimension closest to the data points in the tropical projective torus; in the other approach, we consider the tropical polytope with a fixed number of vertices closest to the data points. We then give approximative algorithms for both approaches and apply them to phylogenetics, testing the methods on simulated phylogenetic data and on an empirical dataset of Apicomplexa genomes.

Keywords

Dimensionality reduction Phylogenomics Tropical geometry 

Notes

Acknowledgements

R. Y. was supported by Research Initiation Proposals from the Naval Postgraduate School and NSF Division of Mathematical Sciences 1622369. L. Z. was supported by an NSF Graduate Research Fellowship. X. Z. was supported by travel funding from the Department of Statistics at the University of Kentucky. The authors thank Bernd Sturmfels (UC Berkeley and MPI Leipzig) for many helpful conversations. The authors also thank Daniel Howe (University of Kentucky) for his input on Apicomplexa tree topologies.

References

  1. Akian M, Gaubert S, Viorel N, Singer I (2011) Best approximation in max-plus semimodules. Linear Algebra Appl 435:3261–3296MathSciNetCrossRefzbMATHGoogle Scholar
  2. Billera L, Holmes S, Vogtman K (2001) Geometry of the space of phylogenetic trees. Adv Appl Math 27:733–767MathSciNetCrossRefzbMATHGoogle Scholar
  3. Butkovic P (2010) Max-linear systems: theory and algorithms. Springer, London Springer monographs in mathematicsCrossRefzbMATHGoogle Scholar
  4. Burkard R, Dell’Amico M, Martello S (2009) Assignment problems. Society for Industrial and Applied Mathematics, PhiladelphiaCrossRefzbMATHGoogle Scholar
  5. Cohen G, Gaubert S, Quadrat J (2004) Duality and separation theorems in idempotent semimodules. Linear Algebra Appl 379:395–422MathSciNetCrossRefzbMATHGoogle Scholar
  6. Depersin J, Gaubert S, Joswig M (2017) A tropical isoperimetric inequality. Sémin Lothar Combin 78B:12MathSciNetzbMATHGoogle Scholar
  7. Develin M, Sturmfels B (2004) Tropical convexity. Doc Math 9:1–27MathSciNetzbMATHGoogle Scholar
  8. Feragen A, Owen M, Petersen J, Wille MMW, Thomsen LH, Dirksen A, de Bruijne M (2012) Tree-space statistics and approximations for large-scale analysis of anatomical trees. In: IPMI 2013: information processing in medical imagingGoogle Scholar
  9. Fink A, Rincón F (2015) Stiefel tropical linear spaces. J Combin Theory A 135:291–331MathSciNetCrossRefzbMATHGoogle Scholar
  10. Igor G, Stephan N, Ariela S (2009) Linear and nonlinear optimization, 2nd edn. Society for Industrial Mathematics, PhiladelphiazbMATHGoogle Scholar
  11. Joswig M (2017) Essentials of tropical combinatorics (in preparation). http://page.math.tu-berlin.de/~joswig/etc/index.html
  12. Joswig M, Sturmfels B, Yu J (2007) Affine buildings and tropical convexity. Alban J Math 1:187–211MathSciNetzbMATHGoogle Scholar
  13. Kuo C, Wares JP, Kissinger JC (2008) The apicomplexan whole-genome phylogeny: an analysis of incongruence among gene trees. Mol Biol Evol 25:2689–2698CrossRefGoogle Scholar
  14. Lenstra HW (1983) Integer programming with a fixed number of variables. Math Oper Res 8:538–548MathSciNetCrossRefzbMATHGoogle Scholar
  15. Lin B, Sturmfels B, Tang X, Yoshida R (2017) Convexity in tree spaces. SIAM Discrete Math 3:2015–2038MathSciNetCrossRefzbMATHGoogle Scholar
  16. Lin B, Yoshida R (2018) Tropical Fermat–Weber points. SIAM Discrete Math. arXiv:1604.04674
  17. Maclagan D, Sturmfels B (2015) Introduction to tropical geometry, graduate studies in mathematics, vol 161. American Mathematical Society, ProvidenceCrossRefzbMATHGoogle Scholar
  18. Maddison WP, Maddison D (2017) Mesquite: a modular system for evolutionary analysis. Version 3.31 http://mesquiteproject.org
  19. Nye T, Tang X, Weyenberg G, Yoshida R (2017) Principal component analysis and the locus of the Fréchet mean in the space of phylogenetic trees. Biometrika 104(4):901–922MathSciNetCrossRefzbMATHGoogle Scholar
  20. Richter-Gebert J, Sturmfels B, Theobald T (2005) First steps in tropical geometry. In: Litvinov GL, Maslov VP (eds) Idempotent mathematics and mathematical physics, vol 377. American Mathematical Society, Providence, pp 289–308CrossRefGoogle Scholar
  21. Weyenberg G, Yoshida R, Howe D (2016) Normalizing kernels in the Billera–Holmes–Vogtmann treespace. IEEE/ACM Trans Comput Biol Bioinform.  https://doi.org/10.1109/TCBB.2016.2565475
  22. Zhao J, Yoshida R, Cheung SS, Haws D (2013) Approximate techniques in solving optimal camera placement problems. Int J Distrib Sens Netw 241913:15.  https://doi.org/10.1155/2013/241913 Google Scholar

Copyright information

© This is a U.S. government work and its text is not subject to copyright protection in the United States; however, its text may be subject to foreign copyright protection 2018

Authors and Affiliations

  1. 1.Naval Postgraduate SchoolMontereyUSA
  2. 2.University of California, BerkeleyBerkeleyUSA
  3. 3.University of KentuckyLexingtonUSA

Personalised recommendations