Towards a Universal Genomic Positioning System: Phylogenetics and Species IDentification

Garzon, Max H.; Mainali, Sambriddhi

doi:10.1007/978-3-319-56154-7_42

Towards a Universal Genomic Positioning System: Phylogenetics and Species IDentification

Max H. Garzon¹⁵ &
Sambriddhi Mainali¹⁵

Conference paper
First Online: 01 April 2017

1833 Accesses
7 Citations

Part of the book series: Lecture Notes in Computer Science ((LNBI,volume 10209))

Abstract

Technology to gather biomic data now far exceeds the capabilities of tools to extract useful information and knowledge from it, a challenging predicament facing demands in our time, such as personalized medicine. We propose a new family of data structures to represent and process omics data in a way that is more anchored in biological reality and processed by algorithms that are more consistent with it, so that DNA itself can be used to process it to extract useful knowledge, organize and store it as needed. These structures enable much more efficient crunching of genomic and proteomics data and can be used as a foundation of a truly universal Genomic Positioning System (GenIS). The power of this approach is illustrated by applications to two important problems in biology, a new universal set of biomarkers and methods to do phylogenetic analysis and species identification and classification. We show that certain metrics on these representations can be used to obtain ab initio, from genomic data alone (possibly including full genomes), in a matter of minutes or hours, well established and accepted phylogenies crafted in biology (such as the 16S rRNA-based plylogenies) in the course of the last 50 years. We also show how the same representation can also be used to solve recognition problems associated with genomic data, which includes in particular the problem of species identification and a solution to the problem of storing large genomes into compact representations while preserving the ability to query them efficiently. We also sketch other applications to be explored in the future, including objective criteria to produce biological taxonomies to produce a truly universal and comprehensive “Atlas of Life”, as it is or as it could be on earth.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

Adleman, L.: Molecular computation of solutions of combinatorial problems. Science 266, 1021–1024 (1994)
Article Google Scholar
Bi, H., Chen, J., Deaton, R., Garzon, M., Rubin, H., Wood, D.H.: A PCR protocol for in Vitro selection of non-crosshybridizing oligonucleotides. J. Nat. Comput. 2(3), 417–426 (2003)
Article MATH Google Scholar
Bobba, K.C., Neel, A.J., Phan, V., Garzon, M.H.: “Reasoning” and “Talking” DNA: can DNA understand English? In: Mao, C., Yokomori, T. (eds.) DNA 2006. LNCS, vol. 4287, pp. 337–349. Springer, Heidelberg (2006). doi:10.1007/11925903_26
Chapter Google Scholar
Chen, J., Chen, S., Deng, L.-Y., Bowman, D., Shiau, J.-J., Wong, T.-Y., Madahian, B., Henry, L.: Phylogenetic tree construction using Trinucleotide Usage Profile (TUP). BMC 17(13), 381 (2016)
Google Scholar
Deaton, J., Chen, J., Garzon, M., Wood, D.H.: Test Tube Selection of Large Independent Sets of DNA Oligonucleotides R, pp. 152–166. World Publishing Co. Singapore (Volume dedicated to Ned Seeman on occasion of his 60^th birthday)
Google Scholar
Garzon, M.H., Mainali, S.: Towards reliable microarray analysis and design. In: The 9th International Conference on Bioinformatics and Computational Biology (2017)
Google Scholar
Garzon, M.H., Wong, T.-Y., Garzon, M.H., Wong, T.Y.: DNA chips for species identification and biological phylogenies. Nat. Comput. 10, 375–389 (2011)
Article MathSciNet Google Scholar
Garzon, M.H., Bobba, K.C.: A geometric approach to Gibbs energy landscapes and optimal DNA codeword design. In: Stefanovic, D., Turberfield, A. (eds.) DNA 2012. LNCS, vol. 7433, pp. 73–85. Springer, Heidelberg (2012). doi:10.1007/978-3-642-32208-2_6
Chapter Google Scholar
Garzon, M.H., Phan, V., Neel, A.: Optimal codes for computing and self-assembly. Int. J. Nanotechnol. Mol. Comput. 1, 1–17 (2009)
Article Google Scholar
Garzon, M.H., Yan, H. (eds.): DNA 2007. LNCS, vol. 4848. Springer, Heidelberg (2008). doi:10.1007/978-3-540-77962-9
MATH Google Scholar
Garzon, M.H., Phan, V., Roy, S., Neel, A.J.: In search of optimal codes for DNA computing. In: Mao, C., Yokomori, T. (eds.) DNA 2006. LNCS, vol. 4287, pp. 143–156. Springer, Heidelberg (2006). doi:10.1007/11925903_11
Chapter Google Scholar
Garzon, M.H., Phan, V., Bobba, K.C., Kontham, R.: Sensitivity and capacity of microarray encodings. In: Carbone, A., Pierce, N.A. (eds.) DNA 2005. LNCS, vol. 3892, pp. 81–95. Springer, Heidelberg (2006). doi:10.1007/11753681_7
Chapter Google Scholar
Garzon, M.H., Blain, D., Neel, A.J.: Virtual test tubes for biomolecular computing. J. Nat. Comput. 3(4), 461–477 (2004)
Article Google Scholar
Hennig, W.: Grundzüge einer Theorie der Phylogenetischen Systematik (1950). English revision, Phylogenetic Systematics. (tr. D. Davis and R. Zangerl), Univ. of Illinois Press, Urbana 1966, reprinted 1979
Google Scholar
Neel, A.J., Garzon, M.H.: DNA-based memories: a survey. In: Bel-Enguix, G., Jiménez-López, M.D., Martín-Vide, C. (eds.) New Developments in Formal Languages and Applications. SCI, vol. 113, pp. 259–275. Springer, Heidelberg (2008)
Chapter Google Scholar
Reif,H., LaBean, T.H., Pirrung, M., Rana, V.S., Guo, B., Kingsford, C., Wickham, G.S.: Experimental construction of very large scale DNA databases with associative search capability. In: Jonoska, N., Seeman, N.C. (eds.) DNA 2001. LNCS, vol. 2340, pp. 231–247. Springer, Heidelberg (2002). doi:10.1007/3-540-48017-X_22
Chapter Google Scholar
Schena, M.: Microarray Analysis. Wiley, Hoboken (2003)
Google Scholar
Seeman, N.: DNA in a material world. Nature 421, 427–431 (2003)
Article MathSciNet Google Scholar
Stekel, D.: Microarray Bioinformatics. Cambridge University Press, Cambridge (2003)
Book Google Scholar
Volff, J.N., Altenbuchner, J.: A new beginning with new ends: linearisation of circular chromosomes during bacterial evolution. FEMS Microbiol. Lett. 186(2), 143–150 (2000)
Article Google Scholar
Huget, J.M., Bizarro, C.V., Forns, N., Smith, S.B., Bustamante, C.A., Ritort, F.: Single-molecule derivation of salt-dependent base-pair free energies in DNA. PNAS 107(35), 15431–15436 (2010)
Article Google Scholar
Winfree, E., Liu, F., Wenzler, L.A., Seeman, N.C.: Design and self-assembly of two-dimensional DNA crystals. Nature 394, 539–544 (1998)
Article Google Scholar
Woese, C., Fox, G.: Phylogenetic structure of the prokaryotic domain: the primary kingdoms. Proc. Natl. Acad. Sci. U.S.A. 74, 5088–5090 (1977)
Article Google Scholar
http://en.wikipedia.org/wiki/Phylogenetics (2008). Accessed Feb 2017

Download references

Acknowledgements

Many thanks to the High Performance Computing Center (HPC) at the U of Memphis for the time to compute the digital signatures.

Author information

Authors and Affiliations

Department of Computer Science, The University of Memphis, Tennessee, 38152, USA
Max H. Garzon & Sambriddhi Mainali

Authors

Max H. Garzon
View author publications
You can also search for this author in PubMed Google Scholar
Sambriddhi Mainali
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Max H. Garzon .

Editor information

Editors and Affiliations

Universidad de Granada, Granada, Spain
Ignacio Rojas
Universidad de Granada, Granada, Spain
Francisco Ortuño

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Garzon, M.H., Mainali, S. (2017). Towards a Universal Genomic Positioning System: Phylogenetics and Species IDentification. In: Rojas, I., Ortuño, F. (eds) Bioinformatics and Biomedical Engineering. IWBBIO 2017. Lecture Notes in Computer Science(), vol 10209. Springer, Cham. https://doi.org/10.1007/978-3-319-56154-7_42

Download citation

DOI: https://doi.org/10.1007/978-3-319-56154-7_42
Published: 01 April 2017
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-56153-0
Online ISBN: 978-3-319-56154-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics