Viral Gene Compression: Complexity and Verification
- 396 Downloads
The smallest known biological organisms are, by far, the viruses. One of the unique adaptations that many viruses have aquired is the compression of the genes in their genomes. In this paper we study a formalized model of gene compression in viruses. Specifically, we define a set of constraints that describe viral gene compression strategies and investigate the properties of these constraints from the point of view of genomes as languages. We pay special attention to the finite case (representing real viral genomes) and describe a metric for measuring the level of compression in a real viral genome. An efficient algorithm for establishing this metric is given along with applications to real genomes including automated classification of viruses and prediction of horizontal gene transfer between host and virus.
KeywordsViral Genome Horizontal Gene Transfer Formal Language Large Integer Regular Language
- 1.Berstel, J.: Transductions and Context-Free Languages. B.B. Teubner, Stuttgart (1979)Google Scholar
- 3.Cann, A.J.: Principles of Molecular Virology, 3rd edn. Academic Press, San Diego (2001)Google Scholar
- 10.Wagner, E.K., Hewlett, M.J.: Basic Virology. Blackwell Science, Malden (1999)Google Scholar