Abstract
Bioinformatic analysis is critical for studies using huge amounts of DNA, RNA, and protein sequences. This chapter does not attempt a comprehensive review of the bioinformatic and computational research, but focuses on discussion of various bioinformatic approaches developed or tested in this author’s laboratory. These approaches include: (1) a statistical method for gene direction analysis, (2) some technical highlights for genome and chromosome base composition analysis, (3) some technical highlights on RNA polyadenylation site analysis, (4) allele comparison for protein domains, and (5) protein network analysis. Following descriptions of these five bioinformatic methods, unsolved technical issues are highlighted and potential future research directions are discussed.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Ahn IY, Winter CE (2005) Determination of DNA base composition by small scale acrylamide-CsCl gradient centrifugation. J Biochem Biophys Methods 63:155–160
Altschul SF, Madden TL, Schäffer AA, Zhang J, Zhang Z, Miller W, Lipman DJ (1997) Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res 25:3389–3402
Altschul SF, Wootton JC, Gertz EM, Agarwala R, Morgulis A, Schäffer AA, Yu YK (2005) Protein database searches using compositionally adjusted substitution matrices. FEBS J 272:5101–5109
Anisimova M, Liberles DA (2007) The quest for natural selection in the age of comparative genomics. Heredity 99:567–579
Appels R, Dennis ES, Smyth DR, Peacock WJ (1981) Two repeated DNA sequences from the heterochromatic regions of rye (Secale cereale) chromosomes. Chromosoma 84:265–277
Ball P (2006) Prestige is factored into journal ratings. Nature 439:770–771.
Beaudoing E, Gautheret D (2001) Identification of alternate polyadenylation sites and analysis of their tissue distribution using EST Data. Genome Res 11:1520–1526
Bikandi J, San Millan R, Rementeria A, Garaizar J (2004) In silico analysis of complete bacterial genomes: PCR, AFLP-PCR and endonuclease restriction. Bioinformatics 20:798–799
Bracht JR, Fang W, Goldman AD, Dolzhenko E, Stein EM, Landweber LF (2013) Genomes on the edge: programmed genome instability in ciliates. Cell 152:406–416
Brin S, Page L (1998) The anatomy of a large-scale hypertextual web search engine. Comput Net ISDN Sys 30:107–117
Cionini PG, Bassi P, Cremonini R, Cavallini A (1985) Cytological localization of fast renaturing and satellite DNA sequences in Vicia faba. Protoplasma 124:106–111
Du D, Lee CF, Li XQ (2012) Systematic differences in signal emitting and receiving revealed by PageRank analysis of a human protein interactome. PLoS ONE 7(9):e44872
Fang W, Landweber LF (2013) RNA-mediated genome rearrangement: hypotheses and evidence. Bioessays 35:84–87
Fang W, Wang X, Bracht JR, Nowacki M, Landweber LF (2012) Piwi-interacting RNAs protect DNA against loss during oxytricha genome rearrangement. Cell 151:1243–1255
Hall TA (1999) BioEdit: a user-friendly biological sequence alignment editor and analysis program for Windows 95/98/NT. Nucl Acids Symp Ser 41:95–98
Hurst LD, Pál C, Lercher MJ (2004) The evolutionary dynamics of eukaryotic gene order. Nat Rev Genet 5:299–310
Li X-Q (2014) Comparative analysis of the base compositions of the pre-mRNA 3′ cleaved-off region and the mRNA 3′ untranslated region relative to the genomic base composition in animals and plants. PLoS ONE 9:(Accepted with revision).
Li X-Q, Du D (2012) Gene direction in living organisms. Sci Rep 2:982
Li X-Q, Du D (2013) RNA polyadenylation sites on the genomes of microorganisms, animals, and plants. PLoS ONE 8:e79511
Li X-Q, Du D (2014a) Motif types, motif locations and base composition patterns around the RNA polyadenylation site in microorganisms, plants and animals. BMC Evol Biol 14:162
Li XQ, Du D (2014b) Variation, evolution, and correlation analysis of C + G content and genome or chromosome size in different kingdoms and phyla. PLoS ONE 9:e88339
Li XQ, Zhang T, Donnelly D (2011) Selective loss of cysteine residues and disulphide bonds in a potato proteinase inhibitor II family. PLoS ONE 6 6(4):e18615. doi:18610.11371/journal.pone.0018615
Liberles DA (2001) Evaluation of methods for determination of a reconstructed history of gene sequence evolution. Mol Biol Evo 18:2040–2047
Liu W, Li D, Wang J, Xie H, Zhu Y, He F (2009) Proteome-wide prediction of signal flow direction in protein interaction networks based on interacting domains. Mol Cell Proteomics 8:2063–2070
Musto H, Naya H, Zavala A, Romero H, Alvarez-ValÃn F, Bernardi G (2006) Genomic GC level, optimal growth temperature, and genome size in prokaryotes. Biochem Biophys Res Commun 347:1–3
Nam DK, Lee S, Zhou G, Cao X, Wang C, Clark T, Chen J, Rowley JD, Wang SM (2002) Oligo(dT) primer generates a high frequency of truncated cDNAs through internal poly(A) priming during reverse transcription. Proc Natl Acad Sci U S A 99:6152–6156
Nellåker C, Li F, Uhrzander F, Tyrcha J, Karlsson H (2009) Expression profiling of repetitive elements by melting temperature analysis: variation in HERV-W gag expression across human individuals and tissues. BMC Genomics 10:532.
Nishida H (2012) Evolution of genome base composition and genome size in bacteria. Front Microbiol 3:420
Oh TJ, Cullis CA (2003) Labile DNA sequences in flax identified by combined sample representational difference analysis (csRDA). Plant Mol Biol 52:527–536
Slomovic S, Laufer D, Geiger D, Schuster G (2006) Polyadenylation of ribosomal RNA in human cells. Nucleic Acids Res 34:2966–2975
Šmarda P, Bureš P (2012) The variation of base composition in plant genomes. In: Wendel JF, Greilhuber J, Doležel J, Leitch IJ (eds) Plant genome diversity. Springer, Wien, pp 209–235
Šmarda P, Bureš P, Šmerda J, Horová L (2012) Measurements of genomic GC content in plant genomes with flow cytometry: a test for reliability. New Phytol 193:513–521
Smith JJ, Antonacci F, Eichler EE, Amemiy CT (2009) Programmed loss of millions of base pairs from a vertebrate genome. Proc Natl Acad Sci U S A 106:11212–11217
Smith JJ, Stuart AB, Sauka-Spengler T, Clifton SW, Amemiya CT (2010) Development and analysis of a germline BAC resource for the sea lamprey, a vertebrate that undergoes substantial chromatin diminution. Chromosoma 119:381–389
Smith JJ, Baker C, Eichler EE, Amemiya CT (2012) Genetic consequences of programmed genome rearrangement. Curr Biol 22:1524–1529
Thompson JD, Gibson TJ, Plewniak F, Jeanmougin F, Higgins DG (1997) The CLUSTAL_X windows interface: flexible strategies for multiple sequence alignment aided by quality analysis tools. Nucleic Acids Res 25:4876–4882
Tungsuchat-Huang T, Sinagawa-GarcÃa SR, Paredes-López O, Maliga P (2010) Study of plastid genome stability in tobacco reveals that the loss of marker genes is more likely by gene conversion than by recombination between 34-bp loxP repeats. Plant Physiol 153:252–259
Wang Z, Hobson N, Galindo L, Zhu S, Shi D, McDill J, Yang L, Hawkins S, Neutelings G, Datla R, Lambert G, Galbraith DW, Grassa CJ, Geraldes A, Cronk QC, Cullis C, Dash PK, Kumar PA, Cloutier S, Sharpe AG, Wong GKS, Wang J, Deyholos MK (2012) The genome of flax (Linum usitatissimum) assembled de novo from short shotgun sequence reads. Plant J 72:461–473
Weston J, Elisseeff A, Zhou D, Leslie CS, Noble WS (2004) Protein ranking: from local to global structure in the protein similarity network. Proc Natl Acad Sci U S A 101:6559–6563
Wöstemeyer J, Burmester A (1986) Structural organization of the genome of the zygomycete Absidia glauca: evidence for high repetitive DNA content. Curr Genet 10:903–907
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer Science+Business Media, LLC
About this chapter
Cite this chapter
Li, XQ. (2015). Bioinformatic Approaches for Analysis of Gene Direction, Chromosome Base Composition, mRNA Polyadenylation, and Protein Network. In: Li, XQ., Donnelly, D., Jensen, T. (eds) Somatic Genome Manipulation. Springer, New York, NY. https://doi.org/10.1007/978-1-4939-2389-2_15
Download citation
DOI: https://doi.org/10.1007/978-1-4939-2389-2_15
Published:
Publisher Name: Springer, New York, NY
Print ISBN: 978-1-4939-2388-5
Online ISBN: 978-1-4939-2389-2
eBook Packages: Biomedical and Life SciencesBiomedical and Life Sciences (R0)