Skip to main content

PATMAP: Polyadenylation Site Identification from Next-Generation Sequencing Data

  • Conference paper
  • 1829 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 7208))

Abstract

Polyadenylation is an essential post-transcriptional processing step in the maturation of eukaryotic mRNA. The coming flood of next-generation sequencing (NGS) data creates new opportunities for intensive study of polyadenylation. We present an automated flow called PATMAP to identify polyadenylation sites (poly(A) sites) by integrating NGS data cleaning, processing, mapping, normalizing and clustering. The ambiguous region was introduced to parse the genome annotation by first. Then a series of Perl scripts were seamlessly integrated to iteratively map the single-end or paired-end sequences to the reference genome. After mapping, the poly(A) tags (PATs) at the same coordinate were grouped into one cleavage site, and the internal priming artifacts were removed. Finally, these cleavage sites from different samples were normalized by a MA-based method and clustered into poly(A) clusters (PACs) by empirical Bayesian method. The effectiveness of PATMAP was demonstrated by identifying thousands of reliable PACs from millions of NGS sequences in Arabidopsis and yeast.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Xing, D., Li, Q.Q.: Alternative Polyadenylation and Gene Expression Regulation in Plants. Wiley Interdisciplinary Reviews: RNA 2, 445–458 (2010)

    Article  Google Scholar 

  2. Shen, Y., Ji, G., Haas, B.J., Wu, X., Zheng, J., Reese, G.J., Li, Q.Q.: Genome Level Analysis of Rice mRNA 3’-End Processing Signals and Alternative Polyadenylation. Nucleic Acids Res. 36, 3150–3161 (2008)

    Article  Google Scholar 

  3. Tian, B., Hu, J., Zhang, H.B., Lutz, C.S.: A Large-Scale Analysis of mRNA Polyadenylation of Human and Mouse Genes. Nucleic Acids Res. 33, 201–212 (2005)

    Article  Google Scholar 

  4. Wu, X., Liu, M., Downie, B., Liang, C., Ji, G., Li, Q.Q., Hunt, A.G.: Genome-Wide Landscape of Polyadenylation in Arabidopsis Provides Evidence for Extensive Alternative Polyadenylation. Proc. Natl. Acad. Sci. USA. 108, 12533–12538 (2011)

    Article  Google Scholar 

  5. Shen, Y., Liu, Y., Liu, L., Liang, C., Li, Q.Q.: Unique Features of Nuclear mRNA Poly(a) Signals and Alternative Polvadenylation in Chlamydomonas Reinhardtii. Genetics 179, 167–176 (2008)

    Article  Google Scholar 

  6. Shen, Y., Venu, R.C., Nobuta, K., Wu, X., Notibala, V., Demirci, C., Meyers, B.C., Wang, G.-L., Ji, G., Li, Q.Q.: Transcriptome Dynamics through Alternative Polyadenylation in Developmental and Environmental Responses in Plants Revealed by Deep Sequencing. Genome Res. 21, 1478–1486 (2011)

    Article  Google Scholar 

  7. Meyers, B.C., Vu, T.H., Tej, S.S., Ghazal, H., Matvienko, M., Agrawal, V., Ning, J.C., Haudenschild, C.D.: Analysis of the Transcriptional Complexity of Arabidopsis Thaliana by Massively Parallel Signature Sequencing. Nat. Biotechnol. 22, 1006–1011 (2004)

    Article  Google Scholar 

  8. Jin, Y., Bian, T.: Nontemplated Nucleotide Addition Prior to Polyadenylation: A Comparison of Arabidopsis cDNA and Genomic Sequences. RNA 10, 1695–1697 (2004)

    Article  Google Scholar 

  9. Liang, C., Liu, Y.S., Liu, L., Davis, A.C., Shen, Y.J., Li, Q.Q.: Expressed Sequence Tags with cDNA Termini: Previously Overlooked Resources for Gene Annotation and Transcriptome Exploration in Chlamydomonas Reinhardtii. Genetics 179, 83–93 (2008)

    Article  Google Scholar 

  10. Tian, B., Pan, Z.H., Lee, J.Y.: Widespread mRNA Polyadenylation Events in Introns Indicate Dynamic Interplay between Polyadenylation and Splicing. Genome Res. 17, 156–165 (2007)

    Article  Google Scholar 

  11. Levin, J.Z., Yassour, M., Adiconis, X., Nusbaum, C., Thompson, D.A., Friedman, N., Gnirke, A., Regev, A.: Comprehensive Comparative Analysis of Strand-Specific Rna Sequencing Methods. Nat. Methods. 7, 709–767 (2010)

    Article  Google Scholar 

  12. Langmead, B., Trapnell, C., Pop, M., Salzberg, S.L.: Ultrafast and Memory-Efficient Alignment of Short DNA Sequences to the Human Genome. Genome Biol. 10 (2009)

    Google Scholar 

  13. Hardcastle, T.J., Kelly, K.A.: Bayseq: Empirical Bayesian Methods for Identifying Differential Expression in Sequence Count Data. BMC Bioinformatics 11 (2010)

    Google Scholar 

  14. Dudoit, S., Yang, Y.H., Callow, M.J., Speed, T.P.: Statistical Methods for Identifying Differentially Expressed Genes in Replicated cDNA Microarray Experiments. Statistica Sinica 12, 111–139 (2002)

    MathSciNet  MATH  Google Scholar 

  15. Bullard, J.H., Purdom, E., Hansen, K.D., Dudoit, S.: Evaluation of Statistical Methods for Normalization and Differential Expression in mRNA-Seq Experiments. BMC Bioinformatics 11 (2010)

    Google Scholar 

  16. Smyth, G.K.: Linear Models and Empirical Bayes Methods for Assessing Differential Expression in Microarray Experiments. Statistical Applications in Genetics and Molecular Biology 3, article3 (2004)

    Google Scholar 

  17. Graber, J.H., Cantor, C.R., Mohr, S.C., Smith, T.F.: Genomic Detection of New Yeast Pre-mRNA 3 ’-End-Processing Signals. Nucleic Acids Res. 27, 888–894 (1999)

    Article  Google Scholar 

  18. Jan, C.H., Friedman, R.C., Ruby, J.G., Bartel, D.P.: Formation, Regulation and Evolution of Caenorhabditis Elegans 3’utrs. Nature 469, 97–101 (2011)

    Article  Google Scholar 

  19. Lee, A., Hansen, K.D., Bullard, J., Dudoit, S., Sherlock, G.: Novel Low Abundance and Transient Rnas in Yeast Revealed by Tiling Microarrays and Ultra High-Throughput Sequencing Are Not Conserved across Closely Related Yeast Species. PLoS Genet. 4, e1000299 (2008)

    Google Scholar 

  20. Wu, T.D., Watanabe, C.K.: Gmap: A Genomic Mapping and Alignment Program for mRNA and EST Sequences. Bioinformatics 21, 1859–1875 (2005)

    Article  Google Scholar 

  21. Abraham, A., Corchado, E., Corchado, J.M.: Hybrid Learning Machines. Neurocomputing 72, 2729–2730 (2009)

    Article  Google Scholar 

  22. Garcia, S., Fernandez, A., Luengo, J., Herrera, F.: Advanced Nonparametric Tests for Multiple Comparisons in the Design of Experiments in Computational Intelligence and Data Mining: Experimental Analysis of Power. Information Sciences 180, 2044–2064 (2010)

    Article  Google Scholar 

  23. Corchado, E., Graña, M., Woźniak, M.: New Trends and Applications on Hybrid Artificial Intelligence Systems. Neurocomputing 75, 61–63 (2012)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2012 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Wu, X., Tang, M., Yao, J., Lin, S., Xiang, Z., Ji, G. (2012). PATMAP: Polyadenylation Site Identification from Next-Generation Sequencing Data. In: Corchado, E., Snášel, V., Abraham, A., Woźniak, M., Graña, M., Cho, SB. (eds) Hybrid Artificial Intelligent Systems. HAIS 2012. Lecture Notes in Computer Science(), vol 7208. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-28942-2_44

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-28942-2_44

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-28941-5

  • Online ISBN: 978-3-642-28942-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics