Abstract
Next-generation sequencing became a method of choice for the investigation of small RNA transcriptomes in plants and animals. Although a technical side of sequencing itself is becoming routine, and experimental costs are affordable, data analysis still remains a challenge, especially for researchers with limited computational experience. Here, we present a detailed description of a computational workflow designed to take raw sequencing reads as input, to obtain small RNA predictions, and to detect the differentially expressed microRNAs as a result. The exact commands and pieces of code are provided and hopefully can be adapted and used by other researchers to facilitate the study of small RNA regulation.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Martin M (2011) Cutadapt removes adapter sequences from high-throughput sequencing reads. EMBnet J 17:1, Gener. Seq. Data Anal
Axtell MJ (2013) Classification and Comparison of Small RNAs from Plants. Annu Rev Plant Biol 64:137–159
Calarco JP, Borges F, Donoghue MTA, Van Ex F, Jullien PE, Lopes T, Gardner R, Berger F, Feijó JA, Becker JD, Martienssen RA (2012) Reprogramming of DNA Methylation in Pollen Guides Epigenetic Inheritance via Small RNA. Cell 151:194–205
Lilljebjorn H, Rissler M, Lassen C, Heldrup J, Behrendtz M, Mitelman F, Johansson B, Fioretos T (2012) Whole-exome sequencing of pediatric acute lymphoblastic leukemia. Leukemia 26:1602–1607
Carroll TS, Liang Z, Salama R, Stark R, de Santiago I (2014) Impact of artifact removal on ChIP quality metrics in ChIP-seq and ChIP-exo data. Front Genet 5:75
Lindner R, Friedel CC (2012) A Comprehensive Evaluation of Alignment Algorithms in the Context of RNA-Seq. PLoS One 7, e52403
Langmead B, Trapnell C, Pop M, Salzberg SL (2009) Ultrafast and memory-efficient alignment of short DNA sequences to the human genome. Genome Biol 10:R25
Yang X, Li L (2011) miRDeep-P: a computational tool for analyzing the microRNA transcriptome in plants. Bioinformatics 27:2614–2615
Xie F, Xiao P, Chen D, Xu L, Zhang B (2012) miRDeepFinder: a miRNA analysis tool for deep sequencing of plant small RNAs. Plant Mol. Biol 80:75–84
Moxon S, Schwach F, Dalmay T, MacLean D, Studholme DJ, Moulton V (2008) A toolkit for analysing large-scale plant small RNA datasets. Bioinformatics 24:2252–2253
Axtell MJ (2013) ShortStack: Comprehensive annotation and quantification of small RNA genes. RNA 19:740–751
An J, Lai J, Lehman ML, Nelson CC (2013) miRDeep*: an integrated application tool for miRNA identification from RNA sequencing data. Nucleic Acids Res 41:727–737
Lorenz R, Bernhart SH, HönerzuSiederdissen C, Tafer H, Flamm C, Stadler PF, Hofacker IL (2011) ViennaRNA Package 2.0. Algor Mol Biol 6:26
Bologna NG, Schapire AL, Zhai J, Chorostecki U, Boisbouvier J, Meyers BC, Palatnik JF (2013) Multiple RNA recognition patterns during microRNA biogenesis in plants. Genome Res 23:1675–1689
Allen E, Xie Z, Gustafson AM, Carrington JC (2005) microRNA-Directed Phasing during Trans-Acting siRNA Biogenesis in Plants. Cell 121:207–221
Xie Z, Allen E, Wilken A, Carrington JC (2005) DICER-LIKE 4 functions in trans-acting small interfering RNA biogenesis and vegetative phase change in Arabidopsis thaliana. Proc Natl Acad Sci U S A 102:12984–12989
Chen H-M, Li Y-H, Wu S-H (2007) Bioinformatic prediction and experimental validation of a microRNA-directed tandem trans-acting siRNA cascade in Arabidopsis. Proc Natl Acad Sci U S A 104:3318–3323
Stocks MB, Moxon S, Mapleson D, Woolfenden HC, Mohorianu I, Folkes L, Schwach F, Dalmay T, Moulton V (2012) The UEA sRNA workbench: a suite of tools for analysing and visualizing next generation sequencing microRNA and small RNA datasets. Bioinformatics 28:2059–2061
Quinlan AR, Hall IM (2010) BEDTools: a flexible suite of utilities for comparing genomic features. Bioinformatics 26:841–842
Robinson JT, Thorvaldsdottir H, Winckler W, Guttman M, Lander ES, Getz G, Mesirov JP (2011) Integrative genomics viewer. Nat Biotechnol 29:24–26
Li, H., Handsaker, B., Wysoker, A., Fennell, T., Ruan, J., Homer, N., Marth, G., Abecasis, G., Durbin, R., 1000 Genome Project Data Processing Subgroup (2009) The Sequence Alignment/Map format and SAMtools. Bioinformatics 25:2078–2079
Mochida K, Shinozaki K (2010) Genomics and Bioinformatics Resources for Crop Improvement. Plant Cell Physiol 51:497–523
Martinez M (2013) From plant genomes to protein families: computational tools. Comput Struct Biotechnol J 8, e201307001
Yu X, Wang H, Lu Y, de Ruiter M, Cariaso M, Prins M, van Tunen A, He Y (2012) Identification of conserved and novel microRNAs that are responsive to heat stress in Brassica rapa. J Exp Bot 63:1025–1038
Burge SW, Daub J, Eberhardt R, Tate J, Barquist L, Nawrocki EP, Eddy SR, Gardner PP, Bateman A (2013) Rfam 11.0: 10 years of RNA families. Nucleic Acids Res 41:D226–D232
Cheng F, Liu S, Wu J, Fang L, Sun S, Liu B, Li P, Hua W, Wang X (2011) BRAD, the genetics and genomics database for Brassica plants. BMC Plant Biol 11:136
Cordero F, Beccuti M, Arigoni M, Donatelli S, Calogero RA (2012) Optimizing a Massive Parallel Sequencing Workflow for Quantitative miRNA Expression Analysis. PLoS One 7, e31630
Emde A-K, Grunert M, Weese D, Reinert K, Sperling SR (2010) MicroRazerS: rapid alignment of small RNA reads. Bioinformatics 26:123–124
Rumble SM, Lacroute P, Dalca AV, Fiume M, Sidow A, Brudno M (2009) SHRiMP: Accurate Mapping of Short Color-space Reads. PLoS Comput Biol 5, e1000386
Hardcastle TJ, Kelly KA (2010) baySeq: Empirical Bayesian methods for identifying differential expression in sequence count data. BMC Bioinformatics 11:422
Anders S, Huber W (2010) Differential expression analysis for sequence count data. Genome Biol 11:R106
Kozomara A, Griffiths-Jones S (2014) miRBase: annotating high confidence microRNAs using deep sequencing data. Nucleic Acids Res 42:D68–D73
Robinson MD, McCarthy DJ, Smyth GK (2010) edgeR: a Bioconductor package for differential expression analysis of digital gene expression data. Bioinformatics 26:139–140
Kauffmann A, Gentleman R, Huber W (2009) arrayQualityMetrics—a bioconductor package for quality assessment of microarray data. Bioinformatics 25:415–416
Wickham H (2009) ggplot2: elegant graphics for data analysis., Springer New York
Benjamini Y, Hochberg Y (1995) Controlling the False Discovery Rate: A Practical and Powerful Approach to Multiple Testing. J R Stat Soc Ser B Methodol 57:289–300
Bourgon R, Gentleman R, Huber W (2010) Independent filtering increases detection power for high-throughput experiments. Proc Natl Acad Sci U S A 107:9546–9551
Li H, Durbin R (2009) Fast and accurate short read alignment with Burrows–Wheeler transform. Bioinformatics 25:1754–1760
Trapnell C, Pachter L, Salzberg SL (2009) TopHat: discovering splice junctions with RNA-Seq. Bioinforma Oxf Engl 25:1105–1111
Li H, Handsaker B, Wysoker A, Fennell T, Ruan J, Homer N, Marth G, Abecasis G, Durbin R, 1000 Genome Project Data Processing Subgroup (2009) The Sequence Alignment/Map format and SAMtools. Bioinformatics 25:2078–2079
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer Science+Business Media New York
About this protocol
Cite this protocol
Ilnytskyy, S., Bilichak, A. (2017). Bioinformatics Analysis of Small RNA Transcriptomes: The Detailed Workflow. In: Kovalchuk, I. (eds) Plant Epigenetics. Methods in Molecular Biology, vol 1456. Humana Press, Boston, MA. https://doi.org/10.1007/978-1-4899-7708-3_16
Download citation
DOI: https://doi.org/10.1007/978-1-4899-7708-3_16
Published:
Publisher Name: Humana Press, Boston, MA
Print ISBN: 978-1-4899-7706-9
Online ISBN: 978-1-4899-7708-3
eBook Packages: Springer Protocols