RNA-seq raw data processing

Cellerino, Alessandro; Sanguanini, Michele

doi:10.1007/978-88-7642-642-1_3

Alessandro Cellerino³ &
Michele Sanguanini⁴

Part of the book series: CRM Series ((LNSNS,volume 17))

987 Accesses

Abstract

The human genome contains more than 20000 protein-coding genes, but the complexity of the RNA population in any given human sample is at least one order of magnitude higher due to alternative splicing that generates different splicing isoforms. To this, one has to add an increasing number of non-coding RNAs and various forms of RNA editing. This high complexity poses important technical and computational questions such as,

how ‘deep’ should the planned sequencing be (i.e. how many clusters should be sequenced from the cDNA libraries) to obtain a good representation of the transcript diversity?
Is the processing of the dataset (i.e. the identification of the gene of origin for each sequence) feasible in terms of computation time?
Can the complexity be reduced?

In this chapter the problems of complexity and of mapping the RNA-seq reads to a the reference genome will be addressed from a probabilistic and informational point of view. The issue of reducing the complexity will be dealt with in Chapters 5 and 6.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

eBook: USD 19.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Author information

Authors and Affiliations

Scuola Normale Superiore, Piazza dei Cavalieri, 7, Pisa, 56126, Italy
Alessandro Cellerino
Gonville and Caius College, University of Cambridge, Trinity Street, Cambridge, Cambridgeshire, CB2 1TA, United Kingdom
Michele Sanguanini

Authors

Alessandro Cellerino
View author publications
You can also search for this author in PubMed Google Scholar
Michele Sanguanini
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Cellerino, A., Sanguanini, M. (2018). RNA-seq raw data processing. In: Transcriptome Analysis. CRM Series(), vol 17. Edizioni della Normale, Pisa. https://doi.org/10.1007/978-88-7642-642-1_3

Download citation

DOI: https://doi.org/10.1007/978-88-7642-642-1_3
Publisher Name: Edizioni della Normale, Pisa
Print ISBN: 978-88-7642-641-4
Online ISBN: 978-88-7642-642-1
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)

Publish with us

Policies and ethics