An Introduction to High-Throughput Sequencing Experiments: Design and Bioinformatics Analysis

Normand, Rachelly; Yanai, Itai

doi:10.1007/978-1-62703-514-9_1

Rachelly Normand² &
Itai Yanai³

Part of the book series: Methods in Molecular Biology ((MIMB,volume 1038))

7201 Accesses
6 Citations

Abstract

The dramatic fall in the cost of DNA sequencing has revolutionized the experiments within reach in the life sciences. Here we provide an introduction for the domains of analyses possible using high-throughput sequencing, distinguishing between “counting” and “reading” applications. We discuss the steps in designing a high-throughput sequencing experiment, introduce the most widely used applications, and describe basic sequencing concepts. We review the various software programs available for many of the bioinformatics analysis required to make sense of the sequencing data. We hope that this introduction will be accessible to biologists with no previous background in bioinformatics, yet with a keen interest in applying the power of high-throughput sequencing in their research.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Protocol: USD 49.95; Price excludes VAT (USA)

eBook: USD 89.00; Price excludes VAT (USA)

Softcover Book: USD 119.99; Price excludes VAT (USA)

Hardcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Goh Y, Fullwood MJ, Poh HM, Peh SQ, Ong CT, Zhang J, Ruan X, Ruan Y (2012) Chromatin interaction analysis with paired-end tag sequencing (ChIA-PET) for mapping chromatin interactions and understanding transcription regulation. J Vis Exp e3770
Google Scholar
Toung JM, Morley M, Li M, Cheung VG (2011) RNA-sequence analysis of human B-cells. Genome Res 21:991–998
Article PubMed CAS Google Scholar
Koehler R, Issac H, Cloonan N, Grimmond SM (2011) The uniqueome: a mappability resource for short-tag sequencing. Bioinformatics 27:272–274
Article PubMed CAS Google Scholar
Robinson JT, Thorvaldsdottir H, Winckler W, Guttman M, Lander ES, Getz G, Mesirov JP (2011) Integrative genomics viewer. Nat Biotechnol 29:24–26
Article PubMed CAS Google Scholar
Paszkiewicz K, Studholme DJ (2010) De novo assembly of short sequence reads. Brief Bioinform 11:457–472
Article PubMed CAS Google Scholar
Schatz MC, Witkowski J, McCombie WR (2012) Current challenges in de novo plant genome sequencing and assembly. Genome Biol 13:243
PubMed CAS Google Scholar
McIntyre LM, Lopiano KK, Morse AM, Amin V, Oberg AL, Young LJ, Nuzhdin SV (2011) RNA-seq: technical variability and sampling. BMC Genomics 12:293
Article PubMed CAS Google Scholar
Trapnell C, Williams BA, Pertea G, Mortazavi A, Kwan G, van Baren MJ, Salzberg SL, Wold BJ, Pachter L (2010) Transcript assembly and quantification by RNA-Seq reveals unannotated transcripts and isoform switching during cell differentiation. Nat Biotechnol 28:511–515
Article PubMed CAS Google Scholar
Kharchenko PV, Tolstorukov MY, Park PJ (2008) Design and analysis of ChIP-seq experiments for DNA-binding proteins. Nat Biotechnol 26:1351–1359
Article PubMed CAS Google Scholar
Auerbach RK, Euskirchen G, Rozowsky J, Lamarre-Vincent N, Moqtaderi Z, Lefrancois P, Struhl K, Gerstein M, Snyder M (2009) Mapping accessible chromatin regions using Sono-Seq. Proc Natl Acad Sci USA 106:14926–14931
Article PubMed CAS Google Scholar
Li R, Zhu H, Ruan J, Qian W, Fang X, Shi Z, Li Y, Li S, Shan G, Kristiansen K et al (2010) De novo assembly of human genomes with massively parallel short read sequencing. Genome Res 20:265–272
Article PubMed CAS Google Scholar
Simpson JT, Wong K, Jackman SD, Schein JE, Jones SJ, Birol I (2009) ABySS: a parallel assembler for short read sequence data. Genome Res 19:1117–1123
Article PubMed CAS Google Scholar
Zerbino DR, Birney E (2008) Velvet: algorithms for de novo short read assembly using de Bruijn graphs. Genome Res 18:821–829
Article PubMed CAS Google Scholar
Butler J, MacCallum I, Kleber M, Shlyakhter IA, Belmonte MK, Lander ES, Nusbaum C, Jaffe DB (2008) ALLPATHS: de novo assembly of whole-genome shotgun microreads. Genome Res 18:810–820
Article PubMed CAS Google Scholar
Li H, Durbin R (2009) Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics 25:1754–1760
Article PubMed CAS Google Scholar
Langmead B, Trapnell C, Pop M, Salzberg SL (2009) Ultrafast and memory-efficient alignment of short DNA sequences to the human genome. Genome Biol 10:R25
Article PubMed Google Scholar
Chen K, Wallis JW, McLellan MD, Larson DE, Kalicki JM, Pohl CS, McGrath SD, Wendl MC, Zhang Q, Locke DP et al (2009) BreakDancer: an algorithm for high-resolution mapping of genomic structural variation. Nat Methods 6:677–681
Article PubMed CAS Google Scholar
Li H, Handsaker B, Wysoker A, Fennell T, Ruan J, Homer N, Marth G, Abecasis G, Durbin R (2009) The sequence alignment/map format and SAMtools. Bioinformatics 25:2078–2079
Article PubMed Google Scholar
McKenna A, Hanna M, Banks E, Sivachenko A, Cibulskis K, Kernytsky A, Garimella K, Altshuler D, Gabriel S, Daly M et al (2010) The genome analysis toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data. Genome Res 20:1297–1303
Article PubMed CAS Google Scholar
Li H, Ruan J, Durbin R (2008) Mapping short DNA sequencing reads and calling variants using mapping quality scores. Genome Res 18:1851–1858
Article PubMed CAS Google Scholar
Grabherr MG, Haas BJ, Yassour M, Levin JZ, Thompson DA, Amit I, Adiconis X, Fan L, Raychowdhury R, Zeng Q et al (2010) Full-length transcriptome assembly from RNA-Seq data without a reference genome. Nat Biotechnol 29:644–652
Article Google Scholar
Langmead B, Hansen KD, Leek JT (2010) Cloud-scale RNA-sequencing differential expression analysis with Myrna. Genome Biol 11:R83
Article PubMed Google Scholar
Zhang Y, Liu T, Meyer CA, Eeckhoute J, Johnson DS, Bernstein BE, Nusbaum C, Myers RM, Brown M, Li W et al (2008) Model-based analysis of ChIP-Seq (MACS). Genome Biol 9:R137
Article PubMed Google Scholar
Feng J, Liu T, Zhang Y (2011) Using MACS to identify peaks from ChIP-Seq data. Curr Protoc Bioinformatics. Chapter 2:Unit 2 14
Google Scholar
Zang C, Schones DE, Zeng C, Cui K, Zhao K, Peng W (2009) A clustering approach for identification of enriched domains from histone modification ChIP-Seq data. Bioinformatics 25:1952–1958
Article PubMed CAS Google Scholar
Danecek P, Auton A, Abecasis G, Albers CA, Banks E, DePristo MA, Handsaker RE, Lunter G, Marth GT, Sherry ST et al (2011) The variant call format and VCFtools. Bioinformatics 27:2156–2158
Article PubMed CAS Google Scholar

Download references

Author information

Authors and Affiliations

Life Sciences and Engineering, Technion Genome Center, Technion—Israel Institute of Technology, Haifa, Israel
Rachelly Normand
Department of Biology, Technion—Israel Institute of Technology, Haifa, Israel
Itai Yanai

Authors

Rachelly Normand
View author publications
You can also search for this author in PubMed Google Scholar
Itai Yanai
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

, Faculty of Medicine, Tel Aviv University, N/A, Tel Aviv, 69978, Israel
Noam Shomron

Rights and permissions

Reprints and permissions

Copyright information

About this protocol

Cite this protocol

Normand, R., Yanai, I. (2013). An Introduction to High-Throughput Sequencing Experiments: Design and Bioinformatics Analysis. In: Shomron, N. (eds) Deep Sequencing Data Analysis. Methods in Molecular Biology, vol 1038. Humana Press, Totowa, NJ. https://doi.org/10.1007/978-1-62703-514-9_1

Download citation

DOI: https://doi.org/10.1007/978-1-62703-514-9_1
Published: 18 June 2013
Publisher Name: Humana Press, Totowa, NJ
Print ISBN: 978-1-62703-513-2
Online ISBN: 978-1-62703-514-9
eBook Packages: Springer Protocols

Publish with us

Policies and ethics