Genome-Scale Analysis of Data from High-Throughput Technologies

Wheelan, Sarah J.

doi:10.1007/978-0-387-69745-1_1

Sarah J. Wheelan³

Part of the book series: Applied Bioinformatics and Biostatistics in Cancer Research ((ABB))

1751 Accesses

Abstract

Few technical advances have excited such a broad spectrum of basic and clinical scientists as high-throughput technologies (microarrays and sequencing). Having learned in training that somewhere in the genome lies the key to just about any phenotype, scientists are fast joining the movement to decrease cost and improve access to these technologies. Generating enormous amounts of high-dimensional data brings certain challenges, and many researchers are turning even further from their training to collaborate with computer scientists and biostatisticians, who are equally excited to analyze these promising datasets. As new and truly interdisciplinary teams are created, we are seeing major advances; the current environment is exciting for all involved. Technology has brought entire scientific fields to the brink of discovery before, and will again, and thus the overall enthusiasm must be tempered by the fact that new technology brings new problems and new artifacts that we have not seen before. We can circumvent some of these by paying careful attention to experimental design, staying mindful of the complexities of the underlying biology, and by soliciting assistance from analysts versed in high-dimensional data.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.99; Price excludes VAT (USA)

Hardcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Benson DA, Karsch-Mizrachi I, Lipman DJ, Ostell J, Sayers EW (2009) GenBank. Nucleic Acids Res 37:D26–D31
Article PubMed CAS Google Scholar
Cheng J, Kapranov P, Drenkow J, Dike S, Brubaker S, Patel S, Long J, Stern D, Tammana H, Helt G, Sementchenko V, Piccolboni A, Bekiranov S, Bailey DK, Ganesh M, Ghosh S, Bell I, Gerhard DS, Gingeras TR (2005) Transcriptional maps of 10 human chromosomes at 5-nucleotide resolution. Science 308:1149–1154
Article PubMed CAS Google Scholar
Dermitzakis ET, Reymond A, Scamuffa N, Ucla C, Kirkness E, Rossier C, Antonarakis SE (2003) Evolutionary discrimination of mammalian conserved non-genic sequences (CNGs). Science 302:1033–1035
Article PubMed CAS Google Scholar
El-Mogharbel N, Wakefield M, Deakin JE, Tsend-Ayush E, Grutzner F, Alsop A, Ezaz T, Marshall Graves JA (2007) DMRT gene cluster analysis in the platypus: new insights into genomic organization and regulatory regions. Genomics 89:10–21
Article PubMed CAS Google Scholar
Gnirke A, Melnikov A, Maguire J, Rogov P, LeProust EM, Brockman W, Fennell T, Giannoukos G, Fisher S, Russ C, Gabriel S, Jaffe DB, Lander ES, Nusbaum C (2009) Solution hybrid selection with ultra-long oligonucleotides for massively parallel targeted sequencing. Nat Biotechnol 27:182–189
Article PubMed CAS Google Scholar
Jones CE, Brown AL, Baumann U (2007) Estimating the annotation error rate of curated GO database sequence annotations. BMC Bioinformatics 8:170
Article PubMed Google Scholar
Kaiser J (2008) DNA sequencing. A plan to capture human diversity in 1000 genomes. Science 319:395
Article PubMed CAS Google Scholar
Kapranov P, Cawley SE, Drenkow J, Bekiranov S, Strausberg RL, Fodor SP, Gingeras TR (2002) Large-scale transcriptional activity in chromosomes 21 and 22. Science 296:916–919
Article PubMed CAS Google Scholar
Kapranov P, Cheng J, Dike S, Nix DA, Duttagupta R, Willingham AT, Stadler PF, Hertel J, Hackermuller J, Hofacker IL, Bell I, Cheung E, Drenkow J, Dumais E, Patel S, Helt G, Ganesh M, Ghosh S, Piccolboni A, Sementchenko V, Tammana H, Gingeras TR (2007) RNA maps reveal new RNA classes and a possible function for pervasive transcription. Science 316:1484–1488
Article PubMed CAS Google Scholar
Marshall A (2008) Prepare for the deluge. Nat Biotechnol 26:1099
Article Google Scholar
Pan Q, Shai O, Lee LJ, Frey BJ, Blencowe BJ (2008) Deep surveying of alternative splicing complexity in the human transcriptome by high-throughput sequencing. Nat Genet 40:1413–1415
Article PubMed CAS Google Scholar
Pop M (2009) Genome assembly reborn: recent computational challenges. Brief Bioinform 10:354–366
Article PubMed CAS Google Scholar
Salzberg SL, Sommer DD, Puiu D, Lee VT (2008) Gene-boosted assembly of a novel bacterial genome from very short reads. PLoS Comput Biol 4:e1000186
Article PubMed Google Scholar
Shendure J, Ji H (2008) Next-generation DNA sequencing. Nat Biotechnol 26:1135–1145
Article PubMed CAS Google Scholar
Strasser BJ (2008) Genetics. GenBank – Natural history in the 21st Century? Science 322:537–538
Article PubMed CAS Google Scholar
Trapnell C, Salzberg SL (2009) How to map billions of short reads onto genomes. Nat Biotechnol 27:455–457
Article PubMed CAS Google Scholar
Trapnell C, Pachter L, Salzberg SL (2009) TopHat: discovering splice junctions with RNA-Seq. Bioinformatics 25:1105–1111
Article PubMed CAS Google Scholar
Weiss KM (1998) In search of human variation. Genome Res 8:691–697
PubMed CAS Google Scholar
Wheelan SJ, Scheifele LZ, Martinez-Murillo F, Irizarry RA, Boeke JD (2006) Transposon insertion site profiling chip (TIP-chip). Proc Natl Acad Sci U S A 103:17632–17637
Article PubMed CAS Google Scholar
Wheelan SJ, Martinez Murillo F, Boeke JD (2008) The incredible shrinking world of DNA microarrays. Mol Biosyst 4:726–732
Article PubMed CAS Google Scholar
Zeeberg BR, Riss J, Kane DW, Bussey KJ, Uchio E, Linehan WM, Barrett JC, Weinstein JN (2004) Mistaken identifiers: gene name errors can be introduced inadvertently when using Excel in bioinformatics. BMC Bioinformatics 5:80
Article PubMed Google Scholar

Download references

Author information

Authors and Affiliations

Sidney Kimmel Comprehensive Cancer Center, Johns Hopkins University School of Medicine, Baltimore, MD, USA
Sarah J. Wheelan

Authors

Sarah J. Wheelan
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Sarah J. Wheelan .

Editor information

Editors and Affiliations

Sidney Kimmel Comprehensive, Cancer Center, Johns Hopkins University, Orleans Street 1650, Baltimore, 21231, Maryland, USA
Srinivasan Yegnasubramanian
School of Medicine, Urology, Pharmacology, Oncology, Johns Hopkins University, N. Wolfe Street 600, Baltimore, 21287, Maryland, USA
William B. Isaacs

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Wheelan, S.J. (2010). Genome-Scale Analysis of Data from High-Throughput Technologies. In: Yegnasubramanian, S., Isaacs, W. (eds) Modern Molecular Biology. Applied Bioinformatics and Biostatistics in Cancer Research. Springer, New York, NY. https://doi.org/10.1007/978-0-387-69745-1_1

Download citation

DOI: https://doi.org/10.1007/978-0-387-69745-1_1
Published: 29 July 2010
Publisher Name: Springer, New York, NY
Print ISBN: 978-0-387-69744-4
Online ISBN: 978-0-387-69745-1
eBook Packages: Biomedical and Life SciencesBiomedical and Life Sciences (R0)

Publish with us

Policies and ethics