Skip to main content

Genome-Scale Analysis of Data from High-Throughput Technologies

  • Chapter
  • First Online:
Modern Molecular Biology
  • 1751 Accesses

Abstract

Few technical advances have excited such a broad spectrum of basic and clinical scientists as high-throughput technologies (microarrays and sequencing). Having learned in training that somewhere in the genome lies the key to just about any phenotype, scientists are fast joining the movement to decrease cost and improve access to these technologies. Generating enormous amounts of high-dimensional data brings certain challenges, and many researchers are turning even further from their training to collaborate with computer scientists and biostatisticians, who are equally excited to analyze these promising datasets. As new and truly interdisciplinary teams are created, we are seeing major advances; the current environment is exciting for all involved. Technology has brought entire scientific fields to the brink of discovery before, and will again, and thus the overall enthusiasm must be tempered by the fact that new technology brings new problems and new artifacts that we have not seen before. We can circumvent some of these by paying careful attention to experimental design, staying mindful of the complexities of the underlying biology, and by soliciting assistance from analysts versed in high-dimensional data.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  • Benson DA, Karsch-Mizrachi I, Lipman DJ, Ostell J, Sayers EW (2009) GenBank. Nucleic Acids Res 37:D26–D31

    Article  PubMed  CAS  Google Scholar 

  • Cheng J, Kapranov P, Drenkow J, Dike S, Brubaker S, Patel S, Long J, Stern D, Tammana H, Helt G, Sementchenko V, Piccolboni A, Bekiranov S, Bailey DK, Ganesh M, Ghosh S, Bell I, Gerhard DS, Gingeras TR (2005) Transcriptional maps of 10 human chromosomes at 5-nucleotide resolution. Science 308:1149–1154

    Article  PubMed  CAS  Google Scholar 

  • Dermitzakis ET, Reymond A, Scamuffa N, Ucla C, Kirkness E, Rossier C, Antonarakis SE (2003) Evolutionary discrimination of mammalian conserved non-genic sequences (CNGs). Science 302:1033–1035

    Article  PubMed  CAS  Google Scholar 

  • El-Mogharbel N, Wakefield M, Deakin JE, Tsend-Ayush E, Grutzner F, Alsop A, Ezaz T, Marshall Graves JA (2007) DMRT gene cluster analysis in the platypus: new insights into genomic organization and regulatory regions. Genomics 89:10–21

    Article  PubMed  CAS  Google Scholar 

  • Gnirke A, Melnikov A, Maguire J, Rogov P, LeProust EM, Brockman W, Fennell T, Giannoukos G, Fisher S, Russ C, Gabriel S, Jaffe DB, Lander ES, Nusbaum C (2009) Solution hybrid selection with ultra-long oligonucleotides for massively parallel targeted sequencing. Nat Biotechnol 27:182–189

    Article  PubMed  CAS  Google Scholar 

  • Jones CE, Brown AL, Baumann U (2007) Estimating the annotation error rate of curated GO database sequence annotations. BMC Bioinformatics 8:170

    Article  PubMed  Google Scholar 

  • Kaiser J (2008) DNA sequencing. A plan to capture human diversity in 1000 genomes. Science 319:395

    Article  PubMed  CAS  Google Scholar 

  • Kapranov P, Cawley SE, Drenkow J, Bekiranov S, Strausberg RL, Fodor SP, Gingeras TR (2002) Large-scale transcriptional activity in chromosomes 21 and 22. Science 296:916–919

    Article  PubMed  CAS  Google Scholar 

  • Kapranov P, Cheng J, Dike S, Nix DA, Duttagupta R, Willingham AT, Stadler PF, Hertel J, Hackermuller J, Hofacker IL, Bell I, Cheung E, Drenkow J, Dumais E, Patel S, Helt G, Ganesh M, Ghosh S, Piccolboni A, Sementchenko V, Tammana H, Gingeras TR (2007) RNA maps reveal new RNA classes and a possible function for pervasive transcription. Science 316:1484–1488

    Article  PubMed  CAS  Google Scholar 

  • Marshall A (2008) Prepare for the deluge. Nat Biotechnol 26:1099

    Article  Google Scholar 

  • Pan Q, Shai O, Lee LJ, Frey BJ, Blencowe BJ (2008) Deep surveying of alternative splicing complexity in the human transcriptome by high-throughput sequencing. Nat Genet 40:1413–1415

    Article  PubMed  CAS  Google Scholar 

  • Pop M (2009) Genome assembly reborn: recent computational challenges. Brief Bioinform 10:354–366

    Article  PubMed  CAS  Google Scholar 

  • Salzberg SL, Sommer DD, Puiu D, Lee VT (2008) Gene-boosted assembly of a novel bacterial genome from very short reads. PLoS Comput Biol 4:e1000186

    Article  PubMed  Google Scholar 

  • Shendure J, Ji H (2008) Next-generation DNA sequencing. Nat Biotechnol 26:1135–1145

    Article  PubMed  CAS  Google Scholar 

  • Strasser BJ (2008) Genetics. GenBank – Natural history in the 21st Century? Science 322:537–538

    Article  PubMed  CAS  Google Scholar 

  • Trapnell C, Salzberg SL (2009) How to map billions of short reads onto genomes. Nat Biotechnol 27:455–457

    Article  PubMed  CAS  Google Scholar 

  • Trapnell C, Pachter L, Salzberg SL (2009) TopHat: discovering splice junctions with RNA-Seq. Bioinformatics 25:1105–1111

    Article  PubMed  CAS  Google Scholar 

  • Weiss KM (1998) In search of human variation. Genome Res 8:691–697

    PubMed  CAS  Google Scholar 

  • Wheelan SJ, Scheifele LZ, Martinez-Murillo F, Irizarry RA, Boeke JD (2006) Transposon insertion site profiling chip (TIP-chip). Proc Natl Acad Sci U S A 103:17632–17637

    Article  PubMed  CAS  Google Scholar 

  • Wheelan SJ, Martinez Murillo F, Boeke JD (2008) The incredible shrinking world of DNA microarrays. Mol Biosyst 4:726–732

    Article  PubMed  CAS  Google Scholar 

  • Zeeberg BR, Riss J, Kane DW, Bussey KJ, Uchio E, Linehan WM, Barrett JC, Weinstein JN (2004) Mistaken identifiers: gene name errors can be introduced inadvertently when using Excel in bioinformatics. BMC Bioinformatics 5:80

    Article  PubMed  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Sarah J. Wheelan .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2010 Springer Science+Business Media, LLC

About this chapter

Cite this chapter

Wheelan, S.J. (2010). Genome-Scale Analysis of Data from High-Throughput Technologies. In: Yegnasubramanian, S., Isaacs, W. (eds) Modern Molecular Biology. Applied Bioinformatics and Biostatistics in Cancer Research. Springer, New York, NY. https://doi.org/10.1007/978-0-387-69745-1_1

Download citation

Publish with us

Policies and ethics