Skip to main content

A Distributed, Parallel System for Large-Scale Structure Recognition in Gene Expression Data

  • Conference paper
High Performance Computing and Communications (HPCC 2006)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 4208))

  • 792 Accesses

Abstract

Due to the development of very high-throughput lab technology, known as DNA microarrays, it has become feasible for scientists to monitor the transcriptional activity of all known genes in many living organisms. Such assays are typically conducted repeatedly, along a timecourse or across a series of predefined experimental conditions, yielding a set of expression profiles. Arranging these into subsets, based on their pair-wise similarity, is known as clustering. Clusters of genes exhibiting similar expression behavior are often related in a biologically meaningful way, which is at the center of interest to research in functional genomics.

We present a distributed, parallel system based on spectral graph theory and numerical linear algebra that can solve this problem for datasets generated by the latest generation of microarrays, and at high levels of experimental noise. It allows us to process hundreds of thousands of expression profiles, thereby vastly increasing the current size limit for unsupervized clustering with full similarity information.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Ernst, J.: Similarity-Based Clustering Algorithms for Gene Expression Profiles, Dissertation, TU München (2003)

    Google Scholar 

  2. Gourlay, A., Watson, G.: Computational Methods for Matrix Eigenproblems. John Wiley & Sons, New York (1973)

    Google Scholar 

  3. Jiang, D., Tang, C., Zhang, A.: Cluster Analysis for Gene Expression Data: A Survey. IEEE Transactions on Knowledge and Data Engineering 16(11), 1370–1386 (2004)

    Article  Google Scholar 

  4. Press, W., Teukolsky, S., Vetterling, W., Flannery, B.: Numerical Recipes in C: The Art of Scientific Computing, 2nd edn. Cambridge University Press, Cambridge (1992)

    Google Scholar 

  5. Schena, M., Shalon, D., Davis, R.W., Brown, P.O.: Quantitative monitoring of gene expression patterns with a complementary DNA microarray. Science 270, 467–470 (1995)

    Article  Google Scholar 

  6. Shamir, R., Sharan, R.: Algorithmic approaches to clustering gene expression data. Current Topics in Computational Biology, 269–300 (2002)

    Google Scholar 

  7. Spira, A., Beane, J., Shah, V., Liu, G., Schembri, F., Yang, X., Palma, J., Brody, J.S.: Effects of cigarette smoke on the human airway epithelial cell transcriptome. Proc. Natl. Acad. Sci. US 101(27), 10143–10148 (2004)

    Article  Google Scholar 

  8. Valafar, F.: Pattern Recognition Techniques in Microarray Data: A Survey. Special Issue of Annals of New York, Techniques in Bioinformatics and Medical Informatics 980, 41–64 (2002)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2006 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Ernst, J. (2006). A Distributed, Parallel System for Large-Scale Structure Recognition in Gene Expression Data. In: Gerndt, M., Kranzlmüller, D. (eds) High Performance Computing and Communications. HPCC 2006. Lecture Notes in Computer Science, vol 4208. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11847366_3

Download citation

  • DOI: https://doi.org/10.1007/11847366_3

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-39368-9

  • Online ISBN: 978-3-540-39372-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics