Skip to main content

Linear Coherent Bi-cluster Discovery via Line Detection and Sample Majority Voting

  • Conference paper
Combinatorial Optimization and Applications (COCOA 2009)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 5573))

Abstract

Discovering groups of genes that share common expression profiles is an important problem in DNA microarray analysis. Unfortunately, standard bi-clustering algorithms often fail to retrieve common expression groups because (1) genes only exhibit similar behaviors over a subset of conditions, and (2) genes may participate in more than one functional process and therefore belong to multiple groups. Many algorithms have been proposed to address these problems in the past decade; however, in addition to the above challenges most such algorithms are unable to discover linear coherent bi-clusters—a strict generalization of additive and multiplicative bi-clustering models. In this paper, we propose a novel bi-clustering algorithm that discovers linear coherent bi-clusters, based on first detecting linear correlations between pairs of gene expression profiles, then identifying groups by sample majority voting. Our experimental results on both synthetic and two real datasets, Saccharomyces cerevisiae and Arabidopsis thaliana, show significant performance improvements over previous methods. One intriguing aspect of our approach is that it can easily be extended to identify bi-clusters of more complex gene-gene correlations.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 109.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 139.00
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Ben-Dor, A., Chor, B., Karp, R., Yakhini, Z.: Discovering Local Structure in Gene Expression Data: The Order-Preserving Sub-Matrix Problem. In: Proc. of the 6th Annual International Conference on Computational Biology, pp. 49–57 (2002)

    Google Scholar 

  2. Berriz, G.F., King, O.D., Bryant, B., Sander, C., Roth, F.P.: Characterizing Gene Sets with FuncAssociate. BioInformatics 19, 2502–2504 (2003)

    Article  Google Scholar 

  3. Causton, H.C., Quackenbush, J., Brazma, A.: Microarray Gene Expression Data Analysis: A Beginner’s Guide. Blackwell Publishing, Malden (2003)

    Google Scholar 

  4. Cheng, Y., Church, G.M.: Biclustering of Expression Data. In: Proceedings of the Eighth International Conference on Intelligent Systems for Molecular Biology, pp. 93–103 (2000)

    Google Scholar 

  5. Eisen, M.B., Spellman, P.T., Brown, P.O., Botstein, D.: Cluster Analysis and Display of Genome-wide Expression Patterns. Proceedings of the National Academy of Sciences of the United States of America 95, 14863–14868 (1998)

    Article  Google Scholar 

  6. Gan, X., Liew, A.W.-C., Yan, H.: Discovering Biclusters in Gene Expression Data based on High-dimensional Linear Geometries. BMC Bioinformatics 9, 209 (2008)

    Article  Google Scholar 

  7. Hartigan, J.A.: Direct Clustering of a Data Matrix. Journal of the American Statistical Association 67, 123–129 (1972)

    Article  Google Scholar 

  8. Hartigan, J.A., Wong, M.A.: A K-means Clustering Algorithm. Applied Statistics 28, 100–108 (1979)

    Article  MATH  Google Scholar 

  9. Ihmels, J., Bergmann, S., Barkai, N.: Defining Transcription Modules Using Large Scale Gene Expression Data. Bioinformatics 20, 1993–2003 (2004)

    Article  Google Scholar 

  10. Ihmels, J., Friedlander, G., Bergmann, S., Sarig, O., Ziv, Y., Barkai, N.: Revealing Modular Organization in the Yeast Transcriptional Network. Nature Genetics 31, 370–377 (2002)

    Google Scholar 

  11. Kluger, Y., Basri, R., Chang, J.T., Gerstein, M.: Spectral Biclustering of Microarray Data: Coclustering Genes and Conditions. Genome Res. 13, 703–716 (2003)

    Article  Google Scholar 

  12. Liu, X., Wang, L.: Computing the Maximum Similarity Bi-clusters of Gene Expression Data. Bioinformatics 23, 50–56 (2006)

    Article  Google Scholar 

  13. Madeira, S.C., Oliveira, A.L.: Biclustering Algorithms for Biological Data Analysis: A Survey. Computational Biology and Bioinformatics 1, 24–45 (2004)

    Google Scholar 

  14. Mirkin, B.: Mathematical Classification and Clustering. Kluwer Academic Publishers, Dordrecht (1996)

    Book  MATH  Google Scholar 

  15. Prelić, A., Bleuler, S., Zimmermann, P., Wille, A.: A Systematic Comparison and Evaluation of Biclustering Methods for Gene Expression Data. Bioinformatics 22(9), 1122–1129 (2006)

    Article  Google Scholar 

  16. Sheng, Q., Moreau, Y., De Moor, B.: Biclustering Microarray Data by Gibbs Sampling. Bioinformatics 19, 196–205 (2003)

    Article  Google Scholar 

  17. Sokal, R.R., Michener, C.D.: A Statistical Method for Evaluating Systematic Relationships. University of Kansas Science Bulletin 38, 1409–1438 (1958)

    Google Scholar 

  18. Tamayo, P., Slonim, D., Mesirov, J., Zhu, Q., Kitareewan, S., Dmitrovsky, E., Lander, E.S., Golub, T.R.: Interpreting Patterns of Gene Expression with Self-organizing Maps: Methods and Application to Hematopoietic Differentiation. Proceedings of the National Academy of Sciences of the United States of America 96, 2907–2912 (1999)

    Article  Google Scholar 

  19. Tanay, A., Sharan, R., Shamir, R.: Discovering Statistically Significant Biclusters in Gene Expression Data. Bioinformatics 18, 136–144 (2002)

    Article  Google Scholar 

  20. Tavazoie, S., Hughes, J.D., Campbell, M.J., Cho, R.J., Church, G.M.: Systematic Determination of Genetic Network Architecture. Nature Genetics 22, 281–285 (1999)

    Article  Google Scholar 

  21. Westfall, P.H., Young, S.S.: Resampling-Based Multiple Testing. Wiley, New York (1993)

    MATH  Google Scholar 

  22. Zhou, X., Su, Z.: EasyGO: Gene Ontology-Based Annotation and Functional Enrichment Analysis Tool for Agronomical Species. BMC Genomics 8, 246 (2007)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2009 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Shi, Y., Cai, Z., Lin, G., Schuurmans, D. (2009). Linear Coherent Bi-cluster Discovery via Line Detection and Sample Majority Voting. In: Du, DZ., Hu, X., Pardalos, P.M. (eds) Combinatorial Optimization and Applications. COCOA 2009. Lecture Notes in Computer Science, vol 5573. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-02026-1_7

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-02026-1_7

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-02025-4

  • Online ISBN: 978-3-642-02026-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics