Skip to main content

Querying Co-regulated Genes on Diverse Gene Expression Datasets Via Biclustering

  • Protocol
  • First Online:
Microarray Data Analysis

Part of the book series: Methods in Molecular Biology ((MIMB,volume 1375))

Abstract

Rapid development and increasing popularity of gene expression microarrays have resulted in a number of studies on the discovery of co-regulated genes. One important way of discovering such co-regulations is the query-based search since gene co-expressions may indicate a shared role in a biological process. Although there exist promising query-driven search methods adapting clustering, they fail to capture many genes that function in the same biological pathway because microarray datasets are fraught with spurious samples or samples of diverse origin, or the pathways might be regulated under only a subset of samples. On the other hand, a class of clustering algorithms known as biclustering algorithms which simultaneously cluster both the items and their features are useful while analyzing gene expression data, or any data in which items are related in only a subset of their samples. This means that genes need not be related in all samples to be clustered together. Because many genes only interact under specific circumstances, biclustering may recover the relationships that traditional clustering algorithms can easily miss. In this chapter, we briefly summarize the literature using biclustering for querying co-regulated genes. Then we present a novel biclustering approach and evaluate its performance by a thorough experimental analysis.

“What we call chaos is just patterns we haven’t recognized. What we call random is just patterns we can’t decipher.”

— Chuck Palahniuk, Survivor

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Protocol
USD 49.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 159.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Ben-Dor A, Chor B, Karp R, Yakhini Z (2002) Discovering local structure in gene expression data: The order-preserving submatrix problem. In: Proceedings of the International Conference on Computational Biology, pp 49–57

    Google Scholar 

  2. Jiang D, Pei J, Zhang A (2003) DHC: a density-based hierarchical clustering method for time series gene expression data. In: Proceedings IEEE Symposium on BioInformatics and Bioengineering, pp 393–400

    Google Scholar 

  3. Madeira SC, Oliveira AL (2004) Biclustering algorithms for biological data analysis: a survey. IEEE/ACM Trans Comput Biol Bioinform 1(1):24–45

    Article  CAS  PubMed  Google Scholar 

  4. Pujana MA, Han J-DJ, LM Starita, Stevens KN, Tewari M, Ahn JS, Rennert G, Moreno V, Kirchhoff T, Gold B, Assmann V, ElShamy WM, Rual J-F, Levine D, Rozek LS, Gelman RS, Gunsalus KC, Greenberg RA, Sobhian B, Bertin N, Venkatesan K, Ayivi-Guedehoussou N, Sole X, Hernandez P, Lazaro C, Nathanson KL, Weber BL, Cusick ME, Hill DE, Offit K, Livingston DM, Gruber SB, Parvin JD, Vidal M (2007) Network modeling links breast cancer susceptibility and centrosome dysfunction. Nat Genet 39(11):1338–1349

    Article  CAS  PubMed  Google Scholar 

  5. Owen AB, Stuart J, Mach K, Villeneuve AM, Kim S (2003) A gene recommender algorithm to identify coexpressed genes in C. elegans. Genome Res 13(8):1828–1837

    CAS  PubMed  PubMed Central  Google Scholar 

  6. Hibbs MA, Hess DC, Myers CL, Huttenhower C, Li K, Troyanskaya OG (2007) Exploring the functional landscape of gene expression: directed search of large microarray compendia. Bioinformatics 23:2692–2699

    Article  CAS  PubMed  Google Scholar 

  7. Dhollander T, Sheng Q, Lemmens K, De Moor B, Marchal K, Moreau Y (2007) Query-driven module discovery in microarray data. Bioinformatics 23:2573–2580

    Article  CAS  PubMed  Google Scholar 

  8. Adler P, Kolde R, Kull M, Tkachenko A, Peterson H, Reimand J, Vilo J (2009) Mining for coexpression across hundreds of datasets using novel rank aggregation and visualization methods. Genome Biol 10:R139

    Article  PubMed  PubMed Central  Google Scholar 

  9. Bozdağ D, Parvin JD, Çatalyürek ÜV (2009) A biclustering method to discover co-regulated genes using diverse gene expression datasets. In: Proceedings of 1st International Conference on Bioinformatics and Computational Biology, pp 151–163

    Google Scholar 

  10. Zhao H, Cloots L, Van den Bulcke T, Wu Y, De Smet R, Storms V, Meysman P, Engelen K, Marchal K (2011) Query-based biclustering of gene expression data using probabilistic relational models. BMC Bioinf 12(Suppl 1):S37

    Article  Google Scholar 

  11. Cheng Y, Church GM (2000) Biclustering of expression data. In: Proceedings of International Conference on Intelligent Systems for Molecular Biology, pp 93–103

    Google Scholar 

  12. Segal E, Taskar B, Gasch A, Friedman N, Koller D (2001) Rich probabilistic models for gene expression. Bioinformatics 17(suppl_1):S243–S252

    Google Scholar 

  13. Wang H, Wang W, Yang J, Yu PS (2002) Clustering by pattern similarity in large data sets. In: Proceedings of ACM SIGMOD

    Book  Google Scholar 

  14. Lazzeroni L, Owen A (2000) Plaid models for gene expression data. Tech. Rep., Stanford University

    Google Scholar 

  15. Mitra S, Banka H (2006) Multi-objective evolutionary biclustering of gene expression data. Pattern Recognit 39(12):2464–2477

    Article  Google Scholar 

  16. Mejía-Roa E, Carmona-Saez P, Nogales R, Vicente C, Vázquez M, Yang XY, García C, Tirado F, Pascual-Montano A (2008) bioNMF: a web-based tool for nonnegative matrix factorization in biology. Nucleic Acids Res 36(suppl 2):W523–W528

    Article  PubMed  PubMed Central  Google Scholar 

  17. Gu J, Liu JS (2008) Bayesian biclustering of gene expression data. BMC Genomics 9(Suppl 1):S4

    Article  PubMed  PubMed Central  Google Scholar 

  18. Hochreiter S, Bodenhofer U, Heusel M, Mayr A, Mitterecker A, Kasim A, Khamiakova T, Van Sanden S, Lin D, Talloen W et al (2010) Fabia: factor analysis for bicluster acquisition. Bioinformatics 26(12):1520–1527

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  19. Painsky A, Rosset S (2012) Exclusive row biclustering for gene expression using a combinatorial auction approach. In: Proceedings of the 2012 I.E. 12th International Conference on Data Mining, pp 1056–1061. IEEE Computer Society

    Google Scholar 

  20. Joung J-G, Kim S-J, Shin S-Y, Zhang B-T (2012) A probabilistic coevolutionary biclustering algorithm for discovering coherent patterns in gene expression dataset. BMC Bioinf 13(Suppl 17):S12

    Article  Google Scholar 

  21. Flores JL, Inza I, Larrañaga P, Calvo B (2013) A new measure for gene expression biclustering based on non-parametric correlation. Comput Methods Prog Biomed 112(3):367–397

    Article  Google Scholar 

  22. Sun P, Speicher NK, Röttger R, Guo J, Baumbach J (2014) Bi-force: large-scale bicluster editing and its application to gene expression data biclustering. Nucleic Acids Res. doi:10.1093/nar/gku201

    Google Scholar 

  23. Chakraborty A (2005) Biclustering of gene expression data by simulated annealing. In: Proceedings of Eighth International Conference on High-Performance Computing in Asia-Pacific Region, 2005, pp 627–632

    Google Scholar 

  24. Liew AW-C, Law N-F, Yan H (2011) Recent patents on biclustering algorithms for gene expression data analysis. Recent Pat DNA Gene Seq 5(2):117–125

    Article  CAS  PubMed  Google Scholar 

  25. Hussain SF (2011) Bi-clustering gene expression data using co-similarity. In: Proceedings of the 7th International Conference on Advanced Data Mining and Applications - Volume Part I, ADMA’11, pp 190–200. Springer, Berlin/Heidelberg

    Google Scholar 

  26. An J, Liew AW-C, Nelson CC (2012) Seed-based biclustering of gene expression data. PLoS ONE 7:e42431, 08

    Google Scholar 

  27. Kiraly A, Abonyi J, Laiho A, Gyenesei A (2012) Biclustering of high-throughput gene expression data with bicluster miner. In: IEEE 12th International Conference on Data Mining Workshops (ICDMW), 2012, pp 131–138

    Google Scholar 

  28. Liu J, Wang J, Wang W (2004) Biclustering in gene expression data by tendency. In: Proceedings of IEEE Computational Systems Bioinformatics Conference, pp 182–193. IEEE Computer Society

    Google Scholar 

  29. Liu J, Wang J, Wang W (2004) Gene ontology friendly biclustering of expression profiles. In: Proceedings of IEEE Computational Systems Bioinformatics Conference, pp 436–447. IEEE Computer Society

    Google Scholar 

  30. Madeira S, Oliveira A (2005) A linear time biclustering algorithm for time series gene expression data. In: Casadio R, Myers G (eds) Algorithms in bioinformatics. Lecture Notes in Computer Science, vol 3692, pp 39–52, Springer, Berlin/Heidelberg

    Chapter  Google Scholar 

  31. Pontes B, Giraldéz R, Aguilar-Ruiz JS (2013) Configurable pattern-based evolutionary biclustering of gene expression data. Algorithms Mol Biol 8:4

    Article  PubMed  PubMed Central  Google Scholar 

  32. Yang W-H, Dai D-Q, Yan H (2011) Finding correlated biclusters from gene expression data. IEEE Trans Knowl Data Eng 23:568–584

    Article  Google Scholar 

  33. Yoon S, Nardini C, Benini L, De Micheli G (2005) Discovering coherent biclusters from gene expression data using zero-suppressed binary decision diagrams. IEEE/ACM Trans Comput Biol Bioinf 2:339–354

    Article  CAS  Google Scholar 

  34. Angiulli F, Cesario E, Pizzuti C (2008) Random walk biclustering for microarray data. Inf Sci 178(6):1479–1497

    Article  Google Scholar 

  35. Bryan K (2005) Biclustering of expression data using simulated annealing. In: Proceedings of the 18th IEEE Symposium on Computer-Based Medical Systems, CBMS’05, (Washington, DC, USA), pp 383–388. IEEE Computer Society

    Google Scholar 

  36. Bryan K, Cunningham P, Bolshakova N (2006) Application of simulated annealing to the biclustering of gene expression data. Trans Inf Tech Biomed 10:519–525

    Article  Google Scholar 

  37. Bleuler S, Prelic A, Zitzler E (2004) An EA framework for biclustering of gene expression data. In: Congress on Evolutionary Computation, 2004 (CEC2004), vol 1, pp 166–173

    Google Scholar 

  38. Divina F, Aguilar-Ruiz J (2006) Biclustering of expression data with evolutionary computation. IEEE Trans Knowl Data Eng 18:590–602

    Article  Google Scholar 

  39. Nepomuceno JA, Troncoso A, Aguilar-Ruiz JS (2010) Correlation-based scatter search for discovering biclusters from gene expression data. In: Proceedings of the 8th European Conference on Evolutionary Computation, Machine Learning and Data Mining in Bioinformatics, EvoBIO’10, pp 122–133. Springer, Berlin/Heidelberg

    Chapter  Google Scholar 

  40. Nepomuceno JA, Troncoso A, Aguilar-Ruiz JS (2011) A comparative analysis of biclustering algorithms for gene expression data. BioData Mining 4:3

    Article  PubMed  PubMed Central  Google Scholar 

  41. Erten C, Sözdinler M (2009) Biclustering expression data based on expanding localized substructures. In: Rajasekaran S (ed) Bioinformatics and computational biology. Lecture Notes in Computer Science, vol 5462, pp 224–235. Springer, Berlin/Heidelberg

    Google Scholar 

  42. Tanay A, Sharan R, Shamir R (2002) Discovering statistically significant biclusters in gene expression data. Bioinformatics 18(Supplement 1):136–144

    Google Scholar 

  43. Bergmann S, Ihmels J, Barkai N (2003) Iterative signature algorithm for the analysis of large-scale gene expression data. Phys Rev E Stat Nonlinear Soft Matter Phys 67:031902

    Article  Google Scholar 

  44. Kluger Y, Basri R, Chang JT, Gerstein M (2003) Spectral biclustering of microarray data: coclustering genes and conditions. Genome Res 13(4):703–716

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  45. Prelić A, Bleuler S, Zimmermann P, Wille A, Bühlmann P, Gruissem W, Hennig L, Thiele L, Zitzler E (2006) A systematic comparison and evaluation of biclustering methods for gene expression data. Bioinformatics 22:1122–1129

    Article  PubMed  Google Scholar 

  46. Li G, Ma Q, Tang H, Paterson AH, Xu Y (2009) QUBIC: a qualitative biclustering algorithm for analyses of gene expression data. Nucleic Acids Res 37(15):e101

    Article  PubMed  PubMed Central  Google Scholar 

  47. Huttenhower C, Mutungu KT, Indik N, Yang W, Schroeder M, Forman JJ, Troyanskaya OG, Coller HA (2009) Detailing regulatory networks through large scale data integration. Bioinformatics 25:3267–3274

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  48. Voggenreiter O, Bleuler S, Gruissem W (2012) Exact biclustering algorithm for the analysis of large gene expression data sets. BMC Bioinf 13(Suppl 18):A10

    Google Scholar 

  49. Bryan K, Cunningham P (2006) Bottom-up biclustering of expression data. In: IEEE Symposium on Computational Intelligence and Bioinformatics and Computational Biology, 2006 (CIBCB ’06), pp 1–8

    Google Scholar 

  50. Murali T, Kasif S (2003) Extracting conserved gene expression motifs from gene expression data. Pac Symp Biocomput 8:77–88

    Google Scholar 

  51. Liu J, Wang W (2003) Op-cluster: clustering by tendency in high dimensional space. In: Proceedings of IEEE International Conference on Data Mining, p 187

    Google Scholar 

  52. Freitas AV, Ayadi W, Elloumi M, Oliveira J, Oliveira J, Hao J-K (2013) Survey on biclustering of gene expression data, pp 591–608. Wiley, New York

    Google Scholar 

  53. Bozdağ D, Kumar A, Çatalyürek ÜV (2010) Comparative Analysis of Biclustering Algorithms. In: ACM International Conference on Bioinformatics and Computational Biology

    Book  Google Scholar 

  54. Chia BKH, Karuturi RKM (2010) Differential co-expression framework to quantify goodness of biclusters and compare biclustering algorithms. Algorithms Mol Biol 5(1):8

    Article  Google Scholar 

  55. Eren K, Deveci M, Küçüktunç O, Çatalyürek ÜV (2012) A comparative analysis of biclustering algorithms for gene expression data. Brief Bioinform

    Google Scholar 

  56. Oghabian A, Kilpinen S, Hautaniemi S, Czeizler E (2014) Biclustering methods: Biological relevance and application in gene expression analysis. PloS one 9(3):e90801

    Article  PubMed  PubMed Central  Google Scholar 

  57. Bhattacharya A, De RK (2009) Bi-correlation clustering algorithm for determining a set of co-regulated genes. Bioinformatics 25(21):2795–2801

    Article  CAS  PubMed  Google Scholar 

  58. Casella G, Wells MT (1993) Is Pitman closeness a reasonable criterion: comment. J Am Stat Assoc 88(421):70–71

    Google Scholar 

  59. Mian O, Wang S, Zhu S, Gnanapragasam M, Graham L, Bear H, Ginder G (2011) Methyl-binding domain protein 2-dependent proliferation and survival of breast cancer cells. Mol Cancer Res 9(8):1152–62

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  60. Kioulafa M, Kaklamanis L, Stathopoulos E, Mavroudis D, Georgoulias V, Lianidou ES (2009) Kallikrein 10 (KLK10) methylation as a novel prognostic biomarker in early breast cancer. Ann Oncol 20:1020–1025

    Article  CAS  PubMed  Google Scholar 

  61. Dorszewska J, Florczak J, Rozycka A, Jaroszewska-Kolecka J, Trzeciak WH, Kozubski W (2005) Polymorphisms of the CHRNA4 gene encoding the alpha4 subunit of nicotinic acetylcholine receptor as related to the oxidative DNA damage and the level of apoptotic proteins in lymphocytes of the patients with Alzheimer’s disease. DNA Cell Biol 24:786–794

    Article  CAS  PubMed  Google Scholar 

  62. Zhang L, Farrell JJ, Zhou H, Elashoff D, Akin D, Park N-H, Chia D, Wong DT (2010) Salivary transcriptomic biomarkers for detection of resectable pancreatic cancer. Gastroenterology 138(3):949–957, e1–7

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  63. Lindahl M, Poteryaev D, Yu L, Arumae U, Timmusk T, Bongarzone I, Aiello A, Pierotti MA, Airaksinen MS, Saarma M (2001) Human glial cell line-derived neurotrophic factor receptor alpha 4 is the receptor for persephin and is predominantly expressed in normal and malignant thyroid medullary cells. J Biol Chem 276:9344–9351

    Article  CAS  PubMed  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ümit V. Çatalyürek .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer Science+Business Media New York

About this protocol

Cite this protocol

Deveci, M., Küçüktunç, O., Eren, K., Bozdağ, D., Kaya, K., Çatalyürek, Ü.V. (2015). Querying Co-regulated Genes on Diverse Gene Expression Datasets Via Biclustering. In: Guzzi, P. (eds) Microarray Data Analysis. Methods in Molecular Biology, vol 1375. Humana Press, New York, NY. https://doi.org/10.1007/7651_2015_246

Download citation

  • DOI: https://doi.org/10.1007/7651_2015_246

  • Published:

  • Publisher Name: Humana Press, New York, NY

  • Print ISBN: 978-1-4939-3172-9

  • Online ISBN: 978-1-4939-3173-6

  • eBook Packages: Springer Protocols

Publish with us

Policies and ethics