Skip to main content
Log in

A Novel Approach to Revealing Positive and Negative Co-Regulated Genes

  • Regular Paper
  • Published:
Journal of Computer Science and Technology Aims and scope Submit manuscript

Abstract

As explored by biologists, there is a real and emerging need to identify co-regulated gene clusters, which include both positive and negative regulated gene clusters. However, the existing pattern-based and tendency-based clustering approaches are only designed for finding positive regulated gene clusters. In this paper, a new subspace clustering model called g-Cluster is proposed for gene expression data. The proposed model has the following advantages: 1) find both positive and negative co-regulated genes in a shot, 2) get away from the restriction of magnitude transformation relationship among co-regulated genes, and 3) guarantee quality of clusters and significance of regulations using a novel similarity measurement gCode and a user-specified regulation threshold δ, respectively. No previous work measures up to the task which has been set. Moreover, MDL technique is introduced to avoid insignificant g-Clusters generated. A tree structure, namely GS-tree, is also designed, and two algorithms combined with efficient pruning and optimization strategies to identify all qualified g-Clusters. Extensive experiments are conducted on real and synthetic datasets. The experimental results show that 1) the algorithm is able to find an amount of co-regulated gene clusters missed by previous models, which are potentially of high biological significance, and 2) the algorithms are effective and efficient, and outperform the existing approaches.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Similar content being viewed by others

References

  1. Liu J, Wang W. Op-cluster: Clustering by tendency in high dimensional space. In Proc. ICDM 2003 Conference, Melbourne, USA, 2003, 187–194.

  2. Haixun Wang, Wei Wang, Jiong Yang, Philip S Yu. Clustering by pattern similarity in large data sets. In Proc. the 2002 ACM SIGMOD Conference, Wisconsin, 2002, pp.394–405.

  3. Jian Pei, Xiaoling Zhang, Moonjung Cho et al. Maple: A fast algorithm for maximal pattern-based clustering. In Proc. ICDM 2003 Conf., Florida, 2003, pp.259–266.

  4. Haixun Wang, Fang Chu, Wei Fan, Philip S Yu, Jian Pei. A fast algorithm for subspace clustering by pattern similarity. In Proc. Scientific and Statistical Database Management Conference, Santorini Island, Greece, 2004, pp.51–62.

  5. Lizhuang Zhao, Mohammed J Zaki. Tricluster: An effective algorithm for mining coherent clusters in 3d microarray data. In Proc. SIGMOD 2005 Conference, Maryland, USA, 2005, pp.51–62.

  6. Jinze Liu, Jiong Yang, Wei Wang. Biclustering in gene expression data by tendency. In Proc. 3rd Int. IEEE Computer Society Computational Systems Bioinformatics Conf., Stanford, USA, 2004, pp.182–193.

  7. Selnur Erdal, Ozgur Ozturk, David L Armbruster et al. A time series analysis of microarray data. In Proc. 4th IEEE Int. Symp. Bioinformatics and Bioengineering Conference, Taichung, 2004, pp.366–378.

  8. Daxin Jiang, Chun Tang, Aidong Zhang. Cluster analysis for gene expression data: A survey. IEEE Trans. Knowl. Data Eng., 2004, 16(11): 1370–1386.

    Article  Google Scholar 

  9. Jason Ernst, Gerard J Nau, Ziv Bar-Joseph. Clustering short time series gene expression data. Bioinformatics, 2005, 21(Suppl): 159–168.

    Article  Google Scholar 

  10. Yizong Cheng, George M Church. Biclustering of expression data. In Proc. 8th Int. Conf. Intelligent Systems for Molecular Biology 2000 Conference, San Diego, USA, 2000, pp.93–103.

  11. Yu H, Luscombe N, Qian J, Gerstein M. Genomic analysis of gene expression relation-ships in transcriptional regulatory networks. Trends Genet, 2003, 19(8): 422–427.

    Article  Google Scholar 

  12. Zhang Y, Zha H, Chu C H. A time-series biclustering algorithm for revealing co-regulated genes. In Proc. Int. Symp. Information and Technology: Coding and Computing, (ITCC 2005), Las Vegas, USA, 2005, pp.32–37.

  13. Terry P Speed. Review of “stochastic complexity in statistical inquiry”. IEEE Trans. Information Theory, 1991, 37(6): 1739–1746.

    Google Scholar 

  14. Kesheng Wu, Ekow J. Otoo, Arie Shoshani. On the performance of bitmap indices for high cardinality attributes. In Proc. VLDB 2004 Conference, Canada, 2004, pp.24–35.

  15. Kesheng Wu, Ekow J. Otoo, Arie Shoshani. Compressing bitmap indexes for faster search operations. In Proc. SSDBM 2002 Conference, Scotland, UK, 2002, pp.99–108.

  16. Golub T R, Slonim D K, Tamayo P et al. Molecular classification of cancer: Class discovery and class prediction by gene expression monitoring. Science, 1999, 286(5439): 531–537.

    Article  Google Scholar 

  17. Spellman P T, Sherlock G, Zhang M Q et al. Comprehensive identification of cell cycle-regulated genes of the yeast sacccha-romyces cerevisiae by microarray hybridization. Molecular Biology of the Cell, 1998, 1(9):3273–3297.

    Google Scholar 

  18. Levine E, Getz G, Domany E. Coupled two-way clustering analysis of gene microarray data. In Proc. Natural Academy of Sciences US, 2000, pp.12079–12084.

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ying Yin.

Additional information

This work is supported by the National Grand Fundamental Research 973 Program of China (Grant No. 2006CB303103) and the National Natural Science Foundation of China under Grants No. 60573089, No. 60273079 and No. 60473074.

Electronic supplementary material

Rights and permissions

Reprints and permissions

About this article

Cite this article

Zhao, YH., Wang, GR., Yin, Y. et al. A Novel Approach to Revealing Positive and Negative Co-Regulated Genes. J Comput Sci Technol 22, 261–272 (2007). https://doi.org/10.1007/s11390-007-9033-7

Download citation

  • Received:

  • Revised:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11390-007-9033-7

Keywords

Navigation