Skip to main content

A New Approach for Clustering Gene Expression Data

  • Conference paper
  • First Online:
Computational Intelligence, Communications, and Business Analytics (CICBA 2017)

Abstract

Most of the clustering algorithms are sensitive to noise. Many of them cluster all the genes of the dataset. However, it may be possible that only a small part of genes of the gene expression dataset is involved in the biological processes for a particular set of experiment conditions or sample. To identify these genes clusters, we propose a method which identifies the co-expressed genes having chances of co-regulation in presence of non-functional genes and high level of noise. The proposed method clusters those genes that are within distance threshold t with respect to a specific gene in each experiment conditions and works on column wise distance calculation approach. To validate the proposed method an experimental analysis has been done with a real gene expression data and the experimental results show the significance of proposed method over existing one.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

References

  1. Eisen, M.B., Spellman, P.T., Brown, P.O., Botstein, D.: Cluster analysis and display of genome-wide expression patterns. Proc. Nat. Acad. Sci. USA 95(25), 14863–14868 (1998)

    Article  Google Scholar 

  2. Tavazoie, S., Hughes, J.D., Campbell, M.J., Cho, R.J., Church, G.M.: Systematic determination of genetic network architecture. Nat. Genet. 22(3), 281–285 (1999)

    Article  Google Scholar 

  3. Tamayo, P., Slonim, D., Mesirov, J., Zhu, Q., Kitareewan, S., Dmitrovsky, E., Lander, E.S., Golub, T.R.: Interpreting patterns of gene expression with self-organizing maps: Methods and application to hematopoietic differentiation. Proc. Nat. Acad. Sci. USA 96(6), 2907–2912 (1999)

    Article  Google Scholar 

  4. Sharan, R., Shamir, R., CLICK: A clustering algorithm with applications to gene expression analysis. In Proceeding of Intelligent Systems for Molecular Biology (ISMB), pp. 307–316 (2000)

    Google Scholar 

  5. Dembele, D., Kastner, P.: Fuzzy c-means method for clustering microarray data. Bioinformatics 19(8), 973–980 (2003)

    Article  Google Scholar 

  6. Bandyopadhyay, S., Mukhopadhyay, A., Maulik, U.: An improved algorithm for clustering gene expression data. Bioinformatics 23(21), 2859–2865 (2007)

    Article  Google Scholar 

  7. Jiang, D., Tang, C., Zhang, A.: Cluster analysis for gene expression data: A survey. IEEE Trans. Knowl. Data Eng. 16(11), 1370–1386 (2004)

    Article  Google Scholar 

  8. Yee, Y.K., Haynor, D.R., Ruzzo, W.L.: Validating clustering for gene expression data. Bioinformatics 17(4), 309–318 (2001)

    Article  Google Scholar 

  9. Cho, Raymond: J.: A genome-wide transcriptional analysis of the mitotic cell cycle. Mol. Cell 2(1), 65–73 (1998)

    Article  MathSciNet  Google Scholar 

  10. Ma, P.C.H., Chan, K.C.C.: A novel approach for discovering overlapping clusters in gene expression data. IEEE Trans. Biomed. Eng. 56(7), 1803–1809 (2009)

    Article  Google Scholar 

  11. Dunn, J.C.: Well-separated clusters and optimal fuzzy partitions. J. Cybern. 4(1), 95–104 (1974)

    Article  MathSciNet  MATH  Google Scholar 

  12. Bolshakova, N., Azuaje, F.: Cluster validation techniques for genome expression data. Sig. Process. 83(4), 825–833 (2003)

    Article  MATH  Google Scholar 

  13. Brock, G., Pihur, V., Datta, S., Datta, S.: clValid, an R package for cluster validation. J. Stat. Softw. (Brock et al., March 2008) (2011)

    Google Scholar 

  14. Kerr, G., Ruskin, H.J., Crane, M., Doolan, P.: Techniques for clustering gene expression data. Comput. Biol. Med. 38(3), 283–293 (2008)

    Article  Google Scholar 

  15. Tamayo, P., Slonim, D., Mesirov, J., Zhu, Q., Kitareewan, S., Dmitrovsky, E., Lander, E.S., Golub, T.R.: Interpreting patterns of gene expression with self-organizing maps: methods and application to hematopoietic differentiation. Proc. Natl. Acad. Sci. 96(6), 2907–2912 (1999)

    Article  Google Scholar 

  16. Nieweglowski, L., Maintainer Nieweglowski, L.: Package ‘clv’ (2015)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Girish Chandra .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer Nature Singapore Pte Ltd.

About this paper

Cite this paper

Chandra, G., Tripathi, S. (2017). A New Approach for Clustering Gene Expression Data. In: Mandal, J., Dutta, P., Mukhopadhyay, S. (eds) Computational Intelligence, Communications, and Business Analytics. CICBA 2017. Communications in Computer and Information Science, vol 776. Springer, Singapore. https://doi.org/10.1007/978-981-10-6430-2_5

Download citation

  • DOI: https://doi.org/10.1007/978-981-10-6430-2_5

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-10-6429-6

  • Online ISBN: 978-981-10-6430-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics