Skip to main content

Marked Point Processes for Microarray Data Clustering

  • Conference paper
  • First Online:
Data Science

Abstract

Microarray technologies become a powerful technique for simultaneously monitoring expression patterns of thousands of genes under different conditions. However, it is important to identify gene groups that manifest similar expression profiles and are activated by similar conditions. ClusterMPP: Clustering by Marked Point Process is a new microarray data clustering algorithm performed in two steps. The first one detects cluster modes representing regions of high density observations in the raw space. Based on the simulation of a proposed Marked Point Process by the well-known Reversible Jump Markov Chain Monte Carlo algorithm, where we consider several movements like birth and death, this algorithm step identifies prototype observations of each cluster. The second step of ClusterMPP is the K nearest neighbors (KNN) assignation that affects the remaining observations to the corresponding clusters. We experiment ClusterMPP on several complex and scalable microarray datasets. The results show the efficiency of ClusterMPP compared to well-known microarray data clustering methods like K-means, Spectral Clustering, and Mean-Shift.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 109.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 139.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Alata, O., Burg, S., Dupas, A.: Grouping/degrouping point process, a point process driven by geometrical and topological properties of a partition in regions. Comput. Vis. Image Underst. 115(9), 1324–1339 (2011)

    Article  Google Scholar 

  2. Chin, Y.C., Baddeley, A.J.: Markov interacting component processes. Adv. Appl. Probab. 32(3), 597–619 (2000)

    Article  MathSciNet  MATH  Google Scholar 

  3. Clifford, P.: Markov random fields in statistics. In: Grimmett, G.R., Welsh, D.J.A. (Eds.) Disorder in Physical Systems, A Volume in Honour of J.M. Hammersley, pp. 19–32. Clarendon Press, Oxford (1990)

    Google Scholar 

  4. Ferrandiz, S., Boullé, M.: Bayesian instance selection for the nearest neighbor rule. Mach. Learn. 81(3), 229–256 (2010)

    Article  MathSciNet  Google Scholar 

  5. Giancarlo, R., Bosco, L., Pinello, G.L., Utro, F.: A methodology to assess the intrinsic discriminative ability of a distance function and its interplay with clustering algorithms for Microarray data analysis. BMC Bioinformatics 14(S-1), S6 (2013)

    Google Scholar 

  6. Gorunescu, F.: Data Mining: Concepts, Models and Techniques. Intelligent Systems Reference Library, vol. 12, pp. 1–43. Springer, Berlin (2011)

    Google Scholar 

  7. Green, P.J.: Reversible jump Markov chain Monte Carlo computation and Bayesian model determination. Biometrika 82, 711–732 (1995)

    Article  MathSciNet  MATH  Google Scholar 

  8. Harun, P., Burak, E., Andy P., Çetin, Y.: Clustering of high throughput gene expression data. Comput. Oper. Res. 39(12), 3046–3061 (2012)

    Article  MathSciNet  MATH  Google Scholar 

  9. Kaur, S., Kaur, U.: A survey on various clustering techniques with K-means clustering algorithm in detail. Int. J. Comput. Sci. Mob. Comput. 2(4), 155–159 (2013)

    Google Scholar 

  10. Khaled, S.: TOBAE: a density-based agglomerative clustering algorithm. J. Classif. 32(2), 241–267 (2015)

    Article  MathSciNet  MATH  Google Scholar 

  11. Liu, Y., Li, Z., Xiong, H., Gao, X., Wu, J.: Understanding of internal clustering validation measures. In: ICDM-10 Proceedings of the 2010 IEEE International Conference on Data Mining, pp. 911–916 (2010)

    Google Scholar 

  12. Mouysset, S., et al.: Spectral clustering: interpretation and Gaussian parameter. In: Data Analysis, and Knowledge Organization. Studies in Classification, vol. 4, pp. 153–162 (2013)

    Google Scholar 

  13. Reddy, C.K., Vinzamuri, B.: A survey of partitional and hierarchical clustering algorithms. In: Aggarwal, C., Reddy, C.K. (eds.) Data Clustering: Algorithms and Applications, pp. 87–110. CRC (2014)

    Google Scholar 

  14. Sepp, H., et al.: FABIA: factor analysis for bicluster acquisition. Bioinformatics. 26(12), 1520–1527 (2010)

    Article  Google Scholar 

  15. Stoica, R.S., Gay, E., Kretzschmar, A.: Cluster pattern detection in spatial data based on Monte Carlo inference. Biom. J. 49(4), 505–519 (2007)

    Article  MathSciNet  Google Scholar 

  16. Stoica, R.S., Martinez, V.J., Saar, E.: Filaments in observed and mock galaxy catalogues. Astron. Astrophys. 510(38), 1–12 (2010)

    Google Scholar 

  17. Wu, K.L., Yang, M.S.: Mean shift-based clustering. Pattern Recogn. 40(11) 3035–3052 (2007)

    Article  MATH  Google Scholar 

Download references

Acknowledgements

This research was supported in part by the Erasmus Mundus—Al Idrisi II program.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ahmed Moussa .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer International Publishing AG

About this paper

Cite this paper

Henni, K., Alata, O., El Idrissi, A., Vannier, B., Zaoui, L., Moussa, A. (2017). Marked Point Processes for Microarray Data Clustering. In: Palumbo, F., Montanari, A., Vichi, M. (eds) Data Science . Studies in Classification, Data Analysis, and Knowledge Organization. Springer, Cham. https://doi.org/10.1007/978-3-319-55723-6_11

Download citation

Publish with us

Policies and ethics