Abstract
Microarray technologies become a powerful technique for simultaneously monitoring expression patterns of thousands of genes under different conditions. However, it is important to identify gene groups that manifest similar expression profiles and are activated by similar conditions. ClusterMPP: Clustering by Marked Point Process is a new microarray data clustering algorithm performed in two steps. The first one detects cluster modes representing regions of high density observations in the raw space. Based on the simulation of a proposed Marked Point Process by the well-known Reversible Jump Markov Chain Monte Carlo algorithm, where we consider several movements like birth and death, this algorithm step identifies prototype observations of each cluster. The second step of ClusterMPP is the K nearest neighbors (KNN) assignation that affects the remaining observations to the corresponding clusters. We experiment ClusterMPP on several complex and scalable microarray datasets. The results show the efficiency of ClusterMPP compared to well-known microarray data clustering methods like K-means, Spectral Clustering, and Mean-Shift.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Alata, O., Burg, S., Dupas, A.: Grouping/degrouping point process, a point process driven by geometrical and topological properties of a partition in regions. Comput. Vis. Image Underst. 115(9), 1324–1339 (2011)
Chin, Y.C., Baddeley, A.J.: Markov interacting component processes. Adv. Appl. Probab. 32(3), 597–619 (2000)
Clifford, P.: Markov random fields in statistics. In: Grimmett, G.R., Welsh, D.J.A. (Eds.) Disorder in Physical Systems, A Volume in Honour of J.M. Hammersley, pp. 19–32. Clarendon Press, Oxford (1990)
Ferrandiz, S., Boullé, M.: Bayesian instance selection for the nearest neighbor rule. Mach. Learn. 81(3), 229–256 (2010)
Giancarlo, R., Bosco, L., Pinello, G.L., Utro, F.: A methodology to assess the intrinsic discriminative ability of a distance function and its interplay with clustering algorithms for Microarray data analysis. BMC Bioinformatics 14(S-1), S6 (2013)
Gorunescu, F.: Data Mining: Concepts, Models and Techniques. Intelligent Systems Reference Library, vol. 12, pp. 1–43. Springer, Berlin (2011)
Green, P.J.: Reversible jump Markov chain Monte Carlo computation and Bayesian model determination. Biometrika 82, 711–732 (1995)
Harun, P., Burak, E., Andy P., Çetin, Y.: Clustering of high throughput gene expression data. Comput. Oper. Res. 39(12), 3046–3061 (2012)
Kaur, S., Kaur, U.: A survey on various clustering techniques with K-means clustering algorithm in detail. Int. J. Comput. Sci. Mob. Comput. 2(4), 155–159 (2013)
Khaled, S.: TOBAE: a density-based agglomerative clustering algorithm. J. Classif. 32(2), 241–267 (2015)
Liu, Y., Li, Z., Xiong, H., Gao, X., Wu, J.: Understanding of internal clustering validation measures. In: ICDM-10 Proceedings of the 2010 IEEE International Conference on Data Mining, pp. 911–916 (2010)
Mouysset, S., et al.: Spectral clustering: interpretation and Gaussian parameter. In: Data Analysis, and Knowledge Organization. Studies in Classification, vol. 4, pp. 153–162 (2013)
Reddy, C.K., Vinzamuri, B.: A survey of partitional and hierarchical clustering algorithms. In: Aggarwal, C., Reddy, C.K. (eds.) Data Clustering: Algorithms and Applications, pp. 87–110. CRC (2014)
Sepp, H., et al.: FABIA: factor analysis for bicluster acquisition. Bioinformatics. 26(12), 1520–1527 (2010)
Stoica, R.S., Gay, E., Kretzschmar, A.: Cluster pattern detection in spatial data based on Monte Carlo inference. Biom. J. 49(4), 505–519 (2007)
Stoica, R.S., Martinez, V.J., Saar, E.: Filaments in observed and mock galaxy catalogues. Astron. Astrophys. 510(38), 1–12 (2010)
Wu, K.L., Yang, M.S.: Mean shift-based clustering. Pattern Recogn. 40(11) 3035–3052 (2007)
Acknowledgements
This research was supported in part by the Erasmus Mundus—Al Idrisi II program.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this paper
Cite this paper
Henni, K., Alata, O., El Idrissi, A., Vannier, B., Zaoui, L., Moussa, A. (2017). Marked Point Processes for Microarray Data Clustering. In: Palumbo, F., Montanari, A., Vichi, M. (eds) Data Science . Studies in Classification, Data Analysis, and Knowledge Organization. Springer, Cham. https://doi.org/10.1007/978-3-319-55723-6_11
Download citation
DOI: https://doi.org/10.1007/978-3-319-55723-6_11
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-55722-9
Online ISBN: 978-3-319-55723-6
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)