Abstract
The paper provides a practical guide on initialization of the recursive mixture-based clustering of non-negative data. For modeling the non-negative data, mixtures of uniform, exponential, gamma and other distributions can be used. Initialization is known to be an important task for a start of the mixture estimation algorithm. Within the considered recursive approach, the key point of initialization is a choice of initial statistics of the involved prior distributions. The paper describes several initialization techniques for the mentioned types of components that can be beneficial primarily from a practical point of view.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Kárný, M., Kadlec, J., Sutanto, E.L.: Quasi-Bayes estimation applied to normal mixture. In: Rojíček, J., Valečková, M., Kárný, M., Warwick K. (eds.) Preprints of the 3rd European IEEE Workshop on Computer-Intensive Methods in Control and Data Processing, CMP’98 /3./, Prague, CZ, pp. 77–82 (1998)
Kárný, M., et al.: Optimized Bayesian Dynamic Advising: Theory and Algorithms. Springer, London (2006)
Nagy, I., Suzdaleva, E., Kárný, M., Mlynářová, T.: Bayesian estimation of dynamic finite mixtures. Int. J. Adapt. Control. Signal Process. 25(9), 765–787 (2011)
Nagy, I., Suzdaleva, E.: Algorithms and Programs of Dynamic Mixture Estimation. Unified Approach to Different Types of Components. SpringerBriefs in Statistics. Springer International Publishing, Cham (2017)
Roy, A., Pal, A., Garain, U.: JCLMM: a finite mixture model for clustering of circular-linear data and its application to psoriatic plaque segmentation. Pattern Recognit. 66, (2017). https://doi.org/10.1016/j.patcog.2016.12.016
Bouveyron, C., Brunet-Saumard, C.: Model-based clustering of high-dimensional data: a review. Comput. Stat. Data Anal. 71, 52–78 (2014)
Scrucca, L.: Genetic algorithms for subset selection in model-based clustering. Unsupervised Learning Algorithms, pp. 55–70. Springer International Publishing, Cham (2016)
Fernández, D., Arnold, R., Pledger, S.: Mixture-based clustering for the ordered stereotype model. Comput. Stat. Data Anal. 93, 46–75 (2016)
Suzdaleva, E., Nagy, I., Mlynářová, T.: Recursive estimation of mixtures of exponential and normal distributions. In: Proceedings of the 8th IEEE International Conference on Intelligent Data Acquisition and Advanced Computing Systems: Technology and Applications, Warsaw, Poland, 24–26 Sept 2015, pp. 137–142 (2015)
Browne, R.P., McNicholas, P.D.: A mixture of generalized hyperbolic distributions. Can. J. Stat. 43(2), 176–198 (2015)
Morris, K., McNicholas, P.D.: Clustering, classification, discriminant analysis, and dimension reduction via generalized hyperbolic mixtures. Comput. Stat. Data Anal. 97, 133–150 (2016)
Malsiner-Walli, G., Frühwirth-Schnatter, S., Grün, B.: Model-based clustering based on sparse finite Gaussian mixtures. Stat. Comput. 26(1–2), 303–324 (2016)
Li, R., Wang, Z., Gu, C., Li, F., Wu, H.: A novel time-of-use tariff design based on Gaussian Mixture Model. Appl. Energy 162, 1530–1536 (2016)
O’Hagan, A., Murphy, T.B., Gormley, I.C., McNicholas, P.D., Karlis, D.: Clustering with the multivariate normal inverse Gaussian distribution. Comput. Stat. Data Anal. 93, 18–30 (2016)
Suzdaleva, E., Nagy, I., Pecherková, P., Likhonina, R.: Initialization of recursive mixture-based clustering with uniform components. In: Proceedings of the 14th International Conference on Informatics in Control, Automation and Robotics (ICINCO 2017), Madrid, Spain, 26–28 July 2017, pp. 449–458 (2017)
Scrucca, L., Raftery, A.E.: Improved initialisation of model-based clustering using Gaussian hierarchical partitions. Adv. Data Anal. Classif. 9(4), 447–460 (2015)
Melnykov, V., Melnykov, I.: Initializing the EM algorithm in Gaussian mixture models with an unknown number of components. Comput. Stat. Data Anal. 56(6), 1381–1395 (2012)
Kwedlo, W.: A new method for random initialization of the EM algorithm for multivariate Gaussian mixture learning. In: Burduk, R., Jackowski, K., Kurzynski, M., Wozniak, M., Zolnierek, A. (eds.) Proceedings of the 8th International Conference on Computer Recognition Systems CORES 2013, pp. 81–90. Springer International Publishing, Heidelberg (2013)
Shireman, E., Steinley, D., Brusco, M.J.: Examining the effect of initialization strategies on the performance of Gaussian mixture modeling. Behav. Res. Methods 1–12 (2015)
Maitra, R.: Initializing partition-optimization algorithms. IEEE/ACM Trans. Comput. Biol. Bioinform. (TCBB) 6(1), 144–157 (2009)
Gupta, M.R., Chen, Y.: Theory and use of the EM method. Found. Trends Signal Process. 4(3), 223–296 (2011)
Peterka, V.: Bayesian system identification. In: Eykhoff, P. (ed.) Trends and Progress in System Identification, pp. 239–304. Pergamon Press, Oxford (1981)
Nagy, I., Suzdaleva, E., Mlynářová, T.: Mixture-based clustering non-Gaussian data with fixed bounds. In: Proceedings of the IEEE International Conference Intelligent Systems IS’16, pp. 265–271 (2016)
Suzdaleva, E., Nagy, I., Mlynářová, T.: Expert-based initialization of recursive mixture estimation. In: Proceedings of the IEEE International Conference Intelligent Systems IS’16, pp. 308–315 (2016)
Kárný, M., Nedoma, P., Khailova, N., Pavelková, L.: Prior information in structure estimation. IEE Proc. Control. Theory Appl. 150(6), 643–653 (2003)
Nagy, I., Suzdaleva, E., Pecherková, P.: Comparison of various definitions of proximity in mixture estimation. In: Proceedings of the 13th International Conference on Informatics in Control, Automation and Robotics (ICINCO), pp. 527–534 (2016)
Casella, G., Berger R.L.: Statistical Inference, 2nd edn. Duxbury Press (2001)
Jain, A.K.: Data clustering: 50 years beyond K-means. Pattern Recognit. Lett. 31(8), 651–666 (2010)
DeGroot, M.: Optimal Statistical Decisions. McGraw-Hill, New York (1970)
Spiegel, M.R.: Theory and Problems of Probability and Statistics. McGraw-Hill, New York (1992)
Johnson, R.A., Wichern, D.W.: Applied Multivariate Statistical Analysis. Pearson Prentice Hall, Upper Saddle River (2007)
Elfessi, A., Reineke, D.M.: A Bayesian look at classical estimation: the exponential distribution. J. Stat. Educ. 9(1) (2001)
Minka, T.P.: Estimating a gamma distribution. Microsoft Res. (2002)
Acknowledgements
The paper was supported by project GAČR GA15-03564S.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Suzdaleva, E., Nagy, I. (2020). Practical Initialization of Recursive Mixture-Based Clustering for Non-negative Data. In: Gusikhin, O., Madani, K. (eds) Informatics in Control, Automation and Robotics . ICINCO 2017. Lecture Notes in Electrical Engineering, vol 495. Springer, Cham. https://doi.org/10.1007/978-3-030-11292-9_34
Download citation
DOI: https://doi.org/10.1007/978-3-030-11292-9_34
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-11291-2
Online ISBN: 978-3-030-11292-9
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)