Abstract
There are many cluster analysis problems in the context of multi-criteria decision analysis. These problems often need to simultaneously determine the number of clusters and their boundaries. There is no good method available to automatically determine the number of clusters. In this paper, we propose a simple and intuitive approach to address this issue. The proposed approach first aggregates a set of multi-criteria or multi-attribute data into a one-dimensional data set. Then, we consider an arbitrary data point, which divides the dataset into two groups. The between-groups distance and within-group variances are combined into a clustering quality measure called the signal-to-noise ratio (SNR). The plot of SNR versus each data point provides the clue about the number of clusters and their boundaries. Specifically, the cluster boundaries are at the local maxima of the plot; and this also simultaneously determines the number of clusters. The proposed approach can be conveniently implemented using an Excel spreadsheet program. Two real-world examples are included to illustrate the appropriateness of the proposed approach. The results are also validated through comparing them with the results obtained from the Gaussian kernel density estimation.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Jain AK (2008) Data clustering: 50 years beyond K-means. Pattern Recogn Lett 31(8):651–666
Jiang R (2009) Cluster analysis of maintenance management problems. In: 2009 IEEM. Hong Kong, pp 1150–1154
Mandelli D, Yilmaz A, Aldemir T, Metzroth K, Denning R (2013) Scenario clustering and dynamic probabilistic risk assessment. Reliab Eng Syst Saf 115(1):146–160
Murthy CA (2016) On bandwidth selection using minimal spanning tree for kernel density estimation. Comput Stat Data Anal 102:67–84
Silverman BW (1986) Density estimation for statistics and data analysis. Chapman & Hall, London
Wikipedia. Kernel density estimation. https://en.wikipedia.org/wiki/Kernel_density_estimation, last accessed 14 Dec 2015
Celik P, Wu ML (1999) Decision-making and performance measurement models with applications to robot selection. Comput Ind Eng 36(3):503–523
Mecit ED, Alp I (2013) A new proposed model of restricted data envelopment analysis by correlation coefficients. Appl Math Model 37(5):3407–3425
Flores BE, Olson DL, Dorai VK (1992) Management of multi-criteria inventory classification. Math Comput Model 16(12):71–82
Acknowledgements
The research was supported by the National Natural Science Foundation of China (No. 71771029).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Jiang, R., Huang, C. (2019). A Signal-to-Noise Ratio Based Optimization Approach for Data Cluster Analysis. In: Kim, K., Kim, H. (eds) Mobile and Wireless Technology 2018. ICMWT 2018. Lecture Notes in Electrical Engineering, vol 513. Springer, Singapore. https://doi.org/10.1007/978-981-13-1059-1_25
Download citation
DOI: https://doi.org/10.1007/978-981-13-1059-1_25
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-13-1058-4
Online ISBN: 978-981-13-1059-1
eBook Packages: EngineeringEngineering (R0)