Advertisement

A Signal-to-Noise Ratio Based Optimization Approach for Data Cluster Analysis

  • Renyan JiangEmail author
  • Chaoqun Huang
Conference paper
Part of the Lecture Notes in Electrical Engineering book series (LNEE, volume 513)

Abstract

There are many cluster analysis problems in the context of multi-criteria decision analysis. These problems often need to simultaneously determine the number of clusters and their boundaries. There is no good method available to automatically determine the number of clusters. In this paper, we propose a simple and intuitive approach to address this issue. The proposed approach first aggregates a set of multi-criteria or multi-attribute data into a one-dimensional data set. Then, we consider an arbitrary data point, which divides the dataset into two groups. The between-groups distance and within-group variances are combined into a clustering quality measure called the signal-to-noise ratio (SNR). The plot of SNR versus each data point provides the clue about the number of clusters and their boundaries. Specifically, the cluster boundaries are at the local maxima of the plot; and this also simultaneously determines the number of clusters. The proposed approach can be conveniently implemented using an Excel spreadsheet program. Two real-world examples are included to illustrate the appropriateness of the proposed approach. The results are also validated through comparing them with the results obtained from the Gaussian kernel density estimation.

Keywords

Multi-criteria decision Cluster analysis Signal-to-noise ratio Kernel density estimation 

Notes

Acknowledgements

The research was supported by the National Natural Science Foundation of China (No. 71771029).

References

  1. 1.
    Jain AK (2008) Data clustering: 50 years beyond K-means. Pattern Recogn Lett 31(8):651–666CrossRefGoogle Scholar
  2. 2.
    Jiang R (2009) Cluster analysis of maintenance management problems. In: 2009 IEEM. Hong Kong, pp 1150–1154Google Scholar
  3. 3.
    Mandelli D, Yilmaz A, Aldemir T, Metzroth K, Denning R (2013) Scenario clustering and dynamic probabilistic risk assessment. Reliab Eng Syst Saf 115(1):146–160CrossRefGoogle Scholar
  4. 4.
    Murthy CA (2016) On bandwidth selection using minimal spanning tree for kernel density estimation. Comput Stat Data Anal 102:67–84MathSciNetCrossRefGoogle Scholar
  5. 5.
    Silverman BW (1986) Density estimation for statistics and data analysis. Chapman & Hall, LondonCrossRefGoogle Scholar
  6. 6.
    Wikipedia. Kernel density estimation. https://en.wikipedia.org/wiki/Kernel_density_estimation, last accessed 14 Dec 2015
  7. 7.
    Celik P, Wu ML (1999) Decision-making and performance measurement models with applications to robot selection. Comput Ind Eng 36(3):503–523CrossRefGoogle Scholar
  8. 8.
    Mecit ED, Alp I (2013) A new proposed model of restricted data envelopment analysis by correlation coefficients. Appl Math Model 37(5):3407–3425MathSciNetCrossRefGoogle Scholar
  9. 9.
    Flores BE, Olson DL, Dorai VK (1992) Management of multi-criteria inventory classification. Math Comput Model 16(12):71–82CrossRefGoogle Scholar

Copyright information

© Springer Nature Singapore Pte Ltd. 2019

Authors and Affiliations

  1. 1.Faculty of Automotive and Mechanical EngineeringChangsha University of Science and TechnologyChangshaChina
  2. 2.Xinjiang Goldwind Science & Technology Co., Ltd.BeijingChina

Personalised recommendations