Abstract
Outlier detection is an important data mining task, whose target is to find the abnormal or atypical objects from a given data set. The techniques for detecting outliers have a lot of applications, such as credit card fraud detection, environment monitoring, etc. In this paper, we proposed a new definition of outlier, called cluster-based outlier. Comparing with the existing definitions, the cluster-based outlier is more suitable for the complicated data sets that consist of many clusters with different densities. To detect cluster-based outliers, we first split the given data set into a number of clusters using unsupervised extreme learning machines. Then, we further design a pruning method technique to efficiently compute outliers in each cluster. at last, the effectiveness and efficiency of the proposed approaches are verified through plenty of simulation experiments.
Keywords
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Hawkins, D.M.: Identification of Outliers. Springer, New York (1980)
Barnett, V., Lewis, T.: Outliers in Statistical Data. Wiley, New York (1994)
Rousseeuw, P.J., Leroy, A.M.: Robust Regression and Outlier Detection. Wiley, New York (2005)
Knorr, E.M., Ng, R.T.: Algorithms for mining distancebased outliers in large datasets. In: Proceedings of the International Conference on Very Large Data Bases, pp. 392–403 (1998)
Ramaswamy, S., Rastogi, R., Shim, K.: Efficient algorithms for mining outliers from large data sets. ACM SIGMOD Rec. 29(2), 427–438 (2000)
Angiulli, F., Pizzuti, C.: Outlier mining in large high-dimensional data sets. IEEE Trans. Knowl. Data Eng. 17(2), 203–215 (2005)
Breunig, M.M., Kriegel, H.P., Ng, R.T., Sander, J.: Lof: identifying density-based local outliers. ACM Sigmod Rec. 29(2), 93–104 (2000)
Huang, G., Song, S., Gupta, J.N.D., Wu, C.: Semi-supervised and unsupervised extreme learning machines. IEEE Trans. Cybern. 44(12), 2405–2417 (2014)
Huang, G., Zhu, Q., Siew, C.-K.: Extreme learning machine: a new learning scheme of feedforward neural networks. Proc. Int. Joint Conf. Neural Netw. 2, 985–990 (2004)
Huang, G., Zhu, Q., Siew, C.-K.: Extreme learning machine: theory and applications. Neurocomputing 70, 489–501 (2006)
Huang, G.: What are extreme learning machines? Filling the gap between Frank Rosenblatt’s Dream and John von Neumann’s Puzzle. Cogn. Comput. 7(3), 263–278 (2015)
Cherkassky, V.: The nature of statistical learning theory. IEEE Trans. Neural Netw. 8(6), 1564 (1997)
Cortes, C., Vapnik, V.: Support-vector networks. Mach. Learn. 20(3), 273–297 (1995)
Huang, G., Zhou, H., Ding, X., Zhang, R.: Extreme learning machine for regression and multiclass classification. IEEE Trans. Syst. Man Cybern. Part B 42(2), 513–529 (2012)
Rong, H., Huang, G., Sundararajan, N., Saratchandran, P.: Online sequential fuzzy extreme learning machine for function approximation and classification problems. IEEE Trans. Syst. Man Cybern. Part B 39(4), 1067–1072 (2009)
Liang, N., Huang, G., Saratchandran, P., Sundararajan, N.: A Fast and accurate online sequential learning algorithm for feedforward networks. IEEE Trans. Neural Netw. 17(6), 1411–1423 (2006)
Kanungo, T., Mount, D.M., Netanyahu, N.S., Piatko, C.D., Silverman, R., Wu, A.Y.: An Efficient k-means clustering algorithm: analysis and implementation. IEEE Trans. Pattern Anal. Mach. Intell. 24(7), 881–892 (2001)
Belkin, M., Niyogi, P.: Laplacian eigenmaps for dimensionality reduction and data representation. Neural Comput. 15(6), 1373–1396 (2003)
Andrew, Y., Ng, M.I., Jordan, Y.W.: On spectral clustering: analysis and an algorithm. Adv. Neural Inform. Process. Syst. 2, 849C856 (2002)
Bengio, Yoshua: Learning deep architectures for AI. Found. Trends Mach. Learn. 2(1), 1–127 (2009)
He, Z., Xu, X., Deng, S.: Discovering cluster-based local outliers. Pattern Recog. Lett. 24(9), 1641–1650 (2003)
Bay, S.D, Schwabacher, M.: Mining distance-based outliers in near linear time with randomization and a simple pruning rule. In: Proceedings of the Ninth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 29–38 (2003)
Angiulli, F., Fassetti, F.: Very efficient mining of distance-based outliers. In: Proceedings of the Sixteenth ACM Conference on Information and Knowledge Management, pp. 791–800 (2007)
Guttman, A.: R-trees: a dynamic index structure for spatial searching. ACM (1984)
Patella, M., Ciaccia, P., Zezula, P.: M-tree: an efficient access method for similarity search in metric spaces. In: Proceedings of the International Conference on Very Large Databases (VLDB). Athens, Greece (1997)
Acknowledgments
This work is supported by the National Basic Research 973 Program of China under Grant No.2012CB316201, the National Natural Science Foundation of China under Grant Nos. 61033007, 61472070.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing Switzerland
About this paper
Cite this paper
Wang, X., Shen, D., Bai, M., Nie, T., Kou, Y., Yu, G. (2016). Cluster-Based Outlier Detection Using Unsupervised Extreme Learning Machines. In: Cao, J., Mao, K., Wu, J., Lendasse, A. (eds) Proceedings of ELM-2015 Volume 1. Proceedings in Adaptation, Learning and Optimization, vol 6. Springer, Cham. https://doi.org/10.1007/978-3-319-28397-5_11
Download citation
DOI: https://doi.org/10.1007/978-3-319-28397-5_11
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-28396-8
Online ISBN: 978-3-319-28397-5
eBook Packages: EngineeringEngineering (R0)