Fast and unsupervised outlier removal by recurrent adaptive reconstruction extreme learning machine

  • Wang SiqiEmail author
  • Liu Qiang
  • Guo Xifeng
  • Zhu En
  • Yin Jianping
Original Article


Outlier removal is vital in machine learning. As massive unlabeled data are generated rapidly today, eliminating outliers from noisy data in a fast and unsupervised manner is gaining increasing attention in practical applications. This paper tackles this challenging problem by proposing a novel Recurrent Adaptive Reconstruction Extreme Learning Machine (RAR-ELM). Specifically, with the given noisy data collection, RAR-ELM recurrently learns to reconstruct data and automatically excludes those data with high reconstruction errors as outliers by a novel adaptive labeling mechanism. Compared with existing methods, the proposed RAR-ELM enjoys three major merits: first, RAR-ELM inherits the fast and sound learning property of original extreme learning machine (ELM). RAR-ELM can be implemented at a tens or hundreds of times faster speed while achieving a superior or comparable outlier removal performance to existing methods, which makes RAR-ELM particularly suitable for application scenarios like real-time outlier removal; secondly, instead of priorly specifying a decision threshold, RAR-ELM is able to adaptively find a reasonable decision threshold when processing data with different proportions of outliers, which is vital to the case of unsupervised outlier removal where no prior knowledge of outliers in the data is available; thirdly, we also propose Online Sequential RAR-ELM (OS-RAR-ELM) can be implemented by an online or sequential mode, which makes RAR-ELM easily applicable to massive noisy data or online sequential data. Extensive experiments on various datasets reveal that the proposed RAR-ELM can realize faster and better unsupervised outlier removal in contrast to existing methods.


Outlier removal Unsupervised learning Extreme learning machine Adaptive labeling 



  1. 1.
    Russakovsky O, Deng J, Hao S, Krause J, Satheesh S, Ma S (2015) Imagenet large scale visual recognition challenge. Int J Comput Vis 115(3):211–252MathSciNetCrossRefGoogle Scholar
  2. 2.
    Schroff F, Criminisi A, Zisserman A (2007) Harvesting image databases from the web. IEEE Int Conf Comput Vis 33:1–8Google Scholar
  3. 3.
    Chandola V (2004) Outlier detection : a survey. ACM Comput Surv 14(3):15MathSciNetGoogle Scholar
  4. 4.
    Perdisci R, Gu G, Lee W (2007) Using an ensemble of one-class SVM classifiers to Harden Payload-based anomaly detection systems. In: International conference on data mining, IEEE, pp. 488–498Google Scholar
  5. 5.
    Mahadevan V, Li W, Bhalodia V, Vasconcelos N (2010) Anomaly detection in crowded scenes. IEEE Comput Vis Pattern Recognit 26:1975–1981Google Scholar
  6. 6.
    Ji Z, Pang Y, Li X (2015) Relevance preserving projection and ranking for web image search reranking. IEEE Trans Image Process A Publ IEEE Signal Process Soc 24(11):4137–47MathSciNetzbMATHGoogle Scholar
  7. 7.
    Xiao Y, Wang H, Zhang L, Xu W (2014) Two methods of selecting Gaussian Kernel parameters for one-class svm and their application to fault detection. Knowl Based Syst 59(2):75–84CrossRefGoogle Scholar
  8. 8.
    Xiao Y, Wang H, Xu W, Zhou J (2016) Robust one-class svm for fault detection. Chemometr Intell Lab Syst 151:15–25CrossRefGoogle Scholar
  9. 9.
    Roberts S, Tarassenko L (1994) A probabilistic resource allocating network for novelty detection. Neural Comput 6(2):270–284CrossRefGoogle Scholar
  10. 10.
    Dasarathy BV (1998) Adaptive local fusion systems for novelty detection and diagnostics in condition monitoring. Proc SPIE Int Soc Opt Eng 3376:210–218Google Scholar
  11. 11.
    Manevitz L, Yousef M (2007) One-class document classification via Neural Networks. Elsevier, AmsterdamCrossRefGoogle Scholar
  12. 12.
    Scholkopf B, Platt JC, Shawetaylor J, Smola AJ, Williamson RC (2014) Estimating the support of a high-dimensional distribution. Neural Comput 13(7):1443–1471CrossRefGoogle Scholar
  13. 13.
    Tax DMJ, Duin RPW (2004) Support vector data description. Mach Learn 54(1):45–66CrossRefGoogle Scholar
  14. 14.
    Leng Q, Qi H, Miao J, Zhu W, Su G (2015) One-class classification with extreme learning machine. In: Mathematical problems in engineering 1–11MathSciNetCrossRefGoogle Scholar
  15. 15.
    Kriegel HP, Hubert MS, Zimek A (2008) Angle-based outlier detection in high-dimensional data. In: ACM SIGKDD international conference on knowledge discovery and data mining, pp 444–452Google Scholar
  16. 16.
    Casale P, Pujol O, Radeva P (2014) Approximate polytope ensemble for one-class classification. Pattern Recognit 47(2):854–864CrossRefGoogle Scholar
  17. 17.
    Janakiraman VM, Nielsen D (2016) Anomaly detection in aviation data using extreme learning machines. In: International joint conference on neural networks, pp 1993–2000Google Scholar
  18. 18.
    Breunig MM, Kriegel HP, Ng RT (2000) LOF: identifying density-based local outliers. In: ACM sigmod international conference on management of data, Vol 29, pp 93–104Google Scholar
  19. 19.
    Tang J, Chen Z, Fu AW, Cheung DW (2002) Enhancing effectiveness of outlier detections for low density patterns. Pacific Asia Conf Knowl Discov Data Min 2336:535–548CrossRefGoogle Scholar
  20. 20.
    Hautamaki V, Karkkainen I, Franti P (2004) Outlier Detection Using k-Nearest Neighbour Graph. In: International conference on pattern recognition, IEEE, Vol 3, pp 430–433Google Scholar
  21. 21.
    Pokrajac D, Lazarevic A, Latecki LJ (2007) Incremental local outlier detection for data streams. In: Computational intelligence and data mining, 2007, CIDM 2007, IEEE Symposium on, pp 504–515Google Scholar
  22. 22.
    Liu W, Hua G, Smith JR (2014) Unsupervised one-class learning for automatic outlier removal. In: IEEE conference on computer vision and pattern recognition, pp 3826–3833Google Scholar
  23. 23.
    Grubbs F (1969) Procedures for detecting outlying observations in samples. Technometrics 11(1):1–21CrossRefGoogle Scholar
  24. 24.
    Zimek A, Schubert E, Kriegel HP (2012) A survey on unsupervised outlier detection in high-dimensional numerical data. Stat Anal Data Min 5(5):363–387MathSciNetCrossRefGoogle Scholar
  25. 25.
    Parzen E (1962) On estimation of a probability density function and mode. Ann Math Stat 33(3):1065–1076MathSciNetCrossRefGoogle Scholar
  26. 26.
    Kim JS, Scott C (2008) Robust kernel density estimation. In: IEEE international conference on acoustics, speech and signal processing, vol 13, pp 2529–2565Google Scholar
  27. 27.
    Karlpearson FRS (1901) Liii. on lines and planes of closest fit to systems of points in space. Philos Magn 2(11):559–572CrossRefGoogle Scholar
  28. 28.
    Schlkopf B, Smola A, Mller KR (1998) Nonlinear component analysis as a kernel eigen-value problem. Neuroimage 10:1299–1319Google Scholar
  29. 29.
    Vidal R, Sapiro G, Elhamifar E (2012) See all by looking at a few: Sparse modeling for finding representative objects. IEEE Comput Vis Pattern Recognit 157:1600–1607Google Scholar
  30. 30.
    Xia Y, Cao X, Wen F, Hua G (2015) Learning discriminative reconstructions for unsupervised outlier removal. In: IEEE international conference on computer vision, pp 1511–1519Google Scholar
  31. 31.
    Li S, Shao M, Fu Y (2014) Locality linear fitting one-class SVM with low-rank constraints for outlier detection. In: International joint conference on neural networks, IEEE, pp 676–683Google Scholar
  32. 32.
    Li S, Shao M, Fu Y (2014) Low-rank outlier detectionGoogle Scholar
  33. 33.
    Huang GB, Zhu QY, Siew CK (2006) Extreme learning machine: theory and applications. Neurocomputing 70(1):489–501CrossRefGoogle Scholar
  34. 34.
    Huang GB, Zhou H, Ding X, Zhang R (2012) Extreme learning machine for regression and multiclass classification. IEEE Trans Syst Man Cybern B Cybern 42(2):513–529CrossRefGoogle Scholar
  35. 35.
    Liang NY, Huang GB, Saratchandran P, Sundararajan N (2006) A fast and accurate online sequential learning algorithm for feedforward networks. IEEE Trans Neural Netw 17(6):1411–1423CrossRefGoogle Scholar
  36. 36.
    Huang G, Song S, Gupta JND, Wu C (2014) Semi-supervised and unsupervised extreme learning machines. IEEE Trans Cybern 44(12):2405–2417CrossRefGoogle Scholar
  37. 37.
    Cambria E, Liu Q, Li K, Leung VCM, Feng L, Ong YS et al (2013) Extreme learning machines: trends and controversies. IEEE Intell Syst 28(6):30–59CrossRefGoogle Scholar
  38. 38.
    Wang Y, Xie Z, Xu K, Dou Y, Lei Y (2016) An efficient and effective convolutional auto-encoder extreme learning machine network for 3d feature learning. Neurocomputing 174(PB):988–998CrossRefGoogle Scholar
  39. 39.
    Bai Z, Huang GB (2015) Generic object recognition with local receptive fields based extreme learning machine. Proc Comput Sci 53(1):391–399CrossRefGoogle Scholar
  40. 40.
    Decherchi S, Gastaldo P, Zunino R, Cambria E, Redi J (2013) Circular-elm for the reduced-reference assessment of perceived image quality. Neurocomputing 102(2):78–89CrossRefGoogle Scholar
  41. 41.
    Choi K, Toh K-A, Byun H (2012) Incremental face recognition for large-scale social network services. Pattern Recognit 45(8):2868–2883CrossRefGoogle Scholar
  42. 42.
    Xie Z, Kai X, Shan W, Liu L, Xiong Y, Huang H (2015) Projective feature learning for 3d shapes with multi-view depth images. Comput Graph Forum 34(7):1–11CrossRefGoogle Scholar
  43. 43.
    Wang S, Zhu E, Yin J, Porikli F (2017) Video anomaly detection and localization by local motion based joint video representation and OCELM. Neurocomputing 277:161–175CrossRefGoogle Scholar
  44. 44.
    Tang J, Deng C, Huang GB (2017) Extreme learning machine for multilayer perceptron. IEEE Trans Neural Netw Learn Syst 27(4):809–821MathSciNetCrossRefGoogle Scholar
  45. 45.
    Zhang L, Deng P (2017) Abnormal odor detection in electronic nose via self-expression inspired extreme learning machine. IEEE Trans Syst Man Cybern Syst PP(99):1–11Google Scholar
  46. 46.
    Williams G, Baxter R, He H, Hawkins S, Gu L (2002) A comparative study of RNN for outlier detection in data mining. In: IEEE international conference on data mining, 2002. ICDM 2003. IEEE, Proceedings vol 156, pp 709–712Google Scholar
  47. 47.
    Ohtsu N (1979) A threshold selection method from gray-level histograms. IEEE Trans Syst Man Cybern 9(1):62–66CrossRefGoogle Scholar
  48. 48.
    Dasgupta S (2013) Experiments with random projection. In: Proceedings of the sixteenth conference on uncertainty in artificial intelligence, pp 143–151Google Scholar
  49. 49.
    Bingham E, Mannila H (2001) Random projection in dimensionality reduction: applications to image and text data. In: Proceedings of the seventh ACM SIGKDD international conference on knowledge discovery and data mining, pp 245–250Google Scholar
  50. 50.
    Xie H, Li J, Xue H (2017) A survey of dimensionality reduction techniques based on random projection. arXiv preprint arXiv:1706.04371
  51. 51.
    Dasgupta S, Gupta A (2003) An elementary proof of a theorem of johnson and lindenstrauss. Random Struct Algorithm 22(1):60–65MathSciNetCrossRefGoogle Scholar
  52. 52.
    Aggarwal C (2015) Outlier analysis. Springer, New YorkzbMATHGoogle Scholar
  53. 53.
    Chandola V, Banerjee A, Kumar V (2009) Anomaly detection:a survey. ACM Comput Surv (CSUR) 41(3):1–58CrossRefGoogle Scholar
  54. 54.
    Zong W, Huang GB, Chen Y (2013) Weighted extreme learning machine for imbalance learning. Neurocomputing 101(3):229–242CrossRefGoogle Scholar
  55. 55.
    Wang J, Yang J, Yu K, Lv F, Huang T, Gong Y (2010) Locality-constrained linear coding for image classification. In: Computer vision and pattern recognition, IEEE, vol 119, pp 3360–3367Google Scholar

Copyright information

© Springer-Verlag GmbH Germany, part of Springer Nature 2019

Authors and Affiliations

  1. 1.College of ComputerNational University of Denfese TechnologyHunanChina
  2. 2.Dongguan University of TechnologyGuangdongChina

Personalised recommendations