A Novel Cluster Based Algorithm for Outlier Detection

  • Manish Mahajan
  • Santosh KumarEmail author
  • Bhasker Pant
Conference paper
Part of the Advances in Intelligent Systems and Computing book series (AISC, volume 810)


Nowadays an important issue as well as challenge in data mining is obviously is outlier detection. Outlier detection has been used in many areas such as Fraud detection, Intrusion detection, Health care, Fault detection, etc., where detection of outliers is based on the different characteristics of data or datasets. In this current age of ‘Information Technology’, large numbers of processes are obtainable in the domain of data mining to discover the outliers by successfully creating the clusters and after that detecting the outliers from these created clusters. In data mining, cluster methods are highly essential and have been applied from micro- to macro-applications. Basically clusters are a pool of similar data objects put together grounded on the attributes and district features they have. Specifically outlier detection is used to recognize and exclude inconsistency from the available data sets. In the presented work an algorithm has been suggested which is based on clustering approach to the given data sets. The proposed algorithm efficiently detects outliers inside the clusters by using clustering algorithm and weight based approach.


Data mining Outlier Outlier detection K-means clustering 


  1. 1.
    Cateni, S., Colla, V., Vannucci, M.: Outlier detection methods for industrial applications. Advances in robotics. In: Automation and Control, pp. 274–275 (2008)Google Scholar
  2. 2.
    Ahmad, A., Dey, L.: A k-mean clustering algorithm for mixed numeric and categorical data. Data Knowl. Eng. 63, 502–527 (2007)CrossRefGoogle Scholar
  3. 3.
    Hodge, V.J., Austin, J.: A survey of outlier detection methodologies. Artif. Intell. Rev. 22(2), 85–126 (2004)CrossRefGoogle Scholar
  4. 4.
    Fawzy, A., Mokhtar, H.M.O., Hegazy, O.: Outliers detection and classification in wireless sensor networks. Egypt. Inf. J. 14, 157–164 (2013)CrossRefGoogle Scholar
  5. 5.
    Khan, F.: An initial seed selection algorithm for k-means clustering of geo-referenced data to improve replicability of cluster assignments for mapping application. Appl. Soft Comput. 12, 3698–3700 (2012)CrossRefGoogle Scholar
  6. 6.
    Chandola, V., Banerjee, A., Kumar, V.: Anomaly detection: a survey. ACM Comput. Surv. 41(3), 1–58 (2009)CrossRefGoogle Scholar
  7. 7.
    Pachgade, S.D., Dhande, S.S.: Outlier detection over data set using cluster-based and distance based approach. Int. J. Adv. Res. Comput. Sci. Soft. Eng. 2(6), 12–16 (2012)Google Scholar
  8. 8.
    Zhu, C., Kitagawa, H., Papadimitriou, S., Faloutsos, C.: Outlier detection by example. J. Intell. Inf. Syst. 36, 217–247 (2011)CrossRefGoogle Scholar
  9. 9.
    Shi, Y., Zhang, L.: COID: a cluster–outlier iterative detection approach to multi-dimensional data analysis. Knowl. Inf. Syst. 28, 710–733 (2010)Google Scholar
  10. 10.
    Indira Priya, P., Ghosh, D.K.: A survey on different clustering algorithms in data mining techniques. Int. J. Mod. Eng. Res. 3(1), 267–274 (2013)Google Scholar
  11. 11.
    Gupta, M., Gao, J., Aggarwal, C.C., Han, J.: Outlier detection for temporal data. In: Proceedings of the 13th SIAM International Conference on Data Mining (SDM) (2013)Google Scholar
  12. 12.
    Divya, T., Christopher, T.: A study of clustering based algorithm for outlier detection in data streams. Int. J. Adv. Netw. Appl. (IJANA) (2015). ISSN 0975-0282Google Scholar
  13. 13.
    Chugh, N., Chugh, M., Agarwal, A.: Outlier detection in streaming data a research perspective. Int. J. Sci. Eng. Technol. Res. (IJSETR) 4(3) (2015)Google Scholar
  14. 14.
    Bhosale, S.V., et al.: Outlier detection in straming data using clustering approached. Int. J. Comput. Sci. Inf. Technol. (IJCSIT) 5(5), 6050–6053 (2014)Google Scholar
  15. 15.
    Manoharan, J.J., Hari Ganesh, S.: Improved k-means clustering algorithm using linear data structure list to enhance the efficiency. Int. J. Appl. Eng. Res. 10(20) (2015). ISSN 0973-4562Google Scholar
  16. 16.
    Purohit, P.: A new efficient approach towards k-means clustering algorithm. Int. J. Comput. Appl. 65(11) (2013)Google Scholar
  17. 17.
    Shunye, W.: An improved k-means clustering algorithm based on dissimilarity. In: 2013 International Conference on Mechatronic Sciences, Electric Engineering and Computer (MEC) Dec 20–22, 2013, Shenyang, China. IEEEGoogle Scholar
  18. 18.
    Fahim, S.A.M., Torkey, F.A., Ramadan, M.A.: An efficient enhanced k-means clustering algorithm. J. Zhejiang Univ. Sci. A. ISSN 1009-3095, ISSN 1862-1775Google Scholar
  19. 19.
    Wang, J., Su, X.: An improved k-means clustering algorithm. IEEE (2011)Google Scholar
  20. 20.
    Mahmud, Md.S., Rahman, Md.M., Akhtar, Md.N.: Improvement of k-means clustering algorithm with better initial centroids based on weighted average. In: 2012 7th International Conference on Electrical and Computer Engineering, 20–22 Dec 2012, Dhaka, Bangladesh. IEEE (2012)Google Scholar
  21. 21.
    Chauhan, P., Shukla, M.: A review on outlier detection techniques on data stream by using different approaches of KMeans algorithm. In: 2015 International Conference on Advances in Computer Engineering and Applications (ICACEA), IMS Engineering College, Ghaziabad, India. IEEE (2015)Google Scholar

Copyright information

© Springer Nature Singapore Pte Ltd. 2019

Authors and Affiliations

  1. 1.Graphic Era Deemed to be UniversityDehradunIndia

Personalised recommendations