Advertisement

Outlier Detection

  • N. N. R. Ranga SuriEmail author
  • Narasimha Murty M
  • G. Athithan
Chapter
Part of the Intelligent Systems Reference Library book series (ISRL, volume 155)

Abstract

This chapter deals with the task of detecting outliers in data from the data mining perspective. It suggests a formal approach for outlier detection highlighting various frequently encountered computational aspects connected with this task. An overview of this book with chapter-wise organization is also furnished here giving an idea of the coverage of the material on this research domain.

References

  1. 1.
    Aggarwal, C.C., Yu, P.S.: Outlier detection for high dimensional data. In: ACM SIGMOD International Conference on Management of Data, pp. 37–46. Santa Barbara, USA (2001)CrossRefGoogle Scholar
  2. 2.
    Aggarwal, C.C.: Outlier Analysis. Spinger, New York, USA (2013)Google Scholar
  3. 3.
    Akoglu, L., McGlohon, M., Faloutsos, C.: Oddball: spotting anomalies in weighted graphs. In: 14th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining (PAKDD), Hyderabad, India, pp. 410–421 (2010)Google Scholar
  4. 4.
    Albanese, A., Pal, S.K., Petrosino, A.: Rough sets, kernel set, and spatio-temporal outlier detection. IEEE Trans. Knowl. Data Eng. 26(1), 194–207 (2014)CrossRefGoogle Scholar
  5. 5.
    Bartkowiak, A.M.: Anomaly, novelty, one-class classification: a comprehensive introduction. Int. J. Comput. Inf. Syst. Ind. Manag. Appl. 3, 61–71 (2011)Google Scholar
  6. 6.
    Beyer, K.S., Goldstein, J., Ramakrishnan, R., Shaft, U.: When is nearest neighbor meaningful? In: 7th International Conference on Database Theory, ICDT. Lecture Notes in Computer Science, vol. 1540, pp. 217–235. Springer, Jerusalem, Israel (1999)Google Scholar
  7. 7.
    Chakrabarti, D.: Autopart: Parameter-free graph partitioning and outlier detection. In: PKDD, pp. 112–124 (2004)Google Scholar
  8. 8.
    Chandola, V., Banerjee, A., Kumar, V.: Anomaly detection: a survey. ACM Comput. Surv. 41(3) (2009)CrossRefGoogle Scholar
  9. 9.
    Chandola, V., Banerjee, A., Kumar, V.: Anomaly detection for discrete sequences: a survey. IEEE Trans. Knowl. Data Eng. (TKDE) 24(5), 823–839 (2012)CrossRefGoogle Scholar
  10. 10.
    Chaudhary, A., Szalay, A.S., Szalay, E.S., Moore, A.W.: Very fast outlier detection in large multidimensional data sets. In: ACM SIGMOD Workshop in Research Issues in Data Mining and Knowledge Discovery, pp. 45–52 (2002)Google Scholar
  11. 11.
    Cheng, J., Danescu-Niculescu-Mizil, C., Leskovec, J.: Antisocial behavior in online discussion communities. In: Proceedings of the Ninth International Conference on Web and Social Media ICWSM, Oxford, UK, pp. 61–70 (2015)Google Scholar
  12. 12.
    Estevez, P.A., Tesmer, M., Perez, C.A., Zurada, J.M.: Normalized mutual information feature selection. IEEE Trans. Neural Netw. 20(2), 189–201 (2009)CrossRefGoogle Scholar
  13. 13.
    Fawcett, T., Provost, F.: Activity monitoring: noticing interesting changes in behavior. In: 5th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD), pp. 53–62 (1999)Google Scholar
  14. 14.
    Girvan, M., Newman, M.E.J.: Community structure in social and biological networks. PNAS 99(12), 7821–7826 (2002)MathSciNetCrossRefGoogle Scholar
  15. 15.
    Han, J., Kamber, M., Pei, J.: Data Mining: Concepts and Techniques, 3rd edn. Morgan Kaufmann (2011)Google Scholar
  16. 16.
    Hido, S., Tsuboi, Y., Kashima, H., Sugiyama, M., Kanamori, T.: Statistical outlier detection using direct density ratio estimation. Knowl. Inf. Syst. 26(2), 309–336 (2011)CrossRefGoogle Scholar
  17. 17.
    Hodge, V.J., Austin, J.: A survey of outlier detection methodologies. Artif. Intell. Rev. 22, 85–126 (2004)CrossRefGoogle Scholar
  18. 18.
    Hubert, E.V.M.: An adjusted boxplot for skewed distributions. Comput. Stat. Data Anal. 52, 5186–5201 (2008)MathSciNetCrossRefGoogle Scholar
  19. 19.
    Jain, A.K., Murty, M.N., Flynn, P.J.: Data clustering: a review. ACM Comput. Surv. 31(3), 264–323 (1999)CrossRefGoogle Scholar
  20. 20.
    Keller, F., Muller, E., Bohm, K.: Hics: High contrast subspaces for density-based outlier ranking. In: 28th International Conference on Data Engineering (ICDE), pp. 1037–1048. IEEE (2012)Google Scholar
  21. 21.
    Kim, M., Leskovec, J.: Latent multi-group memebership graph model. In: 29th International Conference on Machine Learning (ICML), Edinburgh, Scotland, UK (2012)Google Scholar
  22. 22.
    Knorr, E., Ng, R., Tucakov, V.: Distance-based outliers: algorithms and applications. VLDB J. Very Large Databases 8(3–4), 237–253 (2000)CrossRefGoogle Scholar
  23. 23.
    Kuck, J., Zhuang, H., Yan, X., Cam, H., Han, J.: Query-based outlier detection in heterogeneous information networks. In: Proceedings of the 18th International Conference on Extending Database Technology, EDBT, Brussels, Belgium, pp. 325–336 (2015)Google Scholar
  24. 24.
    Kumar, S., Jiang, M., Jung, T., Luo, R.J., Leskovec, J.: MIS2: misinformation and misbehavior mining on the web. In: Proceedings of the Eleventh ACM International Conference on Web Search and Data Mining, WSDM, Marina Del Rey, CA, USA, pp. 799–800 (2018)Google Scholar
  25. 25.
    Lee, D.D., Seung, H.S.: Learning the parts of objects by non-negative matrix factorization. Nature (1999)Google Scholar
  26. 26.
    Lingras, P., Peters, G.: Applying rough set concepts to clustering. In: Rough Sets: Selected Methods and Applications in Management and Engineering, pp. 23–38. Springer Verlag, London (2012)CrossRefGoogle Scholar
  27. 27.
    Markou, M., Singh, S.: Novelty detection: a review, Part 1: statistical approaches. Signal Process. 83(12), 2481–2497 (2003)CrossRefGoogle Scholar
  28. 28.
    McBurney, P., Ohsawa, Y.: Chance Discovery. Springer (2003)Google Scholar
  29. 29.
    Mongiovi, M., Bogdanov, P., Ranca, R., Singh, A.K., Papalexakis, E.E., Faloutsos, C.: Netspot: spotting significant anomalous regions on dynamic networks. In: SDM, Austin, Texas (2013)CrossRefGoogle Scholar
  30. 30.
    Muller, E., Assent, I., Steinhausen, U., Seidl, T.: Outrank: ranking outliers in high dimensional data. In: IEEE ICDE Workshop, Cancun, Mexico, pp. 600–603 (2008)Google Scholar
  31. 31.
    Noble, C.C., Cook, D.J.: Graph-based anomaly detection. In: SIGKDD, Washington, DC, USA, pp. 631–636 (2003)Google Scholar
  32. 32.
    Peng, H., Long, F., Ding, C.: Feature selection based on mutual information: criteria of max-dependency, max-relevance, and min-redundancy. IEEE Trans. Pattern Anal. Mach. Intell. 27(8), 1226–1238 (2005)Google Scholar
  33. 33.
    Rossi, R.A., Neville, J., Gallagher, B., Henderson, K.: Modeling dynamic behavior in large evolving graphs. In: WSDM, Rome, Italy (2013)Google Scholar
  34. 34.
    Sato-Ilic, M., Jain, L.C.: Asymmetric clustering based on self-similarity. In: 3rd International Conference on Intelligent Information Hiding and Multimedia Signal Processing, pp. 361–364 (2007)Google Scholar
  35. 35.
    Scholkpof, B., Williamson, R., Smola, A., Taylor, J.S., Platt, J.: Support vector method for novelty detection. In: Advances in Neural Information Processing Systems (NIPS), pp. 582–588. MIT Press (1999)Google Scholar
  36. 36.
    Su, X., Xia, F., Wu, L., Chen, C.L.P.: Event-triggered fault detector and controller coordinated design of fuzzy systems. IEEE Tran. Fuzzy Syst. (2017)Google Scholar
  37. 37.
    Suzuki, E., Zytkow, J.: Unified algorithm for undirected discovery of exception rules. In: PKDD, pp. 169–180 (2000)CrossRefGoogle Scholar
  38. 38.
    Tan, P.N., Steinbach, M., Kumar, V.: Introduction to Data Mining. Addison-Wesley (2006)Google Scholar
  39. 39.
    Tolvi, J.: Genetic algorithms for outlier detection and variable selection in linear regression models. Soft Comput. 8(8), 527–533 (2004)CrossRefGoogle Scholar
  40. 40.
    Wu, Q., Ma, S.: Detecting outliers in sliding window over categorical data streams. In: 8th International Conference on Fuzzy Systems and Knowledge Discovery (FSKD), pp. 1663–1667. IEEE (2011)Google Scholar
  41. 41.
    Wu, S., Wang, S.: Information-theoretic outlier detection for large-scale categorical data. IEEE Trans. Knowl. Data Eng. (TKDE) 25(3), 589–602 (2013)CrossRefGoogle Scholar
  42. 42.
    Yang, X., Latecki, L.J., Pokrajac, D.: Outlier detection with globally optimal exemplar-based GMM. In: SIAM International Conference on Data Mining (SDM), pp. 145–154 (2008)CrossRefGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  • N. N. R. Ranga Suri
    • 1
    Email author
  • Narasimha Murty M
    • 2
  • G. Athithan
    • 3
  1. 1.Centre for Artificial Intelligence and Robotics (CAIR)BangaloreIndia
  2. 2.Department of Computer Science and AutomationIndian Institute of Science (IISc)BangaloreIndia
  3. 3.Defence Research and Development Organization (DRDO)New DelhiIndia

Personalised recommendations