Variance Reduction in Outlier Ensembles

Chapter

Abstract

The theoretical discussion in the previous chapter establishes that the error of an outlier detector can be decomposed into the squared bias and the variance. Ensemble methods attempt to reduce the overall error by reducing either the squared bias or the variance.

References

  1. 1.
    C. C. Aggarwal. Outlier Analysis, Second Edition, Springer, 2017.Google Scholar
  2. 2.
    C. C. Aggarwal and P. S. Yu. Outlier Detection in Graph Streams, ICDE Conference, 2011.Google Scholar
  3. 3.
    C. C. Aggarwal. Outlier Ensembles: Position Paper, ACM SIGKDD Explorations, 14(2), pp. 49–58, December, 2012.Google Scholar
  4. 4.
    C. C. Aggarwal, C. Procopiuc, J. Wolf, P. Yu, and J. Park. Fast Algorithms for Projected Clustering. ACM SIGMOD Conference, 1999.Google Scholar
  5. 5.
    C. C. Aggarwal and P. S. Yu. Finding Generalized Projected Clusters in High Dimensional Spaces, ACM SIGMOD Conference, 2000.Google Scholar
  6. 6.
    C. C. Aggarwal and S. Sathe. Theoretical Foundations and Algorithms for Outlier Ensembles, ACM SIGKDD Explorations, 17(1), June 2015.Google Scholar
  7. 7.
    C. C. Aggarwal. Recommender Systems: The Textbook, Springer, 2016. [Chapter 6 on Ensemble-Based Systems]Google Scholar
  8. 8.
    C. C. Aggarwal. Data Mining: The Textbook, Springer, 2015.Google Scholar
  9. 9.
    C. C. Aggarwal and C. K. Reddy. Data Clustering: Algorithms and Applications, CRC Press, 2013.Google Scholar
  10. 10.
    C. C. Aggarwal and P. S. Yu. Outlier Detection in High Dimensional Data, ACM SIGMOD Conference, 2001.Google Scholar
  11. 11.
    F. Angiulli, C. Pizzuti. Fast outlier detection in high dimensional spaces, PKDD Conference, 2002.Google Scholar
  12. 12.
    E. Bauer and R. Kohavi. An Empirical Comparison of Voting Classification Algorithms: Bagging, Boosting, and Variants, Machine Learning, 36(1), pp. 1–38, 1998.Google Scholar
  13. 13.
    S. Bay. Nearest Neighbor Classification from Multiple Feature Subsets. Intelligent Data Analysis, 2(3), pp. 191–209, 1999.CrossRefGoogle Scholar
  14. 14.
    G. Biau, F. Cerou, and A. Guyader. On the Rate of Convergence of the Bagged Nearest Neighbor Estimate. Journal of Machine Learning Research, 11, pp. 687–712, 2010.MathSciNetMATHGoogle Scholar
  15. 15.
    L. Brieman. Random Forests. Journal Machine Learning archive, 45(1), pp. 5–32, 2001.CrossRefGoogle Scholar
  16. 16.
    L. Brieman. Bagging Predictors. Machine Learning, 24(2), pp. 123–140, 1996.MATHGoogle Scholar
  17. 17.
    M. Breunig, H.-P. Kriegel, R. Ng, and J. Sander. LOF: Identifying Density-based Local Outliers, ACM SIGMOD Conference, 2000.Google Scholar
  18. 18.
    G. Brown, J. Wyatt, R. Harris, and X. Yao. Diversity creation methods: a survey and categorisation. Information Fusion, 6:5(20), 2005.Google Scholar
  19. 19.
    R. Bryll, R. Gutierrez-Osuna, and F. Quek. Attribute Bagging: Improving Accuracy of Classifier Ensembles by using Random Feature Subsets. Pattern Recognition, 36(6), pp. 1291–1302, 2003.CrossRefMATHGoogle Scholar
  20. 20.
    P. Buhlmann. Bagging, Subagging and Bragging for Improving some Prediction Algorithms, Recent advances and trends in nonparametric statistics, Elsevier, 2003.Google Scholar
  21. 21.
    P. Buhlmann, B. Yu. Analyzing Bagging. Annals of Statistics, pp. 927–961, 2002.Google Scholar
  22. 22.
    A. Buja and W. Stuetzle. Observations on Bagging. Statistica Sinica, 16(2), 323, 2006.MathSciNetMATHGoogle Scholar
  23. 23.
    J. Chen, S. Sathe, C. Aggarwal, and D. Turaga. Outlier Detection with Autoencoder Ensembles. SIAM Conference on Data Mining, 2017.Google Scholar
  24. 24.
    A. Criminisi, J. Shotton, and E. Konukoglu. Decision Forests for Classification, Regression, Density Estimation, Manifold Learning and Semi-Supervised Learning. Microsoft Research Cambridge, Tech. Rep. MSRTR-2011-114, 5(6), 12, 2011.Google Scholar
  25. 25.
    M. Datar, N. Immorlica, P. Indyk, V. Mirrokni. Locality-sensitive hashing scheme based on p-stable distributions. ACM Annual Symposium on Computational Geometry, pp. 253–262, 2004.Google Scholar
  26. 26.
    M. Denil, D. Matheson, and N. De Freitas. Narrowing the Gap: Random Forests In Theory and in Practice. ICML Conference, pp. 665–673, 2014.Google Scholar
  27. 27.
    C. Desir, S. Bernard, C. Petitjean, and L. Heutte. One Class Random Forests. Pattern Recognition, 46(12), pp. 3490–3506, 2013.CrossRefGoogle Scholar
  28. 28.
    T. Dietterich. Ensemble Methods in Machine Learning, First International Workshop on Multiple Classifier Systems, 2000.Google Scholar
  29. 29.
    A. Emmott, S. Das, T. Dietterich, A. Fern, and W. Wong. Systematic Construction of Anomaly Detection Benchmarks from Real Data. arXiv:1503.01158, 2015. https://arxiv.org/abs/1503.01158
  30. 30.
    M. Fernandez-Delgado, E. Cernadas, S. Barro, and D. Amorim. Do we Need Hundreds of Classifiers to Solve Real World Classification Problems?. The Journal of Machine Learning Research, 15(1), pp. 3133–3181, 2014.MathSciNetMATHGoogle Scholar
  31. 31.
    J. Friedman, and P. Hall. On bagging and nonlinear estimation. Journal of statistical planning and inference, 137(3), pp. 669–683, 2007.MathSciNetCrossRefMATHGoogle Scholar
  32. 32.
    P. Geurts. Variance Reduction Techniques, Chapter 4 of unpublished PhD Thesis entitled “Contributions to decision tree induction: bias/variance tradeoff and time series classification.” University of Liege, Belgium, 2002. http://www.montefiore.ulg.ac.be/services/stochastic/pubs/2002/Geu02/
  33. 33.
    M. Grill and T. Pevny. Learning Combination of Anomaly Detectors for Security Domain. Computer Networks, 2016.Google Scholar
  34. 34.
    S. Guha, N. Mishra, G. Roy, and O. Schrijver. Robust Random Cut Forest Based Anomaly Detection On Streams. ICML Conference, pp. 2712–2721, 2016.Google Scholar
  35. 35.
    Z. He, S. Deng and X. Xu. A Unified Subspace Outlier Ensemble Framework for Outlier Detection, Advances in Web Age Information Management, 2005.Google Scholar
  36. 36.
    T. K. Ho. Random decision forests. Third International Conference on Document Analysis and Recognition, 1995. Extended version appears as “The random subspace method for constructing decision forests” in IEEE Transactions on Pattern Analysis and Machine Intelligence, 20(8), pp. 832–844, 1998.Google Scholar
  37. 37.
    T. K. Ho. Nearest Neighbors in Random Subspaces. Lecture Notes in Computer Science, Vol. 1451, pp. 640–648, Proceedings of the Joint IAPR Workshops SSPR’98 and SPR’98, 1998. http://link.springer.com/chapter/10.1007/BFb0033288
  38. 38.
    F. Keller, E. Muller, K. Bohm. HiCS: High-Contrast Subspaces for Density-based Outlier Ranking, IEEE ICDE Conference, 2012.Google Scholar
  39. 39.
    M. Kopp, T. Pevny, and M. Holena. Interpreting and Clustering Outliers with Sapling Random Forests. Information Technologies Applications and Theory Workshops, Posters, and Tutorials (ITAT), 2014.Google Scholar
  40. 40.
    A. Lazarevic, and V. Kumar. Feature Bagging for Outlier Detection, ACM KDD Conference, 2005.Google Scholar
  41. 41.
    F. T. Liu, K. M. Ting, and Z.-H. Zhou. Isolation Forest. ICDM Conference, 2008. Extended version appears as “Isolation-based Anomaly Detection,” ACM Transactions on Knowledge Discovery from Data (TKDD), 6(1), 3, 2012.Google Scholar
  42. 42.
    F. T. Liu, K. N. Ting, and Z.-H. Zhou. On Detecting Clustered Anomalies using SCiForest. Machine Learning and Knowledge Discovery in Databases, pp. 274–290, Springer, 2010.Google Scholar
  43. 43.
    C. Manning, P. Raghavan, and H. Schutze. Introduction to Information Retrieval, Cambridge University Press, 2008. [Also see Exercises 14.16 and 14.17]Google Scholar
  44. 44.
    G. Martinez-Munoz and A. Suarez. Out-of-bag estimation of the optimal sample size in bagging. Pattern Recognition, 43, pp. 143–152, 2010.CrossRefMATHGoogle Scholar
  45. 45.
    P. Melville, R. Mooney. Creating Diversity in Ensembles Using Artificial Data. Information Fusion, 6(1), 2005.Google Scholar
  46. 46.
    B. Micenkova, B. McWilliams, and I. Assent. Learning Outlier Ensembles: The Best of Both Worlds Supervised and Unsupervised. ACM SIGKDD Workshop on Outlier Detection and Description, ODD, 2014.Google Scholar
  47. 47.
    B. Micenkova, B. McWilliams, and I. Assent. Learning Representations for Outlier Detection on a Budget. arXiv preprint arXiv:1507.08104, 2014.
  48. 48.
    F. Moosmann, B. Triggs, and F. Jurie. Fast Discriminative Visual Codebooks using Randomized Clustering Forests. Neural Information Processing Systems, pp. 985–992, 2006.Google Scholar
  49. 49.
    R. Motwani and P. Raghavan. Randomized Algorithms. Chapman and Hall/CRC, 2012.Google Scholar
  50. 50.
    E. Muller, M. Schiffer, and T. Seidl. Statistical Selection of Relevant Subspace Projections for Outlier Ranking. ICDE Conference, pp, 434–445, 2011.Google Scholar
  51. 51.
    E. Muller, I. Assent, P. Iglesias, Y. Mulle, and K. Bohm. Outlier Ranking via Subspace Analysis in Multiple Views of the Data, ICDM Conference, 2012.Google Scholar
  52. 52.
    H. Nguyen, H. Ang, and V. Gopalakrishnan. Mining Ensembles of Heterogeneous Detectors on Random Subspaces, DASFAA, 2010.Google Scholar
  53. 53.
    H. Nguyen, E. Muller, J. Vreeken, F. Keller, and K. Bohm. CMI: An Information-Theoretic Contrast Measure for Enhancing Subspace Cluster and Outlier Detection. SIAM International Conference on Data Mining (SDM), pp. 198–206, 2013.Google Scholar
  54. 54.
    S. Papadimitriou, H. Kitagawa, P. Gibbons, and C. Faloutsos, LOCI: Fast outlier detection using the local correlation integral, ICDE Conference, 2003.Google Scholar
  55. 55.
    T. Pevny. Loda: Lightweight On-line Detector of Anomalies. Machine Learning, 102(2), pp. 275–304, 2016.MathSciNetCrossRefMATHGoogle Scholar
  56. 56.
    J. Pickands. Statistical inference using extreme order statistics. The Annals of Statistics, 3(1), pp. 119–131, 1975.MathSciNetCrossRefMATHGoogle Scholar
  57. 57.
    J. Pickands. Multivariate extreme value distributions. Proceedings of the 43rd Session International Statistical Institute, 2, pp. 859–878, 1981.Google Scholar
  58. 58.
    D. Rocke and D. Woodruff. Identification of Outliers in Multivariate Data. Journal of the American Statistical Association 91, 435, pp. 1047–1061, 1996.MathSciNetCrossRefMATHGoogle Scholar
  59. 59.
    L. Rokach. Pattern classification using ensemble methods, World Scientific Publishing Company, 2010.Google Scholar
  60. 60.
    R. Samworth. Optimal Weighted Nearest Neighbour Classifiers. The Annals of Statistics, 40(5), pp. 2733–2763, 2012.MathSciNetCrossRefMATHGoogle Scholar
  61. 61.
    S. Sathe and C. Aggarwal. Subspace Outlier Detection in Linear Time with Randomized Hashing. ICDM Conference, 2016.Google Scholar
  62. 62.
    G. Seni and J. Elder. Ensemble Methods in Data Mining: Improving Accuracy through Combining Predictions, Synthesis Lectures in Data Mining and Knowledge Discovery, Morgan and Claypool, 2010.Google Scholar
  63. 63.
    B. M. Steele. Exact Bootstrap k-Nearest Neighbor Learners. Machine Learning, 74(3), pp. 235-255, 2009.CrossRefGoogle Scholar
  64. 64.
    M. Sugiyama and K. Borgwardt. Rapid distance-based outlier detection via sampling. Advances in Neural Information Processing Systems, pp. 467–475, 2013.Google Scholar
  65. 65.
    S. C. Tan, K. M. Ting, and T. F. Liu. Fast Anomaly Detection for Streaming Data. IJCAI Conference, 2011.Google Scholar
  66. 66.
    K. M. Ting, T. Washio, J. Wells, and S. Arya. Defying the gravity of learning curve: a characteristic of nearest neighbour anomaly detectors. Machine learning Journal, Auguest 2016.Google Scholar
  67. 67.
    K. M. Ting, G. T. Zhou, F. T. Liu, and S. C. Tan. Mass Estimation and its Applications. ACM KDD Conference, pp. 989–998, 2010.Google Scholar
  68. 68.
    K. M. Ting, Y. Zhu, M. Carman, and Y. Zhu. Overcoming Key Weaknesses of Distance-Based Neighbourhood Methods using a Data Dependent Dissimilarity Measure. ACM KDD Conference, 2016.Google Scholar
  69. 69.
    Y. Wang, S. Parthasarathy, and S. Tatikonda. Locality sensitive outlier detection: a ranking driven approach. ICDE Conference, pp. 410–421, 2011.Google Scholar
  70. 70.
    D. Wolpert and W. Macready. No free lunch theorems for optimization. IEEE Transactions on Evolutionary Computation, 1(1), pp. 67–72, 1997.CrossRefGoogle Scholar
  71. 71.
    K. Wu, K. Zhang, W. Fan, A. Edwards, and P. Yu. RS-Forest: A Rapid Density Estimator for Streaming Anomaly Detection. IEEE ICDM Conference, pp. 600–609, 2014.Google Scholar
  72. 72.
    Z.-H. Zhou. Ensemble Methods: Foundations and Algorithms. Chapman and Hall/CRC Press, 2012.Google Scholar
  73. 73.
    A. Zimek, M. Gaudet, R. Campello, J. Sander. Subsampling for efficient and effective unsupervised outlier detection ensembles, KDD Conference, 2013.Google Scholar
  74. 74.

Copyright information

© Springer International Publishing AG 2017

Authors and Affiliations

  1. 1.IBM T. J. Watson Research CenterYorktown HeightsUSA

Personalised recommendations