Abstract
Bias reduction is a difficult problem in unsupervised problem like outlier detection. The main reason is that bias-reduction algorithms often require a quantification of error in intermediate steps of the algorithm. An example of such a bias reduction algorithm from classification is referred to as “boosting”. In boosting, the outputs of highly biased detectors are used to learn portions of the decision space in which the bias performance affects the algorithm in a negative way.
Informed decision-making comes from a long tradition of guessing and then blaming others for inadequate results.
Scott Adams
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
C. C. Aggarwal. Outlier Ensembles: Position Paper, ACM SIGKDD Explorations, 14(2), pp. 49–58, December, 2012.
C. C. Aggarwal. Active Learning: A Survey. Data Classification: Algorithms and Applications, CRC Press, 2014.
C. C. Aggarwal Data Mining: The Textbook, Springer, 2015.
C. C. Aggarwal. Outlier Analysis, Second Edition, Springer, 2017.
C. C. Aggarwal and S. Sathe. Theoretical Foundations and Algorithms for Outlier Ensembles, ACM SIGKDD Explorations, 17(1), June 2015.
C. C. Aggarwal and P. S. Yu. Outlier Detection in High Dimensional Data, ACM SIGMOD Conference, 2001.
D. Barbara, Y. Li, J. Couto, J.-L. Lin, and S. Jajodia. Bootstrapping a Data Mining Intrusion Detection System. Symposium on Applied Computing, 2003.
Y. Bengio, A. Courville, and P. Vincent. Representation learning: A Review and New Perspectives. IEEE Transactions on Pattern Analysis and Machine Intelligence, 35(8), pp. 1798–1828, 2013.
C. Brodley and M. Friedl. Identifying Mislabeled Training Data. Journal of Artificial Intelligence Research, pp. 131–167, 1999.
C. Campbell, and K. P. Bennett. A Linear-Programming Approach to Novel Class Detection. Advances in Neural Information Processing Systems, 2000.
N. Chawla, A. Lazarevic, L. Hall, and K. Bowyer. SMOTEBoost: Improving prediction of the minority class in boosting, PKDD, pp. 107–119, 2003.
P. Domingos. Bayesian Averaging of Classifiers and the Overfitting Problem. ICML Conference, 2000.
C. Dwork, R. Kumar, M. Naor, and D. Sivakumar. Rank aggregation methods for the Web. WWW Conference, 2001.
A. Emmott, S. Das, T. Dietterich, A. Fern, and W. Wong. Systematic Construction of Anomaly Detection Benchmarks from Real Data. arXiv:1503.01158, 2015. https://arxiv.org/abs/1503.01158
Y. Freund and R. Schapire. A Decision-theoretic Generalization of Online Learning and Application to Boosting, Computational Learning Theory, 1995.
Y. Freund and R. Schapire. Experiments with a New Boosting Algorithm. ICML Conference, pp. 148–156, 1996.
J. Gao, P.-N. Tan. Converting output scores from outlier detection algorithms into probability estimates. ICDM Conference, 2006.
https://www.cs.cmu.edu/afs/cs/project/jair/pub/volume11/opitz99a-html/node14.html
J. Hoeting, D. Madigan, A. Raftery, and C. Volinsky. Bayesian Model Averaging: A Tutorial. Statistical Science, 14(4), pp. 382–401, 1999.
G. John. Robust Decision Trees: Removing Outliers from Data. KDD Conference, pp. 174–179, 1995.
M. Joshi, V. Kumar, and R. Agarwal. Evaluating Boosting Algorithms to Classify Rare Classes: Comparison and Improvements. ICDM Conference, pp. 257–264, 2001.
F. Keller, E. Muller, K. Bohm. HiCS: High-Contrast Subspaces for Density-based Outlier Ranking, IEEE ICDE Conference, 2012.
J. Kemeny. Mathematics without numbers. Daedalus, pp. 577591, 1959.
R. Kolde, S. Laur, P. Adler, and J. Vilo. Robust rank aggregation for gene list integration and meta-analysis. Bioinformatics, 28(4), pp. 573–580, 2012.
A. Lazarevic, and V. Kumar. Feature Bagging for Outlier Detection, ACM KDD Conference, 2005.
L. M. Manevitz and M. Yousef. One-class SVMs for Document Classification. Journal of Machine Learning Research, 2: pp, 139–154, 2001.
B. Micenkova, B. McWiliams, and I. Assent. Learning Outlier Ensembles: The Best of Both Worlds – Supervised and Unsupervised. Outlier Detection and Description Workshop, 2014. Extended version: http://arxiv.org/pdf/1507.08104v1.pdf
E. Muller, M. Schiffer, and T. Seidl. Statistical Selection of Relevant Subspace Projections for Outlier Ranking. ICDE Conference, pp, 434–445, 2011.
M. Perrone and L. Cooper. When Networks Disagree: Ensemble Method for Neural networks. Artifical Neural Networks for Speech and Vision, Chapman and Hall, pp. 126–142, 1993.
G. Ratsch, S. Mika, B. Scholkopf, K. Muller. Constructing boosting algorithms from SVMs: an application to one-class classification. IEEE Transactions on Pattern Analysis and Machine Intelligence, 24(9), pp. 1184–1199, 2002.
S. Rayana, L. Akoglu. Less is More: Building Selective Anomaly Ensembles with Application to Event Detection in Temporal Graphs. SDM Conference, 2015.
S. Rayana, L. Akoglu. Less is More: Building Selective Anomaly Ensembles. ACM Transactions on Knowledge Discovery and Data Mining, 10(4), 42, 2016.
S. Rayana, W. Zhong, and L. Akoglu. Sequential Ensemble Learning for Outlier Detection: A Bias-Variance Perspective. IEEE ICDM Conference, 2016.
L. Rokach. Pattern classification using ensemble methods, World Scientific Publishing Company, 2010.
G. Seni and J. Elder. Ensemble Methods in Data Mining: Improving Accuracy through Combining Predictions, Synthesis Lectures in Data Mining and Knowledge Discovery, Morgan and Claypool, 2010.
M. Salehi, C. Leckie, M. Moshtaghi, and T. Vaithianathan. A Relevance Weighted Ensemble Model for Anomaly Detection in Switching Data Streams. Advances in Knowledge Discovery and Data Mining, pp. 461–473, 2014.
M. Salehi, X. Zhang, J. Bezdek, and C. Leckie. Smart Sampling: A Novel Unsupervised Boosting Approach for Outlier Detection. Australasian Joint Conference on Artificial Intelligence, Springer, pp. 469–481, 2016. http://rd.springer.com/book/10.1007/978-3-319-50127-7
S. Weisberg. Applied Linear Regression. John Wiley and Sons, 1985.
D. Wilson. Asymptotic Properties of Nearest-Neighbor Rules using Edited Data. Man and Cybernetics, 2, pp. 408–421, 1972.
D. Wolpert. Stacked Generalization, Neural Networks, 5(2), pp. 241–259, 1992.
H. Xu, C. Caramanis, and S. Sanghavi. Robust PCA via Outlier Pursuit. Advances in Neural Information Processing Systems, pp. 2496–2504, 2010.
Z.-H. Zhou. Ensemble Methods: Foundations and Algorithms. Chapman and Hall/CRC Press, 2012.
Z.-H. Zhou, J. Wu, and W. Tang. Ensembling Neural Networks: Many could be Better than All. Artificial Intelligence, 137(1), pp. 239–263, 2002.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this chapter
Cite this chapter
Aggarwal, C.C., Sathe, S. (2017). Bias Reduction in Outlier Ensembles: The Guessing Game. In: Outlier Ensembles. Springer, Cham. https://doi.org/10.1007/978-3-319-54765-7_4
Download citation
DOI: https://doi.org/10.1007/978-3-319-54765-7_4
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-54764-0
Online ISBN: 978-3-319-54765-7
eBook Packages: Computer ScienceComputer Science (R0)