Integrating outlier filtering in large margin training



Large margin classifiers such as support vector machines (SVM) have been applied successfully in various classification tasks. However, their performance may be significantly degraded in the presence of outliers. In this paper, we propose a robust SVM formulation which is shown to be less sensitive to outliers. The key idea is to employ an adaptively weighted hinge loss that explicitly incorporates outlier filtering in the SVM training, thus performing outlier filtering and classification simultaneously. The resulting robust SVM formulation is non-convex. We first relax it into a semi-definite programming which admits a global solution. To improve the efficiency, an iterative approach is developed. We have performed experiments using both synthetic and real-world data. Results show that the performance of the standard SVM degrades rapidly when more outliers are included, while the proposed robust SVM training is more stable in the presence of outliers.

Key words

Support vector machines Outlier filter Semi-definite programming Multi-stage relaxation 

CLC number



  1. Bousquet, O., Elisseeff, A., 2002. Stability and generalization. J.Mach. Learn. Res., 2(3):499–526. [doi:10.1162/153244302760200704]MathSciNetMATHCrossRefGoogle Scholar
  2. Brodley, C.E., Friedl, M.A., 1996. Identifying and Eliminating Mislabeled Training Instances. Proc. 13th National Conf. on Artificial Intelligence, 1:799–805.Google Scholar
  3. Cortes, C., Vapnik, V., 1995. Support vector networks. Mach. Learn., 20(3):273–297. [doi:10.1023/A:1022627411411]MATHGoogle Scholar
  4. Davy, M., Godsill, S., 2002. Detection of Abrupt Spectral Changes Using Support Vector Machines: an Application to Audio Signal Segmentation. Proc. IEEE Int. Conf. on Acoustics, Speech, and Signal Processing, p.1313–1316.Google Scholar
  5. Eskin, E., Lee, W., Stolfo, S.J., 2001. Modeling System Calls for Intrusion Detection with Dynamic Window Sizes. Proc. DARPA Information Survivability Conf. and Exposition, p.1–11.Google Scholar
  6. Fawcett, T., Provost, F.J., 1997. Adaptive fraud detection. Data Min. Knowl. Disc., 1(3):291–316. [doi:10.1023/A:1009700419189]CrossRefGoogle Scholar
  7. Frank, A., Asuncion, A., 2010. UCI Machine Learning Repository. School of Information and Computer Science, University of California, Irvine.Google Scholar
  8. Herbrich, R., Weston, J., 2000. Adaptive Margin Support Vector Machines for Classification. Advances in Large Margin Classifiers. MIT Press, Cambridge, Massachusetts, USA, p.281–295.Google Scholar
  9. King, S.P., King, D.M., Astley, K., Tarassenko, L., Hayton, P., Utete, S., 2002. The Use of Novelty Detection Techniques for Monitoring High-Integrity Plant. Proc. Int. Conf. on Control Applications, 1:221–226. [doi:10.1109/CCA.2002.1040189]CrossRefGoogle Scholar
  10. Krause, N., Singer, Y., 2004. Leveraging the Margin More Carefully. Proc. 21st Int. Conf. on Machine Learning, p.1–8. [doi:10.1145/1015330.1015344]Google Scholar
  11. Laskov, P., Schafer, F., Kotenko, I., 2004. Intrusion Detection in Unlabeled Data with Quarter-Sphere Support Vector Machines. Proc. DIMVA, p.71–82.Google Scholar
  12. Manevitz, L.M., Yousef, M., 2002. One-class SVMs for document classification. J. Mach. Learn. Res., 2(2):139–154.MATHCrossRefGoogle Scholar
  13. Ratsch, G., Mika, S., Scholkopf, B., Muller, K.R., 2002. Constructing boosting algorithms from SVMs: an application to one-class classification. IEEE Trans. Pattern Anal. Mach. Intell., 24(9):1184–1199. [doi:10.1109/TPAMI.2002.1033211]CrossRefGoogle Scholar
  14. Scholkopf, B., Smola, A.J., 2002. Learning with Kernels Support Vector Machines, Regularization, Optimization and Beyond. MIT Press, Cambridge, Massachusetts, USA, p.135–141.Google Scholar
  15. Song, Q., Hu, W., Xie, W., 2002. Robust support vector machine with bullet hole image classification. IEEE Trans. Syst. Man Cybern. C, 32(4):440–448.CrossRefGoogle Scholar
  16. Steinwart, I., Hush, D., Scovel, C., 2005. A classification framework for anomaly detection. J. Mach. Learn. Res., 6:211–232.MathSciNetGoogle Scholar
  17. Tax, D., Ypma, A., Ypma, E., Duin, R.P.W., 1999. Support Vector Data Description Applied to Machine Vibration Analysis. Annual Conf. of the Advanced School for Computing and Imaging, p.398–405.Google Scholar
  18. Tax, D.M.J., 2001. One-Class Classification: Concept-Learning in the Absence of Counter-Examples. PhD Thesis, Delft University of Technology, Delft, the Netherlands.Google Scholar
  19. Thongkam, J., Xu, G., Zhang, Y., Huang, F., 2008. Support Vector Machine for Outlier Detection in Breast Cancer Survivability Prediction. APWeb Workshop, p.99–109. [doi:10.1007/978-3-540-89376-9-10]Google Scholar
  20. Wu, Y., Liu, Y., 2007. Robust truncated hinge loss support vector machines. J. Am. Statist. Assoc., 102(479):974–983. [doi:10.1198/016214507000000617]MATHCrossRefGoogle Scholar
  21. Xu, L., Crammer, K., Schuurmans, D., 2006. Robust Support Vector Machine Training via Convex Outlier Ablation. Proc. National Conf. of Artificial Intelligence, 21:536–542.Google Scholar
  22. Zhang, T., 2008. Multi-stage Convex Relaxation for Learning with Sparse Regularization. NIPS, p.1929–1936.Google Scholar

Copyright information

© Journal of Zhejiang University Science Editorial Office and Springer-Verlag Berlin Heidelberg 2011

Authors and Affiliations

  1. 1.College of Communication EngineeringChongqing UniversityChongqingChina
  2. 2.School of Electrical EngineeringZhejiang UniversityHangzhouChina
  3. 3.Department of Computer Science and EngineeringArizona State UniversityTempeUSA

Personalised recommendations