Summary
We describe an algorithm for converting linear support vector machines SVM and any other arbitrary hyperplane-based linear classifiers into a set of nonoverlapping rules that, unlike the original classifier, can be easily interpreted by humans.
Each iteration of the rule extraction algorithm is formulated as a constrained optimization problem that is computationally inexpensive to solve. We discuss various properties of the algorithm and provide proof of convergence for two different optimization criteria. We demonstrate the performance and the speed of the algorithm on linear classifiers learned from real-world datasets, including a medical dataset on detection of lung cancer from medical images.
The ability to convert SVMs and other “black-box” classifiers into a set of human-understandable rules, is critical not only for physician acceptance, but also for reducing the regulatory barrier for medical-decision support systems based on such classifiers.
We also present some variations and extensions of the proposed mathematical programming formulations for rule extraction.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
D. P. Bertsekas. Nonlinear Programming. Athena Scientific, Belmont, MA, 1995.
Dimitri P. Bertsekas. Projected Newton methods for optimization problems with simple constraints. SIAM Journal on Control and Optimization, 20:221–246, 1982.
F. Beyer, L. Zierott, J. Stoeckel, W. Heindel, and D. Wormanns. Computer-assisted detection (cad) of pulmonary nodules at mdct: Can cad be used as concurrent reader? In Proceeding of the 11th European Congress of Radiology, Viena, Austria, March 2005. To appear.
E. H. Shortliffe B. G. Buchanan. Rule-Based Expert Systems: the MYCIN experiments of the Stanford Heuristic Programming Project. Addison-Wesley, Reading, MA, 1984.
P. S. Bradley and O. L. Mangasarian. Feature selection via concave minimization and support vector machines. In J. Shavlik, editor, Machine Learning Proceedings of the Fifteenth International Conference (ICML '98), pages 82–90, San Francisco, California, 1998. Morgan Kaufmann. ftp://ftp.cs.wisc.edu/mathprog/tech-reports/98-03.ps.
V. Cherkassky and F. Mulier. Learning from Data - Concepts, Theory and Methods. John Wiley & Sons, New York, 1998.
G. Fung and O. L. Mangasarian. Proximal support vector machine classifiers. In F. Provost and R. Srikant, editors, Proceedings KDD-2001: Knowledge Discovery and Data Mining, August 26–29, 2001, San Francisco, CA, pages 77–86, New York, 2001. Asscociation for Computing Machinery. ftp://ftp.cs.wisc.edu/pub/dmi/tech-reports/01-02.ps.
G. Fung, O. L. Mangasarian, and J. Shavlik. Knowledge-based support vector machine classifiers. Technical Report 01–09, Data Mining Institute, Computer Sciences Department, University of Wisconsin, Madison, Wisconsin, November 2001. ftp://ftp.cs.wisc.edu/pub/dmi/tech-reports/01-09.ps, NIPS 2002 Proceedings, to appear.
G. Fung, O. L. Mangasarian, and J. Shavlik. Knowledge-based nonlinear kernel classifiers. Technical Report 03–02, Data Mining Institute, Computer Sciences Department, University of Wisconsin, Madison, Wisconsin, March 2003. ftp://ftp.cs.wisc.edu/pub/dmi/tech-reports/02-03.ps. Conference on Learning Theory (COLT 03) and Workshop on Kernel Machines, Washington D.C., August 24–27, 2003, submitted.
Glenn Fung. The disputed federalist papers: Svm feature selection via concave minimization. In TAPIA '03: Proceedings of the 2003 conference on Diversity in computing, pages 42–46. ACM Press, 2003.
F. J. Kurfes. Neural networks and structured knowledge: Rule extraction and applications. Applied Intelligence (Special Issue), 12(1–2):7–13, 2000.
O. L. Mangasarian. Arbitrary-norm separating plane. Operations Research Letters, 24:15–23, 1999. ftp://ftp.cs.wisc.edu/math-prog/tech-reports/97-07r.ps.
O. L. Mangasarian. Generalized support vector machines. In A. Smola, P. Bartlett, B. Schölkopf, and D. Schuurmans, editors, Advances in Large Margin Classifiers, pages 135–146, Cambridge, MA, 2000. MIT Press. ftp://ftp.cs.wisc.edu/math-prog/tech-reports/98-14.ps.
O. L. Mangasarian, W. N. Street, and W. H. Wolberg. Breast cancer diagnosis and prognosis via linear programming. Operations Research, 43(4):570–577, July-August 1995.
S. Mika, G. Rätsch, J. Weston, B. Schölkopf, and K.-R. Müller. Fisher discriminant analysis with kernels. In Y.-H. Hu, J. Larsen, E. Wilson, and S. Douglas, editors, Neural Networks for Signal Processing IX, pages 41–48. IEEE, 1999.
P. M. Murphy and D. W. Aha. UCI machine learning repository, 1992. www.ics.uci.edu/∼mlearn/MLRepository.html.
Haydemar Nuñez, Cecilio Angulo, and Andreu Catal. Rule extraction from support vector machines. In ESANN'2002 proceedings - European Symposium on Artificial Neural Networks, pages 107–112. d-side, 2002.
K. Preston. Computer processing of biomedical images. Computer, 9:54–68, 1976.
A. Tickle R. Andrews, and J. Diederich. A survey and critique of techniques for extracting rules from trained artificial neural networks. Knowledge-Based Systems, 8:373–389, 1995.
L. B. Lusted and R. S. Ledley. Reasoning foundations of medical diagnosis. Science, 130:9–21, 1959.
J. Roehrig. The promise of cad in digital mammography. European Journal of Radiology, 31:35–39, 1999.
G. Towell & J. Shavlik. The extraction of refined rules from knowledge-based neural networks. Machine Learning, 13:71–101, 1993.
J. A. K. Suykens and J. Vandewalle. Least squares support vector machine classifiers. Neural Processing Letters, 9(3):293–300, 1999.
V. N. Vapnik. The Nature of Statistical Learning Theory. Springer, New York, second edition, 2000.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2008 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Fung, G., Sandilya, S., Rao, R. (2008). Rule Extraction from Linear Support Vector Machines via Mathematical Programming. In: Diederich, J. (eds) Rule Extraction from Support Vector Machines. Studies in Computational Intelligence, vol 80. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-75390-2_4
Download citation
DOI: https://doi.org/10.1007/978-3-540-75390-2_4
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-75389-6
Online ISBN: 978-3-540-75390-2
eBook Packages: EngineeringEngineering (R0)