Abstract
Incorporation of prior knowledge into the learning process can significantly improve low-sample classification accuracy. We show how to introduce prior knowledge into linear support vector machines in form of constraints on the rotation of the normal to the separating hyperplane. Such knowledge frequently arises naturally, e.g., as inhibitory and excitatory influences of input variables. We demonstrate that the generalization ability of rotationally-constrained classifiers is improved by analyzing their VC and fat-shattering dimensions. Interestingly, the analysis shows that large-margin classification framework justifies the use of stronger prior knowledge than the traditional VC framework. Empirical experiments with text categorization and political party affiliation prediction confirm the usefulness of rotational prior knowledge.
Chapter PDF
Similar content being viewed by others
Keywords
- Support Vector Machine
- Prior Knowledge
- Neural Information Processing System
- Generalization Error
- Hypothesis Space
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
Vapnik, V.: The Nature of Statistical Learning Theory. Springer, Heidelberg (1995)
Joachims, T.: Text categorization with support vector machines: learning with many relevant features. In: Nédellec, C., Rouveirol, C. (eds.) ECML 1998. LNCS, vol. 1398. Springer, Heidelberg (1998)
Dumas, S., Platt, J., Heckerman, D., Sahami, M.: Inductive learning algorithms and representations for text categorization. In: Proceedings of the Seventh International Conference on Information and Knowledge Management (1998)
Campbell, C., Cristianini, N., Smola, A.: Query learning with large margin classifiers. In: Proceedings of The Seventeenth International Conference on Machine Learning, pp. 111–118 (2000)
Raina, R., Shen, Y., Ng, A., McCallum, A.: Classification with hybrid generative/discriminative models. In: Proceedings of the Seventeenth Annual Conference on Neural Information Processing Systems (2003)
Fink, M.: Object classification from a single example utilizing class relevance metrics. In: Proceedings of the Eighteenth Annual Conference on Neural Information Processing Systems (2004)
Scholkopf, B., Simard, P., Vapnik, V., Smola, A.: Prior knowledge in support vector kernels. In: Advances in kernel methods - support vector learning (2002)
Fung, G., Mangasarian, O., Shavlik, J.: Knowledge-based support vector machine classifiers. In: Proceedings of the Sixteenth Annual Conference on Neural Information Processing Systems (2002)
Wu, X., Srihari, R.: Incorporating prior knowledge with weighted margin support vector machines. In: Proceedings of the Tenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (2004)
Mangasarian, O., Shavlik, J., Wild, E.: Knowledge-based kernel approximation. Journal of Machine Learning Research (2004)
Shawe-Taylor, J., Bartlett, P.L., Williamson, R.C., Anthony, M.: Structural risk minimization over data-dependent hierarchies. IEEE Transactions on Information Theory 44 (1998)
Anthony, M., Biggs, N.: PAC learning and artificial neural networks. Technical report (2000)
Erlich, Y., Chazan, D., Petrack, S., Levy, A.: Lower bound on VC-dimension by local shattering. Neural Computation 9 (1997)
Grunbaum, B.: Convex Polytopes. John Wiley, Chichester (1967)
Blake, C., Merz, C.: UCI repository of machine learning databases (1998), http://www.ics.uci.edu/~mlearn/mlrepository.html
Blake, C., Merz, C.: 20 newsgroups database (1998), http://people.csail.mit.edu/people/jrennie/20newsgroups/
Miller, G.: WordNet: an online lexical database. International Journal of Lexicography 3 (1990)
Dasgupta, S., Kalai, A.T., Monteleoni, C.: Analysis of perceptron-based active learning. In: Auer, P., Meir, R. (eds.) COLT 2005. LNCS (LNAI), vol. 3559, pp. 249–263. Springer, Heidelberg (2005)
Gabrilovich, E., Markovitch, S.: Text categorization with many redundant features: Using aggressive feature selection to make svms competitive with c4.5. In: Proceedings of The Twenty-First International Conference on Machine Learning (2004)
Amit, D., Campbell, C., Wong, K.: The interaction space of neural networks with sign-constrained weights. Journal of Physics (1989)
Barber, D., Saad, D.: Does extra knowledge necessarily improve generalization? Neural Computation 8 (1996)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Epshteyn, A., DeJong, G. (2005). Rotational Prior Knowledge for SVMs. In: Gama, J., Camacho, R., Brazdil, P.B., Jorge, A.M., Torgo, L. (eds) Machine Learning: ECML 2005. ECML 2005. Lecture Notes in Computer Science(), vol 3720. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11564096_15
Download citation
DOI: https://doi.org/10.1007/11564096_15
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-29243-2
Online ISBN: 978-3-540-31692-3
eBook Packages: Computer ScienceComputer Science (R0)