Abstract
We study learning scenarios in which multiple learners are involved and “nature” imposes some constraints that force the predictions of these learners to behave coherently. This is natural in cognitive learning situations, where multiple learning problems co-exist but their predictions are constrained to produce a valid sentence, image or any other domain representation.
Our theory addresses two fundamental issues in computational learning: (1) The apparent ease at which cognitive systems seem to learn concepts, relative to what is predicted by the theoretical models, and (2) The robustness of learnable concepts to noise in their input. This type of robustness is very important in cognitive systems, where multiple concepts are learned and cascaded to produce more and more complex features.
Existing models of concept learning are extended by requiring the target concept to cohere with other concepts from the concept class. The coherency is expressed via a (Boolean) constraint that the concepts have to satisfy. We show how coherency can lead to improvements in the complexity of learning and to increased robustness of the learned hypothesis.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
S. Ben-David, N. Cesa-Bianchi, D. Haussler, and P. M. Long. Characterizations of learnability of {0,...,n}-valued functions. Journal of Computer and System Sciences, pages 74–86, 1995.
A. Blum and T. Mitchell. Combining labeled and unlabeled data with co-training. In Proc. of the Annual ACM Workshop on Computational Learning Theory, pages 92–100, 1998.
A. Blumer, A. Ehrenfeucht, D. Haussler, and M. K. Warmuth. Learnability and the Vapnik-Chervonenkis dimension. Journal of the ACM, 36(4):865–929, 1989.
S. E. Decatur and R. Gennaro. On learning from noisy and incomplete examples. In Proc. of the Annual ACM Workshop on Computational Learning Theory, pages 353–360, 1995.
H. Druker, R. Schapire, and P. Simard. Improving performance in neural networks using a boosting algorithm. In Neural Information Processing Systems 5, pages 42–49. Morgan Kaufmann, 1993.
A. Ehrenfeucht, D. Haussler, M. Kearns, and L. Valiant. A general lower bound on the number of examples needed for learning. Information and Computation, 82(3):247–251, September 1989.
Y. Freund and R. Schapire. Large margin classi cation using the Perceptron algorithm. In Proc. of the Annual ACM Workshop on Computational Learning Theory, pages 209–217, 1998.
A. R. Golding and D. Roth. A winnow based approach to context-sensitive spelling correction. Machine Learning, 34(1-3):107–130, 1999. Special Issue on Machine Learning and Natural Language.
S. A. Goldman and R. H. Sloan. Can PAC learning algorithms tolerate random attribute noise? Algorithmica, 14(1):70–84, July 1995.
Wassily Hoeding. Probability inequalities for sums of bounded random variables. Journal of the American Statistical Association, 58(301):13–30, March 1963.
M. Kearns and U. Vazirani. Introduction to computational Learning Theory. MIT Press, 1994.
R. Khardon, D. Roth, and L. G. Valiant. Relational learning for nlp using linear threshold elements. In Proc. of the International Joint Conference of Artiffcial Intelligence, 1999.
N. Littlestone. Learning quickly when irrelevant attributes abound: A new linearthreshold algorithm. Machine Learning, 2:285–318, 1988.
O. L. Mangarasian. Nonlinear Programming. McGraw-Hill, 1969.
A. Noviko. On convergence proofs for perceptrons. In Proceeding of the Symposium on the Mathematical Theory of Automata, volume 12, pages 615–622, 1963.
D. Roth. Learning to resolve natural language ambiguities: A unified approach. In Proc. National Conference on Artificial Intelligence, pages 806–813, 1998.
D. Roth. Learning in natural language. In Proc. Int’l Joint Conference on Artificial Intelligence, pages 898–904, 1999.
D. Roth and D. Zelenko. Part of speech tagging using a network of linear separators. In COLING-ACL 98, The 17th International Conference on Computational Linguistics, pages 1136–1142, 1998.
L. G. Valiant. A theory of the learnable. Communications of the ACM, 27(11):1134–1142, November 1984.
L. G. Valiant. Robust logic. In Proceedings of the Annual ACM Symp. on the Theory of Computing, 1999. To appear.
V. N. Vapnik. The Nature of Statistical Learning Theory. Springer-Verlag, New York, 1995.
V. N. Vapnik and A. Ya. Chervonenkis. On the uniform convergence of relative frequencies of events to their probabilities. Theory of Probability and its applications, XVI(2):264–280, 1971.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 1999 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Roth, D., Zelenko, D. (1999). Coherent Concepts, Robust Learning. In: Pavelka, J., Tel, G., Bartošek, M. (eds) SOFSEM’99: Theory and Practice of Informatics. SOFSEM 1999. Lecture Notes in Computer Science, vol 1725. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-47849-3_16
Download citation
DOI: https://doi.org/10.1007/3-540-47849-3_16
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-66694-3
Online ISBN: 978-3-540-47849-2
eBook Packages: Springer Book Archive