Abstract
A fundamental question in learning theory is the quantification of the basic tradeoff between the complexity of a model and its predictive accuracy. One valid way of quantifying this tradeoff, known as the “Information Bottleneck”, is to measure both the complexity of the model and its prediction accuracy by using Shannon’s mutual information. In this paper we show that the Information Bottleneck framework answers a well defined and known coding problem and at same time it provides a general relationship between complexity and prediction accuarcy, measured by mutual information. We study the nature of this complexity-accuracy tradeoff and discuss some of its theoretical properties. Furthermore, we present relations to classical information theoretic problems, such as rate-distortion theory, cost-capacity tradeoff and source coding with side information.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Ahlswede, R.F., Korner, J.: Source coding with side information and a converse for degraded broadcast channels. IEEE transaction on information theory 21(6), 629–637 (1975)
Baram, Y., El-Yaniv, R., Luz, K.: Online choice of active learning algorithms (submitted for publication)
Cardinal, J.: Compression of side information. In: IEEE International Conference on Multimedia and Expo (2003)
Cover, T.M., Thomas, J.A.: Elements Of Information Thory. Wiley Interscience, Hoboken (1991)
Kleinberg, J.: An impossibility theorem for clustering. In: Proc. of the 16th conference on Neural Information Processing Systems (2002)
Poupart, P., Boutilier, C.: Value-directed compression of pomdps. In: Proc. of the 16th conference on Neural Information Processing Systems (2002)
Shannon, C.E.: A mathematical theory of communication. Bell System Technical Journal 27 (July/October 1948)
Slonim, N., Tishby, N.: The power of word clustering for text classification. In: Proc. of the 23rd European Colloquium on Information Retrieval Research (2001)
Slonim, N., Somerville, R., Tishby, N., Lahav, O.: Objective classification of galaxy spectra using the information bottleneck method. Monthly Notes of the Royal Astronomical Society 323, 270–284 (2001)
Slonim, N., Tishby, N.: Document clustering using word clusters via the information bottleneck method. In: Proc. of the 23rd Annual International ACM-SIGIR Conference on Research and Development in Information Retrieval (2000)
Slonim, N.: The Information Bottleneck: Theory and Applications. PhD thesis, The Hebrew University (2002)
Tishby, N., Pereira, F.C., Bialek, W.: The information bottleneck method. In: Proc. of the 37-th Annual Allerton Conference on Comunnication, Control and Computing, pp. 368–377 (1999)
Witsenhausen, H.S., Wyner, A.D.: A conditional entropy bound for a pair of discrete random variables. IEEE transaction on information theory 21(5), 493–501 (1975)
Wyner, A.D.: On source coding with side information at the decoder. IEEE transaction on information theory 21(3), 294–300 (1975)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2003 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Gilad-Bachrach, R., Navot, A., Tishby, N. (2003). An Information Theoretic Tradeoff between Complexity and Accuracy. In: Schölkopf, B., Warmuth, M.K. (eds) Learning Theory and Kernel Machines. Lecture Notes in Computer Science(), vol 2777. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-45167-9_43
Download citation
DOI: https://doi.org/10.1007/978-3-540-45167-9_43
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-40720-1
Online ISBN: 978-3-540-45167-9
eBook Packages: Springer Book Archive