Abstract
Details of complex event sequences are often not predictable, but their reduced abstract representations are. I study an embedded active learner that can limit its predictions to almost arbitrary computable aspects of spatio-temporal events. It constructs probabilistic algorithms that (1) control interaction with the world, (2) map event sequences to abstract internal representations (IRs), (3) predict IRs from IRs computed earlier. Its goal is to create novel algorithms generating IRs useful for correct IR predictions, without wasting time on those learned before. This requires an adaptive novelty measure which is implemented by a coevolutionary scheme involving two competing modules collectively designing (initially random) algorithms representing experiments. Using special instructions, the modules can bet on the outcome of IR predictions computed by algorithms they have agreed upon. If their opinions differ then the system checks who’s right, punishes the loser (the surprised one), and rewards the winner. An evolutionary or reinforcement learning algorithm forces each module to maximize reward. This motivates both modules to lure each other into agreeing upon experiments involving predictions that surprise it. Since each module essentially can veto experiments it does not consider profitable, the system is motivated to focus on those computable aspects of the environment where both modules still have confident but different opinions. Once both share the same opinion on a particular issue (via the loser’s learning process, e.g., the winner is simply copied onto the loser), the winner loses a source of reward — an incentive to shift the focus of interest onto novel experiments. My simulations include an example where surprise-generation of this kind helps to speed up external reward.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Fedorov, V. V. (1972) Theory of optimal experiments. Academic Press
Hwang, J., Choi, J., Oh, S., Marks II, R. J. (1991) Query-based learning applied to partially trained multilayer perceptrons. IEEE Transactions on Neural Networks, 2, 131–136
MacKay, D. J. C. (1992) Information-based objective functions for active data selection. Neural Computation, 4, 550–604
Plutowski, M., Cottrell, G., White, H. (1994) Learning Mackey-Glass from 25 examples, plus or minus 2. In J. Cowan, G. Tesauro, and J. Alspector, editors, Advances in Neural Information Processing Systems 6, 1135–1142. Morgan Kaufmann
Colin, D. A. (1994) Neural network exploration using optimal experiment design. In J. Cowan, G. Tesauro, and J. Alspector, editors, Advances in Neural Information Processing Systems 6, 679–686. Morgan Kaufmann
Shannon, C. E. (1948) A mathematical theory of communication (parts I and II). Bell System Technical Journal, XXVII, 379–423
Schmidhuber, J. (1991) A possibility for implementing curiosity and boredom in model-building neural controllers. In J. A. Meyer and S. W. Wilson, editors, Proceedings of the International Conference on Simulation of Adaptive Behavior: From Animals to Animats, 222–227. MIT Press/Bradford Books
Schmidhuber, J. (1991) Curious model-building control systems. In Proceedings of the International Joint Conference on Neural Networks, Singapore, 2, 1458–1463. IEEE
Storck, J., Hochreiter, S., Schmidhuber, J. (1995) Reinforcement driven information acquisition in non-deterministic environments. In Proceedings of the International Conference on Artificial Neural Networks, Paris, 2, 159–164. EC2 & Cie
Schmidhuber, J., Prelinger, D. (1993) Discovering predictable classifications. Neural Computation, 5, 625–635
Lenat, D. (1983) Theory formation by heuristic search. Machine Learning, 21
Holland, J. H. (1985) Properties of the bucket brigade. In Proceedings of an International Conference on Genetic Algorithms, Hillsdale, NJ
Schmidhuber, J., Zhao, J., Wiering, M. Shifting inductive bias with success-story algorithm, adaptive Levin search, and incremental self-improvement. Machine Learning, 28, 105–130
Schmidhuber, J., Zhao, J., Schraudolph, N. (1997) Reinforcement learning with self-modifying policies. In S. Thrun and L. Pratt, editors, Learning to learn, 293–309. Kluwer
Kolmogorov, A. N. (1965) Three approaches to the quantitative definition of information. Problems of Information Transmission, 1, 1–11
Kwee, I., Hutter, M., and Schmidhuber, J. (2001) Market-based reinforcement learning in partially observable worlds. Proceedings of the International Conference on Artificial Neural Networks (ICANN-2001), in press.
Chaitin, G. J. (1969) On the length of programs for computing finite binary sequences: statistical considerations. Journal of the ACM, 16, 145–159
Solomonoff, R. J. (1964) A formal theory of inductive inference. Part I. Information and Control, 7, 1–22
Li, M., Vitänyi, P. M. B. (1997) An Introduction to Kolmogorov Complexity and its Applications. Springer
Schmidhuber, J. (1999) A general method for incremental self-improvement and multi-agent learning. In X. Yao, editor, Evolutionary Computation: Theory and Applications, 81–123. World Scientific
Schmidhuber, J. (1997) Discovering neural nets with low Kolmogorov complexity and high generalization capability. Neural Networks, 10, 857–873
Lin, L. J. (1993) Reinforcement Learning for Robots Using Neural Networks. PhD thesis, Carnegie Mellon University, Pittsburgh
Geman, S., Bienenstock, E., Doursat, R. (1992) Neural networks and the bias/variance dilemmA. Neural Computation, 4, 1–58
Clarke, A. C. (1991) The ghost from the grand banks.
Hochreiter, S., Schmidhuber, J. (1997) Flat minimA. Neural Computation, 9, 1–42
Hillis, D. (1992) Co-evolving parasites improve simulated evolution as an optimization procedure. In CG. Langton, C. Taylor, J. D. Farmer, and S. Rasmussen, editors, Artificial Life II, 313–324. Addison Wesley
Pollack, J. B., Blair, A. D. (1997) Why did TD-Gammon work? In M. C. Mozer, M. I. Jordan, and S. Petsche, editors, Advances in Neural Information Processing Systems, 9, 10–16
Samuel, A. L. (1959) Some studies in machine learning using the game of checkers. IBM Journal on Research and Development, 3, 210–229
Tesauro, G. (1994) TD-gammon, a self-teaching backgammon program, achieves master-level play. Neural Computation, 6, 215–219
Schmidhuber, J. (1992) Learning factorial codes by predictability minimization. Neural Computation, 4, 863–879
Schraudolph, N., Sejnowski, T. J. (1993) Unsupervised discrimination of clustered data via optimization of binary information gain. In Stephen Jose Hanson, Jack D. Cowan, and C. Lee Giles, editors, Advances in Neural Information Processing Systems, 5, 499–506
Schmidhuber, J., Eldracher, M., Foltin, B. (1996) Semilinear predictability minimization produces well-known feature detectors. Neural Computation, 8, 773–786
Schraudolph, N. N., Eldracher, M., Schmidhuber, J. (1999) Processing images by semi-linear predictability minimization. Network: Computation in Neural Systems, 10, 133–169
Nake, F. (1974) Ästhetik als Informationsverarbeitung. Springer
Schmidhuber, J. (1997) Low-complexity art. Leonardo, Journal of the International Society for the Arts, Sciences, and Technology, 30, 97–103
Schmidhuber, J. (1998) Facial beauty and fractal geometry. Technical Report IDSIA-28-98, IDSIA, Also published in the Cogprint Archive: http://cogprints.soton.ac.uk
Wilson, S. W. (1994) ZCS: A zeroth level classifier system. Evolutionary Computation, 2, 1–18
Wilson, S. W. (1995) Classifier fitness based on accuracy. Evolutionary Computation, 3, 149–175
Weiss, G. (1994) Hierarchical chunking in classifier systems. In Proceedings of the 12th National Conference on Artificial Intelligence, 2, 1335–1340
Weiss, G., Sen, S. (eds.) (1996) Adaption and Learning in Multi-Agent Systems. LNAI 1042, Springer-Verlag
Schmidhuber, J. (1987) Evolutionary principles in self-referential learning, or on learning how to learn: the meta-meta-… hook. Institut für Informatik, Technische Universität München
Schmidhuber, J. (1989) A local learning algorithm for dynamic feedforward and recurrent networks. Connection Science, 1, 403–412
Baum, E. B., Durdanovic, I. (1999) Toward a model of mind as an economy of agents. Machine Learning, 35, 155–185
Wolpert, D. H., Turner, K., Frank, J. (1999) Using collective intelligence to route internet traffic. In M. Kearns, S. A. Solla, and D. Cohn, editors, Advances in Neural Information Processing Systems 12
Schmidhuber, J. (1998) What’s interesting? Technical Report IDSIA-35-97, IDSIA, 1997. ftp://ftp.idsiA.ch/pub/juergen/interest.ps.gz>; extended abstract in Proc. Snowbird’98, Utah
Schmidhuber, J. (1999) Artificial curiosity based on discovering novel algorithmic predictability through coevolution. In P. Angeline, Z. Michalewicz, M. Schoenauer, X. Yao, and Z. Zalzala, editors, Congress on Evolutionary Computation, 1612–1618. IEEE Press
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2003 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Schmidhuber, J. (2003). Exploring the Predictable. In: Ghosh, A., Tsutsui, S. (eds) Advances in Evolutionary Computing. Natural Computing Series. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-18965-4_23
Download citation
DOI: https://doi.org/10.1007/978-3-642-18965-4_23
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-62386-8
Online ISBN: 978-3-642-18965-4
eBook Packages: Springer Book Archive