While it is customary to use terms the computational theory of mind and computationalism interchangeably, this custom may be misleading. Theories are usually thought to be not only explanatory but also predictive. Computationalism remains too abstract to predict almost any non-trivial fact about biological brains. The main assumption of computationalism is that nervous systems are computers, and cognition can be explained in terms of nervous system computation. This assumption substantiates the claim that biological agents engage only in tractable computation (Frixione 2001; Rooij et al. 2010). But it is not of much use when explaining phenomena which are typically listed in textbooks for cognitive psychology.
Computationalism is not a theory but a research program that is helpful in creating both theories and individual models of cognitive phenomena. While the notion of the research program has been introduced to the philosophy of science by Lakatos (1976) to account for scientific change and rationality, one part of his account does not fit the dynamics of research well. Lakatos assumes that there is an unchangeable core in any given research program. Laudan (1977) has shown that this is rarely the case. Programs usually change over time. Moreover, Laudan has stressed that the fruitfulness of research programs lies not only in predicting empirical facts, but also in theoretical problem solving, which has been ignored altogether by Lakatos.
To make it clear that Laudan’s account is embraced here, his notion of research tradition will be used instead. He cites three characteristics of research traditions (Laudan 1977, pp. 78–79). First, “every research tradition has a number of specific theories which exemplify and partially constitute it.” Indeed, there are multiple approaches to brain computation, from classical symbolic computation, to connectionism, to dynamic spiking neural networks. Second, they exhibit “certain metaphysical and methodological commitments which, as an ensemble, individuate the research tradition and distinguish it from others.” I analyze these methodological and metaphysical commitments below in this section. Third, traditions go through a number of formulations and usually have a long history, which is easily evidenced, both in changes of basic assumptions, and in development of computational models of working memory, as described in Sect. 3. Early models were fairly sketchy and computational talk could be understood to be merely metaphorical. This is not the case anymore.
One could continue to talk of the computational theory of mind, while stressing that the term itself is an umbrella term. However, it is less misleading to think of computationalism as a diverse research tradition composed of multiple, historically variable computational theories of mind (or brain). By conflating the research tradition with one of the early theories, one could be tempted to reject the whole tradition. In particular, one might conflate computationalism with one of its versions, usually dubbed “GOFAI,” or Good Old-Fashioned Artificial Intelligence. John Haugeland defined GOFAI as the assumption that intelligence is based on the capacity to think reasonably, and reasonable thinking amounts to a faculty for internal “automatic” symbol manipulation (Haugeland 1985, p. 113). Haugeland’s formulation is thought to be related, for example by Margaret A. Boden (2014), to the physical symbol system hypothesis defended by Allen Newell and Herbert Simon (Newell 1980; Newell and Simon 1976).
The physical symbol system hypothesis states that a “physical symbol system has the necessary and sufficient means for general intelligent action” (Newell and Simon 1976, p. 116). The problem with taking this claim at face value would be that even universal computers may run extremely simple software throughout their lifetime. They may have precious little to do with what we may call “general intelligent action,” in particular because some of them may take no inputs at all. This is also why Newell and Simon elucidate what they mean by “sufficient means”: “any physical symbol system of sufficient size can be organized further to exhibit general intelligence,” (Newell and Simon 1976, p. 116). In other words, while it may seem that computationalism claims that cognition is any computation, the proponents of this idea did not mean that all computation exhibits cognitive features; only some physical computation exhibits general intelligence. Thus, it is not just internal symbol manipulation; this manipulation has to be—in some sense—organized further.
The motivation for computationalism can be found in a number of texts on computer simulation in psychology. For example, Apter (1970, pp. 17–18), an early defender of the simulation approach taken by Newell and Simon, listed four ways that computers resemble brains:
-
(1)
Both are general-purpose devices.
-
(2)
Both are information-processing devices.
-
(3)
Both may incorporate models within themselves.
-
(4)
Both achieve their intellectual excellence by carrying out large numbers of primitive operations.
Let me, therefore, analyze the ways in which these properties are understood by Apter to show that the analogy has to be rephrased in modern terms. Then I will also briefly analyze influential methodological advice that has shaped computational modeling in psychology and neuroscience.
Flexibility
Both computers and brains are general-purpose devices: they can operate in a variety of ways, which makes them adaptable to their environment, and this, in turn, can explain intelligence. This is exactly why Newell and Simon made universal computation part of their hypothesis. Newell’s rendering of the argument deserves some attention:
Central to universality is flexibility of behavior. However, it is not enough just to produce any output behavior; the behavior must be responsive to the inputs. Thus, a universal machine is one that can produce an arbitrary input–output function; that is, that can produce any dependence of output on input (Newell 1980, p. 147).Footnote 2
But are cognitive agents actually unboundedly flexible in their behavior? Is there evidence that, say, human beings are able to adapt to any environmental constraints? To demonstrate this empirically, one would have to actually observe an infinite number of behaviors displayed by human beings, which is, alas, physically impossible for finite observers. The only evidence we have is that people actually sometimes learn to adapt, and to control their environment. But sometimes they fall short of recognizing more complex causal relationships, because they rely too much on the received feedback (for a discussion and review of experiments, see Osman 2014). In other words, the hypothesis that behavior is infinitely flexible not only cannot be confirmed, but also seems to be disconfirmed, This is shown by a number of phenomena related to our illusions of control or illusions of chaos, as well as by highly stereotypical behavior triggered by situational factors (Wood and Neal 2007). Human behavior may be vastly flexible but not infinitely flexible. This is even more true of animal behavior, which is sometimes highly stereotypical.
However, no physical computers ever literally instantiate a universal Turing machine. We only idealize finite-state machines as universal Turing machines for the sake of simplicity. As Gary Drescher states: “There are no precise rules governing the suitability of this idealization; roughly, the idealization is appropriate when a finite-state automaton has a large array of state elements that it uses more or less uniformly that thereby serve as general memory” (Drescher 1991, p. 38). PCs, smartphones, and tablets are all, in reality, just finite-state automata, with the ability to compute a large class of functions that a universal Turing machine also computes. However, they do not compute all of them: their memory or input is not really unbounded, hence, they are not physical universal machines. However, these finite-state machines can be made quite flexible by running various kinds of routines in a given situation.
Newell admits that real systems are bounded in their resources, but retorts that “the structural requirements for universality are not dependent on unbounded memory, only whether the absolute maximal class of input–output functions can be realized” (Newell 1980, p. 178). What counts is not unbounded, but “sufficiently open memory” and “the structure of the system,” because for “every machine with unbounded memory, there are machines with identical structure, but bounded memory, that behave in an identical fashion on all environments (or problems) below a certain size or complexity” (Newell 1980, p. 161). However, from the mathematical point of view, these machines are not technically universal, even if they may be usefully idealized as such.Footnote 3
Other theorists were also at pains to show that cognition cannot happen without universal computation. For example, Pylyshyn (1984) claimed that cognition requires not only flexibility, which in turn requires universality, but also that semantic interpretability requires universality. In other words, he has claimed that finite-state machines (FSM) cannot be semantically interpretable; however, this argument has been rebutted by Nelson (1987). Moreover, Pylyshyn relies on the argument from natural language acquisition, which claims that natural languages require universal computation (Pylyshyn 1973, pp. 26, 1984, pp. 87–88). The argument is stated thus: at least some natural languages that human beings acquire are so complex that they require a universal Turing machine to process them (technically speaking, they are context-free languages). However, the empirical evidence for the claim is ambiguous and scarce (Shieber 1985).
Additionally, universal computation may seem brittle in some environments, lacking complex emotional motivation and being “single-minded,” which has been one of the major worries of critics of computationalism for years (Neisser 1963). This brittleness is now informally referred to as the inability to solve the frame problem (Dennett 1984; Wheeler 2005). Universal computation does not solve the brittleness problem by itself (even if the frame problem in the proper technical sense of the term is, in principle, solvable; cf. (Shanahan 1997). As Newell and Simon would probably repeat, this computation has to be “further organized.”
To sum up, the arguments that were supposed to justify the explication of “general-purpose computation” in terms of universality were unsuccessful. There is no empirical evidence that, say, human memory is actually unbounded, or that it could be unbounded if human beings were to use external resources such as pencil and paper [as suggested by Turing (1937) in his initial idealization of human computers]. Universal computation is neither necessary nor sufficient for behavioral flexibility.
Information-Processing
The second aspect of the analogy is that both computers and minds (or brains) process information. Some contemporary theorists, in particular those inspired by the ecological psychology of James J. Gibson, reject the notion that people process information just like computers (Chemero 2003). But their criticism is related more to the issue of representational information (see Sect. 2.3). The same is at stake with regards to the enactive account of cognition. Enactivists have claimed that autonomous biological systems are all cognitive, but that cognition does not involve any information intake: “The notions of acquisition of representations of the environment or of acquisition of information about the environment in relation to learning, do not represent any aspect of the operation of the nervous system” (Maturana and Varela 1980, p. 133).
However, the information at stake need not be semantic information, if it were, this part of the analogy would be included already in the capacity for incorporating internal models. The notion of information sufficient to express this aspect of the analogy is structural information content, as defined by MacKay (1969), which indicates the minimum equivalent number of independent features which must be specified, i.e., the degrees of freedom or logical dimensionality of the information vehicle (whether or not it represents something). This kind of information is pervasive, and present in any causal process (Collier 1999). One can speak of structural information content in living beings as well.
To sum up, it is not controversial to say that brains and computers process structural information, which may, or may not, be digital. It is much more controversial, however, to say that nervous systems manipulate internal symbols that represent reality.
Modeling Reality
The third aspect of the analogy between brains and computers is that both can create models of reality. In other words, one of the basic functions of the brain is to build models of the environment. Apter (1970, p. 18) points out that this was particularly stressed by J.Z. Young, whose work is the focus of the next section on working memory, and by Kenneth Craik (1943), one of the early proponents of the structural representations:
By a model we thus mean any physical or chemical system which has a similar relation-structure to that of the process it imitates. By ‘relation-structure’ I do not mean some obscure non-physical entity which attends the model, but the fact that it is a physical working model which works in the same way as the process it parallels, in the aspects under consideration at any moment. Thus, the model need not resemble the real object pictorially; Kelvin’s tide-predictor, which consists of a number of pulleys on levers, does not resemble tide in appearance, but it works in the same way in certain essential respects (Craik 1943, p. 51)
Craik’s approach is similar to what is now called structural representations (Cummins 1996; Gładziejewski and Miłkowski 2017; O’Brien and Opie 1999; Ramsey 2007). However, not all proponents of computational approaches to the study of cognition embrace the structural account of representations.
Notably, many computationalists rely on the notion of symbol, whose most important feature (in contrast to structural representations) is that the relationship between its vehicle and its content is arbitrary (Fodor 1975; Haugeland 1985; Newell 1980; Simon 1993). The notion of the symbol used by some of these theorists, however, remains ambiguous (Dennett 1993; Steels 2008). Sometimes Newell and Simon talked of symbols in the sense of the computability theory, where they are just distinguishable entities in a computing device (e.g., on a Turing machine tape). Sometimes they thought of Lisp symbols, which are pointers to other data structures, and sometimes as elements that designate or denote elements of reality (for further analysis and examples, see Miłkowski 2016a). While it is understandable that their intention was to show that computers can operate not only on numbers, but on arbitrary elements, and that the obvious term for such an arbitrary element is symbol, their terminological confusion makes it difficult to evaluate what they meant by symbol in any context.
At the same time, symbolic and structural representations are thought to be opposite, because the vehicles of structural representations are not arbitrarily related to their contents (Sloman 1978). Suppose that computation is manipulation of internal symbols. Is processing structural models not computation? The notion of symbol is too ambiguous to provide a single answer. It is far from clear whether even the broadest notion of the symbol, understood as a distinct part of the Turing machine’s alphabet, could be applicable to analog computational models, which were prominent among early theorists of artificial intelligence and computer models in neuroscience (Von Neumann 1958; Young 1964).
Moreover, unclear ideas about intentionality or aboutness, analyzed in terms of syntactical relationships between symbols by many proponents of symbolic computation (Haugeland 1985), might engender the appearance that computationalists confuse syntax with semantics (Harnad 1990; Searle 1980). The computationalist may say that one can treat syntax as semantics in an idealizing fashion (Dennett 1987; Haugeland 1985). But this idealization is not forced by computationalism itself. Craik and Young had little reason to think of their models as arbitrary pieces of syntax.
This part of the early computationalism is in need of overhaul and reappraisal.
Complex Behavior Out of Simple Building Blocks
The fourth part of the analogy is that both computers and brains have a large number of similar building blocks that perform simple operations, which, taken together, contribute to a complex behavior. This was part of the first computational model of the nervous system, which modeled the system as a large collection of logic gates (McCulloch and Pitts 1943). It has remained a part of the connectionist research program, which underlined the point that the connections between (computational) neurons count the most, not individual neurons.
However, it is not a conceptual truth that individual neurons may perform only simple computations. Some argue that single neurons may be suited to efficient computation of tree-like data structures (Fitch 2014). If this is true, it does not mean that computationalism came to a grinding halt. It just shows that it is not only connections between neurons that count in neural computation. Therefore, we need to revise this part of the analogy as well.
From Analogy to Strict Algorithms
The notion that mere analogy is not enough was a major methodological breakthrough in computational neuroscience—the function to be modeled should be described in a strict fashion. This influential methodological advice for the computational study of cognition was formulated by David Marr (1982) in his three-leveledFootnote 4 account of computational explanation.
At the first level, called computational, the explanations deal with operations performed by the system, and the reason it performs them. In other words, one assumes that the system in question is flexibly adapted to some environment (see Sect 2.1), in which some kind of input–output transformations are useful (see Sect 2.2). At the second level, one specifies the representation for the input and output, and the algorithm for the transformation of the input into the output. Although Marr’s approach to representation may be interpreted variously, it clearly mirrors the assumption that cognitive computational systems model their worlds (Sect. 2.3), and that they perform certain basic operations (Sect 2.4). These basic operations and representations must be shown to be biologically realized at the last Marrian level, the level of implementation. Only then is the explanation complete.
The popularity of Marr’s approach should be no surprise, given that it makes use of the basic aspects of the general analogy involving the computer and the brain. By following Marr strictly, one can pursue the goal of the computational research tradition, and offer detailed computational models for cognitive phenomena. Marr’s methodology, quite obviously, is theoretically almost vacuous as it comes to cognition, and remains very abstract. Almost any of the competing computational models in cognitive science can be cast in terms of the three levels. This methodology offers only a check on whether all the important questions have been asked. But there are no preferred answers.
In spite of the popularity of Marr’s methodology, it is rarely observed that Marr could not make sense of some computational approaches. For example, he criticized the heuristic approach taken by Simon and Newell as mere gimmickry (Marr 1982, pp. 347–348). According to Marr, when people do mental arithmetic, they actually do something more basic. Here, Marr seems to be committed to a version of the basic operations assumption which states that modeling should refer to the really basic building blocks. But his own account of levels cannot justify this commitment; nowhere does he justify the claim that one should specify mental arithmetic in terms of the operations of neural ensembles, or of individual neurons, for example. This is not a coincidence, his account of the implementation level remains awkwardly abstract, which makes it the Achilles heel of his methodology.
To summarize this section, the early statements of basic theoretical assumptions are faulty in several ways. They unpack the analogy between nervous systems and computational devices in an outdated way. Marr’s level-based methodology was a serious improvement, but suffered from the problem of how to include the evidence about the work of the nervous system. In the next section, I show that the analogy can be motivated better from the contemporary mechanistic point of view. In particular, the mechanistic solution sheds light on the role of neural evidence.