Embodied Cognition, Dynamic Field Theory of
KeywordsTuning Curve Selection Decision Local Input Attractor Solution Neural Interaction
The insight that cognition is grounded in sensorimotor processing and shares many properties with motor control, captured by the notion of “embodied cognition,” has been a starting point for neural process models of cognition. Neural Field models represent spaces relevant to cognition, including physical space, perceptual feature spaces, or movement parameters in activation fields that may receive input from the sensory surfaces and may project onto motor systems. Peaks of activation are units of representation. Their positive levels of activation indicate the instantiation of a representation, while their location specifies metric values along the feature dimensions. By ensuring that peaks are stable states (attractors) of a neural activation dynamics, cognitive processes are endowed with the stability properties required when cognition is linked to sensory and motor processes. Instantiations of cognitive processes arise from instabilities that may induce peaks and suppress. Such events may represent detection, selection, or classification decisions. Neural Field models account for classical behavioral signatures of cognition including response times, error rates, and metric estimation biases but also link to neurophysiological correlates of behavior like patterns of population activation and their temporal evolution. Robotic demonstrations of Neural Field models are used to establish the capacity of these models to provide process accounts of cognition that may link to real sensory information and generate real movement in the physical environments.
Elementary forms of cognition are the detection and selection decisions that control attention and eye movements but are also the basis for object perception. Committing detected perceptual states into working memory and then long-term memory is a key element of cognition. Serially organized sequences of cognitive states are the basis for cognitive processes. Motor actions require that the initiation, termination, and potentially the online update of planned movements be autonomously generated. Neural Field models of cognition and control provide a neural account for such elementary cognition based on four elements: the spaces that such cognitive processes are about, the activation fields defined over these spaces within which neural representations can be created, the neural activation dynamics that drive neural representations forward in time, and the instabilities that give rise to the elementary forms of cognition. This use of Neural Field models to account for cognition and its sensorimotor grounding has been called Dynamic Field Theory (DFT). DFT is a mathematically formalized conceptual framework for understanding embodied cognition that is linked to neural process modeling but abstracts from some of the specific anatomical and biophysical details of neurophysiology to enable a close link to behavior (Schneegans and Schöner 2008).
Neural Field models of embodied cognition represent the state outside the nervous system by neural activation patterns that are inside the nervous system. Neural activation, as used in Dynamic Field Theory, is a real number, u, that may both be positive or negative. The link to biophysically detailed accounts of neural activity can be established in multiple different ways (see entries “Neural Field Model, Continuum”; “Neural Population Models and Cortical Field Theory: Overview”). Critical for the link to behavior is the assumption that only sufficiently positive levels of activation impact on downstream structures and ultimately on motor systems. This is expressed mathematically through a sigmoidal function, g(u), often chosen as 1/(1 + exp(−βu)), where β is the steepness of this nonlinearity.
A Neural Field is a continuum of such activation variables, one for each location of the represented space. For instance, in Fig. 1, a level of activation represents each possible horizontal position and motion direction of a moving object. The field notion is thus analogous to how fields are used in physics. Peaks of activation are units of representation. Their positive levels of activation imply that they impact on downstream processes. Their location specifies the represented state.
How does a location in a Neural Field acquire the meaning ascribed to it in this interpretation? It is ultimately the connectivity to the sensory or the motor surfaces that determines what a field location “stands for.” In perceptual representations such as the one illustrated in Fig. 1, this would be, for instance, the connectivity from the retina, through simple and complex cells, to motion detectors, anatomically probably located in area MT of the cortex. Field locations thus have a “tuning curve.” Similarly, the forward projection onto the motor system implies a specificity that could be interpreted as a tuning curve for movement parameters. We know that in the cortex as well as in such subcortical structures as the colliculus and thalamus, tuning curves tend to be broad and overlapping, an indication that broad populations of neurons are activated for any individual perceptual or motor state represented. This fact together with detailed analysis of how strongly all activated neurons contribute to a percept or motor action has given rise to the hypothesis that the activity of small populations of neurons is the best neural correlate of behavior (Cohen and Newsome 2009). DFT is based on the further hypothesis that Neural Fields represent the activity of such small populations of neurons in the higher nervous system that are tuned to particular sensory or motor states. In fact, it is possible to estimate Neural Fields from recorded population activity (Erlhagen et al. 1999). This link to population activity frees Neural Fields from the more literal interpretation prevalent in some neural modeling, in which the activation fields are directly defined over the cortical surface. Instead, the Neural Fields on which DFT is based are organized in terms of the topology of the outer space that is being represented. The two interpretations are aligned where cortical maps are topographic. In other cases, however, topography is violated, such as for the tuning of neurons in motor cortex to the direction of a planned hand movement. The Neural Fields of DFT effectively rearrange neurons so that neighboring sites always represent neighboring states. In fact, strictly speaking, neurons are smeared out across the field dimensions by contributing their entire tuning curve to the representation (Erlhagen et al. 1999).
The parameter, τ, determines the overall timescale of the evolution of u(x,t). The “−u” term provides stability to the dynamics and is a reflection of the intrinsic dynamics of neural populations. The parameter h < 0 is the resting level of the field, stable in the absence of input, s(x,t). Interaction integrates over all field sites, x′. Each site contributes to the extent to which activation exceeds the threshold of the sigmoidal function g(u(x′,t)) with a coupling strength, c(x−x′) that is a function of the distance between interacting field sites. For close distances, coupling is excitatory (c(small) > 0), for larger distances, inhibitory (c(large) < 0). The Amari model is a simplification over biophysically more detailed models that, among other approximations, neglects the time delays involved in synaptic transmission and lumps together excitatory and inhibitory neural populations (see entry “Neural Field Model, Continuum”).
Both subthreshold and self-stabilized peak solutions are continuously linked to sensory input. The peak may track, for instance, continuously shifting local input patterns. Moreover, if input strength increases in a graded, time-continuous way, the Neural Field autonomously creates a discrete detection event when it goes through the detection instability. Similarly, if input that supports a self-stabilized peak is gradually reduced in strength, the peak collapses at a critical point through the reverse detection instability. Such discrete events emerge from continuous-time neural dynamics through the dynamic instabilities. This provides a mechanism that is critical for understanding how sequential processes may arise in neural dynamics (Sandamirskaya and Schöner 2010).
With sufficiently strong interaction or when broad inputs or a high resting level pushes activation close enough to threshold, the reverse detection instability may not be reached when the strength of a local input is reduced. In this case, a self-stabilized peak induced by local input remains stable and is sustained, even after any trace of the local input has disappeared. Such self-sustained activation peaks are the standard model of working memory (Fuster 2005). Mathematically, self-sustained peaks are marginally stable. They resist change in their shape but are not stable against shifts of the peak along the field dimensions. This leads to drift under the influence of noise or broad inputs. Such drift is psychophysically real: memory for metric information develops metric bias and increases variance over the timescale of tens of seconds (Spencer et al. 2009). Moreover, sustained peaks may be destabilized by competing inputs at other field locations. Again, this limitation of the stability of sustained peaks matches properties of working memory, which is subject to interference from new items entered into working memory.
The resistance to change is also a reason, while selection decisions are typically made in response to a transient in the input stream. Once locked into a decision, a Neural Field is not open to change, unless the differences in input strength become very large and the selection instability moves the system from a bistable or multi-stable regime to a monostable regime in which only one selection decision is stable. Such resistance to change can be observed as change blindness in which observers fail to detect a change in an image (Simons and Levin 1997). Normally, when an image changes in some location, the visual transients in the sensory input attract visual attention and help detect the change. That transient signal may be masked by turning the entire image off and then turning the locally changed image back on. Observers are blind to change when transients are masked this way, unless they happen to attend to the changed location. Because sensory inputs in the nervous system are typically transient in nature, the mechanism for making selection decisions in DFT is normally engaged by change, so that change detection in the absence of a masking stimulus may also be understood within DFT (Johnson et al. 2009).
Neural Field Models of Embodied Cognition
Neural Field models have been used within the framework of DFT to account for a large and broad set of experimental signatures of the neural mechanisms underlying embodied cognition. Sensorimotor selection decisions were modeled for saccadic eye movements (Kopecz and Schöner 1995) and infant perseverative reaching (Thelen et al. 2001). Influences of non-stimulus factors were accounted for both for saccades (Trappenberg et al. 2001) and in the infant model. The capacity of Neural Field models to account for the confluence of multiple factors is, in fact, a major strength of the approach. An exhaustive account of the influence of many different task and intrinsic factors on motor response times was foundational for DFT (Erlhagen and Schöner 2002). That model also accounted for the time course of movement preparation as observed in the timed movement initiation paradigm. The same model was linked to neural population data from motor and premotor cortex (Bastian et al. 1998), and related ideas were used to account at a much more neurally detailed level for spatial decision making (Cisek 2006). The temporal evaluation of population activity in the visual cortex has been modeled using Neural Fields (Jancke et al. 1999), including recent data that assessed cortical activity through voltage-sensitive dye imaging (Markounikau et al. 2010).
DFT has been the basis of a neural processing approach to the development of cognition (Spencer and Schöner 2003) that emphasizes the sensorimotor basis of cognitive development but has reached to an understanding of how spatial and metric memory develops (Spencer et al. 2006) and how infants build memories through their looking behavior (Schöner and Thelen 2006; Perone and Spencer 2013). A Neural Field model of metric working memory has led to predictions that have been confirmed experimentally (Johnson et al. 2009) and included an account for change detection (Johnson et al. 2008). Neural Field models of motion pattern perception (Hock et al. 2003), attention (Fix et al. 2011), imitation (Erlhagen et al. 2006), and the perceptual grounding of spatial language (Lipinski et al. 2012) illustrate the breadth of phenomena accessible to Neural Field modeling.
Neural Fields can also be used to endow autonomous robots with simple forms of cognition (Bicho et al. 2000). This work has shown how peaks of activation may couple into motor control systems, an issue first addressed in Kopecz and Schöner (1995). Robotic implementations of the concepts of DFT have generally been useful demonstrations of the capacity of Neural Fields to work directly with realistic online sensory inputs and to control motor behavior in closed loop. Robotic settings have also been useful to develop new theoretical methods that solve conceptual problems. A notable example is the generation of serially ordered sequences of actions (Sandamirskaya and Schöner 2010). Because the inner states of Neural Fields are stable, they resist change. Advancing from one step in a sequence to a next requires, therefore, the controlled generation of instabilities, the release a previous state from stability, and enabling the activation of the subsequent state. The innovative step in (Sandamirskaya and Schöner 2010) was to use a representation of a “condition of satisfaction” that compares current sensory input to the sensory input predicted at the conclusion of an action. This concept has since been used in a variety of models that generate autonomously sequences of mental events. A robotic example is the autonomous acquisition of a scene representation from sequences of covert shifts of attention (Zibner et al. 2011).
Ongoing work in Neural Field modeling of embodied cognition advances on three fronts. On the one hand, the elementary forms of cognition must be integrated into the more complex dynamics of movement generation, coordination, and motor control (Martin et al. 2009). This entails dynamically more complex attractor states including limit cycles, as well as understanding how motor control in closed loop may interface with the population representations modeled by Neural Fields. On the other hand, a systematic push from embodied toward higher cognition (Sandamirskaya et al. 2013) ultimately aims at an understanding of all cognition in neural processing terms. Such an account faces the challenge of how to reach the power of symbol manipulation while maintaining the grounding in sensory and motor processes. Finally, Neural Field modeling needs to interface more closely with neural mechanisms of learning (Sandamirskaya 2014).
- Sandamirskaya Y (2014) Dynamic neural fields as a step toward cognitive neuromorphic architectures. Front Neurosci 7Google Scholar
- Spencer JP, Perone S, Johnson JS (2009) Dynamic field theory and embodied cognitive dynamics. In: Spencer J, Thomas M, McClelland J (eds) Toward a unified theory of development: connectionism and dynamic systems theory re-considered. Oxford University Press, Oxford, pp 86–118Google Scholar
- Trappenberg T (2008) Decision making and population decoding with strongly inhibitory neural field models. In: Heinke D, Mavritsak E (eds) Computational modelling in behavioural neuroscience: closing the gap between neurophysiology and behaviour. Psychology Press, London, pp 1–19Google Scholar