1 Introduction

Traditionally, cognition has been understood in terms of computational operations carried out on internal mental representations. If radical theorists of cognition are correct, different conceptual and theoretical tools may however be required for conceptualizing cognition. In earlier work we’ve proposed a conceptual framework for an ecological-enactive cognitive science: the Skilled Intentionality Framework (SIF) (Bruineberg and Rietveld 2014; Rietveld et al. 2018; Kiverstein and Rietveld 2018). This framework is summarized by the following three interrelated theses:

  1. (1)

    There is no divide between “higher” and “lower” cognition. Both can be understood in terms of skilled activities of engaging with situations in the world.

  2. (2)

    Skilled activities are temporally extended processes in which agents coordinate to multiple relevant affordances simultaneously.

  3. (3)

    The affordances the environment offers are relative to the abilities available in a form of life.

It follows from these three theses that the concept of skilled intentionality should apply to cases of higher-order cognition, such as imagination, long-term planning, language understanding, taking into account the perspective of other people, and mathematical and logical reasoning. So long as one operates with a traditional understanding of cognition, each of these cognitive accomplishments may however seem to necessitate an explanatory appeal to rule-like operations carried out on internal mental representations (Clark and Toribio 1994; c.f. Kiverstein and Rietveld 2018). So-called “higher-order” cognitive processes often involve thinking about objects or states of affairs that are absent, abstract or counterfactual. A traditional understanding of cognition may encourage one to think that the only way a subject could entertain such thoughts is by having internal states whose function is to stand-in for objects that are absent, abstract, or counterfactual. But any state of mind that has the function of standing-in for something, and thus can fail in its function, just is a mental representation (Haugeland 1991). Thus it would seem to follow that ‘higher-order’ cognition is a form of cognition necessarily mediated by internal representations. From the standpoint of SIF however, such a line of reasoning must be mistaken because ‘higher-order’ cognition doesn’t necessarily depend on mental representation. In what follows, we will show how people can think about absent, abstract or counterfactual because of their skills for engaging with an enlanguaged environment.Footnote 1 Once one recognizes how much of the human ecological niche has become structured by past activities of talking and writing, this takes away at least some of the motivation for understanding linguistic thinking in terms of content-bearing internal representations. Mental representations are supposed to help us to understand how ‘higher-order’ cognition is possible. We will show how the explanatory work that is supposedly done by mental representations, can however instead be done by looking outside of the head to the environment structured by sociomaterial practices, and the affordances it makes available.Footnote 2

Based on our earlier work we define affordances as relations between aspects of the socio-material environment in flux, and the abilities available in a form of life (Rietveld and Kiverstein 2014). The scope of the concept of affordances is broad, expanding to include all of the possibilities for action that are made available to people taking part in sociomaterial practices. Among the skills and abilities people develop in the human form of life are skills for expressing, either in the activity of speech or in writing, ways of thinking about the world. Speech and writing are deeply sedimented in the sociomaterial practices in which humans participate, seamlessly interwoven into much of what we do in concrete situations, and often transparently in ways that we as speakers fail to notice. Skills and abilities for expressing things in speech and writing have transformed human life by permeating the ecological niche humans inhabit. The affordances of the human ecological niche can thus be said to be “enlanguaged” because they have been formed, at least in part, through the activities of people taking part in sociomaterial practices of speaking and writing.

Our aim in this paper is to show how linguistic thought is made possible not by internal mental representations but by a person’s skilled engagement with enlanguaged affordances. By linguistic thought we mean the thoughts a person expresses in speech or in writing.Footnote 3 We focus on speech in this paper. What is expressed in speech is the speaker’s way of thinking about the world. In speech people exercise abilities for thinking about aspects of the world as meant in a particular way. The architect for instance may think of the door in a building as being “too low” (Rietveld 2008). In doing so he makes use of a “technique for grasping hold of something”, in this case the height of the door (in its context).Footnote 4 The architect’s utterance “too low” expresses a way of thinking about the door’s height in the wider situation of the architectural project. In making this utterance the architect is engaging with an affordance of the door that has become enlanguaged. Following Merleau-Ponty, we deny the thought has an existence in the mind of the speaker independently from, and prior to its being expressed in speech. The thought is accomplished in the bodily activity of talking with others, or in writing, and doesn’t exist in the speaker’s head as a ready-made thought prior to this activity of talking or writing.

Our aim in this paper is not to provide a fully worked out account of linguistic thought in all its guises. Linguistic thought takes a wide variety of shapes and forms, and serves many different purposes (Wittgenstein 1953). The arguments of this paper are advanced as a call for further research, and are in no way intended to be the final word on how to understand linguistic thought in terms of skilled intentionality. Our set-up for the rest of our paper is as follows: We begin in Sect. 2 by returning to a remark of Gibson’s in his 1966 book that language belongs to the material environment. However, Gibson ended up mistakenly separating the affordances that can be directly perceived by means of ecological information from language, which he took to depend on social convention. In Sect. 3 we turn to the phenomenological philosopher Merleau-Ponty for an alternative account of the enlanguaged environment (Merleau-Ponty 1945/2002). Merleau-Ponty shows how linguistic thinking is something we accomplish in the process of speaking. He recognises the materiality of linguistic thinking by denying that any separation can be made between the bodily activity of speaking and the thoughts that are given expression in this activity. He also recognises the sociality of language by situating bodily expressive activities in the form of life of a community of language speakers. In Sect. 4 we show how speech is typically smoothly integrated with the other skills an agent embodies. We make this argument on the basis of Merleau-Ponty’s discussion of neurological patients in which this integration is disturbed in ways that interfere with how they live their lives. We suggest that the reason speaking is typically integrated with other bodily skills is because the affordances agents are responsive to in skilled action are enlanguaged. We finish up in Sect. 5 by showing how to understand Merleau-Ponty’s socio-material account of linguistic thought in terms of our concept of skilled intentionality. Linguistic thinking is made possible by the enlanguaged affordances our sociomaterial environment makes available.

2 Gibson’s insight

In his 1966 book Gibson rightly observes that the distinction between “material” and “non-material” culture is “seriously misleading” (1966: p. 26). Such a distinction is mistaken he explains, because it implies that “language, tradition, art, music, law, and religion are immaterial, insubstantial and intangible” (Op cit.) Gibson describes how speaking, sculpting, painting and writing are examples of techniques humans developed over the course of their history for making others aware of things outside of their immediate environment. Speech and the other “representative media” Gibson mentions, such as sculpture, painting and writing, made possible what he describes as “second hand perception” of the environment. An individual becomes aware of what is around him through first hand perception. Second hand perception is different insofar as an individual is made aware of things outside of his immediate environment by other observers (p. 234).

Gibson’s insight was to recognise that language is just as much material as anything else in the environment humans are able to perceive. It does not belong to a non-material symbolic culture. He rightly insists that “no symbol exists except as realised in sound, projected light, mechanical contact or the like.” (1966: p. 26) However, Gibson’s treatment of his distinction between two types of perception (first and second-hand) had the consequence that he was unable to do full justice to this insight. He was led to separate linguistic symbols as material inscriptions or vocalisations from the thoughts these symbols are used to express. What led him to make such a separation was his attempt at explaining how it is possible for perceptual invariants contained in an individual’s vocalisations to also express thoughts about things in the environment to other speakers. In addressing this question, Gibson wrote: “A stimulus is related to its source in the world by laws of ecological physics, whereas a word is related to its referent by social convention.” (Gibson 1966: p. 244) Thus, Gibson was led to invoke a problematic distinction between law-based and conventional information (see also Golonka 2015).

Information is law-based when a “one-to-one-to-one” specifying relation holds between states of perceptual systems, the structure in the ambient energetic array and the affordances of the surrounding environment (Turvey et al. 1981). This means that there is no possibility for an animal to make a mistake about the presence of an affordance based on the available ecological information. Speech by contrast can only be used to express thoughts by means of social convention. Gibson’s appeal to social convention rests on a separation of the material (and sensible) dimension of speech from the thoughts people express in concrete situations of social interaction. Token utterances as speech events and the thoughts they express, are only brought back together again through the mediation of social conventions.

Gibson’s distinction between ecological physics and social convention implies what we call a “layered” picture of the ecological niche. First there was a physical environment containing lawfully specifying information. Then at a certain point in evolutionary history, social life emerged in which people’s conduct began to be regulated by social conventions. At this juncture, conventional information was added or superimposed onto the already existing but pre-social, ecological information. Such a layered picture of the ecological niche assumes a mistaken ontological privileging of physical reality over social life. This is mistaken for at least two reasons. First, it suggests that the reality of ecological physics is somehow more basic and fundamental than the environment structured by sociomaterial practices that forms out of people’s regular and relatively stable ways of doing things. Second, such a layering of the ecological niche suggests that the affordances lawfully specified by ecological physics stand apart from language as made possible by social convention. However, the material structure the human–environment offers should not, and cannot be ontologically separated from the social, technical and historical lives people lead (Law and Mol 1996). The alternative we favour, is to see social practices and the affordances of the material environment as forming together in action. Social practices are made possible by affordances, but at the same time affordances form out of the activities people engage in when they take part in those social practices (Van Dijk and Rietveld 2017).

We’ve suggested that Gibson’s distinction between law-based and conventional information also leads to a separation of language from the affordances of the human–environment. But we’ve just argued that such a separation is premised on a mistaken privileging of ecological physics. Such an isolation of affordances from language can however be avoided if we do not follow Gibson in distinguishing speaking from thinking. Following Merleau-Ponty we will argue that it is in speaking that linguistic thinking is accomplished. The thought doesn’t pre-exist the expressive bodily activity of speaking because it is only in the act of talking to ourselves or to others that the thought is articulated, and becomes a determinate thought. Before we give voice to thoughts, they are little more than inchoate feelings. Think of what one experiences when one struggles to recall the name of an acquaintance, or to find the right word in writing a formal letter.

Interestingly, Gibson seems to have recognised a tight connection between the meaningful structure the child can perceive in the environment, and the thoughts the child learns to give expression to in learning how to talk. He observes that the “learning of language is not simply the associating, naming or labelling of impressions from the world. It is also and more importantly an expression of the distinctions, abstractions and recognitions the child is coming to achieve in perceiving” (Op cit. (our italics)). Talking makes it possible to mark out and call attention to abstract and recurrent patterns of invariance in perceptual experience. Speech more generally can be thought of in terms of expressive bodily activities used to point out, and give articulation to the relatively stable regularities and structure available in the ecological niche.

3 Merleau-Ponty’s account of linguistic thought

Perceiving animals that are responsive to affordances have a practical understanding of what the environments makes it possible for them to do. Gibson, in highlighting the expressive possibilities of speech, notices how language can be used to educate the attention of children to practical ways of understanding the environment. The speaker can mark out aspects of the environment as being thought about or understood in a particular way to the child. Gibson did not say much more about how he took expression in speech to work. The point we wish to foreground is how a speaker’s utterances can be used to identify, and make articulate, how things are. What a speaker does is make a way of understanding or thinking about things explicitly manifest. If for instance, I describe a certain shape using the word “triangle”, I thereby point the shape out to others as being a triangle. This requires me as a skilled speaker to be able to make a whole set of contrasts and distinctions, showing you how to distinguish it from other figures, getting you to attend to its shape rather than, say, its size, colour. The word “triangle” expresses the understanding of what it is to be a triangle as contrasted with a square, circle, rectangle and so on, and what it is to have a certain shape, as contrasted with size, colour, number or some other property. The use of a word to point out a particular aspect of the environment thus implies mastery of a vast number of other words, and much else besides. It arguably implies the mastery of a whole web of practices (Taylor 1985; Wittgenstein 1953). It is by learning the point of these practices and how to take part in them that the child learns how to use the word. If I am teaching children about triangles in a geometry lesson for example, I must help them to understand something about the point of geometry.

Now consider in this light Gibson’s claim that words express “the distinctions, abstractions and recognitions the child is coming to achieve in perceiving” (Gibson 1966: p. 281). We suggest reading Gibson’s talk of “distinctions, abstractions and recognitions” here as the holistically structured understanding of the world the child is being initiated into under the guidance of its caregivers and teachers.Footnote 5 The child is learning what it is to use a word correctly in a practice, and in doing so the child learns something about the world. It learns for example the features of a figure that make it a triangle in the practices of doing geometry. Mastery of the word is transformatory of the world the child lives in, since it opens them to news way of being involved with the world. Features of the world come to possess a new significance or meaning for the child, which they wouldn’t have were it not for the child’s mastery of language. The child develops a feel for the rightness or wrongness of ways of talking because of the significance features of the world take on within a whole web of practices (Taylor 2016).

We follow Merleau-Ponty in taking the bodily activity of speaking to “accomplish” ways of thinking about the world. Speech is not the clothing with which we dress our thoughts, because thoughts do not exist already formed inside of speaker’s heads prior to their expression in language. Thinking is instead accomplished in the activity of speaking just as music is performed in the playing of musical instruments, or in whistling. It is only in the activity of articulating them that one’s thoughts take shape, and prior to this articulation there is at best an inarticulate feeling, as when we search for the right word.

A natural objection to Merleau-Ponty’s description of thinking is to point to the many thoughts we keep to ourselves that never get expressed to others. Notice however that such private thoughts tend to be heard in one’s own voice. Romdenh-Romluc has suggested that what one is doing in such cases is imagining giving voice to one’s thoughts (Romdenh-Romluc 2011: p. 189). Private thoughts are, she suggests, best thought of as “imagined acts of speaking”. From the outside, saying a thought aloud and imagining saying it silently to oneself seem to be very different activities. But what they share in common is the activity of speaking: what one imagines is the very same activity one performs when one speaks aloud. What one does is “summon up the demands made by speaking and hearing an utterance. In this way one brings about the pseudo-presence of the utterance” (Romdenh-Romluc 2011: pp. 190–191). When one imagines talking to oneself, one is responding to a demand the world makes on one—one is responding to an inviting enlanguaged affordance. We’ll have more to say on this last point later in Sect. 5. The point we wish to make for now is that both in talking and imagined talking, the thought does not pre-exist but is rather the outcome of the activity of speaking, be it an imagined act, or an act the person performs in speaking with other persons.

It is a distinctive feature of speech that its expressive possibilities can be “indefinitely reiterated” (Merleau-Ponty 1945/2012: p. 196). Merleau-Ponty describes this feature of language in terms of meanings becoming “sedimented” in utterances over time (Op cit, p. 221). The term “sedimentation” is borrowed from Husserl who used it to describe how activities when repeated can transform over time into bodily habits and routines (Husserl 1989). Merleau-Ponty suggests the same can happen in language as phrases and expressions are repeated and become routine, idiomatic ways of saying things. He thus makes a distinction between “speaking speech” (parole parlante) and “spoken speech” (parole parlée). Speaking speech is creative and novel. Its meanings are “original” insofar as words are used spontaneously to say something new. Examples include literary and poetic uses of language. As Taylor points out: “People are constantly shaping language, straining the limits of expression, minting new terms, displacing old ones, giving language a changed gamut of meanings” (Taylor 1985: p. 232). Spoken speech is by contrast Merleau-Ponty’s term for more mundane, everyday routine uses of language. The thoughts expressed in spoken speech are well established, they are available in the community to be picked up and used as and when the situation requires.Footnote 6

For Merleau-Ponty all human activities—linguistic and non-linguistic—have an expressive aspect. It is through the caring engagement of their bodies that subjects experience a meaningful world. Speaking is a dimension of our bodily being in the world (Baldwin 2007). The thoughts one gives articulation to are our ways of being practically and affectively involved with the world. Think back to the example of the triangle and how to catch on to the word “triangle” and its meaning, the child had to come to appreciate what in our human practices made the use of this word appropriate. She had to come to see the point of a whole web of practices, of a whole context that surrounds the use of the word “triangle”. Linguistic expression is just another form of bodily comportment or skilful engagement with the environment. Speech is among our many human ways of skilfully engaging with the world through our bodies, as we take up, and build upon the possibilities for spoken speech set up by other speakers in the past in their caring engagement with the world. Indeed, we will argue next, speech couldn’t succeed in opening a person to the world separately, in isolation from the person’s other bodily skills. In real life situations bodily skills, linguistic and non-linguistic, are not isolated from one other but combine in complex ways to open the person to an enlanguaged environment.

4 The entanglement of language with everyday life

When language comes adrift from other skilled activities the result can be that what the person says no longer gets a grip on the world. We’ve seen in the example of the child learning about geometric shapes like triangles how speech is expressive of ways of being involved with the world, and the significance things take on within practices. Speech can give us a new manner of disclosing what already exists, but it can also open up new ways of existing. Think of what happens when we find the right words for expressing what we feel, perhaps in response to another person we care for. But speech can only open us to the world in this way—it can only allow the subject to take up a position in the world—when speech connects in the right way with the person’s other bodily skills. This can be seen clearly by reflecting on Merleau-Ponty’s discussion of neurological patients deprived of this integration of bodily skills by brain damage.

Consider first Merleau-Ponty’s discussion of aphasics that have amnesia for colour terms. He describes how the patient doesn’t know what to do when given a colour term and asked to name a colour sample. He simply “repeats the name as if he were expecting something from it. But the name is no longer useful to him, it says nothing to him, it is bizarre and absurd, just as names are for us when we have repeated them for too long” (Merleau-Ponty 1945/2012: 199). The patient might still be able to form associations with the word, but what the word has lost is its living sense—it no longer speaks to the patient because they no longer see the point of sorting things by their colour. The word has become cut adrift from their broader life and their way of being involved with the world. However, so long as the word is used concretely in a context of “affective and vital interest” the aphasic has no problem in using the word (Merleau-Ponty 1945/2012: pp. 180–181). Their difficulties arise when they are given an abstract task in which they must take up what Merleau-Ponty calls the “categorial attitude”.

Gelb and Goldstein’s patient Schneider has in some sense, the opposite problem to the aphasic patients Merleau-Ponty discusses. Schneider suffered from a pure form of apperceptive agnosia following severe brain damage he underwent while serving in the German army in the First World War. Merleau-Ponty tells us that Schneider’s vocabulary and syntax seemed to be intact but he would hardly ever speak unless he was questioned. In his motor behaviour, Schneider was able to perform habitual and skilled movements such as those required for his work in a factory producing wallets. He had no trouble understanding common idiomatic expressions. This is an example of what Goldstein and Gelb called “the concrete attitude” (Goldstein and Gelb 1918). However, he ran into difficulties in carrying out a spontaneous conversation, particularly when words were used creatively, as in metaphorical and analogical uses of language such as “the foot of the chair” or “light is to a lamp what heat is to a stove” (Merleau-Ponty 1945/2012: p. 129). Schneider could only understand these metaphors and analogies when he had fixed on some common characteristic. He is an example of what can happen when speech is only used in the categorial attitude and is no longer attuned to what is of affective and vital significance.

He can only speak according to a plan settled in advance: “he cannot give himself over to the inspiration of the moment in order to find the necessary thoughts in response to a complex situation in the conversation, and this is the case whether it is a question of new points of view or of old ones” (Benary 1922). There is something meticulous and serious in all of his behaviour, which comes from the fact that he is incapable of playing. To play is to place oneself momentarily in an imaginary situation, to amuse oneself in changing one’s “milieu”. The patient, however, cannot enter into a fictional situation without converting it into a real situation. (Merleau-Ponty 1945/2012: p. 136)

Goldstein and Scheerer (1964) argue that Schneider is an example of what can go wrong when the abstract and concrete attitudes are no longer properly integrated.Footnote 7 The possibilities for abstract thought—such as for example possibilities to think about geometry—must connect to the wider life of the speaker, and what is of vital and affective significance to them. In the absence of such a connection, the speaker loses their capacity to speak spontaneously, as was the case with Schneider. Alternatively, they may only be motivated to speak in situations that mean something to them emotionally and practically, as we have seen in the aphasic patients Merleau-Ponty describes.

We suggest this connection of speech to other bodily skills may be necessary because the affordances the individual is responsive and sensitive to in acting skilfully are woven into practices of speaking with others. The speaker may suddenly shout for instance: “Run, the bus is coming!” What they say is tied to the context of using public transport for travelling to an appointment, hence the urgency to catch this bus, and not the next one. Inkpin (2016) has usefully labelled the type of meaning that is expressed in contexts of practical activity “pre-predicative meaning”.Footnote 8 In such pre-predicative uses of language we use words in ways that are bound to and embedded in contexts of practical activity. The thoughts one is giving expression to in such pre-predicative uses of language are articulating one’s practical grasp of a context of practical activity. What one is articulating is one’s way of being practically involved with the world. The problem in the neurological patients Merleau-Ponty discusses is that speaking has become somehow unhinged from the patient’s other bodily modes of engagement with the world. In most real-life situations speech is fluidly integrated with the other of the person’s bodily skills because speaking is something people are typically invited to do in the flow of other activities. In acting skilfully they are coordinating their activities to multiple affordances including those that open up to them because of their skill as speakers of a language.

5 Skilled engagement with enlanguaged affordances

So far we’ve argued that the thoughts about the world a speaker expresses should not be taken to exist independently of their being embodied in the activity of speaking. But even original uses of words in what Merleau-Ponty described as speaking speech can outlive any of their individual users. Speech belongs to communities of language users, and is not the exclusive property of individual body subjects. We argue the same is true of affordances. In this final section we will therefore aim to bring out the sense in which linguistic thought can be conceptualised in terms of skills for engaging with enlanguaged affordances.

Drawing on Dennett (1998), Chemero has compared the reality of affordances to the property of loveliness. A female hippopotamus X is lovely if some observer would appreciate the beauty of X upon encountering her (Chemero 2003: p. 193). No one may have ever admired the hippo’s beauty but still she remains lovely. Her loveliness is dependent on the existence of some individual observer that appreciates her beauty. Similarly, Chemero claims an affordance X exists so long as some individual exists with the necessary abilities for taking advantage of the opportunities X offers. We agree with Chemero that affordances do not depend for their existence on any particular individual. We’ve expressed this point by arguing that affordances depend on the abilities that are available within a wider form of life as a whole (Rietveld and Kiverstein 2014; Van Dijk and Rietveld 2017). We borrow the phrase “form of life” from Wittgenstein (1953), using the phrase as Wittgenstein did to refer to the regular patterns of activity that can be observed as individuals engage in coordinated activities over time.Footnote 9 Each of the many practices in the human ecological niche consists of relatively stable ways of acting.Footnote 10 We suggest that each of these regular ways of acting makes available abilities. Affordances should be understood as independent of the existence of any particular individual because they depend for their existence in part on the abilities made available in a form of life.

Our reasons for defining affordances in relation to forms of life can perhaps be seen most clearly by considering the affordances of the human–environment. Each of the affordances of the human–environments occupies a place within a larger “constellation” of social practices shared among multiple individuals (Costall 1997, 2012; c.f. Van Dijk and Rietveld 2017). The affordances of chairs for instance are in part dependent on social practices organized around sitting such as eating together, office work, public transport systems, cinemas and theatres, and so on. The affordances of chairs to support sitting in each of these contexts is sustained over time through these and many other practices. The affordances of the human environment thus have a sociomaterial reality. They depend for their existence on how the materials from which they are made are used by people engaging in practices, including unconventional and novel forms of material engagement.

The same points we’ve just made about affordances of artefacts like chairs can also be made about speech. The materials from which speech is made—expressive bodily activities—take form through the regular ways of acting of the members of the linguistic community. The expressive possibilities available to speakers of a language—what Merleau-Ponty called “spoken speech”—are sustained by the regular, habitual patterns of talking. These established ways of speaking lay out what makes sense, and what doesn’t in the language speaking community to which the individual belongs. If a person is to speak and make themselves understood, it will only be by acting in ways that fit with the patterns for doing things already mapped out in the standing practices. The regular pattern of doing things is essential because it is relative to this agreement in how to take part in the practice that evaluations can then be made as to whether a use of a word in an utterance is appropriate, or inappropriate, correct or incorrect. The form of life thus serves as the basis for making normative judgements concerning what uses of words make sense, and under what conditions a use of words makes no sense.Footnote 11

The meaning of a word doesn’t come from the phrases and sentences in which it is used but from a “language game” as a whole (Wittgenstein 1960: p. 180). The term “language game” is employed by Wittgenstein to refer to “the whole process of using words…the whole: of language and of the activities with which it is interwoven” (Wittgenstein 1953: §7) Spoken utterances are made within broader contexts of practical activity. Talking serve purposes that relate in some way to how people live in the form of life in which they are situated. What this means in practice is that utterances must be sufficiently unambiguous in meaning for them to do the work required of them (Inkpin 2016: p. 176). The contexts of practical activity in which utterances are made places constraints on their meaning. Thus the builders working on the building site need words that will do the work of instructions for bringing each of the types of building block. The terms they use—“block”, “slab”, “column”, “pillar”, and so on—will need to be fit for purpose within this wider practice. Words can thus be thought of as instruments or tools that serve multiple purposes, and our understanding of words as mastery of techniques for using those tools within concrete contexts of practical activity (Noë 2012). One should not be misled by this tool metaphor: each word expresses a meaning only as a part of a holistically structured web of practices, not as an individual tool separable from other tools. It is the nature of the web as Taylor nicely puts it to be “present as a whole in any one of its parts. To speak is to touch a bit of the web, and this is to make the whole resonate” (Taylor 1985: p. 231). What an individual speaker does with words takes form and unfolds within a whole complex web of interrelated practices in which the individual is always situated.

Van Dijk and Rietveld (2017) describe three perspectives one can take on the human form of life. It will be helpful to revisit each of these perspectives in turn as they relate to linguistic thought, since it will help us to make more concrete and precise the relation of the expressive bodily activities of individual body subjects to the sociomaterial practices in which they are situated. The first perspective they describe is on the whole interrelated nexus of practices in which materiality takes shape in such a way as to constrain individual human activity. What is distinctive of this perspective is its focus on regularity and persisting structure in the form of life. A time-lapse camera could for instance give a perspective on some of the regular ways people use language in interaction with each other at different moments of the day. We call the affordances that over time take shape in the whole web of socio-material practices the “landscape of affordances” (Rietveld and Kiverstein 2014).Footnote 12 The landscape of affordances both enables and constrains the meanings individual speakers can give expression to in their bodily speaking activity.

We suggest Merleau-Ponty’s category of spoken speech (parole parlée) should be thought of as expressive of the regular ways of doing things made available within the landscape of affordances of a form of life (e.g. the form of life of teachers of mathematics training their students to do geometry). “Spoken speech”, recall, refers to the already established uses of language that are available to members of a language speaking community. In spoken speech words already have a meaning because of how they have been taken up and used by speakers in the past. Thus, it is not the case that linguistic thinking can be reduced to the capacities of the individual body subject taken in isolation from the wider community of speakers. The bodily activity of speaking is expressive of care—the ways of being involved with the world manifest in the regular patterns of doing in the speaker’s form of life. What each speaker is doing in spoken speech is a continuation of what other speakers have done with words in the past. Many of the thoughts speakers regularly express will be already established ways of thinking about the world that circulate in their linguistic community. They are thoughts about the world already sedimented in the history of practical activity that can be taken up and repeated skilfully against the background of the public consensus about what makes sense, and what doesn’t make sense.

The second perspective one can adopt on a form of life is that of an observer studying how people behave in particular concrete settings and situations. From this observer’s perspective we see “how the details of the sociomaterial environment are changing and affordances are forming in the socio-material entanglement of people coordinating with others and materials in real time.” (Van Dijk and Rietveld 2017: p. 6). In the case of linguistic thought this would mean the activities of people as they take part in particular language games here and now. The regular patterns of activity of a given language game sets up a space of possible moves within which speakers operate as they attune to the particularities of their situation. At the same time this space of possible moves has been set up in the first place by the activities of the earlier participants in the language game. The moves are recreated each time an individual takes part in a game, and this opens up the possibility for the moves the game permits to be “extended, altered and reshaped” (Taylor 1985: p. 232). Following Merleau-Ponty, we’ve argued that thoughts are realised in the activity of talking. This has the consequence that among the possibilities our language games make available is the possibility to realise new ways of thinking, feeling and responding to things and relating to each other (Taylor 1985, pp. 233–234). How people relate to each other (intimately or formally, jokingly or seriously) is something people can shape through how they talk to each other.

There is thus an interdependence of spoken speech, and speaking speech (Baldwin 2007). The established routine ways of talking that are made available by the practices of a linguistic community to be taken up in conversation must once have been given articulation by an individual. The continuation of an established pattern of activity is likewise due to the activity of individuals. It should be noted that in arguing for an interdependence of speaking and spoken speech we are perhaps departing from Merleau-Ponty’s view. Merleau-Ponty sometimes writes as if he takes speaking-speech to play an “original”, almost foundational role in getting linguistic meaning off the ground. Spoken speech is described as “secondary” speech, while speaking speech is “original” and “primordial” (Merleau-Ponty 1945/2012: p. 409). Merleau-Ponty privileged the creative uses of language in poetry and song for instance as “authentic” instances of speaking. Spoken speech is often transparent, relying upon the pre-existing, routine and taken for granted meanings in circulation in the linguistic practice. In speaking speech one is aware of the style of what is being said, and our awareness is with the new meaning the poet or songwriter is forging. Merleau-Ponty takes speaking speech to have primacy because he rightly denies that linguistic thought is sitting in the minds of speakers with a fixed and determinate content waiting to be given expression. Instead he takes thought to be always indeterminate, and whatever determinacy our thoughts take on must be made in the activity of speaking with others, and to ourselves. We agree with Merleau-Ponty on this point. However we think such a view of linguistic thought as realised in speech is consistent with also thinking of speaking speech as depending on already available meanings that are the result of previous acts of expression. Instead of thinking of speaking speech as primary and spoken speech as secondary we follow Baldwin (2007) in taking them to be interdependent.Footnote 13

The third perspective one can adopt on the form of life is that of the lived perspective of the individual as they relate selectively to the rich landscape of affordances in a concrete situation. The individual stands in a relation to a field of multiple relevant affordances. Relevant affordances are those affordances in the landscape that invite a response from the individual because the individual cares about them in some way. The expanse of this field was restricted in Schneider—he was only able to deal with the affordances of the current situation and was unable to “reckon with the possible” (Romdenh-Romluc 2007). This incapacity had the consequence, as we saw above, that Schneider was unable to speak spontaneously—he was unable to engage appropriately with the particular situations in which conversations typically unfold. But Schneider’s predicament is fortunately not the situation most of us ordinarily find ourselves in. We are typically ready for many possible actions simultaneously. This is what allows us to switch rapidly and flexibly between activities. We’ve seen how the affordances that are relevant to us are enlanguaged—they are interwoven into our practices of talking and writing. Thus among the action possibilities the skilled individual can be drawn into performing are activities of making articulate and determinate ways of thinking about and being involved with the world.

The field of relevant affordances can be restructured by talking. We’ve seen above how attention can be guided to patterns of similarity and difference by talking. Certain affordances can be made manifest and articulated for others in talking and thereby “promoted” to them (Reed 1995; c.f. Taylor 2010; Van Den Herik 2018). This possibility to restructure the field of relevant affordances of oneself and others is constrained both by the concrete socio-material situation and by the patterns of practices that delimit the space of moves that can be made within the language. The past patterns of practices that have unfolded give us as speakers a sense of what we can do—which moves would make sense and which would not. The unfolding process of language use in a form of life can thus be thought of as offering invitations to speakers to continue patterns of activity set up in the past. But exactly what the individual speaker does in a particular situation is never fully settled by what has happened in the past. What people do is always continuing, extending, and transforming—often in creative and innovative ways as in speaking speech—the expressive possibilities set up in a language game. Even in more routine and everyday uses of language the speaker is adapting what they say and how they say things to the particularities of the situation in which they find themselves.

An example from the work of our own group comes from observing a skilled practice of architects at work in the planning and building of an art installation. An effect of the architect’s talking with each other is to enable the coordination of their activities over multiple time-scales. (Van Dijk and Rietveld under review) Talking is situated but it can also situate the architect’s wider activity. Conversing can help the architects to articulate patterns of similarity and dissimilarity between their activities over time, or it can enable them to bring aspects of how things were in their past situations into the current situation. A particular action possibility can be promoted which leads to a restructuring of the fields of relevant affordances of the team members involved. Conversing can thus enable them to join forces to engage well with a large-scale inviting affordance—an installation that doesn’t yet exist but which the architects are working together to design and build.

We’ve also observed how language can have the opposite effect when it is too ambiguous (Van Dijk and Rietveld, Under Review). An overly ambiguous utterance can lead to a temporary faltering in the attempts of the architects to coordinate their preparatory activities in the studio now, with their building the installation next year on the selected site. Speaking as an expressive activity is always ambiguous and indeterminate. There are no determinate thoughts waiting in the mind of individual to be expressed. The individual’s thoughts take form only in the act of giving expression to them. All the speaker can ever do is gesture at a thought but exactly what they mean will always remain somewhat indeterminate, and that typically works just fine. However in order for language to do the work of situating activities or to fail to do this work it must be able to make sufficiently articulate and manifest an understanding of how things are. We’ve been arguing these are possibilities that are made available by engaging with enlanguaged affordances. The affordances one is responsive to in acting skilfully are always enlanguaged because among the action possibilities always open to us as skilled participants in language games is the possibility to give articulation to our ways of being practically involved with the environment.

In this section we set out to show how the speaking activity of an individual relates to the practices of a community of language users to which they belong. We have approached this issue from three complementary points of view. The first perspective was that of the form of life as a whole. From this perspective, we find patterns of language use set up through a history of engagement in a web of interrelated language games. The second perspective we adopted was that of an observer of speakers engaged in conversation. From this perspective we were able to see the interdependence of Merleau-Ponty’s speaking and spoken speech at work. Finally we can approach the speaking activity of the individual from the perspective of the individual thinker invited to make articulate, a way of being practically involved with the world. We can think of past patterns of practice as material configurations in the landscape of affordances that come to exist through the repeated activities of individuals participating in the language games of the communities to which they belong. What the speaker is called upon to do is always, at least to some extent, a continuation, extension, and transformation of the expressive possibilities set up in the language games people speakers have established through their past activities.

6 Conclusion

Our aim in this paper was to show how the affordances that take shape in our human sociomaterial practices are enlanguaged—they are entangled with practices of giving expression to ways of thinking about the world in speech and writing. We’ve shown how the capacity for linguistic thought admits of conceptualisation in terms of skilled intentionality, i.e. in terms of coordinating with multiple affordances at the same time. There is thus no problem of scaling-up from perception and action to supposedly representation-hungry ‘higher-order’ cognition. Linguistic thought about the absent, distal, and counterfactual can instead be conceptualized in terms of skills for engaging with enlanguaged affordances. We’ve used Merleau-Ponty’s work on speech and thought to develop an account of linguistic thought that is in line with the Skilled Intentionality Framework (SIF). Our account of linguistic thought is able to do justice to the simultaneous social and material life of language. It recognises the materiality of linguistic thinking by denying that any separation can be made between the bodily activity of speaking or writing and the particular thoughts that are given expression to in these embodied activities. It recognises the sociality of linguistic thinking by situating bodily expressive activities in the form of life of a community of language speakers. Linguistic thought we’ve argued is a dimension of our bodily being-in-the-world. The thoughts a speaker expresses are articulating their caring mode of engagement with the world.

Linguistic thought is often thought to be contentful, explainable only by making appeal to semantic properties such as reference, truth or falsity, correctness and incorrectness (Hutto and Myin 2013, 2017). We do not deny that some cases of linguistic thought are content-involving—assertions for instance can be assessed for truth or falsity. What we do deny however is that such cases of linguistic thought in which truth and falsity are at stake need to be understood in terms of content-carrying mental representations. We’ve argued instead that linguistic thinking across the board is only achieved in the activity of body subjects as they engage in sociomaterial practices of speaking and writing. As members of linguistic communities, the possibility is always available to us, through our participation in sociomaterial practices of speaking and writing, to say how things matter to us. Sometimes what is at stake, when for instance we are doing science, or we are a witness in a court of law, is telling the truth.

Talking can help multiple individuals to coordinate their activities over time (Van Dijk and Rietveld, under review). For example, something said in conversation by the other may bring into the present cycling situation the plan that we made yesterday to go visit the forest this weekend on our bikes. Such an expression can promote an action possibility, and at the same time create a shared situation, thereby restructuring the field of relevant affordances of each of us. Mentioning the possibility to go to the forest shows something of shared relevance to us. This function of speech to restructure the field of relevant affordances connects back to the scaling-up problem. Speech enables humans to make an issue of affordances over longer time-scales, of say days, weeks or years. Thus it allows us to engage with affordances in their absence (such as a building that doesn’t yet exist, or two architects taking the concerns of an absent collaborator into account, Rietveld and Brouwers 2016). Speech also allows us to articulate abstract ways of thinking about the world (e.g. triangles in geometry). Once we have access to the expressive possibilities set up in past linguistic practices we can let our imaginations run wild, summoning up the fictitious worlds we encounter in works of fiction, and in the cinema. So-called “higher-order” cognition is traditionally seen as a problem for radical approaches to cognition that try to do without explanation in terms of inner mental representations. We’ve shown how the problem can be dissolved once we use the notion of skilled intentionality to think about linguistic thinking.