1 Introduction

An important question for any philosophical account focussed on basic arithmetic is how we learn that the natural numbers ‘keep going’, i.e. that for every natural number there is another, larger, natural number. This aspect of the natural numbers is codified in the successor axiom of Peano arithmetic (with \(\mathbb {N}x\) meaning ‘x is a natural number’ and xPy meaning that x is the predecessor of y):

$$\begin{aligned} ({\rm Successor} )\quad \forall x(\mathbb {N} x\rightarrow \exists y\, xPy) \end{aligned}$$

While this is a difficult question for any account, my main interest here will be to see how we might answer it on the basis of what actually happens in childhood. In other words, how is it that virtually everyone manages to acquire number concepts in childhood that are consistent with the successor axiom? In order to answer that question, it is important to not only look at philosophical accounts, but to connect these to the empirical data that is available. My aim is to provide a philosophical account of our knowledge of the successor axiom that is at the same time empirically feasible. So, to provide an account that is in accordance with what we know about actual child development. In doing so I will not provide an account that is specific to platonism or nominalism, as I suspect that the underlying data can be interpreted in both directions. I will also not provide a more general account of our knowledge of actual infinity, which has been the aim of e.g. Lakoff and Nuñez (2000) and Pantsar (2015). I only intend to provide an account of how we can acquire number concepts that match the successor axiom. In other words, to explain how children figure out that the numbers ‘go on forever’, so that there are infinitely many numbers. I will talk of ‘matching’ or ‘being consistent with’ the successor axiom as a way to state more precisely that children have figured out that there are infinitely many numbers.

My aim is to offer an account of children’s knowledge that is based on the structure of the numeral system and not on the procedure of adding one. I argue that such an account is a viable alternative to the existing (plus-one) account, even though it is not yet possible to decide between the two. First, I lay some of the groundwork for my own position by discussing what we know about our grasp of numbers that are represented by more than one digit (e.g. 84). That provides a basis of information about the mechanisms that are used to understand the structure of our numeral system, which I claim can explain how we acquire concepts that are consistent with the successor axiom. The work in Sect. 2 thus prepares the different elements for this account by discussing the structural aspects of the numeral system and how children learn these aspects. It is particularly relevant for Sect. 4, where the aspects of the numeral system identified in Sect. 2 is used to interpret parts of the empirical data.

Before working out my own account I briefly discuss the kind of account that has been prevalent in the philosophical literature. This is the alternative explanation of how we acquire concepts consistent with the successor axiom. Philosophers such as Parsons (2007) have claimed that we can acquire knowledge on the basis of the successor relation. I do not criticise the possibility of such an approach, but do claim that this is not the only currently viable hypothesis. The alternative is my own account based on the syntactic structure of the numeral system. This account consists of two claims: (1) the syntactic numeral system supports children’s learning that the numbers never end and (2) children use digits with a high place-value (further to the left) to form continually larger numbers and thus figure out that one can always arrive at a larger number. The argument that this alternative account is viable is distributed over Sects. 3 and 4. In Sect. 3.2 I discuss some of the empirical data that shows that children are able to form larger numbers by attending to the leftmost digits of numbers, thus exploiting the structure of our numeral system. Section 4 then presents my alternative account in detail and is there argued to fit the data from Sect. 3.2 as well as additional data identifying the stages in which children seem to learn that there is always a larger number. Yet, despite it fitting the data, there are not sufficiently many studies to argue that my alternative account is correct, or to determine its relation to the traditional accounts. It is a viable alternative, but exactly what place it should get (e.g. whether it is merely a first step children go through or captures the entire learning process) is still unclear.

2 Working with multi-digit numerals

2.1 Processing of multi-digit numerals

One may already suspect that there is a cognitive difference between small and large numbers on the basis of simple introspection. It is simple to imagine four apples, but to have a clear image of 83 apples is nowhere near as simple. We do not learn about these ‘large’ numbers on the basis of counting, as we do in the case of four (e.g. Sarnecka and Lee 2009). Of course this does not imply that there are significant differences in the way our brain handles these numbers. It turns out, however, that we also process single-digit and multi-digit numbers differently. There is more and more evidence that our brain processes multi-digit numbers on a digit-by-digit basis, which is usually referred to as compositional processing (Nuerk et al. 2015). The alternatives, namely processing the number as a whole, is commonly referred to as holistic processing. There is also a hybrid option, but that is usually not preferred over the other two for parsimony reasons.

Before going into the consequences of the high likelihood that our brains process mutli-digit numbers compositionally, it may help to mention some of the reasons (there are more in the article cited above) why this is considered to be the best explanation of the data. One of the main reasons is based on response times to questions of the form ‘which of these two numbers is larger?’ One well-known phenomenon, called the distance effect, is that response times go down if the numerical difference between the two numbers increases (Moyer and Landauer 1967). What is curious, is that for multi-digit numbers there is not only a distance effect, but also a unit-decade compatibility effect: response times are faster if the unit and decade digits of the larger number are individually larger than the corresponding unit and decade digits of the smaller number. So, response times to \(67 > 52\) are shorter than response times to \(62 > 47\), because for the former \(6 > 5\) and \(7 > 2\), but for the latter we have \(6 > 4\) and \(2 < 7\) (Nuerk et al. 2001, 2002; Mann et al. 2012). This is an effect that can be explained only if processing of multi-digit numbers is compositional or hybrid, where compositional processing is the preferred hypothesis because hybrid processing doesn’t explain any features better than compositional processing (Moeller et al. 2011).

Another important reason for thinking that multi-digit numbers are processed compositionally is known as the serial-order effect. This effect is found when participants are first presented with two numbers and are subsequently asked if a third number was among those two initial numbers. It turns out that if there is a simple arithmetic relationship between the digits of the three numbers (e.g. 23, 45, 67) then it takes longer for participants to reject the third number as not being among the initial two, as compared to when this is not the case (e.g. 23, 45, 65). Such an effect is best explained by compositional processing, as in that case there is attention to the digits on their own and not only to the two-digit numbers as a whole (García-Orza and Damas 2011).

A final piece of data that may be worth mentioning is that there is a significant influence of the structure of number words in one’s native language on how easy it is to learn the Indo-arabic numerals. One language-related effect is the inversion effect, where children make more errors in writing down the corresponding Indo-arabic numeral if their native language reverses the unit and decade digit [e.g. ‘fünfundvierzig’, five-and-fourty for 45. Klein et al. (2013)]. On the other hand, such errors have not been observed for languages that do not have this kind of inversion, such as French (Barrouillet et al. 2004; Camos 2008). This difference between languages also leads to an increased unit-decade compatibility effect for languages with inversion, because inversion makes it more difficult to determine which digits have a higher value (Nuerk et al. 2005; Pixner et al. 2011). Language inversion even makes it more difficult to solve addition problems that involve carries, because again one needs to know which digits are the unit digits and which the decade digits in order to be able to correctly carry the sum of the units (Moeller et al. 2011a; Göbel et al. 2014).

2.2 Our grasp of multi-digit numbers

What the psychological data shows is that the structure of the numeral system is an incredibly important component to our grasp of multi-digit numbers. Because these numbers are processed compositionally it is essential to be aware of the fact that Indo-arabic numerals use a place-value system with base 10. In other words, it means that it is not possible for us to parse a Indo-arabic numeral as a whole without paying attention to the peculiarities of the numeral system. Consequently processing would be different if we were to use another numeral system, such as the Roman numeral system (Lengnink and Schlimm 2010). For in the case of the Roman numeral system the specific place in which a digit is located has little to do with the value—especially in the original numeral system which was purely additive (and therefore wrote four as IIII instead of as IV).

In the case of the numerals we use nowadays, children begin learning to recite these numbers in a long count list. They learn to count on from the numbers that they do know, but do not seem to realise the cardinal value of the numerals they learn to recite. While children around four, when they can count to at least 10, know what number comes after a number within their count list, they often do not know what number you get by adding one to that same number (Davidson et al. 2012). Therefore it is possible to correctly recite the first thirty or fifty natural numbers while not knowing what cardinal value to associate with the numbers above (roughly) ten. The syntactic structure of the numeral system, namely that one changes decade digit after the unit digit has reached 9, may be grasped at this stage, as children do manage to produce multi-digit numerals in the correct order. However, I would say that they do not have the corresponding number concepts, because they do not associate these numerals with specific cardinal values. That happens only if one can reason in exact terms how much more or less this number is compared to other numbers. Merely knowing that a different number needs to be assigned if one adds one is not enough to grasp a specific number concept. It is certainly a prerequisite, but it is not enough to distinguish one’s concept of 35 from that of 36, if one doesn’t also realise that one added to 35 is 36. In order to have a number concept, rather than a merely formalistic concept of a numeral, one needs to view these as specific cardinal values. This involves various elements:

  • The syntactic structure of the numeral system, which is required for producing a correct count list

  • The knowledge that these numerals indicate cardinal values, so that in applying them we need to use a different numeral if an object is added to the relevant collection of items

  • Linking the syntactic structure of the numeral system to the interpretation of numerals as cardinal values

It is this last element, the linking of the syntactic structure of the numeral system to the interpretation of numerals as cardinal values that children have not made if they do know what numeral is next in the list, but do not know what numeral should be mentioned if one adds one. Only when children manage to connect those two elements do they grasp the number concept.

Importantly, in order to link the syntactic structure (with its place-value system) to the interpretation as cardinal values for multi-digit numbers, one has to interpret the separate digits in terms of the different values assigned. If you don’t do so (and children occasionally fail to do so), one sees mistakes such as the following [from Lengnink and Schlimm (2010)]:

figure a

In this case a child hasn’t assigned a higher numerical value to the digit in the decade position, instead adding the two multi-digit numbers as \(4 + 3 + 2 + 6 = 15\). When linking the syntactic structure, already grasped from the counting routine, and the interpretation as cardinal values it is crucial to assign the right cardinal value to the individual digits. Because the digits are processed independently, this means that one has to take the additional step of assigning the decade digit a value in terms of tens. That assignment is harder if one’s native language has a different structure from the Indo-arabic system, which helps establish that this is a step that children have to take. That there is such a step is particularly relevant for my interpretation of how children acquire number concepts that are consistent with the successor axiom: it will matter there that it is possible to grasp the syntactic structure of the numeral system without interpreting these numerals as cardinal values. First, however, I will discuss the current philosophical accounts of our knowledge of the successor axiom.

3 The successor axiom via adding one

3.1 Current philosophical accounts

Usually philosophical accounts that try to explain specifically how we come to know that the there are infinitely many natural numbers appeal to the procedure of adding one. This may be indirect, in terms of the proof of the successor axiom when the Dedekind–Peano axioms are embedded in Frege arithmetic (Linnebo 2004).Footnote 1 The suggestion in that case is that one might learn about the infinity of the natural numbers through the construction of the number sequence. The Fregean proof of the successor axiom is based on defining the successor number of n as the number of numbers that are smaller than or equal to n. Since the numbers start with 0, this new number is exactly equal to \(n+1\). Naturally, no one expects children to acquire concepts consistent with the successor axiom by following this proof (and it is indirect for that reason; no learning procedure is suggested and it is only shown that there are infinitely many numbers in Frege arithmetic). Still, the general strategy of going from n to \(n+1\) is visible here.

It may also be direct (so by giving an explicit learning procedure, even though typically not one that has children in mind), such as in the account of Parsons (2007), further expanded upon by Jeshion (2014). Their account describes a situation where one acquire knowledge of the Dedekind–Peano axioms for a particular kind of numeral system: the stroke-language. This is a tally system, where zero is encoded by a single | and the successor function is the function that appends a | to a string. On the basis of this stroke-language, Jeshion gives two ways in which one may acquire knowledge of the successor axiom, both based on Parsons’ less elaborate account that we learn this by figuring out that we can always extend a string of strokes by appending one more stroke. Note moreover that, while these are ways to acquire knowledge of the successor axiom, the suggestion is not that we literally learn the successor axiom. They are rather suggestions for a learning process with the outcome that we realize that there are infinitely many numbers, i.e. the process at the end of which are number concepts are (merely) consistent with the successor axiom.

First, it is possible for someone to imagine a string of strokes, to which one then appends another stroke in order to realise that for every stroke there is a successor that is exactly one stroke longer. In order for that general conclusion to be reached, it is important that the initial stroke is imagined in a very particular way. Namely, while one has to imagine a particular stroke string (to be sure that you are representing a stroke string), the internal structure should not be represented, in order to have the required generality. So, one is supposed to imagine something of the form \(||| \ldots |\) to which you can then imagine appending a stroke \(||| \ldots ||\). Since the appending of the stroke string didn’t depend on the representation of the internal structure of the string it is possible to conclude that the successor axiom, in all its generality, is true of the stroke language (Jeshion 2014, p. 338).

Second, instead of building the generalisation into the thought-experiment, one may take that step afterwards. So, one starts by imagining a particular stroke string, along with its internal structure. Then one imagines appending a stroke to that particular string. In order to arrive at knowledge of the successor axiom, one then has to realise that the ability to append another stroke is completely independent from the number of strokes in the string to which these are appended. Consequently it is possible to arrive at a successor string for any string (Jeshion 2014, p. 338).

From the psychology side there is little in the way of explicit accounts as to how one learns that for every number there is another, larger, number. There is, however, an implicit suggestion in the way in which the final learning stage, being a carinal-principle-knower after having been a subset-knower, is characterized (Carey 2009). A subset-knower is a child who knows the meaning of the number words up to three or four. Children are counted as knowing the meaning of a number word if they succeed at the give-N task: the task where they are asked to give the experimenter N items. If they make no more than two mistakes when prompted to give three items, then they are counted as knowing the meaning of the word three. It turns out that children learn the meaning of the first few number words in order, so first one, then two, etc. (Wynn 1990, 1992). When they get past four they also typically learn the Cardinal Principle and are counted as cardinal-principle-knowers. Sarnecka and Carey explain the principle as follows: “Only cardinal-principle-knowers understand that adding exactly 1 object to a set means moving forward exactly 1 word in the list, whereas subset-knowers do not understand the unit of change” (Sarnecka and Carey 2008, p. 662). Knowing this, which is the Cardinal Principle, is what distinguishes a CP-knower from a subset-knower – it is just that typically children learn the Cardinal Principle after learning the meaning of four.

The suggestion that the plus-one strategy is also endorsed here is found in another description of CP-knowers: “it seems that CP-knowers (but not subset-knowers) have begun to connect the counting list and counting routine to the idea of a successor – that each number has a next number, exactly one more, which is named by the next word in the list” (Sarnecka 2015, p. 12). Given this attribution of knowledge to children [it is still somewhat contested in the literature whether children actually know all of this at the stage discussed by Sarnecka and Carey (Davidson et al. 2012)] there is a clear way to figuring out that there is always a larger number. The strategy outlined by Parsons and Jeshion in fact fits rather well: one either imagines adding an object to a collection without its internal structure, which would force one to move to the next numeral (possibly then not in the count list). Or, one imagines a particular collection to which an object is added, after which one can realise that the possibility of adding an object is completely independent form the number of objects already in the collection.

The relevant commonality between these different philosophical positions is that they all work along the same line: we acquire number concepts that are consistent with the successor axiom on the basis of the successor function. In other words, we can learn that there are infinitely many numbers through the procedure of adding one. In the rest of this paper I argue that there is a viable alternative explanation: that we learn that there are infinitely many numbers on the basis of the structure of the numeral system.

3.2 Children’s behaviour when constructing larger numbers

If children acquire number concepts consistent with the successor axiom in virtue of a realisation regarding the successor function, one might expect them to appeal to the process of adding one when asked questions related to the successor axiom. Both of the above strategies from Jeshion involve constructing a longer string (a larger number) by direct application of the successor function. One may thus expect that on the plus-one account children are predicted to form larger numbers by adding one, at least in typical tasks. They don’t form larger numbers in this way, at least not in the case of younger children. As I explain a little later this is not a good argument against the plus-one strategy, but it does provide an argument in favour of the viability of my alternative presented in Sect. 4. This data helps establish that children can reason on the basis of the structure of the numeral system to form larger numbers.

Researchers have used responses to a game as an important tool to test whether children realise that for every natural number there is a larger natural number (in the studies described here children were between 6 and 15). The game itself is very simple and consists of two moves: first the starting player names a number and then the second player names a number. The person that names the higher number wins, which means that the second player can always win. The game is then used to test whether children grasp that there is always a larger number by asking the child before the game whether she wants to be first or second. Specifically, children were presented with the following query: “Lets play a game. Each of us will say a number, the one whose number is greater wins. Would you like to be first or second in choosing a number?” (Falk 2010, p. 8) After answering, the child was asked to explain his or her choice. Regardless of the provided answer both players then give a number and a winner is determined, a procedure that was repeated no more than four times (primarily to make sure that the children understood the game). Having ended the game, the experimenter suggested taking turn with the child at naming larger and larger numbers. This part was ended either when the child failed to name a higher number or after it had gone well for a period of time, leading to the final part of the experiment which consisted of an open-ended interview about whether or not the sequence could be continued indefinitely. In these studies children who pick to go second are generally considered to display knowledge that you can always find a larger number, i.e. that there is no largest number. This was further verified in the interviews by explicit questioning.

On the (questionable) assumption that this study displays children’s reasoning with forming larger numbers one would expect them to add 1 to a number is they follow the plus-one strategy. If they are unfamiliar with a number they might even repeat the given number and add ‘plus one’, to be certain that their number is higher. If they already think about numbers increasing constantly through the ‘+ 1’ operator then it also gives a very simple reason for going second, and for why the sequence rehearsed afterwards will continue indefinitely. If children form larger numbers in a different way, and explain their choices differently, then that means that there is another method available to them on the basis of which they might learn that the numbers go on indefinitely. It does not mean that they in fact use a different strategy, but does widen the options. So, here are two examples of the responses that have been noted from children at the age where they just start to succeed at winning the game. The gender and age of the child are indicated at the start of each set of responses.

G9 [in the increasing sequence]: (500) 600 (20,000) 30,000 (million) 2 millions (billion). What is that? (A very large number) Aha! [ironically] 2 billions (trillion) 2 trillions (googol) 2 googols. (Will the numbers end up for us?) No, because each time you make up a number, I can add to it. (Did you recognize these numbers?) No. (How did you manage to find bigger numbers?) Because you said. (Falk 2010, p. 35)

G10 [in the continuing sequence of Game 1]: (120) 900 (1,000) 1,200 (2,000) 3,000 (million) million and a hundred (2 millions) 10  millions (billion). What is that? (A very large number). I don’t know, I cannot go on because I don’t know any more names of numbers. (Falk 2010, p. 37)

B6 [Game 1]: (How long will the game go on?) Until no end; the numbers will never stop, because if there is, say, a milliard, one can add to it another and another milliard until infinity (Falk 2010, p. 35)

Not only are there examples here of the pattern in which children choose to give higher numbers. The last example, from the reasoning related to the increasing sequence, also shows that the underlying reasoning here is not focussed on adding one. When asked in an ordinary interview if children can find larger numbers they tend to give responses along the same lines (with I the interviewer and A an 8 year old child):

I: You wouldnt stop? How is that? I claim that I can say a very big number and you will stop there!

A: Which?

I: Well...27 quadrillion 842 trillion 520.

A: There is one bigger!

I: Which, which one?

A: 100 quadrillions of quadrillions of quadrillions. (Singer and Voica 2008, p. 195)

It seems that children only start to reference the procedure of adding one to obtain a larger number in these tasks later on. For example, one fifth grader (so 10–11 year old) says: “There are infinitely many natural numbers because if I pretend that I found the biggest, I can add 1 and I get a bigger one” (Singer and Voica 2008, p. 196). Around that age the explanations given with regards to the game in Falk’s study also start to change:

G10: I want to be second because so I’ll hear what you say and I’ll say something bigger ...I think one cannot ever finish that game because each time you can add one and another one, and there is no end to it. (Falk 2010, p. 35)

G13 [Game 1]: There is always a larger number. Not always all of them have names. There are numbers that were not given a name because they are so big, but you can always add one to them. (Falk 2010, p. 36)

G10 [Game 1]: Each number that you say I can add one to it. One cannot finish the game. (Falk 2010, p. 36)

Yet even here there is the occasional mention of an alternative strategy:

G11: No, the game will not end. Perhaps in words there is an end, but when you write a number, you can always write another zero and another zero, so that it will not end for the life of me. (Falk 2010, p. 35)

The pattern of responses and the bits of reasoning that have been recorded show that children do not always rely on the procedure of adding one to form higher numbers, or come to the conclusion that one can continue a sequence of numbers indefinitely. Although children do eventually adopt the more uniform strategy of forming a larger number by adding one, there is a period of time before that during which they reach the same conclusions, but without the reasoning associated with the plus-one strategy. Children do not start by responding in accordance with the pattern outlined by the current philosophical theories, even if they do conform to that pattern of responses a few years later.

One reason why children may give these response patterns is that they are syntactically simpler. It is syntactically simpler to respond to ‘five hundred’ with ‘six hundred’ than with ‘five hundred and one’. Similarly, the recorded response to ‘one hundred and twenty’ (‘nine hundred’) is syntactically much simpler than the response obtained by adding one (‘one hundred and twenty-one’). That might be why children give these responses rather than responses in line with adding one. For that reason, the response pattern should not be taken as evidence that children do not reason along the lines of the plus-one strategy. It may well be that they understand everything needed for the plus-one strategy and reason in that way, but choose their responses differently for reasons of simplicity. Nevertheless, the fact that they can provide responses in this way does show that the strategy I will work out in the next section is available to them. Since they prefer answering by increasing the leftmost digit, they are aware of the fact that doing so produces a larger number.

The response patterns that we find are therefore not a good argument against the plus-one strategy. The explicit motivations given by children when asked why the game never stops are in a bit of a conflict with the plus-one strategy, if one thinks that these motivations display children’s reasoning about infinity. When young children, as cited above, say that the game never ends because even to a milliard (a dated, but acceptable, term for a billion) one can add another milliard, then this reasoning doesn’t fit the plus-one strategy. But here too there may be reasons for this that are different from their not using the plus-one strategy to acquire number concepts consistent with the successor axiom. In short, we don’t have any argument that shows that the plus-one strategy is not the one used by children.

These response patterns should, moreover, not be confused with the fact that children display some understanding of the successor function before they figure out that there are infinitely many natural numbers. Davidson et al. (2012) and Cheung et al. (2017) found that a little before learning that the numbers go on forever children manage to consistently name the next numeral, also for complex numerals (i.e. up to 100). That finding is consistent with results presented by Fuson and Hall (1983) that children learn the syntax of these numerals quite late. It is tempting to interpret this as evidence in favour of the plus-one strategy, but caution is in order. What these findings show is that children acquire a good grasp of the syntax of complex numerals right before they figure out that there are infinitely many numbers.

Knowing the next numeral in the list, and that you arrive at the next numeral when one item is added to a collection [the CP principle, which children supposedly learn before understanding the syntax of complex numerals (Sarnecka and Lee 2009)—though it is not clear that they immediately apply this principle to the more complex numerals (Davidson et al. 2012)] is not sufficient for having concepts consistent with the successor axiom through the plus-one strategy. It could also lead to their figuring out the structure of the numeral system and following the strategy I present in the next section, since this ability can be interpreted as displaying primarily syntactic knowledge about the numerals. In line with such a hypothesis Le Corre (2014) argues that children learn the ordering of slightly more complex numbers (numbers up to 10 were tested) before learning their cardinal meaning, even though they do not yet know the ordering for higher numbers (e.g. that 8 < 10) when they become CP-knowers. This dissociation between syntax learning and the learning of cardinal meaning further supports the idea that the mere understanding of the structure of the numeral system does not automatically lead to an interpretation of this structure in terms of the plus-one operation. At the moment, then, these findings do not decide between the two accounts.

4 The successor axiom via the structure of the numeral system

4.1 Forming larger numbers

In order to be able to realise that every natural number has a successor one has to be able to form a larger number, regardless of which number is presented. Adding one to the given number is one way to form a larger number, but there are of course others. What seems to happen in the above response patterns is that children use the structure of the numeral system to form larger numbers. This is the first claim for my alternative account: children use the syntax to learn something about the domain. This claim is still compatible with the plus-one strategy, as children might learn the operation of adding one from the syntax. My second claim is that instead of making changes to the unit digit of the presented number they make changes to digits with higher place-value (i.e. further to the left), such as when they move from 500 to 600 or from a billion to 2 billion. Furthermore, in the above responses they do so even when they are unfamiliar with the number words used, as was evident in the cases of billions, trillions and googols. Given that digits are processed independently (and this is not just true for numbers that are written down), it would make sense to focus on just one of the digits, and to then pick the largest one. One for doing so would be that picking the leftmost digit guarantees that you have at the very most one carry to execute. Here, then, is a basic procedure for generating larger numbers that comes from one’s realisation that the leftmost digit indicates the largest number:

figure b

This method is guaranteed to yield a larger number in virtue of the place-value system that our (written) numeral system is based on. For the spoken numeral system it is, in languages with inversion, trickier to make sure that one has identified the right digit. With spoken numerals one also needs to pay attention to the words indicating the different base-10 values, but for numbers above 99 the leftmost written digit is usually said first.Footnote 2 So, while trickier, it is not significantly more difficult to follow this strategy for spoken numerals. Yet perhaps it is because it is more difficult for spoken numerals under 99 that in the above interview A answers 100 quadrillions of quadrillions of quadrillions to 27 quadrillion 842 trillion 520. A slightly different strategy is used, because not just the leftmost digit is increased. Instead A makes an increase from 27 to 100, thus adding a digit to the left and changing the two digits following after that. In some cases, then, a child may choose to add a digit on the lefthand side. Furthermore, the numbers mentioned may be too unfamiliar for A to know with certainty that 100 quadrillion is larger than the number mentioned by the interviewer. In such a situation repeated mention of the highest sounding numeral (seemingly meant to indicate multiplication) is a safer bet than just to mention 100 quadrillion, even though that number would also be larger. These observations thus lead to a slightly more general method (even though I will ignore the repeated use of the highest sounding numeral):

figure c

Even now, this gloss doesn’t cover all of the responses that have been given by children. A 10 year old girl responded to 120 with 900, in line with this strategy, but to 1000 with 1200 (probably uttered as ‘twelve hundred’) and to a million with a million and a hundred. At least the last response goes against the considerations of syntactic simplicity mentioned earlier, though notably the more complex response is not simply adding one. This deviating case also illustrates the fact that there is far too little data to say anything definite about the strategy being employed in answering these questions. There have been, as far as I know, no tests with numbers such as 2001 where one can see if children prefer 3001 or 2002, both syntactically equally simple answers. Nor have there been tests with numbers below 100, to see if children prefer answering to 79 with 80, 89 or something like 100. Since (more on this later) children only seem to display knowledge of the endlessness of the natural numbers when they can count to a hundred, it may well be that they follow a completely different strategy for numbers below a hundred. Yet, at the same time, what strategy they follow below 100 may not be that important: what matters for getting number concepts consistent with the successor axiom is that one has a strategy for forming ever larger numbers. If this strategy only kicks in after 100, because children already know all the numbers up to 100, then it might be that a dual account where children add 1 below 100 and go by the place-value of digits above 100 is a better fit. In short, there are still some significant gaps in our knowledge of these response patterns and the reasoning behind children’s conviction that one can always form a larger number. The current suggestion is an initial attempt at structuring the available data into a hypothesis, that then needs to be subjected to further testing. What is interesting in this alternative hypothesis is that it offers another way in which one might come to know that there is always a larger number. That this is the case has to be argued for, and it is what I turn to next.

In order to know if this is a strategy that could be used to acquire number concepts that are consistent with the successor axiom, two things need to be established. First, that this strategy can be used to learn that \(\forall x \mathbb {N}x \rightarrow \exists y(\mathbb {N}y \wedge y > x)\). Second, that this also implies that one’s number concepts are consistent with the standard formulation of the successor axiom: \(\forall x (\mathbb {N}x \rightarrow \exists y\, xPy)\). The next subsection takes up these two elements and thus argues that the current hypothesis gives a viable alternative account, though it cannot yet be established if it is also the correct account.

4.2 From forming larger numbers to consistency with the successor axiom

Let me start with the question whether the strategy of increasing the value of the leftmost digit or adding a digit on the lefthand side may be used to learn that there is another, larger, number for every natural number. Because of the fact that the numeral system is based on a place-value system this is a mathematically sound strategy for creating a larger number. Note that this hinges on the use of a place-value system: if we were to use the roman numeral system that also includes subtraction then adding a digit to the left may produce a lower number (e.g. adding I to the left of V gives us IV, which is lower). Because in our numeral system adding a digit to the left is guaranteed to generate a larger number, this is an acceptable strategy, Similarly, increasing the value of the leftmost digit is always guaranteed to generate a larger number. So as a method of generating a larger number for any number presented it is one that, regardless of how one fills in the specifics, will always work. Therefore this method can be used to show that \(\forall x \mathbb {N}x \rightarrow \exists y(\mathbb {N}y \wedge y > x)\), because we have a way of generating a y for an arbitrary x.

How might we realise that this method of increasing a number does indeed produce a larger number? After all, a philosophical account cannot just stop at noting that a method is sufficient for producing larger numbers. It also has to provide an account of a way in which one can convince oneself of the fact that the method in question always produces a larger number. It should be mentioned that children seem to realise that this is a method that always produces larger numbers. In the interviews there are some indications that they think along the lines of the above strategy and think that that will always yield an even larger number. So, if this strategy fits their thinking then it is not just a procedure that they happen to follow without being consciously aware of the implications of this construction method for the endlessness of the natural numbers. Hence, there has to be some way that the children use to get to this knowledge from the availability of the above method for increasing numbers.

Here is one way in which this can happen. Children need to learn the syntactic structure of the numeral system (recall the discussion of this structure in Sect. 2), which means that they need to learn that the leftmost digit indicates a value ten times that of the digit that’s to the right of it (if there is one). Or, put more sparsely, they need to learn that you increase the leftmost digit by one only after ten increases in the digit to its right (if there is one). As a result of this knowledge, which is necessary for the most elementary grasp of the numeral system, children can realise that the leftmost digit is the most important when it comes to the designated value. It is the one that carries the most weight when comparing two numbers, etc. When the syntactic structure is linked to an interpretation of numerals as designating cardinal values, children need to learn that the leftmost digit therefore designates a higher value (tens, hundreds, etc.) than digits to its right. Here a syntactic feature of the numeral system is linked to the semantic interpretation. Because of the place-value system the leftmost digit always indicates the highest value. Knowing how the place-value system works therefore enables children to know that the leftmost digit always indicates the highest value of all the digits that are present.

Once children know enough about the structure of the numeral system they can realise that the above method is guaranteed to provide a larger number. They already know that the leftmost digit always indicates the highest value. Consequently they can know that increasing that value will produce a larger number. And increasing a digit is quite simple as they mainly need to know that \(2 > 1\), \(3 > 2\), up to 9. Adding a digit to the left can be seen to guarantee a larger number on the same principle: the leftmost digit always indicates the highest value, so placing a 1 to the left will yield a higher number. The reasoning can therefore happen in relation to something that is already general: the structure of the numeral system. There is no need to imagine specific numerals, because we can start from a more general principle. Therefore there is also no need, as in Jeshion’s first thought experiment, to imagine a numeral without representing its internal structure.

One last thing to discuss before moving to the question whether concepts consistent with \(\forall x \mathbb {N}x \rightarrow \exists y(\mathbb {N}y \wedge y > x)\) are also consistent with the successor axiom is whether the current hypothesis also fits with the moment in time when children seem to realise that the numbers go on forever. For the hypothesis to be viable it shouldn’t be the case, for example, that children seem to have number concepts consistent with the successor axiom years before they are aware of the structure of the numeral system. In fact, were children to learn that we can always form a larger number on the basis of their knowledge of the structure of the numeral system, then the moment when they realise that you can keep on forming larger numbers should be correlated with their knowledge of the structure of the numeral system. The work of Cheung et al. (2017) found that these two are indeed correlated. They looked at the relation between children’s ability to count up to 100, their ability to name the successor (in terms of adding one) of a number and whether or not they realise that we can always form a larger number. They found that most children only start to answer that you can always find a larger number and that there is no largest number when they can count at least to 100 and when they perform near-perfect on the task of naming the successor of a given number. In other words, there is a strong correlation between knowing that it is always possible to form a larger number and counting ability. There is also a strong correlation between this knowledge and being able to name the successor of a given number, which I interpreted above as being able to interpret the numerals as specific cardinal values. Cheung et al. (2017) only report that these correlations hold with the combination: knowing that you can keep adding one (which seems to have been interpreted by some as asking if you can keep counting) and knowing that there is no largest number.

Further research will have to show what the exact correlations are between counting ability and knowledge that you can always keep counting. For my current purposes the study by Cheung et al. (2017) shows that there is no mismatch between the acquisition of concepts consistent with the successor axiom and knowledge of the structure of the numeral system. More practice with counting, in particular with not making mistakes with the peculiarities of the place-value system, indicate a better grasp of the structure of the numeral system. Learning this structure of the numeral system, i.e. the place-value system, is moreover quite difficult and children only succeed at doing so relatively late (Fuson 1990; Fuson and Briars 1990) For the hypothesis that children learn that you can always form a larger number on the basis of the structure of the numeral system to be viable there has to be evidence of such a correlation. The fact that this correlation was significant, but that there was no significant correlation with age supports my argument that the hypothesis is a viable alternative. Furthermore, there has been some research into how well children can separate out the different digits. As it turns out, they only become good at doing so when they are able to count to roughly 100 (Fuson and Hall 1983; Siegler and Robinson 1982; Rule et al. 2015)—exactly the point where Cheung et al. (2017) noticed that children start to realise that there is always a larger number. Knowledge that you can always form larger numbers correlates with knowledge of the structure of the numeral system. The strategy outlined above, or something similar dependent on place-values, is thus a viable alternative hypothesis for how children learn that the natural numbers go on forever. The argument for this claim will be extended in the next subsection, by showing that the hypothesis can also account for the developmental stages that have been observed. There is just one technical detail to establish first, namely that the strategy really gets us to the successor axiom.

I have been describing a strategy for acquiring concepts consistent with \(\forall x \mathbb {N}x \rightarrow \exists y(\mathbb {N}y \wedge y > x)\). Naturally, it is required that this is also formally sufficient for consistency with \(\forall x (\mathbb {N}x \rightarrow \exists y\, xPy)\). Fortunately, \(\forall x \mathbb {N}x \rightarrow \exists y(\mathbb {N}y \wedge y > x)\) implies the standard formulation of the successor axiom under the standard definition of \(y > x\). This is usually defined as \(\exists z (x + z = y)\). Addition, in turn, is defined in terms of the successor (or predecessor) relation.Footnote 3 So, if there is a natural number larger than x then this means that for some z there are z numbers between x and that larger number. Therefore, there has to be a y such that xPy (that y is the successor of x). In the other direction \(\forall x (\mathbb {N}x \rightarrow \exists y\, xPy)\) also implies that \(\forall x \mathbb {N}x \rightarrow \exists y(\mathbb {N}y \wedge y > x)\), because the y that is supposed to be larger than x can simply be chosen to be the successor of x that is guaranteed to exist by the successor axiom. Therefore, the two are logically equivalent and in terms of the structure of one’s number concepts it doesn’t matter if they are consistent with \(\forall x \mathbb {N}x \rightarrow \exists y(\mathbb {N}y \wedge y > x)\) or with \(\forall x (\mathbb {N}x \rightarrow \exists y\, xPy)\). Consistency with either one of these has the same consequences for what our natural number concepts have to be like.

4.3 The different developmental stages

Another finding that Cheung et al. (2017) report is that they have found four different stages of knowledge that children can be in. This corroborates earlier findings regarding stages of knowledge by Evans (1983), Gelman (1980), Hartnett (1991) and Hartnett and Gelman (1998). First, they may not realise that there is no largest number, nor that it is possible to always add 1 (mean age 5.2 years). Second, they may know that there is no largest number but not realise that this implies that you can always add 1 to a number (mean age 5.5  years). Third, children may not know that there is no largest number, even though they think that you can always add 1 (mean age 5.3 years). Fourth, they may know both that there is no largest number and that you can always add 1 to a number (mean age 6.2  years). When, and in what order, children go through these stages is mostly unclear, since there was a range for all of these categories from about 4 to 7 years. This seems to fit with Falk (2010), who found that at 6–7 about 50% manage to explain why going second in their game gives a winning strategy, which increases to 80% at 10–11. This supports the relevance of the data from Falk (2010) for showing that children are in a position to follow my alternative strategy, since the younger children who were interviewed were thus still in the process of learning about the infinity of the natural numbers. However, since we see that there are these stages, a hypothesis about how children acquire number concepts that are consistent with the successor axiom should be able to account for the existence of the different stages. The last part of my argument that this hypothesis is a viable alternative is therefore to show that these stages can be interpreted in terms of aspects of the structure of the numeral system, as those were highlighted in Sect. 2.

Particularly relevant is to see how it is that children may end up in the intermediate stages (two and three), as these help to indicate distinctions we should make when thinking about children’s mathematical abilities (and should thus be reflected by an account of the development of those abilities). The second stage was the least common with only 6 children out of 100 (and wasn’t found at all in earlier studies), but is also the more puzzling of the two. If children know that there is no largest number, then how can they deny that you can always add one? I can see at least two possible interpretations. One is that the children in question interpreted the question if you can keep adding one as a question whether they could personally do this. In that case, the answer may well be no, even though they realise that there is no largest number. Perhaps they don’t feel comfortable enough with large numbers (since they’re explicitly asked to name the highest number they know) to say that they can keep on adding one. Yet this still sits uneasily (but no more than that—it may just be that the later answer indicates general confusion) with the claim that the number they name is not the biggest that could ever be:

E: Can you think of a bigger number?

C: A million.

E: Is that the biggest number there could ever be?

C: No.

E: Can you think of a bigger number?

C: I don’t know. (Cheung et al. 2017, p. 33)

The other interpretation that I want to put forward is based instead on the distinction I made earlier between grasping the syntactic structure of the numeral system and interpreting the numerals as designating cardinal values. If the child (the one in question was 5  years, 3 months) hasn’t made the connection between these two, then we may explain the above behaviour. A claim about there not being a largest number may be related to cardinal values, while the question about counting relates to the syntactic structure of the numeral system. When that syntactic structure is grasped it is easy for children to form larger numbers, as the earlier interviews should show. This child hasn’t grasped the syntactic structure of the numeral system and therefore doesn’t know how to keep on counting, even though the child does know (e.g. on the basis of explicit instruction from his or her parents) that there are ever larger cardinal values. One indication for that latter claim is that the response to ‘why do they go on forever’? was “Because God made them”. (Cheung et al. 2017, p. 33) This doesn’t clash with my account that children learn about the infinity of the natural numbers via the syntactic structure of the numeral system, since on this explanation their answer that there is no largest number is a rehearsal of what the parents told the child. Proponents of the plus-one strategy will have to say something similar, since children deny that you can keep adding one—and so they presumably haven’t learned about the infinity of the natural numbers along the lines of the plus-one strategy either. So, on either interpretation of the data this stage can be accounted for by my alternative account.

The third stage, where children claim that the numbers don’t go on forever but that you can keep counting, can be analysed in the same way. The one interview that Cheung et al. (2017) report is from a child who says that the largest number is 2083 and that while you can keep counting, “C: Numbers do not go on forever because if you keep on counting, it takes you back to 0”. (Cheung et al. 2017, p. 34) In this case the child has simply misunderstood the syntactic structure of the numeral system. Yet it also means that there is something missing in the interpretation of numerals as designating cardinal values. Since the child is happy to affirm that you can keep counting, i.e. that you can keep adding one, it can’t be that he or she fully realises that this increases the cardinal value from 2083 to something higher. Here an abnormal count sequence is maintained, but that can only be possible (as a consistent practice) if the count sequence and the procedure of adding one isn’t also viewed as changing the designated cardinal value.

The data we currently have can be interpreted as support for the hypothesis that a child has correctly grasped the syntactic structure of the numeral system without interpreting numerals as designating specific cardinal values, especially since there is further data in support of the claim that (syntactic) knowledge of the numeral list precedes knowledge of the progression of cardinal values (Fuson 1988). On this interpretation children maintain that you can keep counting forever, because syntactically that is possible. However, they also maintain that there is a largest number, because number is viewed as cardinal value (not to claim that this is something a child would say, but for example: the number of atoms in the universe is the largest number). As long as these two elements are not combined, children end up with at most partial knowledge regarding the successor axiom. This means that the current hypothesis can account for all the developmental stages that were identified by Cheung et al. (2017), Evans (1983), Gelman (1980), Hartnett (1991) and Hartnett and Gelman (1998) in terms of aspects of the structure of the numeral system. In fact, Cheung et al. (2017, p. 32) also briefly suggest that a viable interpretation of the data is that the structure of the numeral system is instrumental in learning that there are infinitely many natural numbers (though via the plus-one strategy, so without my second claim).

Even so, there is one last argument that needs to be discussed. The fact that stage 2 is very uncommon and stage 3 common can be seen as an argument in favour of the plus-one strategy. The argument would be that my alternative account cannot explain why it is strange for children not to realize that you can keep adding one, even though they know that there is no largest number. After all, I do not claim that they learn this by reflecting on the procedure of adding one and so they could just ignore this procedure. My account would have a harder time explaining why stage 2 is uncommon, and so the plus-one strategy better fits the data. My interpretation of the data offers an answer to this argument. I suggest that children in this stage generally don’t grasp the syntactic structure of the numeral system. This means that they also cannot arrive at the conclusion that there is no largest number via my alternative account. The explanations for how we still get this data, i.e. the two interpretations I gave above, will be basically the same for my account and for the plus-one strategy. So, on either account stage 2 is expected to be uncommon, and on either it presents some difficulties to explain how it’s possible that children give these answers. In short, this fact doesn’t decide between the two accounts and my alternative account is empirically viable.

5 Conclusion

Current philosophical accounts explain our knowledge of the successor axiom (directly or indirectly) on the basis of the procedure of adding 1. I have put forward an alternative hypothesis that explains this knowledge on the basis of the structure of the numeral system. Yet much is still unclear. Children eventually reason along the lines of the adding one strategy and we don’t yet know how they get to there from the earlier stage I have focussed on. Even for that early stage there is too little data to claim with any confidence that children only rely on the structure of the numeral system, or that they then focus on digits with a high place-value. As I have noted in passing, it might well be that children rely on the adding one procedure for numbers below 100 and on the structure of the numeral system for numbers above 100. A lot of data is still needed (not only for numbers below 100, but also for syntactically more complex numbers, such as 2001) and for that reason neither of the accounts can currently be established as the correct one. Instead, I have argued that my alternative hypothesis is currently viable and therefore merits attention.

My argument for the viability of my alternative hypothesis consisted of several parts. First, in Sect. 3, I argued that the response patterns seen in a game played by children indicate that they are capable of forming larger numbers in accordance with my suggested alternative. In relation to the same experiments I noted that their reasoning, when queried whether one can keep forming larger numbers, also displays this kind of reasoning. Even so, this data does not undermine the plus-one strategy, since there are other reasons (such as syntactic simplicity) why they might have preferred these kinds of responses.

Second, in Sect. 4, I argued that the alternative hypothesis accounts for the consistency of children’s concepts with the successor axiom in such a way that it is in line with what we know about the development of this knowledge. The moment at which children display behaviour that is consistent with the successor axiom is correlated with the moment at which they have a fairly robust knowledge of the structure of the numeral system. Furthermore, the different developmental stages identified by Cheung et al. (2017) and others can be interpreted with the help of the different aspects of the structure of the numeral system, as put forward in Sect. 2. Finally, I also argued that the alternative hypothesis is a formally viable way of acquiring number concepts consistent with the successor axiom. All in all, I have argued that there are in fact two viable accounts that explain how we learn that there are infinitely many natural numbers. The plus-one strategy is one option that fits the data. Another option that fits the data is that we use the syntactic structure of the numeral system, paying special attention to the leftmost digits.