Abstract
Most accounts of our knowledge of the successor axiom claim that this is based on the procedure of adding one. While they usually don’t claim to provide an account of how children actually acquire this knowledge, one may well think that this is how they get that knowledge. I argue that when we look at children’s responses in interviews, the time when they learn the successor axiom and the intermediate learning stages they find themselves in, that there is an empirically viable alternative. I argue that they could also learn it on the basis of a method that has to do with the structure of the numeral system. Specifically, that they (1) use the syntactic structure of the numeral system and (2) attend to the leftmost digits, the one with the highest placevalue. Children can learn that this is a reliable method of forming larger numbers by combining two elements. First, a grasp of the syntactic structure of the numeral system. That way they know that the leftmost digit receives the highest value. Second, an interpretation of numerals as designating cardinal values, so that they also realise that increasing or adding digits on the lefthand side of a numeral produces a larger number. There are thus two, currently equally wellsupported, ways in which children might learn that there are infinitely many natural numbers.
Keywords
Successor axiom Epistemology Number concepts Arithmetical cognition1 Introduction
My aim is to offer an account of children’s knowledge that is based on the structure of the numeral system and not on the procedure of adding one. I argue that such an account is a viable alternative to the existing (plusone) account, even though it is not yet possible to decide between the two. First, I lay some of the groundwork for my own position by discussing what we know about our grasp of numbers that are represented by more than one digit (e.g. 84). That provides a basis of information about the mechanisms that are used to understand the structure of our numeral system, which I claim can explain how we acquire concepts that are consistent with the successor axiom. The work in Sect. 2 thus prepares the different elements for this account by discussing the structural aspects of the numeral system and how children learn these aspects. It is particularly relevant for Sect. 4, where the aspects of the numeral system identified in Sect. 2 is used to interpret parts of the empirical data.
Before working out my own account I briefly discuss the kind of account that has been prevalent in the philosophical literature. This is the alternative explanation of how we acquire concepts consistent with the successor axiom. Philosophers such as Parsons (2007) have claimed that we can acquire knowledge on the basis of the successor relation. I do not criticise the possibility of such an approach, but do claim that this is not the only currently viable hypothesis. The alternative is my own account based on the syntactic structure of the numeral system. This account consists of two claims: (1) the syntactic numeral system supports children’s learning that the numbers never end and (2) children use digits with a high placevalue (further to the left) to form continually larger numbers and thus figure out that one can always arrive at a larger number. The argument that this alternative account is viable is distributed over Sects. 3 and 4. In Sect. 3.2 I discuss some of the empirical data that shows that children are able to form larger numbers by attending to the leftmost digits of numbers, thus exploiting the structure of our numeral system. Section 4 then presents my alternative account in detail and is there argued to fit the data from Sect. 3.2 as well as additional data identifying the stages in which children seem to learn that there is always a larger number. Yet, despite it fitting the data, there are not sufficiently many studies to argue that my alternative account is correct, or to determine its relation to the traditional accounts. It is a viable alternative, but exactly what place it should get (e.g. whether it is merely a first step children go through or captures the entire learning process) is still unclear.
2 Working with multidigit numerals
2.1 Processing of multidigit numerals
One may already suspect that there is a cognitive difference between small and large numbers on the basis of simple introspection. It is simple to imagine four apples, but to have a clear image of 83 apples is nowhere near as simple. We do not learn about these ‘large’ numbers on the basis of counting, as we do in the case of four (e.g. Sarnecka and Lee 2009). Of course this does not imply that there are significant differences in the way our brain handles these numbers. It turns out, however, that we also process singledigit and multidigit numbers differently. There is more and more evidence that our brain processes multidigit numbers on a digitbydigit basis, which is usually referred to as compositional processing (Nuerk et al. 2015). The alternatives, namely processing the number as a whole, is commonly referred to as holistic processing. There is also a hybrid option, but that is usually not preferred over the other two for parsimony reasons.
Before going into the consequences of the high likelihood that our brains process mutlidigit numbers compositionally, it may help to mention some of the reasons (there are more in the article cited above) why this is considered to be the best explanation of the data. One of the main reasons is based on response times to questions of the form ‘which of these two numbers is larger?’ One wellknown phenomenon, called the distance effect, is that response times go down if the numerical difference between the two numbers increases (Moyer and Landauer 1967). What is curious, is that for multidigit numbers there is not only a distance effect, but also a unitdecade compatibility effect: response times are faster if the unit and decade digits of the larger number are individually larger than the corresponding unit and decade digits of the smaller number. So, response times to \(67 > 52\) are shorter than response times to \(62 > 47\), because for the former \(6 > 5\) and \(7 > 2\), but for the latter we have \(6 > 4\) and \(2 < 7\) (Nuerk et al. 2001, 2002; Mann et al. 2012). This is an effect that can be explained only if processing of multidigit numbers is compositional or hybrid, where compositional processing is the preferred hypothesis because hybrid processing doesn’t explain any features better than compositional processing (Moeller et al. 2011).
Another important reason for thinking that multidigit numbers are processed compositionally is known as the serialorder effect. This effect is found when participants are first presented with two numbers and are subsequently asked if a third number was among those two initial numbers. It turns out that if there is a simple arithmetic relationship between the digits of the three numbers (e.g. 23, 45, 67) then it takes longer for participants to reject the third number as not being among the initial two, as compared to when this is not the case (e.g. 23, 45, 65). Such an effect is best explained by compositional processing, as in that case there is attention to the digits on their own and not only to the twodigit numbers as a whole (GarcíaOrza and Damas 2011).
A final piece of data that may be worth mentioning is that there is a significant influence of the structure of number words in one’s native language on how easy it is to learn the Indoarabic numerals. One languagerelated effect is the inversion effect, where children make more errors in writing down the corresponding Indoarabic numeral if their native language reverses the unit and decade digit [e.g. ‘fünfundvierzig’, fiveandfourty for 45. Klein et al. (2013)]. On the other hand, such errors have not been observed for languages that do not have this kind of inversion, such as French (Barrouillet et al. 2004; Camos 2008). This difference between languages also leads to an increased unitdecade compatibility effect for languages with inversion, because inversion makes it more difficult to determine which digits have a higher value (Nuerk et al. 2005; Pixner et al. 2011). Language inversion even makes it more difficult to solve addition problems that involve carries, because again one needs to know which digits are the unit digits and which the decade digits in order to be able to correctly carry the sum of the units (Moeller et al. 2011a; Göbel et al. 2014).
2.2 Our grasp of multidigit numbers
What the psychological data shows is that the structure of the numeral system is an incredibly important component to our grasp of multidigit numbers. Because these numbers are processed compositionally it is essential to be aware of the fact that Indoarabic numerals use a placevalue system with base 10. In other words, it means that it is not possible for us to parse a Indoarabic numeral as a whole without paying attention to the peculiarities of the numeral system. Consequently processing would be different if we were to use another numeral system, such as the Roman numeral system (Lengnink and Schlimm 2010). For in the case of the Roman numeral system the specific place in which a digit is located has little to do with the value—especially in the original numeral system which was purely additive (and therefore wrote four as IIII instead of as IV).

The syntactic structure of the numeral system, which is required for producing a correct count list

The knowledge that these numerals indicate cardinal values, so that in applying them we need to use a different numeral if an object is added to the relevant collection of items

Linking the syntactic structure of the numeral system to the interpretation of numerals as cardinal values
3 The successor axiom via adding one
3.1 Current philosophical accounts
Usually philosophical accounts that try to explain specifically how we come to know that the there are infinitely many natural numbers appeal to the procedure of adding one. This may be indirect, in terms of the proof of the successor axiom when the Dedekind–Peano axioms are embedded in Frege arithmetic (Linnebo 2004).^{1} The suggestion in that case is that one might learn about the infinity of the natural numbers through the construction of the number sequence. The Fregean proof of the successor axiom is based on defining the successor number of n as the number of numbers that are smaller than or equal to n. Since the numbers start with 0, this new number is exactly equal to \(n+1\). Naturally, no one expects children to acquire concepts consistent with the successor axiom by following this proof (and it is indirect for that reason; no learning procedure is suggested and it is only shown that there are infinitely many numbers in Frege arithmetic). Still, the general strategy of going from n to \(n+1\) is visible here.
It may also be direct (so by giving an explicit learning procedure, even though typically not one that has children in mind), such as in the account of Parsons (2007), further expanded upon by Jeshion (2014). Their account describes a situation where one acquire knowledge of the Dedekind–Peano axioms for a particular kind of numeral system: the strokelanguage. This is a tally system, where zero is encoded by a single  and the successor function is the function that appends a  to a string. On the basis of this strokelanguage, Jeshion gives two ways in which one may acquire knowledge of the successor axiom, both based on Parsons’ less elaborate account that we learn this by figuring out that we can always extend a string of strokes by appending one more stroke. Note moreover that, while these are ways to acquire knowledge of the successor axiom, the suggestion is not that we literally learn the successor axiom. They are rather suggestions for a learning process with the outcome that we realize that there are infinitely many numbers, i.e. the process at the end of which are number concepts are (merely) consistent with the successor axiom.
First, it is possible for someone to imagine a string of strokes, to which one then appends another stroke in order to realise that for every stroke there is a successor that is exactly one stroke longer. In order for that general conclusion to be reached, it is important that the initial stroke is imagined in a very particular way. Namely, while one has to imagine a particular stroke string (to be sure that you are representing a stroke string), the internal structure should not be represented, in order to have the required generality. So, one is supposed to imagine something of the form \( \ldots \) to which you can then imagine appending a stroke \( \ldots \). Since the appending of the stroke string didn’t depend on the representation of the internal structure of the string it is possible to conclude that the successor axiom, in all its generality, is true of the stroke language (Jeshion 2014, p. 338).
Second, instead of building the generalisation into the thoughtexperiment, one may take that step afterwards. So, one starts by imagining a particular stroke string, along with its internal structure. Then one imagines appending a stroke to that particular string. In order to arrive at knowledge of the successor axiom, one then has to realise that the ability to append another stroke is completely independent from the number of strokes in the string to which these are appended. Consequently it is possible to arrive at a successor string for any string (Jeshion 2014, p. 338).
From the psychology side there is little in the way of explicit accounts as to how one learns that for every number there is another, larger, number. There is, however, an implicit suggestion in the way in which the final learning stage, being a carinalprincipleknower after having been a subsetknower, is characterized (Carey 2009). A subsetknower is a child who knows the meaning of the number words up to three or four. Children are counted as knowing the meaning of a number word if they succeed at the giveN task: the task where they are asked to give the experimenter N items. If they make no more than two mistakes when prompted to give three items, then they are counted as knowing the meaning of the word three. It turns out that children learn the meaning of the first few number words in order, so first one, then two, etc. (Wynn 1990, 1992). When they get past four they also typically learn the Cardinal Principle and are counted as cardinalprincipleknowers. Sarnecka and Carey explain the principle as follows: “Only cardinalprincipleknowers understand that adding exactly 1 object to a set means moving forward exactly 1 word in the list, whereas subsetknowers do not understand the unit of change” (Sarnecka and Carey 2008, p. 662). Knowing this, which is the Cardinal Principle, is what distinguishes a CPknower from a subsetknower – it is just that typically children learn the Cardinal Principle after learning the meaning of four.
The suggestion that the plusone strategy is also endorsed here is found in another description of CPknowers: “it seems that CPknowers (but not subsetknowers) have begun to connect the counting list and counting routine to the idea of a successor – that each number has a next number, exactly one more, which is named by the next word in the list” (Sarnecka 2015, p. 12). Given this attribution of knowledge to children [it is still somewhat contested in the literature whether children actually know all of this at the stage discussed by Sarnecka and Carey (Davidson et al. 2012)] there is a clear way to figuring out that there is always a larger number. The strategy outlined by Parsons and Jeshion in fact fits rather well: one either imagines adding an object to a collection without its internal structure, which would force one to move to the next numeral (possibly then not in the count list). Or, one imagines a particular collection to which an object is added, after which one can realise that the possibility of adding an object is completely independent form the number of objects already in the collection.
The relevant commonality between these different philosophical positions is that they all work along the same line: we acquire number concepts that are consistent with the successor axiom on the basis of the successor function. In other words, we can learn that there are infinitely many numbers through the procedure of adding one. In the rest of this paper I argue that there is a viable alternative explanation: that we learn that there are infinitely many numbers on the basis of the structure of the numeral system.
3.2 Children’s behaviour when constructing larger numbers
If children acquire number concepts consistent with the successor axiom in virtue of a realisation regarding the successor function, one might expect them to appeal to the process of adding one when asked questions related to the successor axiom. Both of the above strategies from Jeshion involve constructing a longer string (a larger number) by direct application of the successor function. One may thus expect that on the plusone account children are predicted to form larger numbers by adding one, at least in typical tasks. They don’t form larger numbers in this way, at least not in the case of younger children. As I explain a little later this is not a good argument against the plusone strategy, but it does provide an argument in favour of the viability of my alternative presented in Sect. 4. This data helps establish that children can reason on the basis of the structure of the numeral system to form larger numbers.
Researchers have used responses to a game as an important tool to test whether children realise that for every natural number there is a larger natural number (in the studies described here children were between 6 and 15). The game itself is very simple and consists of two moves: first the starting player names a number and then the second player names a number. The person that names the higher number wins, which means that the second player can always win. The game is then used to test whether children grasp that there is always a larger number by asking the child before the game whether she wants to be first or second. Specifically, children were presented with the following query: “Lets play a game. Each of us will say a number, the one whose number is greater wins. Would you like to be first or second in choosing a number?” (Falk 2010, p. 8) After answering, the child was asked to explain his or her choice. Regardless of the provided answer both players then give a number and a winner is determined, a procedure that was repeated no more than four times (primarily to make sure that the children understood the game). Having ended the game, the experimenter suggested taking turn with the child at naming larger and larger numbers. This part was ended either when the child failed to name a higher number or after it had gone well for a period of time, leading to the final part of the experiment which consisted of an openended interview about whether or not the sequence could be continued indefinitely. In these studies children who pick to go second are generally considered to display knowledge that you can always find a larger number, i.e. that there is no largest number. This was further verified in the interviews by explicit questioning.
G9 [in the increasing sequence]: (500) 600 (20,000) 30,000 (million) 2 millions (billion). What is that? (A very large number) Aha! [ironically] 2 billions (trillion) 2 trillions (googol) 2 googols. (Will the numbers end up for us?) No, because each time you make up a number, I can add to it. (Did you recognize these numbers?) No. (How did you manage to find bigger numbers?) Because you said. (Falk 2010, p. 35)
G10 [in the continuing sequence of Game 1]: (120) 900 (1,000) 1,200 (2,000) 3,000 (million) million and a hundred (2 millions) 10 millions (billion). What is that? (A very large number). I don’t know, I cannot go on because I don’t know any more names of numbers. (Falk 2010, p. 37)
Not only are there examples here of the pattern in which children choose to give higher numbers. The last example, from the reasoning related to the increasing sequence, also shows that the underlying reasoning here is not focussed on adding one. When asked in an ordinary interview if children can find larger numbers they tend to give responses along the same lines (with I the interviewer and A an 8 year old child):B6 [Game 1]: (How long will the game go on?) Until no end; the numbers will never stop, because if there is, say, a milliard, one can add to it another and another milliard until infinity (Falk 2010, p. 35)
It seems that children only start to reference the procedure of adding one to obtain a larger number in these tasks later on. For example, one fifth grader (so 10–11 year old) says: “There are infinitely many natural numbers because if I pretend that I found the biggest, I can add 1 and I get a bigger one” (Singer and Voica 2008, p. 196). Around that age the explanations given with regards to the game in Falk’s study also start to change:I: You wouldnt stop? How is that? I claim that I can say a very big number and you will stop there!
A: Which?
I: Well...27 quadrillion 842 trillion 520.
A: There is one bigger!
I: Which, which one?
A: 100 quadrillions of quadrillions of quadrillions. (Singer and Voica 2008, p. 195)
G10: I want to be second because so I’ll hear what you say and I’ll say something bigger ...I think one cannot ever finish that game because each time you can add one and another one, and there is no end to it. (Falk 2010, p. 35)
G13 [Game 1]: There is always a larger number. Not always all of them have names. There are numbers that were not given a name because they are so big, but you can always add one to them. (Falk 2010, p. 36)
Yet even here there is the occasional mention of an alternative strategy:G10 [Game 1]: Each number that you say I can add one to it. One cannot finish the game. (Falk 2010, p. 36)
The pattern of responses and the bits of reasoning that have been recorded show that children do not always rely on the procedure of adding one to form higher numbers, or come to the conclusion that one can continue a sequence of numbers indefinitely. Although children do eventually adopt the more uniform strategy of forming a larger number by adding one, there is a period of time before that during which they reach the same conclusions, but without the reasoning associated with the plusone strategy. Children do not start by responding in accordance with the pattern outlined by the current philosophical theories, even if they do conform to that pattern of responses a few years later.G11: No, the game will not end. Perhaps in words there is an end, but when you write a number, you can always write another zero and another zero, so that it will not end for the life of me. (Falk 2010, p. 35)
One reason why children may give these response patterns is that they are syntactically simpler. It is syntactically simpler to respond to ‘five hundred’ with ‘six hundred’ than with ‘five hundred and one’. Similarly, the recorded response to ‘one hundred and twenty’ (‘nine hundred’) is syntactically much simpler than the response obtained by adding one (‘one hundred and twentyone’). That might be why children give these responses rather than responses in line with adding one. For that reason, the response pattern should not be taken as evidence that children do not reason along the lines of the plusone strategy. It may well be that they understand everything needed for the plusone strategy and reason in that way, but choose their responses differently for reasons of simplicity. Nevertheless, the fact that they can provide responses in this way does show that the strategy I will work out in the next section is available to them. Since they prefer answering by increasing the leftmost digit, they are aware of the fact that doing so produces a larger number.
The response patterns that we find are therefore not a good argument against the plusone strategy. The explicit motivations given by children when asked why the game never stops are in a bit of a conflict with the plusone strategy, if one thinks that these motivations display children’s reasoning about infinity. When young children, as cited above, say that the game never ends because even to a milliard (a dated, but acceptable, term for a billion) one can add another milliard, then this reasoning doesn’t fit the plusone strategy. But here too there may be reasons for this that are different from their not using the plusone strategy to acquire number concepts consistent with the successor axiom. In short, we don’t have any argument that shows that the plusone strategy is not the one used by children.
These response patterns should, moreover, not be confused with the fact that children display some understanding of the successor function before they figure out that there are infinitely many natural numbers. Davidson et al. (2012) and Cheung et al. (2017) found that a little before learning that the numbers go on forever children manage to consistently name the next numeral, also for complex numerals (i.e. up to 100). That finding is consistent with results presented by Fuson and Hall (1983) that children learn the syntax of these numerals quite late. It is tempting to interpret this as evidence in favour of the plusone strategy, but caution is in order. What these findings show is that children acquire a good grasp of the syntax of complex numerals right before they figure out that there are infinitely many numbers.
Knowing the next numeral in the list, and that you arrive at the next numeral when one item is added to a collection [the CP principle, which children supposedly learn before understanding the syntax of complex numerals (Sarnecka and Lee 2009)—though it is not clear that they immediately apply this principle to the more complex numerals (Davidson et al. 2012)] is not sufficient for having concepts consistent with the successor axiom through the plusone strategy. It could also lead to their figuring out the structure of the numeral system and following the strategy I present in the next section, since this ability can be interpreted as displaying primarily syntactic knowledge about the numerals. In line with such a hypothesis Le Corre (2014) argues that children learn the ordering of slightly more complex numbers (numbers up to 10 were tested) before learning their cardinal meaning, even though they do not yet know the ordering for higher numbers (e.g. that 8 < 10) when they become CPknowers. This dissociation between syntax learning and the learning of cardinal meaning further supports the idea that the mere understanding of the structure of the numeral system does not automatically lead to an interpretation of this structure in terms of the plusone operation. At the moment, then, these findings do not decide between the two accounts.
4 The successor axiom via the structure of the numeral system
4.1 Forming larger numbers
Even now, this gloss doesn’t cover all of the responses that have been given by children. A 10 year old girl responded to 120 with 900, in line with this strategy, but to 1000 with 1200 (probably uttered as ‘twelve hundred’) and to a million with a million and a hundred. At least the last response goes against the considerations of syntactic simplicity mentioned earlier, though notably the more complex response is not simply adding one. This deviating case also illustrates the fact that there is far too little data to say anything definite about the strategy being employed in answering these questions. There have been, as far as I know, no tests with numbers such as 2001 where one can see if children prefer 3001 or 2002, both syntactically equally simple answers. Nor have there been tests with numbers below 100, to see if children prefer answering to 79 with 80, 89 or something like 100. Since (more on this later) children only seem to display knowledge of the endlessness of the natural numbers when they can count to a hundred, it may well be that they follow a completely different strategy for numbers below a hundred. Yet, at the same time, what strategy they follow below 100 may not be that important: what matters for getting number concepts consistent with the successor axiom is that one has a strategy for forming ever larger numbers. If this strategy only kicks in after 100, because children already know all the numbers up to 100, then it might be that a dual account where children add 1 below 100 and go by the placevalue of digits above 100 is a better fit. In short, there are still some significant gaps in our knowledge of these response patterns and the reasoning behind children’s conviction that one can always form a larger number. The current suggestion is an initial attempt at structuring the available data into a hypothesis, that then needs to be subjected to further testing. What is interesting in this alternative hypothesis is that it offers another way in which one might come to know that there is always a larger number. That this is the case has to be argued for, and it is what I turn to next.
In order to know if this is a strategy that could be used to acquire number concepts that are consistent with the successor axiom, two things need to be established. First, that this strategy can be used to learn that \(\forall x \mathbb {N}x \rightarrow \exists y(\mathbb {N}y \wedge y > x)\). Second, that this also implies that one’s number concepts are consistent with the standard formulation of the successor axiom: \(\forall x (\mathbb {N}x \rightarrow \exists y\, xPy)\). The next subsection takes up these two elements and thus argues that the current hypothesis gives a viable alternative account, though it cannot yet be established if it is also the correct account.
4.2 From forming larger numbers to consistency with the successor axiom
Let me start with the question whether the strategy of increasing the value of the leftmost digit or adding a digit on the lefthand side may be used to learn that there is another, larger, number for every natural number. Because of the fact that the numeral system is based on a placevalue system this is a mathematically sound strategy for creating a larger number. Note that this hinges on the use of a placevalue system: if we were to use the roman numeral system that also includes subtraction then adding a digit to the left may produce a lower number (e.g. adding I to the left of V gives us IV, which is lower). Because in our numeral system adding a digit to the left is guaranteed to generate a larger number, this is an acceptable strategy, Similarly, increasing the value of the leftmost digit is always guaranteed to generate a larger number. So as a method of generating a larger number for any number presented it is one that, regardless of how one fills in the specifics, will always work. Therefore this method can be used to show that \(\forall x \mathbb {N}x \rightarrow \exists y(\mathbb {N}y \wedge y > x)\), because we have a way of generating a y for an arbitrary x.
How might we realise that this method of increasing a number does indeed produce a larger number? After all, a philosophical account cannot just stop at noting that a method is sufficient for producing larger numbers. It also has to provide an account of a way in which one can convince oneself of the fact that the method in question always produces a larger number. It should be mentioned that children seem to realise that this is a method that always produces larger numbers. In the interviews there are some indications that they think along the lines of the above strategy and think that that will always yield an even larger number. So, if this strategy fits their thinking then it is not just a procedure that they happen to follow without being consciously aware of the implications of this construction method for the endlessness of the natural numbers. Hence, there has to be some way that the children use to get to this knowledge from the availability of the above method for increasing numbers.
Here is one way in which this can happen. Children need to learn the syntactic structure of the numeral system (recall the discussion of this structure in Sect. 2), which means that they need to learn that the leftmost digit indicates a value ten times that of the digit that’s to the right of it (if there is one). Or, put more sparsely, they need to learn that you increase the leftmost digit by one only after ten increases in the digit to its right (if there is one). As a result of this knowledge, which is necessary for the most elementary grasp of the numeral system, children can realise that the leftmost digit is the most important when it comes to the designated value. It is the one that carries the most weight when comparing two numbers, etc. When the syntactic structure is linked to an interpretation of numerals as designating cardinal values, children need to learn that the leftmost digit therefore designates a higher value (tens, hundreds, etc.) than digits to its right. Here a syntactic feature of the numeral system is linked to the semantic interpretation. Because of the placevalue system the leftmost digit always indicates the highest value. Knowing how the placevalue system works therefore enables children to know that the leftmost digit always indicates the highest value of all the digits that are present.
Once children know enough about the structure of the numeral system they can realise that the above method is guaranteed to provide a larger number. They already know that the leftmost digit always indicates the highest value. Consequently they can know that increasing that value will produce a larger number. And increasing a digit is quite simple as they mainly need to know that \(2 > 1\), \(3 > 2\), up to 9. Adding a digit to the left can be seen to guarantee a larger number on the same principle: the leftmost digit always indicates the highest value, so placing a 1 to the left will yield a higher number. The reasoning can therefore happen in relation to something that is already general: the structure of the numeral system. There is no need to imagine specific numerals, because we can start from a more general principle. Therefore there is also no need, as in Jeshion’s first thought experiment, to imagine a numeral without representing its internal structure.
One last thing to discuss before moving to the question whether concepts consistent with \(\forall x \mathbb {N}x \rightarrow \exists y(\mathbb {N}y \wedge y > x)\) are also consistent with the successor axiom is whether the current hypothesis also fits with the moment in time when children seem to realise that the numbers go on forever. For the hypothesis to be viable it shouldn’t be the case, for example, that children seem to have number concepts consistent with the successor axiom years before they are aware of the structure of the numeral system. In fact, were children to learn that we can always form a larger number on the basis of their knowledge of the structure of the numeral system, then the moment when they realise that you can keep on forming larger numbers should be correlated with their knowledge of the structure of the numeral system. The work of Cheung et al. (2017) found that these two are indeed correlated. They looked at the relation between children’s ability to count up to 100, their ability to name the successor (in terms of adding one) of a number and whether or not they realise that we can always form a larger number. They found that most children only start to answer that you can always find a larger number and that there is no largest number when they can count at least to 100 and when they perform nearperfect on the task of naming the successor of a given number. In other words, there is a strong correlation between knowing that it is always possible to form a larger number and counting ability. There is also a strong correlation between this knowledge and being able to name the successor of a given number, which I interpreted above as being able to interpret the numerals as specific cardinal values. Cheung et al. (2017) only report that these correlations hold with the combination: knowing that you can keep adding one (which seems to have been interpreted by some as asking if you can keep counting) and knowing that there is no largest number.
Further research will have to show what the exact correlations are between counting ability and knowledge that you can always keep counting. For my current purposes the study by Cheung et al. (2017) shows that there is no mismatch between the acquisition of concepts consistent with the successor axiom and knowledge of the structure of the numeral system. More practice with counting, in particular with not making mistakes with the peculiarities of the placevalue system, indicate a better grasp of the structure of the numeral system. Learning this structure of the numeral system, i.e. the placevalue system, is moreover quite difficult and children only succeed at doing so relatively late (Fuson 1990; Fuson and Briars 1990) For the hypothesis that children learn that you can always form a larger number on the basis of the structure of the numeral system to be viable there has to be evidence of such a correlation. The fact that this correlation was significant, but that there was no significant correlation with age supports my argument that the hypothesis is a viable alternative. Furthermore, there has been some research into how well children can separate out the different digits. As it turns out, they only become good at doing so when they are able to count to roughly 100 (Fuson and Hall 1983; Siegler and Robinson 1982; Rule et al. 2015)—exactly the point where Cheung et al. (2017) noticed that children start to realise that there is always a larger number. Knowledge that you can always form larger numbers correlates with knowledge of the structure of the numeral system. The strategy outlined above, or something similar dependent on placevalues, is thus a viable alternative hypothesis for how children learn that the natural numbers go on forever. The argument for this claim will be extended in the next subsection, by showing that the hypothesis can also account for the developmental stages that have been observed. There is just one technical detail to establish first, namely that the strategy really gets us to the successor axiom.
I have been describing a strategy for acquiring concepts consistent with \(\forall x \mathbb {N}x \rightarrow \exists y(\mathbb {N}y \wedge y > x)\). Naturally, it is required that this is also formally sufficient for consistency with \(\forall x (\mathbb {N}x \rightarrow \exists y\, xPy)\). Fortunately, \(\forall x \mathbb {N}x \rightarrow \exists y(\mathbb {N}y \wedge y > x)\) implies the standard formulation of the successor axiom under the standard definition of \(y > x\). This is usually defined as \(\exists z (x + z = y)\). Addition, in turn, is defined in terms of the successor (or predecessor) relation.^{3} So, if there is a natural number larger than x then this means that for some z there are z numbers between x and that larger number. Therefore, there has to be a y such that xPy (that y is the successor of x). In the other direction \(\forall x (\mathbb {N}x \rightarrow \exists y\, xPy)\) also implies that \(\forall x \mathbb {N}x \rightarrow \exists y(\mathbb {N}y \wedge y > x)\), because the y that is supposed to be larger than x can simply be chosen to be the successor of x that is guaranteed to exist by the successor axiom. Therefore, the two are logically equivalent and in terms of the structure of one’s number concepts it doesn’t matter if they are consistent with \(\forall x \mathbb {N}x \rightarrow \exists y(\mathbb {N}y \wedge y > x)\) or with \(\forall x (\mathbb {N}x \rightarrow \exists y\, xPy)\). Consistency with either one of these has the same consequences for what our natural number concepts have to be like.
4.3 The different developmental stages
Another finding that Cheung et al. (2017) report is that they have found four different stages of knowledge that children can be in. This corroborates earlier findings regarding stages of knowledge by Evans (1983), Gelman (1980), Hartnett (1991) and Hartnett and Gelman (1998). First, they may not realise that there is no largest number, nor that it is possible to always add 1 (mean age 5.2 years). Second, they may know that there is no largest number but not realise that this implies that you can always add 1 to a number (mean age 5.5 years). Third, children may not know that there is no largest number, even though they think that you can always add 1 (mean age 5.3 years). Fourth, they may know both that there is no largest number and that you can always add 1 to a number (mean age 6.2 years). When, and in what order, children go through these stages is mostly unclear, since there was a range for all of these categories from about 4 to 7 years. This seems to fit with Falk (2010), who found that at 6–7 about 50% manage to explain why going second in their game gives a winning strategy, which increases to 80% at 10–11. This supports the relevance of the data from Falk (2010) for showing that children are in a position to follow my alternative strategy, since the younger children who were interviewed were thus still in the process of learning about the infinity of the natural numbers. However, since we see that there are these stages, a hypothesis about how children acquire number concepts that are consistent with the successor axiom should be able to account for the existence of the different stages. The last part of my argument that this hypothesis is a viable alternative is therefore to show that these stages can be interpreted in terms of aspects of the structure of the numeral system, as those were highlighted in Sect. 2.
The other interpretation that I want to put forward is based instead on the distinction I made earlier between grasping the syntactic structure of the numeral system and interpreting the numerals as designating cardinal values. If the child (the one in question was 5 years, 3 months) hasn’t made the connection between these two, then we may explain the above behaviour. A claim about there not being a largest number may be related to cardinal values, while the question about counting relates to the syntactic structure of the numeral system. When that syntactic structure is grasped it is easy for children to form larger numbers, as the earlier interviews should show. This child hasn’t grasped the syntactic structure of the numeral system and therefore doesn’t know how to keep on counting, even though the child does know (e.g. on the basis of explicit instruction from his or her parents) that there are ever larger cardinal values. One indication for that latter claim is that the response to ‘why do they go on forever’? was “Because God made them”. (Cheung et al. 2017, p. 33) This doesn’t clash with my account that children learn about the infinity of the natural numbers via the syntactic structure of the numeral system, since on this explanation their answer that there is no largest number is a rehearsal of what the parents told the child. Proponents of the plusone strategy will have to say something similar, since children deny that you can keep adding one—and so they presumably haven’t learned about the infinity of the natural numbers along the lines of the plusone strategy either. So, on either interpretation of the data this stage can be accounted for by my alternative account.E: Can you think of a bigger number?
C: A million.
E: Is that the biggest number there could ever be?
C: No.
E: Can you think of a bigger number?
C: I don’t know. (Cheung et al. 2017, p. 33)
The third stage, where children claim that the numbers don’t go on forever but that you can keep counting, can be analysed in the same way. The one interview that Cheung et al. (2017) report is from a child who says that the largest number is 2083 and that while you can keep counting, “C: Numbers do not go on forever because if you keep on counting, it takes you back to 0”. (Cheung et al. 2017, p. 34) In this case the child has simply misunderstood the syntactic structure of the numeral system. Yet it also means that there is something missing in the interpretation of numerals as designating cardinal values. Since the child is happy to affirm that you can keep counting, i.e. that you can keep adding one, it can’t be that he or she fully realises that this increases the cardinal value from 2083 to something higher. Here an abnormal count sequence is maintained, but that can only be possible (as a consistent practice) if the count sequence and the procedure of adding one isn’t also viewed as changing the designated cardinal value.
The data we currently have can be interpreted as support for the hypothesis that a child has correctly grasped the syntactic structure of the numeral system without interpreting numerals as designating specific cardinal values, especially since there is further data in support of the claim that (syntactic) knowledge of the numeral list precedes knowledge of the progression of cardinal values (Fuson 1988). On this interpretation children maintain that you can keep counting forever, because syntactically that is possible. However, they also maintain that there is a largest number, because number is viewed as cardinal value (not to claim that this is something a child would say, but for example: the number of atoms in the universe is the largest number). As long as these two elements are not combined, children end up with at most partial knowledge regarding the successor axiom. This means that the current hypothesis can account for all the developmental stages that were identified by Cheung et al. (2017), Evans (1983), Gelman (1980), Hartnett (1991) and Hartnett and Gelman (1998) in terms of aspects of the structure of the numeral system. In fact, Cheung et al. (2017, p. 32) also briefly suggest that a viable interpretation of the data is that the structure of the numeral system is instrumental in learning that there are infinitely many natural numbers (though via the plusone strategy, so without my second claim).
Even so, there is one last argument that needs to be discussed. The fact that stage 2 is very uncommon and stage 3 common can be seen as an argument in favour of the plusone strategy. The argument would be that my alternative account cannot explain why it is strange for children not to realize that you can keep adding one, even though they know that there is no largest number. After all, I do not claim that they learn this by reflecting on the procedure of adding one and so they could just ignore this procedure. My account would have a harder time explaining why stage 2 is uncommon, and so the plusone strategy better fits the data. My interpretation of the data offers an answer to this argument. I suggest that children in this stage generally don’t grasp the syntactic structure of the numeral system. This means that they also cannot arrive at the conclusion that there is no largest number via my alternative account. The explanations for how we still get this data, i.e. the two interpretations I gave above, will be basically the same for my account and for the plusone strategy. So, on either account stage 2 is expected to be uncommon, and on either it presents some difficulties to explain how it’s possible that children give these answers. In short, this fact doesn’t decide between the two accounts and my alternative account is empirically viable.
5 Conclusion
Current philosophical accounts explain our knowledge of the successor axiom (directly or indirectly) on the basis of the procedure of adding 1. I have put forward an alternative hypothesis that explains this knowledge on the basis of the structure of the numeral system. Yet much is still unclear. Children eventually reason along the lines of the adding one strategy and we don’t yet know how they get to there from the earlier stage I have focussed on. Even for that early stage there is too little data to claim with any confidence that children only rely on the structure of the numeral system, or that they then focus on digits with a high placevalue. As I have noted in passing, it might well be that children rely on the adding one procedure for numbers below 100 and on the structure of the numeral system for numbers above 100. A lot of data is still needed (not only for numbers below 100, but also for syntactically more complex numbers, such as 2001) and for that reason neither of the accounts can currently be established as the correct one. Instead, I have argued that my alternative hypothesis is currently viable and therefore merits attention.
My argument for the viability of my alternative hypothesis consisted of several parts. First, in Sect. 3, I argued that the response patterns seen in a game played by children indicate that they are capable of forming larger numbers in accordance with my suggested alternative. In relation to the same experiments I noted that their reasoning, when queried whether one can keep forming larger numbers, also displays this kind of reasoning. Even so, this data does not undermine the plusone strategy, since there are other reasons (such as syntactic simplicity) why they might have preferred these kinds of responses.
Second, in Sect. 4, I argued that the alternative hypothesis accounts for the consistency of children’s concepts with the successor axiom in such a way that it is in line with what we know about the development of this knowledge. The moment at which children display behaviour that is consistent with the successor axiom is correlated with the moment at which they have a fairly robust knowledge of the structure of the numeral system. Furthermore, the different developmental stages identified by Cheung et al. (2017) and others can be interpreted with the help of the different aspects of the structure of the numeral system, as put forward in Sect. 2. Finally, I also argued that the alternative hypothesis is a formally viable way of acquiring number concepts consistent with the successor axiom. All in all, I have argued that there are in fact two viable accounts that explain how we learn that there are infinitely many natural numbers. The plusone strategy is one option that fits the data. Another option that fits the data is that we use the syntactic structure of the numeral system, paying special attention to the leftmost digits.
Footnotes
 1.
Meaning that it is shown that the numbers defined in Frege arithmetic are in an important respect ‘the same’ as the numbers defined by the Dedekind–Peano axioms. The relevant proof shows that a suitably reformulated version of the Dedekind–Peano axioms is true in Frege arithmetic. So, the embedding is the reformulation of the Dedekind–Peano axioms to fit the language of Frege arithmetic.
 2.
In some cases, the two leftmost digits would be said first: if there is inversion for the numbers under 100, then spoken numerals such as eighty two thousand would have the second numeral mentioned first. In that case, though, it is still clear that the leftmost digits are first. There are, to my knowledge, no situation of numbers over 99 where one would start with pronouncing the rightmost digit.
 3.
There are no problems in doing so, as children by this age already have number concepts consistent with all the other Peano axioms—or so I argue in my Buijsman (forthcoming).
References
 Barrouillet, P., Camos, V., Perruchet, P., & Seron, X. (2004). ADAPT: A developmental, asemantic, and procedural model for transcoding from verbal to Arabic numerals. Psychological Review, 111(2), 368–394.CrossRefGoogle Scholar
 Buijsman, S. (forthcoming). Learning the natural numbers as a child. Noûs, 1–20.Google Scholar
 Camos, V. (2008). Low working memory capacity impedes both efficiency and learning of number transcoding in children. Journal of Experimental Child Psychology, 99(1), 37–57.CrossRefGoogle Scholar
 Carey, S. (2009). Where our number concepts come from. Journal of Philosophy, 106(4), 220–254.CrossRefGoogle Scholar
 Cheung, P., Rubenson, M., & Barner, D. (2017). To infinity and beyond: Children generalize the successor function to all possible numbers years after learning to count. Cognitive Psychology, 92, 22–36.CrossRefGoogle Scholar
 Davidson, K., Eng, K., & Barner, D. (2012). Does learning to count involve a semantic induction? Cogntion, 123, 162–173.CrossRefGoogle Scholar
 Evans, D. (1983). Understanding zero and infinity in the early school years. Philadelphia: University of Pennsylvania. (Unpublished doctoral dissertation).Google Scholar
 Falk, R. (2010). The infinite challenge: Levels of conceiving the endlessness of numbers. Cognition and Instruction, 28(1), 1–38.CrossRefGoogle Scholar
 Fuson, K. (1988). Childrens counting and concepts of number. New York: Springer.CrossRefGoogle Scholar
 Fuson, K. (1990). Issues in placevalue and multidigit addition and subtraction learning and teaching. Journal for Research in Mathematics Education, 21(4), 273–280.CrossRefGoogle Scholar
 Fuson, K., & Briars, D. (1990). Using a baseten blocks learning/teaching approach for firstand secondgrade placevalue and multidigit addition and subtraction. Journal for Research in Mathematics Education, 21(3), 180–206.CrossRefGoogle Scholar
 Fuson, K. C., & Hall, J. W. (1983). The acquisition of early number word meanings: A conceptual analysis and review. In H. P. Ginsburg (Ed.), The development of childrens mathematical thinking (pp. 49–107). London, New York: Academic Press.Google Scholar
 GarcíaOrza, J., & Damas, J. (2011). Sequential processing of twodigit numbers evidence of decomposition from a perceptual number matching task. Journal of Psychology, 219(1), 23–29.Google Scholar
 Gelman, R. (1980). What young children know about numbers. Educational Psychologist, 15, 5468.CrossRefGoogle Scholar
 Göbel, S., Moeller, K., Pixner, S., Kauffmann, L., & Nuerk, H. (2014). Language affects symbolic arithmetic in children: The case of number word inversion. Journal of Experimental Child Psychology, 119, 17–25.CrossRefGoogle Scholar
 Hartnett, P. (1991). The development of mathematical insight: From one, two, three to infinity. Philadelphia: University of Pennsylvania. (Unpublished doctoral dissertation).Google Scholar
 Hartnett, P., & Gelman, R. (1998). Early understandings of numbers: Paths or barriers to the construction of new understandings? Learning and Instruction, 8, 341–374.CrossRefGoogle Scholar
 Jeshion, R. (2014). Intuiting the infinite. Philosophical Studies, 171, 327–349.CrossRefGoogle Scholar
 Klein, E., Bahnmuelller, J., Mann, A., Pixner, S., Kaufmann, L., Nuerk, H., et al. (2013). Language influences on numerical development—Inversion effects on multidigit number processing. Frontiers in Psychology, 4(480), 1–6.Google Scholar
 Lakoff, G., & Nuñez, R. (2000). Where mathematics comes from: How the embodied mind brings mathematics into being. New York: Basic books.Google Scholar
 Le Corre, M. (2014). Children acquire the latergreater principle after the cardinal principle. British Journal of Developmental Psychology, 32(2), 163–177.CrossRefGoogle Scholar
 Lengninik, K., & Schlimm, D. (2010). Learning and understanding numeral systems: Semantic aspects of number representations from an educational perspective. Philosophy of Mathematics: Sociological Aspects and Mathematical Practice, 11, 235–264.Google Scholar
 Linnebo, Ø. (2004). Predicative fragments of frege arithmetic. The Bulletin of Symbolic Logic, 10(2), 153–174.CrossRefGoogle Scholar
 Mann, A., Moeller, K., Pixner, S., Kaufmann, L., & Nuerk, H. (2012). On the development of Arabic threedigit number processing in primary school children. Journal of Experimental Child Psychology, 113, 594–601.CrossRefGoogle Scholar
 Moeller, K., Huber, S., Nuerk, H., & Wilmes, K. (2011). Twodigit number processing: Holistic, decomposed or hybrid? A computational modelling approach. Psychological Research, 75(4), 290–306.CrossRefGoogle Scholar
 Moeller, K., Klein, E., & Nuerk, H. (2011). Three processes underlying the carry effect in addition—Evidence from eyetracking. British Journal of Psychology, 102(3), 623–645.CrossRefGoogle Scholar
 Moyer, R., & Landauer, T. (1967). Time required for judgements of numerical inequality. Nature, 215, 1519–1520.CrossRefGoogle Scholar
 Nuerk, H., Weger, U., & Willmes, K. (2001). Decade breaks in the mental number line? Putting the tens and units back in different bins. Cognition, 82(1), B25–B33.CrossRefGoogle Scholar
 Nuerk, H., Weger, U., & Willmes, K. (2002). A unitdecade compatibility effect in German number words. Current Psychology Letters: Behaviour, Brain and Cognition, 7, 19–38.Google Scholar
 Nuerk, H., Weger, U., & Willmes, K. (2005). Language effects in magnitude comparison: Small, but not irrelevant. Brain and Language, 92(3), 262–277.CrossRefGoogle Scholar
 Nuerk, H., Moeller, K., & Willmes, K. (2015). Multidigit number processing: Overview, conceptual clarifications, and language influences. In R. C. Kadosh, & A. Dowker, (Eds.), The Oxford Handbook of numerical cognition (pp. 106–139).Google Scholar
 Pantsar, M. (2015). In search of \(\aleph _0\): How infinity can be created. Synthese, 192, 2489–2511.CrossRefGoogle Scholar
 Parsons, C. (2007). Mathematical thought and its objects. Cambridge: Cambridge University Press.CrossRefGoogle Scholar
 Pixner, S., Moeller, K., Hermanova, V., Nuerk, H., & Kaufmann, L. (2011). Whorf reloaded: Language effects on nonverbal number processing in 1st grade—A trilingual study. Journal of Experimental Child Psychology, 108(2), 371–382.CrossRefGoogle Scholar
 Rule, J., Dechter, E., & Tenenbaum, J. (2015). Representing and learning a large system of number concepts with latent predicate networks. In D. C. Noelle, R. Dale, A. S. Warlaumont, J. Yoshimi, T. Matlock, C. D. Jennings, & P. P. Maglio, (Eds.), Proceedings of the 37th annual meeting of the cognitive science society (pp. 2051–2056).Google Scholar
 Sarnecka, B. (2015). Learning to represent exact numbers. Synthese, 1–18. https://doi.org/10.1007/s1122901508546.
 Sarnecka, B., & Carey, S. (2008). How counting represents number: What children must learn and how they learn it. Cognition, 108, 662–674.CrossRefGoogle Scholar
 Sarnecka, B., & Lee, M. (2009). Levels of number knowledge during early childhood. Journal of Experimental Child Psychology, 103, 325–337.CrossRefGoogle Scholar
 Siegler, R., & Robinson, M. (1982). The development of numerical understandings. In Reese and Lipsett (Eds.), Advances in child development and behavior, (Vol. 16, pp. 242–312).Google Scholar
 Singer, F., & Voica, C. (2008). Between perception and intuition: Learning about infinity. Journal of Mathematical Behaviour, 27, 188–205.CrossRefGoogle Scholar
 Wynn, K. (1990). Children’s understanding of counting. Cognition, 36(2), 155–193.CrossRefGoogle Scholar
 Wynn, K. (1992). Childrens acquisition of number words and the counting system. Cognitive Psychology, 24, 220–251.CrossRefGoogle Scholar
Copyright information
Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.