The Logic of Learning

Bennet, Christian

doi:10.1007/s10516-018-9394-2

The Logic of Learning

Original Paper
Open access
Published: 17 August 2018

Volume 29, pages 173–187, (2019)
Cite this article

Download PDF

You have full access to this open access article

Axiomathes Aims and scope Submit manuscript

The Logic of Learning

Download PDF

Christian Bennet ORCID: orcid.org/0000-0003-4939-9561¹

2552 Accesses
1 Citation
Explore all metrics

Abstract

An intensional logic is presented and suggested as a framework for a formal investigation of learning. The framework allows for discussing and comparing concepts and representations, and makes it possible to view learning processes as iterations of a certain type of functions (contractions). It is shown how this framework may be used to shed light on Meno’s paradox, but also on concepts such as Vygotsky’s ZPD and learning trajectories. In the case of mathematics, where there are recent attempts to merge ideas from the philosophy of mathematics with ideas from the philosophy of education, a formal framework such as the one presented here, may constitute a common arena of discussion.

Logic and Learning

A Computational Learning Semantics for Inductive Empirical Knowledge

Initiate Learning

1 Introduction

Learning has been discussed by philosophers at least since the days of Socrates. In Meno Plato (1997) lets Socrates state the paradoxical thought: You cannot inquire into what you know, since you already know it, but neither can you inquire into what you do not know since you then don’t know into what you inquire. Thus we may, it seems, conclude that inquiring is impossible or, at best, useless. Otherwise stated it seems that you cannot learn what you do not already know, since you do not know what knowledge to look for, and should you somehow come across new knowledge, you have no possibility of recognizing it, since you do not know it. Now obviously learning is possible; whence the paradox.

There are two typical “solutions” to Meno’s (alleged) paradox. The first, which we may call “The Target Argument”, argues that, although one cannot inquire into something of which one knows nothing at all what it is, one may well inquire into something of which one has a true belief: We may not be able to inquire into the whereabouts of a total stranger. We wouldn’t know at all where to look or even what to look for. But if we knew enough about the person, for example if we had a drawing of him, which we truly believed to be correct, and some true beliefs of where he might be found, our search could well be successfully organized. The second argument, “The Recognition Argument”, states that even if one may inquire into something of which one knows nothing at all, it would be impossible to recognize this new knowledge once found: Looking for the strangers whereabouts, we wouldn’t know that we had found him, even if we had indeed done so, since we don’t know what he looks like. But, again, if we knew enough about the strangers looks, and had the appropriate true beliefs, we might well be in a position to appreciate a successful search.^{Footnote 1}

Thus Meno’s paradox may be dissolved, if accepting cognitive states of beliefs, in some sense close to knowledge. Here are two somewhat more realistic examples: Chris doesn’t know whether there are finitely or infinitely many primes, but she knows some elementary arithmetic, including the definition of prime number, and she can recognize a proof when she sees one. Thus her teacher leads her through the different steps of Euclid’s proof of the infinity of primes, and after some thought she has gained new knowledge. It may be noted here that Chris has then not only come to know the single fact that there are infinitely many primes. She has also obtained a deeper understanding of what a prime number is, and thereby a better understanding of the concept of natural number. Her conceptions, or concept images (we will return to this), have changed, and should in some sense be considered as having become closer to the corresponding “real” (intersubjective) concepts.

It also happens that Chris has no knowledge of what democracy is. This time Chris’s teacher, acknowledging this fact, decides to spend a suite of social science classes to highlight the concept. Starting out from Chris’s and her classmates’ pre-knowledge from geography, history, and social science, the learners obtain a fairly good knowledge of what constitutes democracy. During these classes Chris and her classmates gradually change their conceptions (or concept images) of democracy to become closer to the general (intersubjective) concept.

Now, in order to explain Meno’s (alleged) paradox in these terms, we need an ontology which recognizes cognitive states of different types. Typically we need a context in which we may discuss states of belief or knowledge, individual concepts (or conceptions or concept images), intersubjective concepts, likeness of concept images to each other as well as to concepts, etc. Such a context should, we believe, as far as possible be open to different theoretical stances, and (thus) open to different views on ontology. The aim of this paper is to present such a context—a logic of learning.

Intensional logic is a field which traces its origin to Aristotle (1963). In Prior Analytics and On Interpretation he investigates primarily extensional contexts, but also intensional contexts connected to temporality, necessity, and knowledge (the famous sea battle example in Chapter IX of On Interpretation). The modern, formalized versions of intensional logic goes back to Lewis and Langford (1959), bearing on much earlier work by Lewis, and they received their first formal semantics by Carnap, Hintikka, Kripke and others in terms of possible worlds and Kripke-frames in the mid twentieth century. Today different intensional logics are used with success in different fields, from investigating moral concepts in natural language contexts to investigating provability in formalized arithmetic.

Here we suggest the possibility to use intensional logic as a framework for investigating learning. After presenting the logic we have in mind in a semi-formal manner, we show how to understand learning and related concepts in a formal intensional context. There are several gains here to be made. One such is to understand the logic of learning, i.e. to gain insight in what constitutes a learning process and how such a process is connected to the concepts learnt. Another is to provide a common, ideologically and ontologically neutral context within which educational researchers could meet a common discourse. This will be exemplified and discussed towards the end of the paper.

Note that our interest here is in a formal analysis of learning in an intensional setting, rather than in a formal analysis of concepts. The latter may be found in Ganter and Wille (1998) Formal Concept Analysis, a highly successful mathematical (extensional) analysis of extensions and intensions of concepts. There, however, the intensional character of learning is not analyzed.

Throughout the paper we keep the formalism at a minimal level, but we do need to use some formal concepts along the way. These will be retrieved from elementary logic, but to some extent also from topology. All concepts used are, as far as possible, explained in the text, but the mathematically unexperienced reader may need to give the text a somewhat superficial first reading.

2 An Intensional Logic

In this section we present an intensional logic called ILAO, Intensional Logic of Abstract Objects. We need to use some formalism, but refer the reader interested in technicalities to Zalta (2001, 1988). For details, further background, and references on intensional logic, we suggest Fitting and Mendelsohn (1998).

ILAO was created by Edward Zalta in order to provide a formal context for reasoning about abstract and non-abstract objects, the latter from now on called ordinary objects. Here ordinary objects are things like elephants, chairs, and students, while abstract objects are things like concepts, feelings, and propositions. Note that we don’t say anything about the ontological status of either kind of object, and we don’t assume a fixed classification. As an example we may leave open if the number five is an abstract or an ordinary object, even though we do have a personal opinion here; numbers are abstract, just as, e.g., classes and sets.

In general a logic consists of a formal language and a consequence relation. A formal language, in turn, is defined by its syntax and its semantics. Thus, we need to specify the syntax, the semantics, and the relation of logical consequence for ILAO. Now, the core of ILAO is standard second order predicate calculus. Thus the vocabulary may contain names for individuals, names for (first order) concepts and relations, individual variables, predicate variables, and the standard logical apparatus for connectives ($\wedge , \vee , \lnot , \rightarrow , \leftrightarrow$ for and, or, not, if ...then, if and only if (abbreviated iff)), and (first and second order) quantification ($\forall , \exists$ for For all, There is). In such a language identity for ordinary objects is defined in the Leibnizian way by $u = v \leftrightarrow \forall X(Xu \leftrightarrow Xv)$, i.e. u is identical to v iff u and v share the same properties. For this part of the language, the semantics is usual (extensional) Tarski style semantics.

In order to include also intensional contexts, we need some auxiliary machinery. Thus we include formulas $\square \phi$ (necessarily $\phi$), $\blacksquare \phi$ ($\phi$has always been the case), E!x (x exists, i.e. x has a place in space), $\lambda x\phi$ (to be an individual x such that $\phi$), (the individual x such that $\psi$), xP (x encodes the property P, as apart from the usual Px, i.e. x has, or exemplifies, the property P). The idea behind the latter distinction is that in our vocabulary ordinary objects exemplify properties while abstract objects encode properties. Thus an elephant may exemplify the property of having a trunk, while my mental representation of an elephant may (or may not) encode the same property. We will have more to say here later on. Further, we use $\lozenge \phi$ and $\blacklozenge \phi$ as abbreviations for $\lnot \square \lnot \phi$ and $\lnot \blacksquare \lnot \phi$, respectively. Thus $\lozenge \phi$ means it is possible that$\phi$ and $\blacklozenge \phi$ means $\phi$holds at some point in time.

With this technical machinery in place, we may define, e.g., the concept of being an ordinary object: $O!y =_{def} (\lambda x\lozenge \blacklozenge E!x)y$ , i.e. y is such that it in some world, at some point in time exists in space. Negating this gives us the definition of abstract objects as those which necessarily never exist in space: $A!y =_{def} (\lambda x\square \blacksquare \lnot E!x)y$.

We also include a number of axioms like the following, which guarantee objects to behave the way we want, sufficiently fixes our terminology, and give us a sufficiently rich universe:

Axiom 1
No ordinary objects may ever encode a concept: $\forall x(O!x \rightarrow \square \blacksquare \lnot \exists PxP)$.
Axiom 2
Any object that encodes some property in some world and at some point in time, encodes that property necessarily always: $\forall x\forall F(\lozenge \blacklozenge xF \rightarrow \square \blacksquare xF)$.
Axiom 3
For any condition $\phi$ on properties there is necessarily always an abstract object which encodes precisely those properties that fulfil $\phi$: $\square \blacksquare \exists x(A!x \wedge \forall P( xP \leftrightarrow \phi ))$.

Axiom 2 means that abstract objects rigidly encode the properties they encode irrespective of possible world or point in time. An example of a consequence of Axiom 3 is the following:

$\square \blacksquare \exists x(A!x \wedge \forall P(xP \leftrightarrow P =$ “to be round” $\vee$$P =$ “to be a square”)).

Thus there is necessarily always an abstract object, which is a round square. This object it is possible to think about, have opinions of, discuss, etc., even if it has no existence in space.

Here we may note in passing that it is not contradictory for an abstract object to encode both a concept and its complement. Thus a may encode both being a square and not being a square. No contradiction follows from aP together with $a\lnot P$, as opposed to Pa contradicting $\lnot Pa$; of course no object exemplifies both P and $\lnot P$.

3 Semantics

Next we must say a few words on semantics. In fact we have already assumed a semantics, by spelling out the formulas above in English. It may, however, be preferable to be somewhat more formal.

In general a semantics for a formal language consists in a definition of a truth relation between formulas and structures (models), thus specifying under what conditions a formula $\phi$ is true in a structure V. Once this is done the consequence relation is defined as: $\phi$ is a logical consequence of the set of formulas X iff $\phi$ is true in all structures in which all formulas in X are true. The idea here may again be traced back to Aristotle, but in modern terms it was the Polish logician Alfred Tarski who managed to define the formal, today standard, concept.

Now Tarski style semantics is extensional. Thus if, e.g., we have the concepts x is the morning star (x is the first heavenly body being seen in the morning), x is the evening star (x is the first heavenly body being seen in the evening), and the name Venus in our vocabulary, it is a logical consequence of The morning star is Venus together with The morning star is the same as the evening star (which both, given a certain amount of idealization, happen to be true) that The evening star is Venus. From this in turn it follows, if we have the predicate Chris knows that x in our language, that if Chris knows that The morning star is Venus she also knows that The evening star is Venus. But this may, obviously, not be the case. Since knowledge is intimately connected to learning, we must, therefore, include also intensionality in our semantics. This is done by utilizing the notion of abstract objects and possible worlds. The latter is a notion that draws on ideas from Leibniz, Wittgenstein, and Carnap, and which in its modern, technical setting is due to Kripke.

Here is how this is done: First we let a structureV be a sequence

$$\begin{aligned} (D, R, (W, w_0), (T, t_0, <), ext_{w, t}, ext_A, I) \end{aligned}$$

such that

D is a (non-empty) set of ordinary and abstract individuals
$R = R_0 \cup R_1 \cup \dots R_n \cup \dots$ is the set of all n-ary relations, where 0-ary relations are propositions
W is the set of all possible worlds, of which $w_0$ is the actual world
$(T, t_0, <)$ is an ordered set of points in time, where $t_0$ is “now"
$ext_{w, t}$ is a function that provides every relation in R with an (exemplification) extension in w at t, and each element in $R_0$ with a truth value. Thus, if P is a unary predicate, $ext_{w,t}$ applied to P gives us the extension of P in w at t, i.e. the set of objects which in that particular world at that particular moment has (exemplifies) the property P.
$ext_A$ is a function that provides every relation in R with an (encoding) extension in the form of a subset of $D^n$, where n is the arity of the relation. Again, if P is a unary predicate, $ext_A$ applied to P gives us the set of objects which (in any world, and at all moments) encode the property P.
I provides every constant with an object of relevant type in relation to D. This means that proper names are rigid in the sense that they denote the same object in every possible world.^{Footnote 2}

Next we define truth relative to V, and here, leaving out all technical details, we just show by examples how things are supposed to work^{Footnote 3}:

Pa is true in w at time t iff the object provided by I to a is in the exemplification extension provided to P by $ext_{w, t}$. Thus Pa is true iff the object denoted by a has the property denoted by P.
aP is true in w at time t iff the object provided by I to a is in the encoding extension provided to P by $ext_A$. Thus aP is true iff the (in this case abstract) object denoted by a encodes the property denoted by P.
$\lnot \phi$ is true in w at t iff $\phi$ is not true at w at t.
$\forall x \phi$ is true in w at t iff $\phi$ is true whatever object in D we let x designate.^{Footnote 4}
$\forall X \phi$ is true in w at t iff $\phi$ is true whatever property in D we let X designate.
The intended meaning of $\square \phi$ is It is necessarily the case that$\phi$. The interpretation here of necessity is “true in all possible worlds”. Thus we define $\square \phi$ to be true in a world w iff $\phi$ is true in all possible worlds.^{Footnote 5}

4 Some Preliminary Examples

Before we go on, we give some simple examples of how the formalism of ILAO may be used. Let V be as above, and suppose our language is sufficiently rich for the following reasoning.

4.1 A First Example

Let Venus denote the planet Venus, supposing D includes ordinary objects like heavenly bodies and human beings. Further, let in any possible world wThe morning star be a concept exemplified by the first heavenly body seen in the morning in that world, and The evening star be a concept exemplified by the first heavenly body seen in the evening. In (an idealized) $w_0$ it happens to be the case that both these heavenly bodies are the planet Venus. But in some other possible world, $w_1$ say, this may not be so. Now $\square \phi$ is true at w iff $\phi$ is true in all possible worlds. Thus it is not true in $w_0$ that It is necessarily the case that the morning star is Venus, even if it is true that The morning star is Venus.

4.2 A Second Example

Suppose that Chris believes that 10 is an even number. Now the solution to the equation ${3\over 4} x + {5\over 6} = 5x - {125\over 3}$ happens to be 10. Does this mean that Chris also believes that the solution to this equation is an even number? Well, that depends on how the proposition Chris believes that the solution to the equation is even is understood. If the solution to the equation is understood as referring to the solution in an extensional sense, then the proposition is true; since Chris believes that 10 is even, and 10 is the solution to the equation, she obviously believes that solution to be even. But if the solution to the equation is understood as a conception of Chris’s, then it may very well be that the proposition is false; Chris believes that 10 is an even number, but she (wrongly) believes the solution of the equation to be 9, and thus she believes the solution to the equation, from her point of view, to be odd and not even.

Now the first reading may be formalized in ILAO by B(c, Ea), where B(c, p) designates the belief relation between the person c (Chris in this case) and the proposition p. Here p is Ea, i.e. the proposition that a (the number 10) has the property E (is an even number). a is in a, what is usually called, de re position (just as c is), and the context is extensional in the sense that substituting another name or referent for the number 10 in B(c, Ea) for a, will not change the truth value of the sentence.

The second reading is different, the context being intensional; substituting another expression with the same reference for a may well change the truth value. In this case we should, however, formalize the proposition as , where denotes the abstract object which we may denote as the number being believed by Chris to be the solution to the equation, i.e. the number which, for Chris, is the reference of the solution to the equation.^{Footnote 6} This is called a de dicto reading.

One significant difference between this formalization, and the one above, is then that from B(c, Ea) and $a = b$ it follows that B(c, Eb), i.e. from Chris believes a to be even and $a = b$ it follows that Chris believes b to be even. But from $B(c, \underline{Ea})$ and it doesn’t follow that , i.e. from Chris believes a to be even and a is (in fact) the solution to the equation, it doesn’t follow that Chris believes the solution to the equation (according to her beliefs) to be even.^{Footnote 7}

Instead of indulging further into these and similar examples, we now move on towards implementing the concept of learning in our formal intensional context.

5 Representations

Given an individual a (in some world w at some point t in time),^{Footnote 8} let $\phi$ describe properties of a according to $ext_{w,t}$. According to Axiom 3 above, there is then an abstract object b that encodes precisely the properties described by $\phi$. We call such an object a representation of a. Thus, a representation of an ordinary object a is an abstract object that encodes some (or all) properties that a has.

If, e.g., Chris sees the elephant Jumbo at the zoo, some (physical, non linguistic) structure S in her brain may be seen as representing Jumbo. The cognitive content of S may be taken to be an abstract object encoding those properties that Chris (correct or not) ascribes to Jumbo, while Jumbo himself is the objective content of S. We call this abstract object a mode of presentation or just a representation of Jumbo for Chris. The representation of Jumbo for Chris depends, among other things, on Chris’s knowledge of elephants; if she learns more, her representation of Jumbo, and of her concept elephant, changes.

For another example, suppose that Chris knows some basic facts about natural numbers. There is then an abstract object, p say, that represents the greatest known prime number for Chris. p may encode properties such as being an odd number, being divisible only with 1 and itself, being larger than 50,000,000 (if, e.g., she happens to know that 50,000,017 is prime), and not being the largest prime (for example if she knows that there are infinitely many primes). Assuming that Chris doesn’t believe any particular number to be the largest known prime, there is, however, no natural number k such that p encodes being equal to k.

Now, Chris also knows some basic facts about Mersenne numbers.^{Footnote 9} Thus she knows, e.g., that $2^{57885161} - 1$ is a Mersenne number, and that in order for $2^n - 1$ to be a prime number, n needs to be prime. Suppose now that m represents $2^{57885161} - 1$ for Chris. Then m and the above p are different abstract objects; they encode different properties. On the other hand, as it happens, $2^{57885161} - 1$ is in fact the largest known prime (at the time of writing). Thus, if Chris learns this fact, new abstract objects, $m'$, extending m but now encoding also the property being equal to$p'$, and $p'$, extending p but also encoding the property being equal to$m'$, will then be new representations of the two objects for Chris.

We are now ready to (re)turn to the main topic of this article.

6 A Formal Concept of Learning

In the last example above, learning was implicitly presented as resulting in a change of representations. That Chris knows all about elephants, would in this nomenclature be that the representation elephant for Chris encodes the properties elephants have (according to $ext_{w,t}$etc). This representation of course also encodes a number of, in a sense, irrelevant properties like being the favorite animal of Chris’s best friend. There is no need to make this distinction precise here, but let us, just to keep it in mind, denote essential properties as e-properties.^{Footnote 10}

To learn a concept like elephant thus means to acquire a representation of elephant that encodes as many correct e-properties as possible, and (thus) as few incorrect e-properties as possible, of elephants. It may even be possible to speak here of the learner’s representation of a thing or a concept being as similar as possible to the real thing or concept.

Thus learning a concept may be described as a function acting on representations encoding properties of the concept. The process of Chris learning the concept, may then be seen as a sequence of applications of a learning function F successively acting on Chris’s representations: Starting out from a vague representation r, maybe encoding few properties in common with the concept in question, and even representing many properties not in common with it (misconceptions, if you like), F(r) is a somewhat “better” representation for Chris with more properties in common with (exemplified by) the concept to be learnt, and fewer misconceptions. Successively the sequence r, F(r), $F^2(r) = F(F(r))$, $F^3(r)$, ..., will produce representations of the concept in question for Chris, that become more and more similar to the concept itself.^{Footnote 11}

Here, as indicated in the above footnote, we avoid the question of precisely how such a process is performed. A natural view could, e.g., be that learning is in accordance with a prototype theory of conceptual formation, where a student, in our terminology, acquires better and better representations of a certain concept by being pointed out to more and more central representatives of the corresponding class of objects.^{Footnote 12} But it may also be described in accordance with other philosophical views on what constitutes a concept. This philosophical issue is, however, not in focus in this paper.

An effective learning process is then a process for which the corresponding learning function in few iterations produces a good representation of the appropriate concept, a function that in a sense rapidly approaches the object of learning. Thus learning is here seen as functions acting in models for an intensional logic, the logic of learning.

For this to make sense, there is need to spell out what is to be counted as a good representation, i.e. there is a need for a concept of likeness or closeness between concepts and representations. There is also need for a measure of distance when talking about a sequence of representations approaching a concept or another representation. Mathematically concepts of distance are often modeled topologically by a metric.

7 Partial Metric Spaces

Again leaving out technical details, a metric space consists of a pair (X, d), where X is a set of objects, often called points, and d is a function from $X\times X$ to the non-negative real numbers.^{Footnote 13}d(x, y) is called the distance between the two points x and y, and must satisfy the conditions

M1
If $x = y$ then $d(x, y) = 0$ (equality implies indistancy)
M2
If $d(x, y) = 0$ then $x = y$ (indistancy implies equality)
M3
$d(x, y) = d(y, x)$ (symmetry)
M4
$d(x, z) \le d(x, y) + d(y, z)$ (triangle inequality).

Typical examples of metric spaces are the Euclidean plane with d(x, y) as the ordinary distance between x and y, and the real number line with $d(x, y) = |x - y|$. An, at least here, more interesting example is the following: Let S be any set, and X be the set of all infinite sequences $x = (x_0, x_1, x_2, \dots )$, with every $x_n$ a member of S. Further let, for sequences x and y in X, $d(x, y) = 2^{-k}$, where k is the largest number such that $x_i = y_i$, for all $i\le k$, if such a number exists. If not, let $d(x, y) = 0$. Thus d(x, y) measures the similarity between x and y in terms of the length of the longest initial sequence common to x and y. It is an easy exercise to show that (X, d) thus defined is a metric space.

Applied to our context, imagine Chris learning a certain concept, a. Associated with a is a potentially infinite number of e-properties exemplified by a. Describe a as a sequence of these properties $(p_0, p_1, p_2, \dots )$. Now Chris has some representation of a, b say, encoding some of the properties of a. Deciding on a prearranged numbering of all possible properties (or at least the essential ones) which are relevant to a particular field of interest,^{Footnote 14} we may compare the properties encoded by Chris’s representation with the properties exemplified by the true object, and measure the distance between them in terms of d(a, b). The smaller the distance, the “better” the representation; knowing the concept a thus means to have a representation of a close to a itself.

Now, there is a problem with this model of learning: Since concepts are described as infinite sequences, we may in fact never learn a concept in full through a finite process. In order to remedy this, we would rather like to include also finite sequences of properties. But then we no longer have a metric space, since $d(x, x) = 2^{-k}\not = 0$, for all sequences x of length k, whence M1 doesn’t hold.

The corresponding problem has, however, been noticed before in the context of computer science, where one is interested in measuring the “distances” between partial (finite) computations of potentially infinite strings. The solution is a concept of partial metric spaces, where one accepts small, but nonnegative, self-distances. Thus a partial metric space is a set X together with a function d from $X\times X$ to the nonnegative reals, such that

P1
$0 \le d(x, x) \le d(x, y)$ (nonnegative (small) self-indistancy)
P2
If $d(x, x) = d(x, y) = d(y, y)$ then $x = y$ (indistancy implies equality)
P3
$d(x, y) = d(y, x)$ (symmetry)
P4
If $d(x, z) \le d(x, y) + d(y, z) - d(y, y)$ (triangle inequality).

Again, we refer the technically interested reader to other literature.^{Footnote 15} We may, however, note that the theory of partial metric spaces is in a strong sense an extension of the theory of metric spaces, and here we wish to utilize this fact in a specific way.

Permitting finite sequences of properties as elements of a partial metric space, enables us to compare distances between representations of finite knowledge (in the form of finite sequences of properties) of concepts. In such a context it, thus, makes sense to compare Chris’s representation b of a concept a in the form of a finite sequence of properties that Chris, right or wrong, attributes to a, i.e. properties encoded by b. Further a certain representation may or may not have self distance 0. A (finite) representation b of a, may in this respect be considered to be partial knowledge of the true concept, just as in computer science a finite sequence may be viewed as a partial computation of an infinite sequence.

As mentioned above, we suggest to view learning as a function in our, now partial, metric space. In fact some such functions are of special interest here: In a (partial) metric space (X, d), a function $f: X \rightarrow X$ is called a contraction if there is a real $0 \le r < 1$ such that, for all x, y, $d(f(x), f(y)) \le r\cdot d(x, y)$. Thus a contraction is a function that shrinks distances. Identifying learning with a contraction, we are intersted in what happens if a contraction is iterated. To that end, we need yet another concept or two:

A sequence $(x_0, x_1, \dots )$ of points in a partial metric space (X, d) is a Cauchy sequence if there exists $r \ge 0$ such that for each $\epsilon > 0$ there exists k such that for all $n, m > k$$|d(x_n, x_m) - r| < \epsilon$. Thus a Cauchy sequence is a sequence for which the distances $d(x_n, x_m)$ converges to some r as n and m approach infinity. This implies that $d(x_n, x_n)$ approaches r when n approaches infinity, and thus, if the space is in fact a metric space, $r = 0$.

Further, a sequence $(x_0, x_1, \dots )$converges to a point y if $d(x_n, y)$ and $d(x_n, x_n)$ approaches d(y, y) as n approaches infinity. A sequence thus converges to y if the points in the sequence approach the very vicinity of y, in as close a manner as possible.

Finally a partial metric space is said to be complete if every Cauchy sequence converges. Typical examples of complete metric spaces are the Euclidean plane and the real numbers with their usual metrics.

We are now ready to state a classical result from topology, but in the setting of partial, rather than ordinary, metric spaces:

Theorem (Mathews 1995)

For each contractionfover a complete partial metric space (X, d), there exists a unique pointxsuch that$f(x) = x.$In fact this point x has self distance 0.^{Footnote 16}

Thus any contraction in a complete partial metric space has a unique fixed point. Now if such a contraction represents a learning process operating on representations of a concept, this means that the learning process has a fixed point, i.e., there is a representation b on which the process gives no further result. I suggest that b in that case is a very good representation of the learnt concept. This will be spelled out in more detail in the next section.

8 The Possibility of Learning

Suppose Chris is to learn arithmetic. Thus she is to learn concepts such as prime number, multiplication, and greatest common factor. As indicated above, we may then take the relevant e-properties to be all properties describable by formulas in the language of arithmetic. An arithmetical concept a may in turn be considered a sequence of such properties, the properties exemplified by a, and a representation b may be taken to be a similar sequence; the properties encoded by b (for Chris). All this relative an appropriate model of ILAO.

Now, if Chris is to learn what a is, the objective is to have b as similar as possible to a, i.e. the e-properties encoded by Chris’s representation of a should ideally be the same as those exemplified by a. In order to obtain such a result, Chris starts a learning process (e.g., in school by the help of teachers, class mates, school books, etc.). This process may be seen as a function acting on Chris’s representations of a, and providing new representations of a.

Since both concepts and representations of concepts are identified with (finite or infinite) sequences of e-properties, we may in principle view them as subsets of natural numbers, where a number n belongs to the subset if and only if the property $p_n$ belongs to the sequence in question. Now subsets of natural numbers may in turn be seen as a complete partial metric space (essentially inheriting a metric from the reals). Thus, if Chris’s learning process is a contraction, the above theorem guarantees that there is a fixed point relative to this process, i.e. a representation b which will be a final representation of a for Chris. When, if ever (there is, of course, a question of time here) Chris represents a by b, we may say that she has in fact learnt the concept a. Thus, theoretically, Chris has a possibility of learning any concept, if provided with a contraction learning process.

Of course, there is nothing principally mystical with a contraction learning process. Somewhat vaguely we may consider a concept to be an infinite sequence of properties, and a representation a finite such sequence. Through a didactical analysis of the subject matter in question one may order the e-properties based on some pre-knowledge ordering, and then measure distance according to similarity of initial segments.^{Footnote 17}

Suggesting a formal logic of learning may be considered a theoretical exercise, but it does show that there is a formal framework in which learning may be shown to be possible. For Plato this was not at all obvious, and, as we have seen, Meno’s paradox is discussed even today. The formalism presented here, shows that learning is in fact possible (which we already knew, of course), but also that knowledge may be the result of a (contraction) process operating on beliefs in the form of representations encoding e-properties. Thus, we may point to where Socrates’ reasoning goes astray.

9 Final Remarks

For subjects like mathematics there is yet another reason to put forward a logic of learning. Any view on knowledge emanating from some form of empiricism, and such views are today standard, may be taken as a philosophical basis for learning empirical subjects, but it is not as clear how to provide such a basis for mathematics. Numbers and operations like multiplication are not as easily pointed out ostensively as elephants, or even stars and dinosaurs, and there is an ongoing debate on the possibility of empirically grounding mathematics.^{Footnote 18} A formalism such as the one presented here could narrow the gap between a philosophy of mathematics approach and a philosophy of education approach in that learning is seen to live in the same world as mathematical concepts.^{Footnote 19}

We round up by giving just one example of possible benefits for a philosophy of mathematics and mathematics education. Within his theory of learning, Vygotsky (1962, 1978) defines the notion of a learners zone of proximal development, ZPD. Much has been written about this concept from an educational perspective, and the concept (or varieties of it) is typically an essential ingredient in constructivist views on learning (cf. e.g. Gredler and Clayton-Shields 2008). Now, ZPD is a notion that teachers recognize in their praxis; it is ineffective to teach a learner what she can successfully manage on her own or to give her tasks too hard to manage at all. In our formalism, this notion may be explicated in topological terms: Given a concept a and a learner’s representation b of a, the relevant ZPD will be a suitable (topologically) open set of points (representations) containing both a and b. A contraction function (learning/teaching process) with a as limit, may then involve an effective scaffolding within this ZPD in order to support a narrowing of distance between a and the learner’s successive representations of a. It is, of course, easy also to associate to the development of students’ learning in terms of “learning trajectories” (Daro et al. 2011), but we will not elaborate on this here.

From a philosophy of education perspective, this view may deepen our understanding of ZPD by explicating the relationship between representations and concepts in a learning process beyond a common Venn–diagram. From a philosophy of mathematics perspective, we gain insight in how knowledge of mathematical concepts may take place in a process that is not necessarily linked to an empirical grounding of mathematics. Thus philosophical aspects of mathematics and of education meet on a common arena, being beneficial for both parts, and worth further exploration

Notes

For further discussion, see Fine (2014).
Cf. Kripke (1980) on rigid designators.
In fact there are a number of technical details that must be settled in order to get the precise concept of truth that we want. Which such concept we do want, may in turn depend on the context we are interested in. Such details, however, are of no concern here.
The observant reader notices that we here allow ourselves to hastily pass over the very point where the Tarski notion of truth is essential. Formally one must introduce an evaluation function (or a similar device) which assigns objects to variables, and define truth via satisfaction using these evaluations.
Actually we may restrict all possible worlds here to all worlds which are possible relative w, assuming a possibility relation between worlds. Different possibility relations give rise to different logics; as an easy example, a transitive possibility relation will induce all formulas of the form $\square \phi \rightarrow \square \square \phi$ to be true in all possible worlds. Thus in such a context all necessary truths are themselves true by necessity.
Underlined expressions denote abstract counterparts to non-underlined expressions. Thus E may denote the concept of being even (for Chris), and may, with an appropriate $\psi$, refer to the number Chris thinks is the solution to the equation.
Actually, the predicate even may also be read in either a de re or a de dicto sense, giving a number of possible interpretations.
From now on, we leave out such comments.
Mersenne numbers are the numbers of the form $2^n - 1$, for natural numbers n. Examples of Mersenne numbers are, thus, 3, 7, 15, and 31.
Here we are at the brink of entering the realms of essentialism, which is not a topic of this paper. But we do want to note, at least, that learning (or knowing) a concept (trivially) doesn’t involve learning (or knowing) all properties of that concept. But exactly which properties are to be considered essential to learn (or know) depends on context; in arithmetic being The Number of the Beast is not essential to 666, while it may very well be so when discussing War and Peace.
This description of a learning process does not necessarily presume that learning has to be seen as conceptual change in either the cognitive research tradition of Carey (1985) or the education research tradition of e.g. Posner et al. (1982). We do, however, see an obvious possibility of interpreting our (formal) learning processes in terms of concept change within such (constructivist) traditions.
Prototype theory was introduced by Rosch (1973) within a cognitive science-framework. See also (Rosch 1999) and (Rosch and Mervis 1975).
Metric spaces were introduced by Fréchet (1906) and are today a standard ingredient in graduate mathematics.
As an example we may take arithmetic to be the field of interest, and let the e-properties be all properties describable in the standard language of arithmetic taken in some alphabetic order.
Good sources of information concerning partial metric spaces are (Bukatin et al. 2009) and (Han et al. 2017).
This is a version for partial spaces of the Banach Fixed Point Theorem for metric spaces.
Such a pre-knowledge ordering is described and used as a base for the national diagnose material Diamant in Sweden. Further information on this may be found (so far) only in Swedish in (Löwing 2016).
It would go too far to recapitulate this debate here. The interested reader may confer Jenkins (2008), Roland (2009) and Sjögren and Bennet (2014).
Recently a number of suggestions have been made to view (versions of) constructivism as common ground for mathematics and (mathematics) education. As shown in the present debate there are, however, a number of problems here, and it is questionable if this road actually leads anywhere. For references see Sjögren and Bennet (2013).

References

Aristotle (1963) Aristotle in twenty-three volumes. Heinemann, Harvard University Press, London
Google Scholar
Bukatin M, Kopperman R, Mathews S, Pajoohesh H (2009) Partial metric spaces. The AMS monthly, pp 708–718
Carey S (1985) Conceptual change in childhood. MIT, Cambridge
Google Scholar
Daro P, Mosher FA, Corcoran T (2011) Learning trajectories in mathematics: a foundation for standards, curriculum, assessment and instruction. CPRE Research Report # RR-6
Fine G (2014) The possibility of inquiry: Meno’s paradox from socrates to sextus. Oxford University Press, Oxford
Book Google Scholar
Fitting M, Mendelsohn R (1998) First order modal logic. Kluwer, Dordrecht
Book Google Scholar
Fréchet M (1906) Sur quelque point du calcul fonctionnel. Rend Circ Mat Palermo 22:1–74
Article Google Scholar
Ganter B, Wille R (1998) Formal concept analysis: mathematical foundations. Springer, Berlin
Google Scholar
Gredler GE, Clayton-Shields C (2008) Vygotsky’s legacy. A foundation for research and practice. The Guilford Press, New York
Google Scholar
Han S, Wu J, Zhang D (2017) Properties and principles on partial metric spaces. Topol Appl 230:77–98
Article Google Scholar
Jenkins C (2008) Grounding concepts: an empirical basis for arithmetical knowledge. Oxford University Press, Oxford
Book Google Scholar
Kripke S (1980) Naming and necessity. Harvard University Press, Cambridge. Also in Davidson D, Harman G, (eds) Semantics of natural language. Reidel, Dordrecht
Lewis CI, Langford CH (1959) (1932) Symbolic logic, London: century, 2nd edn. Dover, New York
Löwing M (2016) Diamant - diagnoser i matematik : ett kartlaggnings material baserat pA didaktisk amnesanalys. Acta universitatis Gothoburgensis, Gothenburg
Google Scholar
Mathews S (1995) An extensional treatment of lazy data flow deadlock. Theor Comput Sci 151:195–205
Article Google Scholar
Plato (1997) Meno (G.M.A. Grube, Trans.). In: Cooper JM, Hutchinson DS (eds) Plato: complete works. Hackett, Indianapolis, pp. 870–897. (Original work published ca. 380 B.C.)
Posner GJ, Strike KA, Hewson PW, Gertzog WA (1982) Accommodation of a scientific conception: toward a theory of conceptual change. Sci Educ 66(2):211–227
Article Google Scholar
Rosch EH (1973) Natural categories. Cognit Psychol 4:328–350
Article Google Scholar
Rosch E (1999) Reclaiming concepts. J Conscious Stud 6:61–78
Google Scholar
Rosch E, Mervis CB (1975) Family resemblances: studies in the internal structure of categories. Cognit Psychol 7:573–605
Article Google Scholar
Roland JW (2009) Concept grounding and knowledge of set theory. Philosophia 38(1):179–193
Article Google Scholar
Sjögren J, Bennet C (2013) The viability of social constructivism as a philosophy of mathematics. Croat J Philos 13(3):341–355
Google Scholar
Sjögren J, Bennet C (2014) Concept formation and concept grounding. Philosophia 42(3):827–839
Article Google Scholar
Vygotsky LS (1962) Thought and language. MIT Press, Cambridge
Book Google Scholar
Vygotsky LS (1978) Mind in society: the development of higher psychological processes. Harvard University Press, Cambridge
Google Scholar
Zalta EN (1988) Intensional logic and the metaphysics of intentionality. MIT Press/Bradford Books, Cambridge
Google Scholar
Zalta EN (2001) Fregean senses. Modes Present Concepts Philos Perspect 15:333–359
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Pedagogical Curricular and Professional Studies, University of Gothenburg, Box 100, 405 30, Gothenburg, Sweden
Christian Bennet

Authors

Christian Bennet
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Christian Bennet.

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.

Reprints and permissions

About this article

Cite this article

Bennet, C. The Logic of Learning. Axiomathes 29, 173–187 (2019). https://doi.org/10.1007/s10516-018-9394-2

Download citation

Received: 28 February 2018
Accepted: 10 August 2018
Published: 17 August 2018
Issue Date: 15 April 2019
DOI: https://doi.org/10.1007/s10516-018-9394-2

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

The Logic of Learning

Abstract