Rationale for Matrix Multiplication in Linear Algebra Textbooks
 534 Downloads
Abstract
Although matrix multiplication is simple enough to perform, there is reason to believe that it presents conceptual challenges for undergraduate students. For example, it differs from forms of multiplication students with which Linear Algebra students have experience because it is not commutative and does not involve scaling one quantity by another. Rather, matrix multiplication is a multiplication in the sense of abstract algebra: it is associative and distributes over matrix addition. Exposure to abstract algebra’s general treatment of multiplication, however, usually occurs after students have taken Linear Algebra. This elicits the following question: How is matrix multiplication being presented in introductory linear algebra courses? In response, we analyzed the rationale provided for matrix multiplication in 24 introductory Linear Algebra textbooks. We found the ways in which matrix multiplication was explained and justified to be quite varied. In particular, two commonly employed rationalizations are somewhat contradictory, with one approach (isomorphization) suggesting that matrix multiplication can be understood from an early stage, while another (postponement) suggesting that it can only be understood upon consideration of more advanced concepts. We also coordinate these findings with the literature on student thinking in Linear Algebra.
Keywords
Linear algebra Matrix multiplication Textbook analysis1 Introduction
Matrix multiplication is foundational to many of the core concepts in introductory Linear Algebra. Indeed, results concerning systems of linear equations, span, linear (in)dependence, and linear transformations can all be (and often are) formulated in terms of matrix multiplication. This means that how conceptually coherent students understanding of matrix multiplication is has broad implications for their potential understanding of much of the Linear Algebra curriculum. Harel (1997), in his commentary on the undergraduate Linear Algebra curriculum, argued that “understanding must imply knowing why, not just how” (p. 111, emphasis ours). Matrix multiplication presents some conceptual challenges for students in this regard. For example, matrix multiplication is fundamentally different than the familiar multiplication of integers because it does not ‘multiply’ in the literal sense; moreover, it is generally not commutative. Instead, matrix multiplication is a ‘multiplication’ in the sense of abstract algebra: it distributes over (matrix) addition and is associative. This is an unfamiliar and nonintuitive idea for students who have had no prior exposure to abstract algebra. These conceptual peculiarities—along with the importance of matrix multiplication—bring to the fore the question of what approaches are available for motivating and explaining matrix multiplication to students. In other words, what rationale for matrix multiplication can be provided to students to help them overcome the challenges associated with this unfamiliar and nonintuitive operation?
One might ask why matrix equality and matrix addition are defined in such a natural way, while matrix multiplication appears to be much more complicated. Only a thorough understanding of the composition of functions and the relationship that exists between matrices and what are called linear transformations would show that the definition of multiplication given previously is a natural one. These topics are covered later in the book. (Kolman & Hill, 2007, p. 24)
Kolman and Hill’s (2007) connection to linear transformation is made 5 chapters after the above quote. Lay, Lay, and McDonald (2015), on the other hand, first introduced matrix multiplication in terms of the matrixvector product Ax, which they defined as a linear combination of the columns of A. The authors justify the need for, and importance of, this definition in the following way: “a system of linear equations may now be viewed in three different but equivalent ways: as a matrix equation, as a vector equation, or as a system of linear equations. … you are free to choose whichever viewpoint is more natural” (p. 36). The connection to the matrixmatrix product is made several sections later in the same chapter and leverages the composition of linear transformations.
These two textbooks illuminate substantial differences in how matrix multiplication can be introduced, both in terms of the order of presentation and the rationale that is provided. Lay et al. (2015) go to considerable lengths to emphasize the utility of the matrixvector product in terms of linear combinations of vectors, then use linear transformations to mathematically justify the matrixmatrix product in the first chapter of their text. On the other hand, Kolman and Hill (2007) define the matrixmatrix product outright with a cursory reference to linear transformations, the full treatment of which appears in Chap. 6. Moreover, there are drastic differences in the rationale these authors provided in their initial presentations of matrix multiplication. Kolman and Hill asserted that the ability to view their initial definition as natural depends on a thorough understanding of concepts that occur much later in the text and are presumably unfamiliar to students. In contrast, Lay et al. (2015) explained their initial definition in terms of concepts that occur earlier in the text and are presumably already familiar to students.
The contrast between the above examples provides impetus for the research questions that guided this study: are other introductory linear textbooks just as different? How else might matrix multiplication be motivated, justified, and explained? To these ends, we documented and analyzed the rationale for the definition(s) of matrix multiplication given in 24 introductory Linear Algebra textbooks. Our analysis revealed that there is indeed substantial variation in how matrix multiplication is rationalized, and that these rationalizations (as in the excerpts above) are not always entirely compatible. We conclude with a discussion of the potential pedagogical implications by coordinating our findings with research on student learning of matrix multiplication and, more generally, Linear Algebra.
2 Literature and Theory
Given that we are studying the presentation of matrix multiplication in textbooks, we first review literature pertaining to textbook analyses, including Harel’s (1987) analysis of Linear Algebra textbooks. In particular, we detail Harel’s framework for classifying the means by which textbook authors bridge the gap between students’ existing knowledge and new content, which we operationalize in this study as a means to classify and analyze the rationale that the textbook authors provide for matrix multiplication. Research on student thinking in Linear Algebra appears later in the discussion section, in which we coordinate the results of our investigation with this body of literature.
2.1 Textbook Analysis as an Avenue of Insight into Instruction
Researchers have argued that textbook analysis can provide insight into how particular content is presented in mathematics classrooms (e.g. Reys, Reys, & Chavez, 2004; Robitaille & Travers, 1992). Though there is no guarantee that classroom instruction mirrors textbooks’ content presentation, we note that textbooks “help teachers identify content to be taught [and] instructional strategies” (Thompson, Senk, & Johnson, 2012). Textbook analysis can thus be an efficient (though not comprehensive) means of gathering insight into classroom instruction. Harel (1987), for example, examined the differences in content presentation in Linear Algebra textbooks in order to “examine existing approaches to teaching” (p. 29); other textbook analyses that have been conducted at the undergraduate level include calculus (Weinberg & Weisener, 2011), combinatorics (Lockwood, Reed, & Caughman, 2016), and abstract algebra (Capaldi, 2012). Regarding the specific goals of our study, differences in content presentation are particularly revealing because of their potential influence on how students understand particular concepts (Bierhoff, 1996). Such differences also reflect “how experts in the field … define and frame foundational concepts” (Lockwood et al., 2016, p. 9) and can therefore be useful for identifying key components of understanding these concepts.
2.2 Harel’s Framework for Textbook Analysis
Harel (1987) conducted an analysis of Linear Algebra textbooks and reported differences in several respects; those particularly relevant to this study are differences in the sequencing of content and the justification provided for the introductory content. Our current study is distinct in two ways. First, Harel’s analysis occurred nearly three decades ago (at the time of this writing), a period of time in which impactful attempts at nationwide Linear Algebra curriculum reform—such as the Linear Algebra Curriculum Study Group (Carlson, Johnson, Lay, & Porter, 1993; Harel, 1997)—were made. Presumably, these attempts, the broader changes in mathematics education, and the passage of time precipitated a different landscape of Linear Algebra textbooks than those studied by Harel. Second, Harel’s study focused on general trends throughout entire textbooks (what might be called a macroanalysis) and did not focus specifically on the different presentations of matrix multiplication (though he did affirm the status of matrix arithmetic in introductory Linear Algebra texts). Our current study, in contrast, focuses on the presentation of one specific concept (a microanalysis).

Isomorphization involves “[imposing] an isomorphism on two mathematical structures where one of these structures is familiar to the student” (p. 31). In other words, isomorphization introduces students to a new concept in a way that highlights how the new concept preserves the mathematical structure of a familiar concept. Because the current study focuses on introductory textbooks, we note that the term ‘isomorphism’ itself is not likely to be explicitly mentioned. As an example, Harel specified how some textbook authors motivated matrix multiplication by showing how it preserves the composition of linear transformations (we documented several cases of this exact use of isomorphization to justify the matrixmatrix product). Lay et al.’s (2015) rationale for the matrixvector product (from our introduction) is another example of isomorphization because it highlights the structural equivalencies between linear systems, vector equations, and matrix equations.

Postponement involves remarking on the necessity for and magnitude of a particular concept for which, in the authors estimation, such considerations are not yet clear to the student. Uses of postponement include cases for which the authors are depending on future, currently unfamiliar, ideas to temporarily justify the concept at hand. Kolman and Hill’s (2007) justification for the matrixmatrix product (from our introduction) is an example of postponement because they argue that understanding matrix multiplication depends on understanding a future concept (linear transformations).

Analogy is a technique used to demonstrate connections “between new ideas to be learned and familiar ones that are outside the content area of immediate interest” (Harel, 1987, p. 30). There are two types of analogies: mathematical, in which connections are made with a familiar mathematical concept, and realworld, in which connections are drawn between the new mathematical concept and an application to a realworld problem where the concept is relevant. Note that analogy is quite similar to isomorphization. Though we acknowledge that these classifications are certainly not disjoint, we reserved ‘isomorphization’ for instances of literal mathematical isomorphism that emphasize the preservation of mathematical structure; we used analogy for all other comparisons.

Abstraction is a very similar strategy in which students are first introduced to general ideas in specific, familiar, and more concrete, situations. The most common example of abstraction, Harel noted, is when an entire concept is motivated by a small number of examples of that same general concept. For example, demonstrating the utility of and justification for a result related to matrix multiplication by initially focusing on specific cases with 2 × 2 matrices would be a use of abstraction. The distinction between abstraction and analogy is that abstraction invokes a specific example of the general concept to be learned, whereas an analogy involves the comparison of two different (albeit similar) concepts or situations.
3 Methods
3.1 Textbook Selection
We narrowed our focus to introductory Linear Algebra textbooks because, as Harel (1987) reported, introductory textbooks have a strong, foundational emphasis on matrix arithmetic. We also restricted our search to textbooks written in English. To ensure that we did not omit any Englishlanguage textbooks currently in widespread use, we examined syllabi available online for introductory Linear Algebra courses at more than 106 Research1 universities around the United States, conducted online searches of textbook provider websites, and examined the textbooks in our own respective university libraries. Notably, among the 106 Research1 Linear Algebra syllabi that we examined, the most frequently appearing were Lay et al. (2015) (43), Bretscher (2012) (10), Leon (2014) (8), Strang (6), Kolman and Hill (5), Edwards and Penney (5), and Poole (4) (or previous versions of these textbooks). All other textbooks in our sample appeared 0, 1, or 2 times in this list. The texts appearing 0 times were those that we included in our sample via searches of online textbook provider websites or our own university libraries.
Furthermore, we focused on textbooks published within the past decade (at the time of this writing, since 2006) to obtain a more accurate snapshot of how matrix multiplication is being presented in today’s Linear Algebra classrooms (though we did not exclude a textbook outside this range if we found evidence that it was in widespread use). Additionally, we omitted textbooks for which Linear Algebra was not the sole focus, such as those designed for courses in Linear Algebra and differential equations. Due to their propensity for introducing topics in very similar (if not identical) ways, we included at most one textbook from each author in our sample. Similarly, whenever possible, we examined the most recently published editions of textbooks, omitting all other releases. In some cases, however, we were not able to obtain access to the most recent edition and thus opted for the most recent edition available (e.g. Andrilli & Hecker, 2010). Overall, our sample included 24 introductory Linear Algebra textbooks. A complete list of the textbooks in our sample can be found at the beginning of Sect. 4, and their corresponding bibliographic information can be found in the “Bibliography of Textbooks” section following the references.
3.2 Procedure for Data Collection and Analysis
In order to explore and contextualize the entirety of the rationale that textbooks presented for matrix multiplication, we decided that it was necessary to document the various forms of matrix multiplication in each text along with how they were sequenced before identifying, recording, and analyzing the associated rationale and justification. The data collection process began with using the table of contents and the index to identify the places where matrix multiplication appeared in each textbook, and then photocopying (or printing out) all pages in any section of a textbook in which matrix multiplication was defined, exemplified, or discussed. To ensure that our sample included all relevant presentations and discussions of matrix multiplication, this process was independently repeated by another researcher contributing to this project.
Next, we documented all forms of matrix multiplication featured in each textbook, the order in which they appeared, and any rationale the authors provided for the given forms. Our operational definition of rationale was broadly interpreted as any explicit attempt by the textbook author(s) to mathematically or pedagogically explain, justify, or demonstrate the purpose(s) or derivation of matrix multiplication. Each instance was classified using Harel’s (1987) framework (isomorphization, postponement, analogy, abstraction); additional categories were created as necessary for rationale that did not conform to these four classifications (we adapted Harel’s framework to include one additional category, detailed below: computational efficiency). We note that use of one strategy did not preclude use of another: as matrix multiplication is such a rich and connected concept, we allowed for the possibility (or, perhaps, probability) that textbooks would employ multiple strategies to communicate their rationale for this important concept. We also must acknowledge that, because we did not examine every page of the textbooks in our study, we cannot discount the possibility that we inadvertently omitted particular subtleties related to the sequencing of and rationale for matrix multiplication in certain textbooks. As such, the documentation in this paper should be regarded only as affirmation that these types of rationale do indeed appear in the textbooks in which they are cited and referenced. The absence of attribution of a type of rationale to a particular textbook does not necessarily imply that the textbook in question does not employ that type of rationale. For example, we will often use parenthetical citations to provide examples of textbooks employing the rationale in question; a citation like (e.g. Bretscher, 2012; Shifrin & Adams, 2010) means only that we documented such rationale in Bretscher’s, and Shifrin and Adams’s respective textbooks, not that these were the only texts in which this form of rationale appeared.
We used constant comparison (Creswell, 2007, 2008) of textbook materials to identify common themes across the data set, including common sequences for the forms of matrix multiplication and commonalities in rationale both within and across sequences. The final stage of our analysis—appearing in Sect. 5—was to triangulate the rationale that textbooks provided for matrix multiplication with the relevant literature on teaching and learning of Linear Algebra.
4 Results

Ax as LCC: The matrixvector product Ax as a linear combination of the columns of A: \(Ax = x_{1} \text{col}_{1} \left( A \right) + \cdots + x_{n} \text{col}_{n} \left( A \right)\).

Ax as DP: The matrixvector product Ax as a dot product of the rows of A with the column vector x: \(Ax = \left[ {\begin{array}{*{20}c} {\text{row}_{1} \left( A \right) \cdot x} \\ \vdots \\ {\text{row}_{m} \left( A \right) \cdot x} \\ \end{array} } \right].\)

AB as DP: The matrixmatrix product AB determined by dot products of row/column vectors: AB is the matrix in which the entry in row \(i\), column \(j\) (where \(1 \le i \le m,1 \le j \le p\)) is given by: \(\text{row}_{i} \left( A \right) \cdot \text{col}_{j} \left( B \right)\).

AB as [Acol(B)]: The matrixmatrix product^{2} AB as a matrix whose columns are determined by the action of A on the columns of B: \(AB = \left[ {A\text{col}_{1} \left( B \right)\left \cdots \rightA\text{col}_{p} \left( B \right)} \right]\).

Sequence 1: initiating with Ax as LCC (Ax as a linear combination of the columns of A);

Sequence 2: initiating with Ax as DP (Ax as a dot product of the rows of A with the column vector x); and

Sequence 3: initiating with AB as DP (AB as a dot product of the rows of A with the columns of B).
Textbooks in which Ax as LCC is the initial form of matrix multiplication
First  Second  Third  Fourth  Textbooks exhibiting this sequence 

Ax as LCC  Ax as DP  AB as [Acol(B)]  AB as DP  Sequence 1a Cheney and Kincaid (2012) Lay et al. (2015) Nicholson (2013) Solomon (2014) Spence, Insel, and Friedberg (2007) Strang (2009) 
AB as DP  AB as [Acol(B)]  Sequence 1b Ricardo (2009)  
AB as [Acol(B)]  AB as DP  –  Sequence 1c Beezer (2015) Holt (2012) 
Textbooks in which Ax as DP is the initial form of matrix multiplication
First  Second  Third  Fourth  Textbooks exhibiting this sequence 

Ax as DP  Ax as LCC  AB as [Acol(B)]  AB as DP  Sequence 2a Bretscher (2012) 
AB as DP  AB as [Acol(B)]  Sequence 2b Hefferon (2008) Leon (2014)^{a}  
AB as DP  AB as [Acol(B)]  –  Sequence 2c Shifrin and Adams (2010)  
Ax as LCC  AB as [Acol(B)]  Sequence 2d DeFranza and Gagliardi (2015) 
Textbooks in which AB as DP is the initial form of matrix multiplication
First  Second  Third  Fourth  Textbooks exhibiting this sequence 

AB as DP  Ax as DP  Ax as LCC  –  Sequence 3a Larson (2016) 
AB as [Acol(B)]  Sequence 3b Edwards and Penney (1988) Kolman and Hill (2007) Poole (2014)  
–  –  Sequence 3c Robinson (1991) Venit, Bishop, and Brown (2013)  
Ax as LCC  Ax as DP  –  Sequence 3d Anthony and Harvey (2012)  
AB as [Acol(B)]  Ax as DP  Ax as LCC  Sequence 3e Anton and Rorres (2014) Williams (2012)  
Ax as LCC  Ax as DP  Sequence 3f Andrilli and Hecker (2010) 
The 5 textbooks following Sequence 2 (initiating with Ax as DP) exhibited considerably more variation.
The textbooks in Sequence 3 were also quite varied.
We now shift to documenting and analyzing the rationale—both mathematical and pedagogical—that these textbooks offered for matrix multiplication and the way in which they presented it. We found examples of all four of Harel’s (1987) classifications of rationale: isomorphization, postponement, analogy (both mathematical and realworld), and abstraction. We also documented examples in which a form of matrix multiplication was introduced for the purposes of computational efficiency, a classification occurring frequently enough amongst our sample to warrant adapting Harel’s framework.
4.1 Isomorphization
We documented two distinct examples of isomorphization: (1) rationalizing the matrixvector product by identifying the advantages of equivalently reformulating linear systems and/or vector equations as matrix equations, and (2) framing the matrixmatrix product as an operation on matrices that preserves the composition of the corresponding linear transformations. Each of these is explained in detail below.
4.1.1 Reformulating Linear Systems and/or Vector Equations as Matrix Equations
Highlighting equivalencies between the matrix equation Ax = b and both systems of linear equations and vector equations was particularly prominent across each of the textbooks in Sequence 1; it also appeared in textbooks in Sequences 2 (e.g. Bretscher, 2012; Leon, 2014; Shifrin & Adams, 2010) and 3 (e.g. Edwards & Penney, 1988; Larson, 2016; Poole, 2014). It is important to note that the textbooks in Sequences 1 and 2 typically invoked isomorphization to accompany their initial definition of matrix multiplication, whereas the textbooks in Sequence 3 invoked isomorphization for forms of matrix multiplication presented after their initial treatment.
Indeed, a matrix equation preserves the algebraic structure of its corresponding linear system and vector equation, allowing these authors to link the matrixvector product (in the form of a matrix equation) with a familiar idea (systems of linear equations). The following excerpt (adapted from Lay et al. (2015, p. 36) typifies this approach:
Equation (3) has the form A x = b. Such an equation is called a matrix equation, to distinguish it from the vector equation such as is shown in (2).

“By now we are comfortable with translating back and forth between vector equations and linear systems. … Ax = b is a compact form of the vector equation \(x_{1} a_{1} + x_{2} a_{2} = b\), which in turn is equivalent to [a] linear system” (Holt, 2012, p. 63).

“We can use these new concepts to understand a system of equations Ax = b. If A and b are given, such a system challenges us to determine whether b is in the span of the columns of A and, if so, to find the coefficients needed to express b as a linear combination of the columns of A” (Cheney & Kincaid, 2012, p. 42).

“For a linear equation with \(n\) unknowns of the form \(a_{1} x_{1} + a_{2} x_{2} + \cdots + a_{n} x_{n} = b\) if we let \(A = \left[ {a_{1} a_{2} \ldots a_{n} } \right]\) and \(x = \left[ {\begin{array}{*{20}c} {x_{1} } \\ {x_{2} } \\ \vdots \\ {x_{n} } \\ \end{array} } \right]\) and define the product Ax by \(Ax = a_{1} x_{1} + a_{2} x_{2} + \cdots + a_{n} x_{n}\) then the system can be written in the form Ax = b” (Leon, 2014, p. 31).

“We reiterate that a solution x of the system of equations Ax = b is a vector having the requisite dot products with the row vectors \(A_{i}\)” (Shifrin & Adams, 2010, p. 39).

“The initial purpose of matrix multiplication is to simplify the notation for systems of linear equations” (Edwards & Penney, 1988, p. 35).

“Then the matrix equation \(AX = B\) is equivalent to the linear system … Here is further evidence that we got the definition of the matrix product right” (Robinson, 1991, p. 10).
Many of these texts also argued that multiple representations afford flexibility with respect to selecting a problemsolving approach. For example, Nicholson (2013) stated that “a change in perspective is useful because one approach or the other may be better in a particular situation … there is a choice” (p. 45). Strang (2009), noting that a linear systems (rows) approach is easy to visualize for a \(2 \times 2\) case but exceptionally difficult (if not impossible) to visualize for higher dimensions, stated that his “own preference is to combine column vectors. It is a lot easier to see a combination of four column vectors in fourdimensional space, than to visualize how four hyperplanes might possibly meet at a point. (Even one hyperplane is hard enough …)” (p. 33, emphasis in original). Interestingly, these arguments align with the arguments in the literature about the importance of being able to move flexibly between multiple representations in Linear Algebra (e.g. Dorier, 2000; Harel, 1997; Larson & Zandieh, 2013), which we discuss further in Sect. 5.
4.1.2 Framing the MatrixMatrix Product in Terms of Preserving the Composition of Linear Transformations
The second example of isomorphization involved framing the matrixmatrix product in terms of preserving the composition of the corresponding linear transformations. Textbooks in Sequences 1 (e.g. Cheney & Kincaid, 2012; Holt, 2012) and 2 (e.g. Bretscher, 2012; Hefferon, 2008) invoked this rationale. Textbooks in Sequence 3, on the other hand, typically treated linear transformations as tangential, rather than interrelated, at this early stage, opting to delay more formal treatments until later in the text.

“When a matrix B multiplies a vector x, it transforms x into the vector Bx. If this vector is then multiplied in turn by a matrix A, the resulting vector is A(Bx). … Thus A(Bx) is produced from x by a composition of mappings—the linear transformations studied [previously]. Our goal is to represent this composite mapping as multiplication by a single matrix, denoted by AB, so that A(Bx) = (AB)x” (Lay, Lay, and McDonald, 2015, p. 96).

“Because x was an arbitrary vector in \(R^{n}\), this shows that \(T_{A}\;\circ \;T_{B}\) is the matrix transformation induced by the matrix \(\left[ {Ab_{1} ,Ab_{2} , \ldots ,Ab_{k} } \right]\). This motivates the following definition” (Nicholson, 2013, p. 57).

“The definition of matrix multiplication was framed precisely to make this equation valid” (Cheney & Kincaid, 2012, p. 152).

“The matrix of the linear transformation \(T\left( x \right) = B\left( {Ax} \right)\) is called the product of the matrices \(B\) and \(A\), written as \(BA\)” (Bretscher, 2012, p. 77, emphasis in original).

“The matrix representing \(g \; \circ \; h\) has the rows of \(G\) combined with the columns of \(H\)” (Hefferon, 2008, p. 226).
It is worth noting that, because matrix multiplication appeared early in most of these textbooks, such an approach necessitated that linear transformations also be treated early. Another approach centered on the similar task of finding one matrix \(C\) such that \(A\left( Bx \right) = C\left( {x} \right)\) but without formally treating linear transformations first (e.g. DeFranza & Gagliardi, 2015; Shifrin & Adams, 2010; Strang, 2009). This alternative approach either avoided invoking linear transformations or made minimal references to them in passing (often with a note that a full treatment of linear transformations would follow in a subsequent chapter/section). For example, DeFranza and Gagliardi (2015) motivated matrix multiplication in a way that strongly suggested the relevance of linear transformations, asking “is there a single matrix which can then be used to transform the original vector \(\left[ {\begin{array}{*{20}c} 1 \\ 3 \\ \end{array} } \right]\) to \(\left[ {\begin{array}{*{20}c} 4 \\ 1 \\ \end{array} } \right]\)?” (p. 30). Shortly thereafter, they remarked that “the notion of matrices as transformations is taken up again in Chap. 4” (p. 31). Several other textbooks—particularly those in Sequence 3—also opted for this tangential reference to the importance of linear transformations, presumably to provide some insight into the mathematical structure of matrices and linear transformations without committing to a formal treatment so early in the text (e.g. Anton & Rorres, 2014; Shifrin & Adams, 2010; Williams, 2012). The methods that such textbooks employed for justifying the associative law—which the above textbooks achieved by leveraging the associativity of composing linear transformations—utilized different strategies, including analogy and abstraction (and are thus detailed in subsequent sections).
4.2 Postponement

“[This definition] is frequently the most useful for its connections with deeper ideas like the null space and the upcoming column space” (Beezer, 2015, p. 182).

“We now make an observation that will be crucial in our future work: the matrix product Ax can also be written as [a linear combination of the columns of A]” (Shifrin & Adams, 2010, p. 53).

“Note that the product Ax is the linear combination of the columns of A with the components of \(\vec{x}\) as the coefficients … Take a good look at this equation, because it is the most frequently used formula in this text. Particularly in theoretical work, it will often be useful” (Bretscher, 2012, p. 31).
Note that many of these textbooks also employed isomorphization, similar to what we discussed in the previous section, in order to motivate linear combinations; thus, these texts are attempting to justify the present importance of linear combinations and vector equations (as an alternative viewpoint on linear systems) while also emphasizing their future importance.

“Since matrices are added by adding corresponding entries and subtracted by subtracting corresponding entries, it would seem natural to define multiplication of matrices by multiplying corresponding entries. However, it turns out that such a definition would not be very useful for most problems. Experience has led mathematicians to the following more useful definition of matrix multiplication” (Anton & Rorres, 2014, p. 29).

“The most natural way of multiplying two matrices might seem to be to multiply corresponding elements when the matrices are of the same size, and to say that the product does not exist if they are of different size. However, mathematicians have introduced an alternative rule that is more useful. It involves multiplying the rows of the first matrix times the columns of the second matrix in a systematic manner” (Williams, 2012, p. 71).

“We now define the product of two matrices. From the way the other matrix operations have been defined, you might guess that we obtain the product of two matrices by simply multiplying corresponding entries. The definition of product given below is much more complicated than this but also considerably more useful in applications” (Venit, Bishop, & Brown, 2013, p. 90).
Considering the sequencing of these textbooks, the widespread use of postponement makes a certain amount of sense, as textbooks that have not yet discussed the matrixvector product or linear transformations upon the presentation of the matrixmatrix product have fewer familiar concepts with which to justify their definition. It should be noted, though, that postponement was only used as a temporary (and not permanent) strategy: all of these textbooks eventually connected matrix multiplication to linear transformations.
4.3 Analogy
We documented uses of both mathematical and realworld analogies. The mathematical analogies focused on relating aspects of matrix multiplication to familiar arithmetic domains (notably the real numbers and the integers). The realworld analogies involved use of a practical realworld scenario to justify or explain the formula for the matrixmatrix product AB.
4.3.1 Mathematical Analogy
Strang (2009) motivated the matrixmatrix product by expressing the desire for a single matrix \(C\) such that \(A\left( {Bx} \right) = Cx\), which was fairly common. What distinguishes his approach, however, is that he does not formally treat linear transformations until near the end of the textbook (Chap. 6), and thus is unable to use linear transformations to justify the associativity of matrix multiplication. Instead, he used the associativity of integer multiplication as an analogy: “When multiplying \(EAC\), you can do AC first or EA first. This is the point of an “associative law” like \(3 \times \left( {4 \times 5} \right) = \left( {3 \times 4} \right) \times 5\). Multiply 3 times 20, or multiply 12 times 5. Both answers are 60. That law seems so clear that it is hard to imagine it could be false” (p. 58). The other instance of mathematical analogy involved comparison of the matrix equation \(Ax = b\) to the real number equation \(ax = b\). Edwards and Penney (1988), for instance, wrote that “[the matrix equation \(Ax = b\)] is analogous in notation to the single scalar equation \(ax = b\) in a single variable \(x\)” (p. 38). This technique was typically used to justify the importance of the matrix equation \(Ax = b\) (and thus the matrixvector product) while also setting the stage for the importance of the inverse of a matrix (as a common method for solving both equations involves multiplication on the left by the appropriate inverse element). Accordingly, some textbooks (e.g. Bretscher, 2012) used the scalar equation \(ax = b\) in order to accentuate the importance of inverse matrices.
4.3.2 RealWorld Analogy
We found examples of textbooks leveraging realworld scenarios to justify the formula for the matrixmatrix product AB across Sequence 1 (e.g. Ricardo, 2009), Sequence 2 (e.g. Bretscher 2012), and Sequence 3 (e.g. Andrilli & Hecker, 2010; Larson, 2016; Williams, 2012). The scenario in Ricardo’s (2009) presentation was typical: the textbook described two hypothetical universities, Alpha College and Beta University, that plan to purchase the same computer equipment (in different quantities). The following information about quantity and price of equipment is provided in the following tables (adapted from Ricardo, 2009, p. 182):
Ricardo followed this scenario with a remark on the matrixmatrix product, noting that “we can generalize this rowbycolumn operation in a natural way” (p. 183).
As with use of postponement, the use of realworld analogy seems wellsituated for Sequence 3 because the sequencing of these textbooks afforded few mathematical footholds to introduce the matrixmatrix product. The textbooks following the other sequences, on the other hand, were able to leverage the matrixvector product en route to developing their characterizations of the matrixmatrix product, lessening the need for realworld comparisons for justification.
4.4 Abstraction
We documented two distinct uses of textbooks explicitly employing abstraction, a classification that we reserved for cases in which textbooks made direct comments about the relationship between a specific example and its associated general concept, representation, or formula. First, DeFranza and Gagliardi (2015), in the absence of a formal treatment of linear transformations, used an argument with \(2 \times 2\) matrices to justify that matrix multiplication is associative (i.e. that \(A\left( {Bx} \right) = \left( {AB} \right)x\)). This is a use of abstraction because it introduces students to and justifies associativity in a particular situation in order to justify the associativity of general matrix multiplication. Second, Shifrin and Adams’s (2010) presentation focused explicitly on the matrixvector product as a special case of the matrixmatrix product, emphasizing that the matrixmatrix product “is a generalization of multiplication of matrices by vectors” (p. ix). In a similar way, Andrilli and Hecker (2010) characterized the matrixvector product as “a generalization of the dot product of vectors” (p. 59). Though the focus of abstraction is different in each case (the matrixvector product and the vector dot product, respectively), both of these examples frame the matrixmatrix product in terms of versions of matrix multiplication that are more concrete and familiar. One possible reason for the lack of documented cases of abstraction is that, as we noted in the introduction, matrix multiplication is a notion for which students have little experiential basis, and thus the capacity for connecting general concepts to their more familiar, concrete instantiations is limited.
4.5 Computational Efficiency

“We can define AB using dot products and ‘fast’ matrixvector multiplication … the columnwise description above is usually the best way to understand matrix multiplication, but the dotproduct formula gives a convenient way to compute matrix products” (Solomon, 2014, p. 1.10).

This formula enables us to compute any element in the dot product with one simple dot product” (Cheney & Kinkaid, 2012, p. 192).

“In some applications, we only need a single entry of the matrix product AB” (Holt, 2012, p. 99).

“It is useful to have a formula for the ijth entry of the product …” (Bretscher, 2012, p. 79).
Several textbooks—particularly those in Sequence 3—employed computational efficiency to motivate their introduction of the matrixmatrix product as an action of A on the columns of B (AB as [Acol(B)]) (e.g. Andrilli & Hecker, 2010; Anton & Rorres, 2014; Kolman & Hill, 2007). Anton and Rorres (2014), for instance, stated that that this form of matrix multiplication “has many uses, one of which is for finding particular rows or columns of a matrix product AB without computing the entire product” (p. 31). This recasting of certain forms of matrix multiplication purely in terms of their capacity to streamline calculations certainly seems to insinuate (or, in Solomon’s case, explicitly assert) that the dot product methods are less conceptually illuminating for students, further demarcating the contrast in rationale around which we framed this analysis.
5 Pedagogical Implications and Future Research
In this section we coordinate the results of our analysis with the literature on student thinking in Linear Algebra in order to hypothesize which approaches might be advantageous (or disadvantageous) for student learning. We acknowledge, however, that we are not positioned to comment definitively on the relative pedagogical effectiveness of any particular content presentation, and thus any hypotheses resulting from this coordination are offered tentatively as avenues for future research.
We used an example of isomorphization (from Lay et al., 2015) and postponement (from Kolman & Hill, 2007) in the introduction to highlight the potential for variation in rationale regarding a central concept like matrix multiplication. As we noted before, there is a subtle tension between these explanations for the initially presented form of matrix multiplication. On one hand, textbooks invoking isomorphization (to vector equations and linear systems) for their initial form of matrix multiplication are attempting to frame matrix multiplication in terms of familiar concepts, procedures, and ideas. On the other hand, textbooks invoking postponement are (oftentimes explicitly) stating that the rationale for matrix multiplication are currently unable to be easily understood. Informally, we might characterize these two approaches as “this can be reasonably understood now using familiar ideas” and “this can only be understood later using more advanced ideas.” Our analysis indicates that this tension is indeed reflected amongst a substantial number of other textbooks in our sample as well, particularly between textbooks in Sequence 1 (Ax as LCC) and Sequence 3 (AB as DP). Some of these other textbooks also clearly delineated this tension. For example, recall Solomon’s (2014) comment that “the columnwise description above is usually the best way to understand matrix multiplication, but the dotproduct formula gives a convenient way to compute matrix products” (p. 1.10) seems to suggest that the dot product approach in Sequence 3 is less conceptually enlightening and should be used purely for computation.

viewing b as a linear combination of the columns of A (i.e. Ax as LCC),

viewing the rows of Ax = b as the equations in a linear system (i.e. Ax as DP), and

viewing b as a linear transformation of the vector x.
Larson and Zandieh reiterated the importance of understanding all three viewpoints, citing the Invertible Matrix Theorem—a set of equivalent conditions to determine the invertibility of a matrix—as an example. The textbooks in Sequence 1 (and some in Sequence 2) appear to be wellpositioned to emphasize and foster such flexibility right from the initial definition of matrix multiplication. How alternative emphases in the early stages regarding matrix multiplication might affect subsequent students’ understanding of subsequent concepts remains an unexplored question in the literature. We also note that initiating with the matrixvector product and its different characterizations naturally extends to the matrixmatrix product and can lead to flexible ways of conceptualizing the matrixmatrix product. For example, understanding the matrixvector product as an isomorphization of a linear system can support understanding the matrixmatrix product as the structure needed to support a change of variables (substitution) in the linear system. Similarly, understanding the matrixvector product as a linear transformation can support viewing the matrixmatrix product as the composition of linear transformations. In the same way that Harel (1997) and Larson and Zandieh (2013) argue in favor of multiple ways of understanding the matrixvector product, we propose that it is similarly valuable to understand the matrixmatrix product in different ways. Specific affordances of these ways of understanding could be explored in future research.
Textbooks invoking postponement, particularly those in Sequence 3, typically supplemented their rationale with (1) a realworld example, and/or (2) a brief, tangential reference to linear transformations in order to justify the formula for matrix multiplication. Regarding realworld examples, Harel (1987) argued that such scenarios require the student to “distinguish between relevant and irrelevant features,” which, in turn, might “weaken the anticipated motivational effect” (p. 30). We argue that one particularly relevant feature of such scenarios is potentially underemphasized: the appropriate arrangement of the arrays of information (from which the matrices are constructed). Indeed, most textbooks presenting a realworld scenario conveniently prearranged these arrays so that the formula for the matrixmatrix product was immediately clear. While we acknowledge that such application problems are intended as an introduction (and not as examples that comprehensively embody the structure of matrix multiplication), we question the effects of this convenient prearrangement because it partially sidesteps the potential for students to identify and abstract the interplay between the row and column vectors of the matrices in the product. Future research could explore how students might be able to construct the relevant features of the mathematical structure of matrix multiplication via such application problems.
Much of this discussion has focused on Sequences 1 and 3, largely because they espoused drastically different initial approaches to matrix multiplication (isomorphization vs. postponement, respectively). We found far less consistency amongst the 5 textbooks in Sequence 2. Their rationale for the matrixmatrix product, for example, spanned all categories of rationale (isomorphization, postponement, analogy, abstraction, and computational efficiency). Sequence 2 was, in some sense, a hybrid of textbooks that were similar in approach to the other two sequences. Bretscher (2012) and Leon (2014), for example, seemed to have more in common with the textbooks in Sequence 1 (due to their early emphasis on Ax as a linear combination of the columns of A, even though this was not their initial definition), and Shifrin and Adams (2010) and DeFranza and Gagliardi (2015) seemed to have more in common with Sequence 3 (due to their limited emphasis in the early stages on viewing Ax as a linear combination). Perhaps what Sequence 2 reveals most, then, is that the approaches of Sequences 1 and 3 are not altogether pedagogically incompatible. Indeed, a textbook making use of a definition in terms of dot products first could very well afford ample focus to the linear combinations definition (e.g. Bretscher, 2012; Leon, 2014). Thus, significant conclusions drawn from the sequencing of matrix multiplication alone should be regarded cautiously and in need of additional study. On a more general level, though, the fact that there are apparent contradictions in the two most conspicuous methods of rationale for these respective approaches does highlight the need for research to examine which approach might be more pedagogically effective.
6 Conclusions
This study contributes to the literature in three important ways. First, the primary contribution of this paper is our documentation and analysis of the ways in which introductory, Englishlanguage linear algebra textbooks conceptualize and sequence matrix multiplication. This analysis provided very specific information about the four characterizations of matrix multiplication that expert mathematicians believe to be the most important (Ax as LCC, Ax as DP, AB as DP, and AB as [Acol(B)]). Particularly, we noticed that experts value fluency amongst these multiple characterizations of matrix multiplication and often discussed unique insights or abilities that each one offered (for example, Ax as LCC allows one to reformulate the concept of span in terms of linear systems, and Ax as DP enables one to calculate individual entries in a product). The order in which these characterizations appeared often had significant implications for the way in which they were rationalized. The textbook authors collectively employed a wide variety of techniques to rationalize and explain matrix multiplication: each category of rationale in Harel’s (1987) framework (in addition to computational efficiency) appeared in our sample.
Second, we coordinated these findings with the literature on the teaching and learning of Linear Algebra to hypothesize about ways of understanding matrix multiplication that might be advantageous (or disadvantageous) for students to have. In addition to highlighting productive avenues for future research, this information can be used to inform a conceptual analysis, a description of “what students might understand when they know an idea in various ways” (Thompson, 2008, p. 57). Conceptual analyses are particularly important in research on student learning because, as noted by Thompson (2008), their uses include (1) devising ways of understanding a particular concept that might be powerful for students, and (2) characterizing the nature of student struggles with that concept. Conceptual analyses can also be used to design and analyze student thinking in the context of instructional sequences that aim to develop these powerful ways of understanding.
Third, we have adapted Harel’s (1987) framework for analyzing rationale to include a category for computational efficiency. Additionally, though it has been used exclusively to study rationale for Linear Algebra concepts in Linear Algebra textbooks, we suggest that its potential scope is much broader and extends beyond textbook analysis. For example, the different types of rationale in this framework—isomorphization, postponement, abstraction, analogy, and computational efficiency—could also be used to analyze the rationale that an instructor provides for the introduction of a new concept in a lecture setting. In this way, we anticipate that our adaptation of Harel’s framework can be useful for future research.
Footnotes
 1.
Our descriptions given here are not necessarily identical to those given in each textbook but are instead offered as summaries of these methods that are mathematically equivalent.
 2.
Methods for multiplying using block/partitioned matrices, if formally addressed in a textbook, typically appeared along with this form of the matrixmatrix product.
References
 Bierhoff, H. (1996). Laying the Foundations of Numeracy. A Comparison of Primary School Textbooks in Britain, Germany and Switzerland. Teaching Mathematics and its Applications, 15(4), 141–60.Google Scholar
 Carlson, D. (1993). Teaching Linear Algebra: Must the Fog Always Roll In? College Mathematics Journal, 24(1), 29–40.Google Scholar
 Carlson, D., Johnson, C. R., Lay, D. C., & Porter, A. D. (1993). The Linear Algebra Curriculum Study Group recommendations for the first course in Linear Algebra. The College Mathematics Journal, 24(1), 41–46.Google Scholar
 Capaldi, M. (2012, February). A study of abstract algebra textbooks. In Proceedings of the 15th Annual Conference on Research in Undergraduate Mathematics Education (pp. 364–368).Google Scholar
 Creswell, J. W. (2007). Qualitative inquiry and research design: Choosing among five Approaches (2nd Edition). California: Sage Publications.Google Scholar
 Creswell, J. W. (2008). Educational research: Planning, conducting, and evaluating quantitative and qualitative research (3rd Edition). Upper Saddle River, NJ: Pearson.Google Scholar
 Dorier, J. L. (Ed.). (2000). On the teaching of Linear Algebra (Vol. 23). Springer Science & Business Media.Google Scholar
 Harel, G. (1987). Variations in Linear Algebra content presentations. For the learning of mathematics, 7(3), 29–32.Google Scholar
 Harel, G. (1997). The Linear Algebra curriculum study group recommendations: Moving beyond concept definition. MAA NOTES, 107–126.Google Scholar
 Larson, C., & Zandieh, M. (2013). Three interpretations of the matrix equation Ax = b. For the Learning of Mathematics, 33(2), 11–17.Google Scholar
 Lockwood, E., Reed, Z., & Caughman, J. S. (2016). An Analysis of Statements of the Multiplication Principle in Combinatorics, Discrete, and Finite Mathematics Textbooks. International Journal of Research in Undergraduate Mathematics Education, 1–36.Google Scholar
 Reys, B. J., Reys, R. E., & Chavez, O. (2004). Why Mathematics Textbooks Matter. Educational Leadership, 61(5), 61–66.Google Scholar
 Robitaille, D. F., & Travers, K. J. (1992). International studies of achievement in mathematics.Google Scholar
 Thompson, P. W. (2008). Conceptual analysis of mathematical ideas: Some spadework at the foundations of mathematics education. In Proceedings of the annual meeting of the International Group for the Psychology of Mathematics Education (Vol. 1, pp. 4564). PME Morelia, Mexico. Google Scholar
 Thompson, D. R., Senk, S. L., & Johnson, G. J. (2012). Opportunities to learn reasoning and proof in high school mathematics textbooks. Journal for Research in Mathematics Education, 43(3), 253–295.Google Scholar
 Weinberg, A., & Wiesner, E. (2011). Understanding mathematics textbooks through reader oriented theory. Educational Studies in Mathematics, 76(1), 49–63.Google Scholar
Bibliography of Textbooks
 Andrilli, S., & Hecker, D. (2010). Elementary Linear Algebra (4th ed.). Massachusetts: Academic Press.Google Scholar
 Anthony, M., & Harvey, M. (2012). Linear Algebra: concepts and methods. Cambridge University Press.Google Scholar
 Anton, H. & Rorres, C. (2014). Elementary Linear Algebra: Applications version (11th ed.). Hoboken, NJ: Wiley.Google Scholar
 Beezer, R. A. (2015). A first course in Linear Algebra (version 3.50). Retrieved from http://linear.ups.edu/download/fcla3.50tablet.pdf.
 Bretscher, O. (2012). Linear Algebra with applications (5th ed.). New Jersey: Pearson.Google Scholar
 Cheney, W. & Kinkaid, D. (2012). Linear Algebra: Theory and applications (2nd ed.). Massachusetts: Jones & Bartlett Learning.Google Scholar
 DeFranza, J. & Gagliardi, D. (2015). Introduction to Linear Algebra with applications. Illinois: Waveland Press.Google Scholar
 Edwards, C. H., & Penney, D. E. (1988). Elementary Linear Algebra: Custom Edition for Arizona State University. Pearson College Div.Google Scholar
 Hefferon, J. (2008). Linear Algebra. Available online.Google Scholar
 Holt, J. (2012). Linear Algebra with applications. New York: W.H. Freeman.Google Scholar
 Kolman, B., & Hill, D. (2007). Introductory Linear Algebra (9th ed.). New Jersey: Pearson.Google Scholar
 Larson, R. (2016). Elementary Linear Algebra (8th ed.). Massachusetts: Houghton Mifflin.Google Scholar
 Lay, D. C., Lay, S., & McDonald, J. (2015). Linear Algebra and its applications (5th ed.). Pearson.Google Scholar
 Leon, S. (2014). Linear Algebra with applications (9th ed.). New Jersey: Pearson.Google Scholar
 Nicholson, K. (2013). Linear Algebra with applications (7th ed.). New York: McGraw Hill.Google Scholar
 Poole, D. (2014). Linear Algebra: A modern introduction (4th ed.). Massachusetts: Houghton Mifflin.Google Scholar
 Ricardo, H. (2009). A modern introduction to Linear Algebra. CRC Press.Google Scholar
 Robinson, D. J. S. (1991). A course in Linear Algebra with applications (pp. I–XIII). Singapore: World Scientific.Google Scholar
 Shifrin, T. & Adams, M. (2010). Linear Algebra: A geometric approach (2nd ed.). New York: W.H. Freeman.Google Scholar
 Solomon, B. (2014). Linear Algebra, geometry and transformation. CRC Press.Google Scholar
 Spence, L., Insel, A. & Friedberg, S. (2007). Elementary Linear Algebra: A matrix approach (2nd ed.). New Jersey: Pearson.Google Scholar
 Strang, G. (2009). Introduction to Linear Algebra (4th ed.). Massachusetts: Wellesley Cambridge Press.Google Scholar
 Venit, S., Bishop, W., & Brown, J. (2013). Elementary Linear Algebra (2nd ed.). Ontario: Nelson Education.Google Scholar
 Williams, G. (2012). Linear Algebra with applications (8th ed.). Massachusetts: Jones & Bartlett Learning.Google Scholar