In humans, social learning can take many forms. It can occur within academe, outside academe, as group learning and individual learning. We have evolved as social and learning beings.
It may be stated that the role of evolution (Darwin 1859) has not been embraced widely by researchers in psychology and social scientists. However, increasingly, a number of contemporary scholars have suggested that the evolutionary process enables a better understanding of human cognition and socialization (e.g., Geary and Berch 2016; Bjorklund and Ellis 2014; Sweller 2008; Wellman 2014). To discuss the numerous ideologies of evolutionary psychology and their associations with social theories is beyond the limits of this chapter, therefore, it will concentrate on two major perspectives. These are from Sweller’s cognitive load theory and Geary’s evolutionary cognition theory. They provide evidence from an evolutionary standpoint of how our memory system and cognitive abilities developed and the implications for learning. These are not the only way to look at the domains but they do provide some compelling evidence.
Cognitive Load Theory
Cognitive load theory (Sweller 2016) is a cognition theory originating in the mid-1970s related to our human memory system structures. Since its foundation, the theory has provided a framework for thousands of empirical studies in educational psychology and instructional design.
The theory is based upon substantial evidence that we acquire and process information via a very small, limited, finite working memory. It has been demonstrated that the recall limit after being shown novel information (e.g., a set of numbers or letters) is seven plus or minus two items (Miller 1956). Later research suggests it is possibly less at four plus or minus one (Cowan 2001). Furthermore, Peterson and Peterson (1959) established that the duration of working memory recall of items is only up to 20 s.
Without a strategy to process the held items, the novel information is lost forever. This is on the proviso that there is no strategy available such as grouping into meaningful blocks or chunks (Miller 1956). As an example, when faced with the task of recalling a random ten digit number (such as a new cell phone number), recall fails after around the first seven to nine digits then may only be retained for the limited time. Fortunately, working memory does interact with, as far as we know, an unlimited capacity long-term memory (Ericsson and Kintsch 1995).
Long-term memory is composed of prior knowledge in the form of hierarchical organized networks of the schema (Miller 1956). De Groot’s foundational research with chess masters versus chess novices indicated that master chess players do not have a larger memory system compared to novice or weekend chess players but more schemas in the form of previously remembered chess board configurations (see De Groot 1965 for a full description).
An Update of the Theory Aligned with Evolutionary Psychology
As aforementioned, cognitive load theory is based on the premise that if the characteristics of working memory capacity and duration limitations are ignored when instructional design procedures are developed, an incomplete, or failure of learning will result (Sweller 2016). When the theory was developing, it was based on the interactions between working and long-term memory. These interactions are still completely valid. However, it was not established that there was an association with the theories of evolution. Evolutionary cognition theories mapped against what we now know about contemporary cognitive architecture has been able to afford context and given substantial explanatory power to cognitive load theory and provide a wider range of hypotheses. Sweller is not unique in his analysis of cognition linked with evolution it has been comprehensively elaborated on by others such as Geary (2005), Campbell (1960), and Darwin (1859). Critically for this chapter, a latter reinterpretation of cognitive load theory by Sweller and Sweller (2006) specifically uses five principles aligned with evolution theory. These are the information store principle, the borrowing and reorganizing principle, the randomness as genesis principle, the narrow limits of change principle, and the environmental organizing and linking principle.
The information store principle indicates a very large knowledge base is essential. In nature, the storage facility is genomes which are also immense in holding of genetic information. Although there is no agreed information unit size for a particular genome capacity, it is in the thousands of smaller genomes and much more for larger genomes (Stotz and Griffiths 2004). So too for the store of human long-term memory as it has, as far as we are aware, an unlimited capacity for holding information. Our long-term memory holds so much that if we were asked to write down all our knowledge, the task would probably be endless. As humans, we can hold simple information from minimal concepts such as an exclamation mark and a comma. We retain that a lemon is yellow and an egg is oval. We also hold larger procedural networks of how to use a pen, drive a vehicle, write a computer program, or read a spreadsheet. All this knowledge plus vastly more is retained in our long-term memory.
The borrowing and reorganizing principle for natural information systems is attained during either asexual or sexual reproduction. Prior hereditary lines are the providers of the information. In the case of sexual reproduction, that information is necessarily reorganized as an essential part of the process. Genetic information does not replicate to the identical. For example, offspring are usually similar but not an indistinguishable duplicate of their parents. Likewise in an equivalent process, the majority of the acquired knowledge kept in long-term memory is not a reproduction of the borrowed information and is always changed in some way. Sophisticated communication is an indication of how humans have evolved to acquire most of their information from each other. A learner borrows knowledge by observation, listening, and reading from what others write. Almost all we know is borrowed knowledge. We emulate what others do (Bandura 1986). We imitate other’s gestures from infancy. For example, a child comes to understand that “clapping” is used as a signal of appreciation (at least in Western society). In addition, we listen to information from family members and the media. We utilize text to increase our knowledge in a domain. This process involves reorganization, in that new material must be combined with previous material stored in long-term memory using a constructive and reconstructive process.
Unique information is also created from the evolutionary process. Natural selection does so by a random mutation, which is a necessity and the primary source of all biological differences. Similarly, the randomness as genesis principle is based on the foundation that, if knowledge cannot be borrowed from others, learners have to randomly create new knowledge and test it for value during problem solving. When faced with a novel problem all we can do is test it against previous learnt knowledge retained in long-term memory. This in turn generates new knowledge that we deem is either effective or ineffective. Effective new knowledge if, of use, can be retained in long-term memory while ineffective new knowledge is discarded.
To illustrate, if faced with four potential trails while being lost on a walk, we can only choose one and proceed. If it is ineffective we will turn back and choose one of the three remaining trails. The strategy will continue on until we discover the correct trail. The productive trail is retained in long-term memory and the others discarded. Clearly a random generate and testing procedure could result in an unworkable number of possible outcomes if there were too many trails. As another example, consider that there are two elements of A and B that need to be combined but there is no prior knowledge available of the correct combination. Using a random generate and test procedure, the permutations for A and B are 2. A combination of A, B, C, and D is more difficult to determine as the permutations are 24. For 7 letters, the total is 5040 permutations. If we must deal with 10 elements, there are 3,628,800 permutations. This has the task of problem-solving essentially unmanageable. Due to the number of possibilities, the combination of just 10 items, or even 7, will take much longer and clearly be beyond the limited capacity and duration of working memory.
Evolution by natural selection as well imposes restrictions on the possible generating number of mutations. A genome is generally limited in the number of changes to its genetic composition. Variations are usually slow and incremental as too many alterations could be unproductive or final. Genetic mutations thus are slight and occur over a prolonged and measured period. Similarly, the narrow limits of change principle is a safeguard that variations to long-term memory are minor and incremental due to the very limited capacity and duration of working memory. The quantity of randomly created new knowledge has to be restricted to preserve the functionality of the information stored in long-term memory.
The environmental organizing and linking principle endorses the previous four principles outlined. In nature, this final principle assures that a change is suitable for the prevailing environment. The environment does influence fluctuations to the information store however, the eventual goal of this store is to permit adaptive functioning in the given environment. The epigenetic system here is the stable governor and can turn genes on and off after cell division. Regardless, it does so evenly. The environmental organizing and linking principle similarly show that working memory is restricted when processing novel information, it has however much extensive, unidentified boundaries when processing known information imported from long-term memory.
The Link Between Cognitive LoadTtheory and Geary’s Evolutionary Cognition Theory
The updating of cognitive load theory and implications for instructional design is due to the relationship with Geary’s theory. Many aspects of Geary’s thesis have been implemented and supported by contemporary cognitive load theory research. In addition, previous theoretical problems with cognitive theory insofar as findings have been clarified. Geary (2006) and Geary and Berch (2016) suggest evolved and nonevolved abilities have equivalency between natural selection and human learning to provide instructional procedures. Some knowledge is inherent and will need no instruction. Geary divides knowledge into what is known as biological primary knowledge and biological secondary knowledge. The two categories align with Sweller’s theory in that it is only biological secondary knowledge that requires an awareness of the constructs of our memory system and implications for instructional design. This will be explained in the following section.
Biological Primary Knowledge
Biological primary knowledge tends to be concerned with generic skills that are essential to human functioning. It is knowledge that we need to function at least basically in a society. It can be termed a cultural knowledge. Clearly, there are numerous adaptive difficulties accompanying a navigating of the social world. Humans have to maintain allies, manage status hierarchies, cope with the competing interests of others, interact with groups and other group members, coordinate social activities, and participate in joint decision-making. Humans have to create favorable conditions using others to their benefit and avoid being used by others for gain.
As well, they have to deal with the realities of the physical world. This is exactly what happens in the natural world (Geary and Berch 2016). A species has to cope with the competing interests of other species, create their own habitats, share their own habitats, and avoid certain habitats of other species. Species need to utilize other species by hunting and evade being used by others by avoiding becoming quarry.
We too as humans have evolved to acquire this knowledge automatically. To reiterate, it is that knowledge we really do not need to be explicitly taught. The capacity to learn our first language is a necessity for our species, and it is required of us to communicate verbally. The commencement of speech is an example (Kuhl 2000). The ability of toddlers with virtually no training to learn and be motivated to learn their first language (to the extent of at least being understandable) by listening is biological primary knowledge. An infant does not need instruction to utter sounds as it will make its first sounds independently. To teach a child to make a word needs to consider the physical mechanics needed to make the sound of the word. It requires a concurrent arrangement of lips, tongue, breath, and voice (Sweller 2008). Yet, without any explicit instruction, a child can independently verbalize this word and others at some stage of its development.
To be mobile is crucial and a child does not need to be taught how to rise upon its legs and take its first steps. Similarly, the recognition of faces (Bentin et al. 1999) is essential and a child has no need to be explicitly taught to distinguish parents, caregivers, family, or friends. As well, a child is able to read emotions from facial expressions and can follow the gazes and gestures of others (Herrmann et al. 2007). In agreement is Bandura (1986), who states that the knowledge to imitate is a skill that has no need to be taught. Imitation is exhibited at an early age where young children will commence to frequently imitate the other without instruction. Although Piaget demonstrated this to occur at a later age of around two onwards (Piaget 1951), it is shown to commence as early as 6 weeks where babies imitated facial expressions of the mother (Meltzoff and Keith Moore 1994).
Biological Secondary Knowledge
The second category of biological secondary knowledge is information were instruction is certainly required. This class of knowledge is different from the first of biological primary knowledge as it is culturally specific (Sweller in Geary and Berch 2016). To be literate and numerate are skills that are needed to be explicitly taught. Educational institutions teach biological secondary knowledge to the members of its society to enable full and successful participation in that society. For example, academe usually instructs in language, mathematics, science, history, geography, and economics among other subjects in a curriculum. The knowledge within these domains are not automatically inherent in the genesis and progression of the human mind and therefore needs instruction.
Pronumerals and algebra as well are examples of biological secondary knowledge. That the pronumeral a can have a value for one equation and another value for an alternative equation must be specifically taught as well as the procedures to solve equations. We are not genetically programmed to spontaneously comprehend these type of mathematical constructs. In English, to comprehend that within our grammatical rules, a sentence must have a subject, is not a tacit construct for us. As infants, we may be able to learn to speak our first language by simply listening as it is biological primary knowledge. However to speak our language fluently, as required by our society, requires the biological secondary knowledge of the rules of grammar. An infant’s biological primary knowledge attained expression of Me want drink is not a grammatical sentence.
David Geary’s categorization of knowledge is a logical proposition. The connection between the second category of biological secondary knowledge then has implications for learning and the role of explicit instruction.
Discovery Learning Versus Explicit Instruction?
For at least a half a century, there has been an energetic belief in many educational circles that we learn more efficiently by a discovery approach. Exemplified by Jerome Bruner in Bruner 1961 and Seymour Papert (1980), the philosophy has had many variants where labels such as discovery learning, problem-based learning, inquiry learning, constructivist learning, and experiential learning are exemplified (see Kirschner et al. 2006). The Piagetian theory of the construction of knowledge is also related here. The terminology tends to signify the same constructivist/discovery view point of emphasizing minimally guided instruction for the learner. It seems that a driver for the movement is an argument that travels along the following lines:
An immense amount of information and knowledge is learnt outside educational institutions. Furthermore, this information and knowledge are accumulated with ease and without intervention of an instructor. If this information is absorbed without structured, explicit methods and is successful, then it must be a useful paradigm. Moreover, the artificial methods by which we teach such as guided or direct instruction and transmission from a teacher must be insufficient. Certainly, the argument goes, current teaching methods are not exemplified by students learning syllabus content with ease and without intervention. So, we must then utilize the way information is absorbed in the outside world for use inside the educational institutions. If people learn by being immersed in a society/culture we can then immerse students in a subject or topic and thus somehow they will accrue the required knowledge and more importantly obtain understanding.
A common example of the unguided method is in science instruction. Students are placed in inquiry learning contexts and have to discover recognized principles by demonstrating the investigatory attributes of professional researchers (Van Joolingen et al. 2005). Another illustration is where university medical students in problem-based learning (PBL) courses are urged to find solutions for patients’ common conditions using problem-solving techniques (Schmidt 2000). Guided, direct instruction and rote learning were deemed to be unproductive. Yet, what has been outlined in this chapter regarding secondary biological information and its processing by our cognitive architecture makes clear that having one discover information alone is not efficient. There are too many possibilities for a discovery path which may overload working memory.
There is no compelling empirical evidence that discovery approaches are efficient methods of learning in spite of decades of research. In stark contrast, there is a large body of evidence from numerous controlled empirical experiments that the epitome of direct instruction, that is worked examples, are substantially superior as instruction (see Renkl 2005). A worked example is a step by step guide given to novices in a domain that shows the solution to a problem. The alternative would have a learner discover the answer.
Consider the following algebra problem: a + b = d solve for b where a = 3 and b = 4. A full worked example would give a step by step guide in perhaps less than 8 steps. However, a novice with minimal prior knowledge in the domain may not find a solution at all, or at least take some time to discover a correct solution. The reason for this is our human cognitive architecture. Working memory may be faced with numerous possible solution paths and be overloaded resulting in a learning failure.
Research on the traditional use of worked examples has shown that usually one or two worked examples are given then they are followed by a number of similar problems to solve. This is ineffective, and instead, it is more productive to have many worked examples initially, then followed, by similar problems to solve (Sweller and Cooper 1985).
As another example of the debate, rote learning is a strategy that within a discovery learning philosophy is frowned upon. Discovery type learning appears to use the rationale that better understanding within the learner is developed and certainly no understanding will ever develop by rote learning. A problem with this premise is that understanding is not defined (Sweller 2008). We know that the way our human memory system operates is critical for learning so it would be now used to look at what we mean by understanding as well as rote learning and how they could be placed within the operation of this system.
Although this may not be evident under a discovery learning philosophy, both understanding and rote learning are crucially dependent on our long-term memory store (Sweller 2008). Elementary math learning of tables by novices is an illustration of rote learning in early education. If a child can rote learn that 5 × 9 = 45; that knowledge is then stored in long-term memory. If the child can also understand that × means repeated addition so that 9 + 9 + 9 + 9 + 9 = 45; this too can be stored in long-term memory. It is not the case that the first is stored and the second is not. What is more important is the differences between the two in the structure of the learning content. Learning with understanding requires large amounts of connected information stored in long-term memory. To make those connections imposes a huge amount of cognitive load on working memory during initial processing to have them stored in long-term memory. Thus as an alternative strategy, novice learners will rote learn (Sweller 2008). Learning with understanding does not negate the function of long-term or working memory. It means that learning with understanding results in a large amount of linked information needed to be stored in long-term memory. This linked information has features that are difficult to process in working memory. If we return back to the algebra worked example illustration previously as researched by Sweller and Cooper (1985), the continued processing of a number of worked examples initially results in more linked knowledge. This is then stored in long-term memory and therefore resulting in better understanding of, in this case, basic algebra.
Schooling, Observation, and Learning
An integral feature of social learning theory is people’s observation of another (the model). The observation takes part within the context of social interactions and experiences. A model carrying out a behavior and the significance of that behavior is recalled with the order of events. The information is then utilized to guide following behaviors (Bandura 1986). Humans do not learn new behaviors merely by testing them and having a resultant achievement or failure. Observing a model can also prompt the viewer to engage in behavior they already learned. To be able to reproduce the actions of others including reproducing the learning of others could be viewed as the survival of a species. Educational institutions such as schools, colleges, and universities were set up for this reason. Too, they are not only considered as sites for social learning but also “content” learning. As previously outlined, almost all we know has been directly perceived or learnt through another person or group. Geary’s category of biologically secondary knowledge covers the types of information that is found in every syllabus area within an educational institution. School was devised for the teaching of biologically secondary knowledge. This category of knowledge is unlike primary knowledge and is not likely to be attained without the constructs and tasks established in educational institutions (Sweller 2015).
Domain General and Domain-Specific Knowledge
According to Sweller (2015), knowledge that is domain general can be considered as generic and is aligned with Geary’s biological primary knowledge. As outlined previously, they are essential human functioning skills that we have genetically attained without instruction. When solving any problem, the looking at differences between the goal and initial state of the problem and minimizing these differences is a first general problem-solving heuristic (Newell and Simon 1972). For example, when faced with a novel problem in direction, we automatically look for differences in the present state of where we are, compared to the goal state of where we wish to go. We just do not inertly stand there. The knowledge is generic, essential for the survival of our species and does not have to be taught. A secondary response of studying the position of the sun or stars if possible as direction indicators is an example of domain-specific knowledge and has to be attained by prior instruction. Other examples of domain-specific knowledge are mathematical operations, the grammar of a language, chemistry principles, and computer coding or medical procedures. These too require detailed training. Educational institutions were established to address the need to teach domain-specific knowledge (which really can be considered biologically secondary knowledge). The chapter will now outline a cognitive load effect related to socialization.
A Collective Memory Through Socialization
Pass and Sweller (2012) state that from the viewpoint of evolution, natural selection supports the “fittest.” According to Brem et al. (2003), self-interest would appear to be a dominating and predisposing force for an individual. Furthermore, Richard Dawkins even labeled this tendency the “selfish gene” (Dawkins 1989). Although this argument of selfishness is questioned due to definitions (see Hawley 2016 for a fuller account), co-operation has the potential of strengthening the group with joint available resources rather than an individual working independently.
The use of group work in educational institutions is common. Group work can be defined as a small group of students who collaborate to learn and complete tasks. Humans working as a group in many situations can provide an individual with information more expediently, than trying to obtain that information without support from a group. In a group, detailed individual knowledge is not a requirement. Clearly, the group can access more resources than when working individually. A team of collaborating learners may well be able to solve complex problems that may be insolvable for an individual learner (Pass and Sweller 2012).
Paas and Sweller have completed research related to this termed the “collaborative learning effect.” Recall, the borrowing and reorganizing principle (Sweller and Sweller 2006) where borrowing and reorganizing information from others is the main source of our knowledge. Correlated with this principle, is that rather than acquiring information alone, information can be better learned by us from an instructor or via instructional materials.
The collaborative learning effect is where during shared learning, that information can be obtained from other sufficiently knowledgeable people engaged in the same task (Pass and Sweller 2012). The effect has been demonstrated in cognitive load research comparing an individual to collaborative learning environments. The Pass and Sweller contribution is more empirically specific than that of Vygotsky and his construct of the Zone of Proximal Development (Vygotsky 1962), where terminology and specific outcomes have been accused by some critics as vague.
A detailed examination of group work from a cognitive load perspective will be now outlined: A co-joined working memory has the potential to increase the ability of the group producing a larger working memory. The elements of information necessary for a learning task can be divided between group members so memory load is shared by the group cognitive capacity. Note that general communication and coordination between the group members is a requirement as well; however, this is a biological primary knowledge process that has evolved and is not needed to be learned. Whereas the content of the learning task is biological secondary knowledge. Yet the collaboration (group work) versus independent (working alone) approach has two contrary consequences (Paas and Sweller 2012).
Firstly, a substantial working memory load can be inflicted by the learning material (the biologically secondary information). However, that can be shared over several cooperating members so that they have to devote less cognitive effort than if they were learning alone. As a result, a large reduction in a singular working memory load can be distributed among group members. Secondly, by dividing the work the communication and coordination processes (group interactions) require the group members to invest an additional cognitive effort, which is an effort that individuals do not have to exert.
This cost of group interactions can be further divided into two categories of biologically primary knowledge or biologically secondary knowledge. The working memory costs of general group interactions may be minor because they are biologically primary. In contrast, the working memory costs of the task-specific group interactions may be considered because they are biologically secondary information. Consequently, a prospective advantage of group learning may only be evident if the working memory costs are largely biologically primary. According to Serfaty et al. (1998), the working memory costs of task-specific communication and coordination processes can be reduced by teaching those processes or by working in carefully designed learning settings.
From this chapter, evidence has been presented that our memory system has proceeded through evolution and developed to make us what we are. As outlined in the Introduction, these perspectives from David Geary, John Sweller, and others are not the only way to look at the domains but undoubtedly compelling ones. The interactions between evolution and our human cognitive architecture may enable a better understanding of why, what, and how we learn in both a social and educational context.
- Bandura, A. (1986). Social foundations of thought and action: A social cognitive theory. London: Prentice-Hall.Google Scholar
- Bruner, J. S. (1961). The art of discovery. Harvard Educational Review, 31, 21–32.Google Scholar
- Campbell, D. (1960). Blind variation and selective retention in creative thought as in other knowledge processes. Psychological Review, 67 , 380–400.Google Scholar
- Darwin, C. (1859). The origin of species by means of natural selection. London: John Murray.Google Scholar
- Dawkins, R. (1989). The selfish gene. Oxford: Oxford University Press.Google Scholar
- De Groot, A. (1965). Thought and choice in chess. The Hague: Mouton. (Original work published in 1946).Google Scholar
- Hawley, P. H. (2016). Eight myths of child social development: An evolutionary approach to power, aggression, and social competence. In D. C. Geary & D. B. Berch (Eds.), Evolutionary psychology (pp. 145–166). Switzerland: Springer.Google Scholar
- Miller, G. A. (1956). The magical number seven, plus or minus two: Some limits on our capacity for processing information. Psychological Review, 63, 81–97.Google Scholar
- Newell, A., & Simon, H. A. (1972). Human problem solving. Englewood Cliffs, NJ: Prentice Hall.Google Scholar
- Papert, S. (1980). Mindstorms: Children, computers, and powerful ideas. New York: Basic Books.Google Scholar
- Piaget, J. (1951). Psychology of intelligence. London: Routledge.Google Scholar
- Serfaty, D., Entin, E. E., & Johnston, J. H. (1998). Team adaptation and coordination training. In J. A. Cannon-Bowers & E. Salas (Eds.), Making decisions under stress: Implications for individual and team training (pp. 221–246). Washington, DC: American Psychological Association.CrossRefGoogle Scholar
- Sweller, J., & Sweller, S. (2006). Natural information processing systems. Evolutionary Psychology, 4, 434–458.Google Scholar