5. Information processing theory Cass Paquin, a middle school mathematics teacher, seemed sad when she met with her team members Don Jacks and Fran Killian. Don: What’s the matter, Cass? Things got you down? Cass: They just don’t get it. I can’t get them to understand what a variable is. “X” is a mystery to them. Fran: Yes, “x” is too abstract for kids. Don: It’s abstract to adults too. “X” is a letter of the alphabet, a symbol. I’ve had the same problem. Some seem to pick it up, but many don’t. Fran: In my master’s program they teach that you have to make learning meaningful. People learn better when they can relate the new learning to something they know. “X” has no meaning in math. We need to change it to something the kids know. Cass: Such as what—cookies? Fran: Well, yes. Take your problem 4x ꠓ 7 ꠗ 15. How about saying: 4 times how many cookies plus 7 cookies equals 15 cookies? Or use apples. Or both. That way the kids can relate “x” to something tangible—real. Then “x” won’t just be something they memorize how to work with. They’ll associate “x” with things that can take on different values, such as cookies and apples. Don: That’s a problem with a lot of math—it’s too abstract. When kids are little, we use real objects to make it meaningful. We cut pies into pieces to illustrate fractions. Then when they get older we stop doing that and use abstract symbols most of the time. Sure, they have to know how to use those symbols, but we should try to make the concepts meaningful. Cass: Yes. I’ve fallen into that trap—teach the material like it’s in the book. I need to try to relate the concepts better to what the kids know and what makes sense to them.Information processing theories focus on how people attend to environmental events, encode information to be learned and relate it to knowledge in memory, store new knowledge in memory, and retrieve it as needed (Shuell, 1986). The tenets of these theories are as folꠓlows: “Humans are processors of information. The mind is an information-processing system. Cognition is a series of mental processes. Learning is the acquisition of mental representaꠓtions.” (Mayer, 1996, p. 154) Information processing is not the name of a single theory; it is a generic name applied to theoretical perspectives dealing with the seꠓquence and execution of cognitive events. Although certain theories are discussed in this chapter, there is no one dominant theory, and some researchers do not support any of the current theories (Matlin, 2009). Given this situꠓation, one might conclude that information processing lacks a clear identity. In part this may be due to its influence by advances in various domains including communications, technology, and neuroscience. Much early information processing research was conducted in laboratories and dealt with phenomena such as eye movements, recogniꠓtion and recall times, attention to stimuli, and interference in perception and memory. Subsequent research has explored learning, memory, problem solving, visual and auditory perception, cognitive development, and artifiꠓcial intelligence. Despite a healthy research litꠓerature, information processing principles have not always lent themselves readily to school learning, curricular structure, and instructional design. This situation does not imply that inforꠓmation processing has little educational releꠓvance, only that many potential applications are yet to be developed. Researchers increasingly are applying principles to educational settings involving such subjects as reading, mathematꠓics, and science, and applications remain research priorities. The participants in the openꠓing scenario are discussing meaningfulness, a key aspect of information processing. This chapter initially discusses the assumpꠓtions of information processing and gives an overview of a prototypical two-store memory model. The bulk of the chapter is devoted to explicating the component processes of attenꠓtion, perception, short-term (working) memory, and long-term memory (storage, retrieval, forꠓgetting). Relevant historical material on verbal learning and Gestalt psychology is mentioned, along with alternative views involving levels of processing and of memory activation. Language comprehension is discussed, and the chapter concludes by addressing mental imagery and instructional applications. When you finish studying this chapter, you should be able to do the following: ■ Describe the major components of inforꠓmation processing: attention, perception, short-term (working) memory, long-term memory. ■ Distinguish different views of attention, and explain how attention affects learning. ■ Compare and contrast Gestalt and inforꠓmation processing theories of perception. ■ Discuss the major forms of verbal learning research. ■ Differentiate short- and long-term memꠓory on the basis of capacity, duration, and component processes. ■ Define propositions, and explain their role in encoding and retrieval of longꠓterm memory information. ■ Explain the major factors that influence encoding, retrieval, and forgetting. ■ Discuss the major components of lanꠓguage comprehension. ■ Explain the dual-code theory and apply it to mental imagery. ■ Identify information processing principles inherent in instructional applications inꠓvolving advance organizers, the condiꠓtions of learning, and cognitive load. Information processing system 1.1 assumptions Information processing theorists challenged the idea inherent in behaviorism (Chapter 3) that learning involves forming associations between stimuli and responses. Information processing theorists do not reject associations, because they postulate that forming assoꠓciations between bits of knowledge helps to facilitate their acquisition and storage in memory. Rather, these theorists are less concerned with external conditions and focus more on internal (mental) processes that intervene between stimuli and responses. Learners are active seekers and processors of information. Unlike behaviorists who said that people respond when stimuli impinge on them, information processing theorists conꠓtend that people select and attend to features of the environment, transform and rehearse information, relate new information to previously acquired knowledge, and organize knowledge to make it meaningful (Mayer, 1996). Information processing theories differ in their views on which cognitive processes are important and how they operate, but they share some common assumptions. One is that information processing occurs in stages that intervene between receiving a stimulus and producing a response. A corollary is that the form of information, or how it is repꠓresented mentally, differs depending on the stage. The stages are qualitatively different from one another. Another assumption is that information processing is analogous to computer processꠓing, at least metaphorically. The human system functions similar to a computer: It receives information, stores it in memory, and retrieves it as necessary. Cognitive processing is reꠓmarkably efficient; there is little waste or overlap. Researchers differ in how far they extend this analogy. For some, the computer analogy is nothing more than a metaphor. Others emꠓploy computers to simulate activities of humans. The field of artificial intelligence is conꠓcerned with programming computers to engage in human activities such as thinking, using language, and solving problems (Chapter 7). Researchers also assume that information processing is involved in all cognitive activꠓities: perceiving, rehearsing, thinking, problem solving, remembering, forgetting, and imꠓaging (Farnham-Diggory, 1992; Matlin, 2009; Mayer, 1996; Shuell, 1986; Terry, 2009). Information processing extends beyond human learning as traditionally delineated. This chapter is concerned primarily with those information functions most germane to learning. 1.2. Two-store memory model Figure 5.1 shows an information processing model that incorporates processing stages. Although this model is generic, it closely corresponds to the classic model proposed by Atkinson and Shiffrin (1968, 1971). Information processing begins when a stimulus input (e.g., visual, auditory) impinges on one or more senses (e.g., hearing, sight, touch). The appropriate sensory register reꠓceives the input and holds it briefly in sensory form. It is here that perception (pattern recognition) occurs, which is the process of assigning meaning to a stimulus input. This typically does not involve naming because naming takes time and information stays in the sensory register for only a fraction of a second. Rather, perception involves matching an input to known information.The sensory register transfers information to short-term memory (STM). STM is a working memory (WM) and corresponds roughly to awareness, or what one is conscious of at a given moment. WM is limited in capacity. Miller (1956) proposed that it holds seven plus or minus two units of information. A unit is a meaningful item: a letter, word, number, or common expression (e.g., “bread and butter”). WM also is limited in duration; for units to be retained in WM they must be rehearsed (repeated). Without rehearsal, inꠓformation is lost after a few seconds. While information is in WM, related knowledge in long-term memory (LTM), or perꠓmanent memory, is activated and placed in WM to be integrated with the new informaꠓtion. To name all the state capitals beginning with the letter A, students recall the names of states—perhaps by region of the country—and scan the names of their capital cities. When students who do not know the capital of Maryland learn “Annapolis,” they can store it with “Maryland” in LTM. It is debatable whether information is lost from LTM (i.e., forgotten). Some reꠓsearchers contend that it can be, whereas others say that failure to recall reflects a lack of good retrieval cues rather than forgetting. If Sarah cannot recall her third-grade teacher’s name (Mapleton), she might be able to if given the hint, “Think of trees.” Regardless of theoretical perspective, researchers agree that information remains in LTM for a long time. Control (executive) processes regulate the flow of information throughout the inforꠓmation processing system. Rehearsal is an important control process that occurs in WM. For verbal material, rehearsal takes the form of repeating information aloud or subvoꠓcally. Other control processes include coding (putting information into a meaningful conꠓtext—an issue being discussed in the opening scenario), imaging (visually representing information), implementing decision rules, organizing information, monitoring level of understanding, and using retrieval, self-regulation, and motivational strategies. Control processes are discussed in this chapter and in Chapter 7. The two-store model can account for many research results. One of the most consisꠓtent research findings is that when people have a list of items to learn, they tend to recall best the initial items (primacy effect) and the last items (recency effect), as portrayed in Figure 5.2. According to the two-store model, initial items receive the most rehearsal and are transferred to LTM, whereas the last items are still in WM at the time of recall. Middle items are recalled the poorest because they are no longer in WM at the time of recall (having been pushed out by subsequent items), they receive fewer rehearsals than initial items, and they are not properly stored in LTM. Research suggests, however, that learning may be more complex than the basic twoꠓstore model stipulates (Baddeley, 1998). One problem is that this model does not fully specify how information moves from one store to the other. The control processes notion is plausible but vague. We might ask: Why do some inputs proceed from the sensory regꠓisters into WM and others do not? Which mechanisms decide that information has been rehearsed long enough and transfer it into LTM? How is information in LTM selected to be activated? Another concern is that this model seems best suited to handle verbal material. How nonverbal representation occurs with material that may not be readily verbalized, such as modern art and well-established skills, is not clear. The model also is vague about what really is learned. Consider people learning word lists. With nonsense syllables, they have to learn the words themselves and the positions in which they appear. When they already know the words, they must only learn the poꠓsitions; for example, “cat” appears in the fourth position, followed by “tree.” People must take into account their purpose in learning and modify learning strategies accordingly. What mechanism controls these processes? Whether all components of the system are used at all times is also an issue. WM is useful when people are acquiring knowledge and need to relate incoming information to knowledge in LTM. But we do many things automatically: get dressed, walk, ride a biꠓcycle, respond to simple requests (e.g., “Do you have the time?”). For many adults, readꠓing (decoding) and simple arithmetic computations are automatic processes that place little demand on cognitive processes. Such automatic processing may not require the opꠓeration of WM. How does automatic processing develop and what mechanisms govern it? These and other issues not addressed well by the two-store model (e.g., the role of motivation in learning and the development of self-regulation) do not disprove the model; rather, they are issues to be addressed. Although the two-store model is the bestꠓknown example of information processing theory, many researchers do not fully accept it (Matlin, 2009; Nairne, 2002). Alternative theories covered in this chapter are levels (or depth) of processing and activation level, and the newer connectionism and parallel disꠓtributed processing (PDP) theories. Before components of the two-store model are deꠓscribed in greater detail, levels of processing and activation level theories are discussed (connectionism and PDP are covered later in this chapter). 1.3. alternatives to the two-store model Levels (Depth) of Processing. Levels (depth) of processing theory conceptualizes memory acꠓcording to the type of processing that information receives rather than its location (Craik, 1979; Craik & Lockhart, 1972; Craik & Tulving, 1975; Lockhart, Craik, & Jacoby, 1976). This view does not incorporate stages or structural components such as WM or LTM (Terry, 2009). Rather, different ways to process information (such as levels or depth at which it is processed) exist: physical (surface), acoustic (phonological, sound), semantic (meaning). These three levels are dimensional, with physical processing being the most superficial (such as “x” as a symbol devoid of meaning as discussed by the teachers in the introductory scenario) and semantic processing being the deepest. For example, suppose you are reading and the next word is wren. This word can be processed on a surface level (e.g., it is not capitalized), a phonological level (rhymes with den), or a semantic level (small bird). Each level represents a more elaborate (deeper) type of processing than the preceding level; processing the meaning of wren expands the information conꠓtent of the item more than acoustic processing, which expands content more than surꠓface-level processing. These three levels seem conceptually similar to the sensory register, WM, and LTM of the two-store model. Both views contend that processing becomes more elaborate with succeeding stages or levels. The levels of processing model, however, does not asꠓsume that the three types of processing constitute stages. In levels of processing, one does not have to move to the next process to engage in more elaborate processing; depth of processing can vary within a level. Wren can receive low-level semantic proꠓcessing (small bird) or more extensive semantic processing (its similarity to and differꠓence from other birds). Another difference between the two information processing models concerns the order of processing. The two-store model assumes information is processed first by the sensory register, then by WM, and finally by LTM. The levels of processing model does not make a sequential assumption. To be processed at the meaning level, information does not have to be first processed at the surface and sound levels (beyond what proꠓcessing is required for information to be received) (Lockhart et al., 1976). The two models also have different views of how type of processing affects memory. In levels of processing, the deeper the level at which an item is processed, the better the memory because the memory trace is more ingrained. The teachers in the opening sceꠓnario are concerned about how they can help students process algebraic information at a deeper level. Once an item is processed at a particular point within a level, additional processing at that point should not improve memory. In contrast, the two-store model contends that memory can be improved with additional processing of the same type. This model predicts that the more a list of items is rehearsed, the better it will be recalled. Some research evidence supports levels of processing. Craik and Tulving (1975) preꠓsented individuals with words. As each word was presented, they were given a question to answer. The questions were designed to facilitate processing at a particular level. For surface processing, people were asked, “Is the word in capital letters?” For phonological processing they were asked, “Does the word rhyme with train?’” For semantic processꠓing, “Would the word fit in the sentence, ‘He met a _____ in the street’?” The time people spent processing at the various levels was controlled. Their recall was best when items were processed at a semantic level, next best at a phonological level, and worst at a surꠓface level. These results suggest that forgetting is more likely with shallow processing and is not due to loss of information from WM or LTM. Levels of processing implies that student understanding is better when material is processed at deeper levels. Glover, Plake, Roberts, Zimmer, and Palmere (1981) found that asking students to paraphrase ideas while they read essays significantly enhanced reꠓcall compared with activities that did not draw on previous knowledge (e.g., identifying key words in the essays). Instructions to read slowly and carefully did not assist students during recall. Despite these positive findings, levels of processing theory has problems. One conꠓcern is whether semantic processing always is deeper than the other levels. The sounds of some words (kaput) are at least as distinctive as their meanings (“ruined”). In fact, reꠓcall depends not only on level of processing but also on type of recall task. Morris, Bransford, and Franks (1977) found that, given a standard recall task, semantic coding produced better results than rhyming coding; however, given a recall task emphasizing rhyming, asking rhyming questions during coding produced better recall than semantic questions. Moscovitch and Craik (1976) proposed that deeper processing during learning results in a higher potential memory performance, but that potential will be realized only when conditions at retrieval match those during learning. Another concern with levels of processing theory is whether additional processing at the same level produces better recall. Nelson (1977) gave participants one or two repetiꠓtions of each stimulus (word) processed at the same level. Two repetitions produced better recall, contrary to the levels of processing hypothesis. Other research shows that additional rehearsal of material facilitates retention and recall as well as automaticity of processing (Anderson, 1990; Jacoby, Bartz, & Evans, 1978). A final issue concerns the nature of a level. Investigators have argued that the notion of depth is fuzzy, both in its definition and measurement (Terry, 2009). As a result, we do not know how processing at different levels affects learning and memory (Baddeley, 1978; Nelson, 1977). Time is a poor criterion of level because some surface processing (e.g., “Does the word have the following letter pattern: consonant-vowel-consonantꠓconsonant-vowel-consonant?”) can take longer than semantic processing (“Is it a type of bird?”). Neither is processing time within a given level indicative of deeper processing (Baddeley, 1978, 1998). A lack of clear understanding of levels (depth) limits the usefulꠓness of this perspective. Resolving these issues may require combining levels of processing with the two-store idea to produce a refined memory model. For example, information in WM might be reꠓlated to knowledge in LTM superficially or more elaborately. Also, the two memory stores might include levels of processing within each store. Semantic coding in LTM may lead to a more extensive network of information and a more meaningful way to remember inꠓformation than surface or phonological coding Activation Level. An alternative concept of memory, but one similar to the two-store and levels of processing models, contends that memory structures vary in their activation level (Anderson, 1990). In this view, we do not have separate memory structures but rather one memory with different activation states. Information may be in an active or inꠓactive state. When active, the information can be accessed quickly. The active state is maintained as long as information is attended to. Without attention, the activation level will decay, in which case the information can be activated when the memory structure is reactivated (Collins & Loftus, 1975). Active information can include information entering the information processing sysꠓtem and information that has been stored in memory (Baddeley, 1998). Regardless of the source, active information either is currently being processed or can be processed rapꠓidly. Active material is roughly synonymous with WM, but the former category is broader than the latter. WM includes information in immediate consciousness, whereas active memory includes that information plus material that can be accessed easily. For example, if I am visiting Aunt Frieda and we are admiring her flower garden, that information is in WM, but other information associated with Aunt Frieda’s yard (trees, shrubs, dog) may be in an active state. Rehearsal allows information to be maintained in an active state (Anderson, 1990). As with working memory, only a limited number of memory structures can be active at a given time. As one’s attention shifts, activation level changes. We encounter the activation level idea again later in this chapter (i.e., Anderson’s ACT theory) because the concept is critical for storage of information and its retrieval from memory. The basic notion involves spreading activation, which means that one memory structure may activate another structure adjacent (related) to it (Anderson, 1990). Activation spreads from active to inactive portions of memory. The level of activation deꠓpends on the strength of the path along which the activation spreads and on the number of competing (interfering) paths. Activation spread becomes more likely with increased practice, which strengthens structures, and less likely with length of retention interval as strength weakens. One advantage of activation level theory is that it can explain retrieval of information from memory. By dispensing with the notion of separate memory stores, the model elimꠓinates the potential problem of transferring information from one store to the other. STM (WM) is that part of memory that is currently active. Activation decays with the passage of time, unless rehearsal keeps the information activated (Nairne, 2002). At the same time, the activation level model has not escaped the dual-store’s probꠓlems because it too dichotomizes the information system (active-inactive). We also have the problem of the strength level needed for information to pass from one state to anꠓother. Thus, we intuitively know that information may be partially activated (e.g., a crossword item on the “tip of your tongue”—you know it but cannot recall it), so we might ask how much activation is needed for material to be considered active. These concerns notwithstanding, the activation level model offers important insights into the processing of information. We now examine in greater depth the components of the two-store model: attention, perception, encoding, storage, and retrieval (Shuell, 1986). The next section discusses atꠓtention; perception, encoding, storage, and retrieval are addressed in subsequent sections. 2. Attention The word attention is heard often in educational settings. Teachers and parents complain that students do not pay attention to instructions or directions. (This does not seem to be the problem in the opening scenario; rather, the issue involves meaningfulness of proꠓcessing.) Even high-achieving students do not always attend to instructionally relevant events. Sights, sounds, smells, tastes, and sensations bombard us; we cannot and should not attend to them all. Our attentional capabilities are limited; we can attend to a few things at once. Thus, attention can be construed as the process of selecting some of many potential inputs. Alternatively, attention can refer to a limited human resource expended to accomꠓplish one’s goals and to mobilize and maintain cognitive processes (Grabe, 1986). Attention is not a bottleneck in the information processing system through which only so much information can pass. Rather, it describes a general limitation on the entire human information processing system. Theories of Attention Research has explored how people select inputs for attending. In dichotic listening tasks, people wear headphones and receive different messages in each ear. They are asked to “shadow” one message (report what they hear); most can do this quite well. Cherry (1953) wondered what happened to the unattended message. He found that listeners knew when it was present, whether it was a human voice or a noise, and when it changed from a male to a female voice. They typically did not know what the message was, what words were spoken, which language was being spoken, or whether words were repeated. Broadbent (1958) proposed a model of attention known as filter (bottleneck) theory. In this view, incoming information from the environment is held briefly in a sensory sysꠓtem. Based on their physical characteristics, pieces of information are selected for further processing by the perceptual system. Information not acted on by the perceptual system is filtered out—not processed beyond the sensory system. Attention is selective because of the bottleneck—only some messages receive further processing. In dichotic listening studies, filter theory proposes that listeners select a channel based on their instructions. They know some details about the other message because the physical examination of inꠓformation occurs prior to filtering. Subsequent work by Treisman (1960, 1964) identified problems with filter theory. Treisman found that during dichotic listening experiments, listeners routinely shifted their attention between ears depending on the location of the message they were shadowing. If they were shadowing the message coming into their left ear, and if the message sudꠓdenly shifted to the right ear, they continued to shadow the original message and not the new message coming into the left ear. Selective attention depends not only on the physiꠓcal location of the stimulus but also on its meaning. Treisman (1992; Treisman & Gelade, 1980) proposed a feature-integration theory. Sometimes we distribute attention across many inputs, each of which receives low-level processing. At other times we focus on a particular input, which is more cognitively deꠓmanding. Rather than blocking out messages, attention simply makes them less salient than those being attended to. Information inputs initially are subjected to different tests for physical characteristics and content. Following this preliminary analysis, one input may be selected for attention. Treisman’s model is problematic in the sense that much analysis must precede attending to an input, which is puzzling because presumably the original analysis involves some atꠓtention. Norman (1976) proposed that all inputs are attended to in sufficient fashion to actiꠓvate a portion of LTM. At that point, one input is selected for further attention based on the degree of activation, which depends on the context. An input is more likely to be attended to if it fits into the context established by prior inputs. While people read, for example, many outside stimuli impinge on their sensory system, yet they attend to the printed symbols. In Norman’s view, stimuli activate portions of LTM, but attention involves more comꠓplete activation. Neisser (1967) suggested that preattentive processes are involved in head and eye movements (e.g., refocusing attention) and in guided movements (e.g., walking, driving). Preattentive processes are automatic—people implement them without conꠓscious mediation. In contrast, attentional processes are deliberate and require conscious activity. In support of this point, Logan (2002) postulated that attention and categorization occur together. As an object is attended to, it is categorized based on information in memꠓory. Attention, categorization, and memory are three aspects of deliberate, conscious cogꠓnition. Researchers currently are exploring the neurophysiological processes (Chapter 2) involved in attention (Matlin, 2009). 2.1 Attention and learning Attention is a necessary prerequisite of learning. In learning to distinguish letters, a child learns the distinctive features: To distinguish b from d, students must attend to the position of the vertical line on the left or right side of the circle, not to the mere presence of a circle attached to a vertical line. To learn from the teacher, students must attend to the teacher’s voice and ignore other sounds. To develop reading comprehension skills, students must attend to the printed words and ignore such irrelevancies as page size and color. Attention is a limited resource; learners do not have unlimited amounts of it. Learners allocate attention to activities as a function of motivation and self-regulation (Kanfer & Ackerman, 1989; Kanfer & Kanfer, 1991). As skills become routine, information processꠓing requires less conscious attention. In learning to work multiplication problems, stuꠓdents must attend to each step in the process and check their computations. Once students learn multiplication tables and the algorithm, working problems becomes autoꠓmatic and is triggered by the input. Research shows that much cognitive skill processing becomes automatic (Phye, 1989) Differences in the ability to control attention are associated with student age, hyperꠓactivity, intelligence, and learning disabilities (Grabe, 1986). Attention deficits are associꠓated with learning problems. Hyperactive students are characterized by excessive motor activity, distractibility, and low academic achievement. They have difficulty focusing and sustaining attention on academic material. They may be unable to block out irrelevant stimuli, which overloads their processing systems. Sustaining attention over time requires that students work in a strategic manner and monitor their level of understanding. Normal achievers and older children sustain attention better than do low achievers and younger learners on tasks requiring strategic processing (Short, Friebert, & Andrist, 1990). Teachers can spot attentive students by noting their eye focus, their ability to begin working on cue (after directions are completed), and physical signs (e.g., handwriting) indicating they are engaged in work (Good & Brophy, 1984). But physical signs alone may not be sufficient; strict teachers can keep students sitting quietly even though stuꠓdents may not be engaged in class work. Teachers can promote student attention to relevant material through the design of classroom activities (Application 5.1). Eye-catching displays or actions at the start of lesꠓsons engage student attention. Teachers who move around the classroom—especially 2.2 attention and reading A common research finding is that students are more likely to recall important text eleꠓments than less important ones (R. Anderson, 1982; Grabe, 1986). Good and poor readers locate important material and attend to it for longer periods (Ramsel & Grabe, 1983; Reynolds & Anderson, 1982). What distinguishes these readers is subsequent processing and comprehension. Perhaps poor readers, being more preoccupied with basic reading tasks (e.g., decoding), become distracted from important material and do not process it adequately for retention and retrieval. While attending to important material, good readꠓers may be more apt to relate the information to what they know, make it meaningful, and rehearse it, all of which improve comprehension (Resnick, 1981). The importance of text material can affect subsequent recall through differential atꠓtention (R. Anderson, 1982). Text elements apparently are processed at some minimal level so importance can be assessed. Based on this evaluation, the text element either is dismissed in favor of the next element (unimportant information) or receives additional attention (important information). Comprehension suffers when students do not pay adeꠓquate attention. Assuming attention is sufficient, the actual types of processing students engage in must differ to account for subsequent comprehension differences. Better readꠓers may engage in much automatic processing initially and attend to information deemed important, whereas poorer readers might engage in automatic processing less often. Hidi (1995) noted that attention is required during many phases of reading: processing orthographic features, extracting meanings, judging information for importance, and foꠓcusing on important information. This suggests that attentional demands vary considerably depending on the purpose of reading—for example, extracting details, comprehending, or new learning. Future research—especially neurophysiological—should help to clarify these issues (Chapter 2) 3. Perception Perception (pattern recognition) refers to attaching meaning to environmental inputs reꠓceived through the senses. For an input to be perceived, it must be held in one or more of the sensory registers and compared to knowledge in LTM. These registers and the comparison process are discussed in the next section. Gestalt theory was an early cognitive view that challenged many assumptions of behaviorism. Although Gestalt theory no longer is viable, it offered important prinꠓciples that are found in current conceptions of perception and learning. This theory is explained next, followed by a discussion of perception from an information processꠓing perspective. 3.1. Gestalt Theory The Gestalt movement began with a small group of psychologists in early twentieth-cenꠓtury Germany. In 1912, Max Wertheimer wrote an article on apparent motion. The article was significant among German psychologists but had no influence in the United States, where the Gestalt movement had not yet begun. The subsequent publication in English of Kurt Koffka’s The Growth of the Mind (1924) and Wolfgang Köhler’s The Mentality of Apes (1925) helped the Gestalt movement spread to the United States. Many Gestalt psyꠓchologists, including Wertheimer, Koffka, and Köhler, eventually emigrated to the United States, where they applied their ideas to psychological phenomena. In a typical demonstration of the apparent motion perceptual phenomenon, two lines close together are exposed successively for a fraction of a second with a short time interval between each exposure. An observer sees not two lines but rather a single line moving from the line exposed first toward the line exposed second. The timing of the demonstration is critical. If the time interval between exposure of the two lines is too long, the observer sees the first line and then the second but no motion. If the interval is too short, the observer sees two lines side by side but no motion. This apparent motion is known as the phi phenomenon and demonstrates that subꠓjective experiences cannot be explained by referring to the objective elements involved. Observers perceive movement even though none occurs. Phenomenological experience (apparent motion) differs from sensory experience (exposure of lines). The attempt to exꠓplain this and related phenomena led Wertheimer to challenge psychological explanaꠓtions of perception as the sum of one’s sensory experiences because these explanations did not take into account the unique wholeness of perception. Meaningfulness of Perception. Imagine a woman named Betty who is 5 feet tall. When we view Betty at a distance, our retinal image is much smaller than when we view Betty close up. Yet Betty is 5 feet tall and we know that regardless of how far away she is. Although the perception (retinal image) varies, the meaning of the image remains constant. The German word Gestalt translates as “form,” “figure,” “shape,” or “configuration.” The essence of the Gestalt psychology is that objects or events are viewed as organized wholes (Köhler, 1947/1959). The basic organization involves a figure (what one focuses on) against a ground (the background). What is meaningful is the configuration, not the individual parts (Koffka, 1922). A tree is not a random collection of leaves, branches, roots, and trunk; it is a meaningful configuration of these elements. When viewing a tree, people typically do not focus on individual elements but rather on the whole. The human brain transforms objective reality into mental events organized as meaningful wholes. This capacity to view things as wholes is an inborn quality, although perception is modiꠓfied by experience and training (Köhler, 1947/1959; Leeper, 1935). Gestalt theory originally applied to perception, but when its European proponents came to the United States they found an emphasis on learning. Applying Gestalt ideas to learning was not difficult. In the Gestalt view, learning is a cognitive phenomenon involving reorganizing experiences into different perceptions of things, people, or events (Koffka, 1922, 1926). Much human learning is insightful, which means that the transformation from ignorance to knowledge occurs rapidly. When confronted with a problem, individuals figure out what is known and what needs to be determined. They then think about possible solutions. Insight occurs when people suddenly “see” how to solve the problem. Gestalt theorists disagreed with Watson and other behaviorists about the role of conꠓsciousness (Chapter 3). In Gestalt theory, meaningful perception and insight occur only through conscious awareness. Gestalt psychologists also disputed the idea that complex phenomena can be broken into elementary parts. Behaviorists stressed associations—the whole is equal to the sum of the parts. Gestalt psychologists felt that the whole is meanꠓingful and loses meaning when it is reduced to individual components. (In the opening scenario, “x” loses meaning unless it can be related to broader categories.) Instead, the whole is greater than the sum of its parts. Interestingly, Gestalt psychologists agreed with behaviorists in objecting to introspection, but for a different reason. Behaviorists viewed it as an attempt to study consciousness; Gestalt theorists felt it was inappropriate to modꠓify perceptions to correspond to objective reality. People who used introspection tried to separate meaning from perception, whereas Gestalt psychologists believed that percepꠓtion was meaningful. Principles of Organization. Gestalt theory postulates that people use principles to organize their perceptions. Some of the most important principles are figure-ground relation, proxꠓimity, similarity, common direction, simplicity, and closure (Figure 5.3; Koffka, 1922; Köhler, 1926, 1947/1959). The principle of figure–ground relation postulates that any perceptual field may be subdivided into a figure against a background. Such salient features as size, shape, color, and pitch distinguish a figure from its background. When figure and ground are ambiguꠓous, perceivers may alternatively organize the sensory experience one way and then anꠓother (Figure 5.3a). The principle of proximity states that elements in a perceptual field are viewed as beꠓlonging together according to their closeness to one another in space or time. Most people will view the lines in Figure 5.3b as three groups of three lines each, although other ways of perceiving this configuration are possible. This principle of proximity also is involved in the perception of speech. People hear (organize) speech as a series of words or phrases separated with pauses. When people hear unfamiliar speech sounds (e.g., foreign languages), they have difficulty discerning pauses The principle of similarity means that elements similar in aspects such as size or color are perceived as belonging together. Viewing Figure 5.3c, people tend to see a group of three short lines, followed by a group of three long lines, and so on. Proximity can outꠓweigh similarity; when dissimilar stimuli are closer together than similar ones (Figure 5.3d), the perceptual field tends to be organized into four groups of two lines each. The principle of common direction implies that elements appearing to constitute a pattern or flow in the same direction are perceived as a figure. The lines in Figure 5.3e are most likely to be perceived as forming a distinct pattern. The principle of common diꠓrection also applies to an alphabetic or numeric series in which one or more rules define the order of items. Thus, the next letter in the series abdeghjk is m, as determined by the rule: Beginning with the letter a and moving through the alphabet sequentially, list two letters and omit one. The principle of simplicity states that people organize their perceptual fields in simple, regular features and tend to form good Gestalts comprising symmetry and reguꠓlarity. This idea is captured by the German word Pragnanz, which roughly translated means “meaningfulness” or “precision.” Individuals are most likely to see the visual patꠓterns in Figure 5.3f as one geometrical pattern overlapping another rather than as several irregularly shaped geometric patterns. The principle of closure means that people fill in incomplete patterns or experiences. Despite the missing lines in the pattern shown in Figure 5.3g, people tend to complete the pattern and see a meaningful picture. Many of the concepts embodied in Gestalt theory are relevant to our perceptions; however, Gestalt principles are quite general and do not address the actual mechanisms of perception. To say that individuals perceive similar items as belonging together does not explain how they perceive items as similar in the first place. Gestalt principles are ilꠓluminating but vague and not explanatory. Research does not support some of the Gestalt predictions. Kubovy and van den Berg (2008) found that the joint effect of proximity and similarity was equal to the sum of their separate effects, not greater than it as Gestalt theꠓory predicts. Information processing principles, discussed next, are clearer and provide a better explanation of perception. 3.3. Sensory Registers Environmental inputs are attended to and received through the senses: vision, hearing, touch, smell, and taste. Information processing theories contend that each sense has its own register that holds information briefly in the same form in which it is received (e.g., visual information is held in visual form, auditory information in auditory form). Information stays in the sensory register for only a fraction of a second. Some sensory input is transferred to WM for further processing. Other input is erased and replaced by new input. The sensory registers operate in parallel fashion because several senses can be engaged simultaneously and independently of one another. The two sensory memoꠓries that have been most extensively explored are iconic (vision) and echoic (hearing) (Neisser, 1967). In a typical experiment to investigate iconic memory, a researcher presents learners with rows of letters briefly (e.g., 50 milliseconds) and asks them to report as many as they remember. They commonly report only four to five letters from an array. Early work by Sperling (1960) provided insight into iconic storage. Sperling presented learners with rows of letters, then cued them to report letters from a particular row. Sperling estimated that, after exposure to the array, they could recall about nine letters. Sensory memory could hold more information than was previously believed, but while participants were recalling letters, the traces of other letters quickly faded. Sperling also found that the more time between the end of a presentation of the array and the beginning of recall, the poorer was the recall. This finding supports the idea that forgetting involves trace decay, or the loss of a stimulus from a sensory register over time. Researchers debate whether the icon is actually a memory store or a persisting image. Sakitt argued that the icon is located in the rods of the eye’s retina (Sakitt, 1976; Sakitt & Long, 1979). The active role of the icon in perception is diminished (but not eliminated) if the icon is a physical structure, although not all researchers agree with Sakitt’s position. There is evidence for an echoic memory similar in function to iconic memory. Studies by Darwin, Turvey, and Crowder (1972) and by Moray, Bates, and Barnett (1965) yielded results comparable to Sperling’s (1960). Research participants heard three or four sets of recordings simultaneously and then were asked to report one. Findings showed that echoic memory is capable of holding more information than can be recalled. Similar to iconic information, traces of echoic information rapidly decay following removal of stimꠓuli. The echoic decay is not quite as rapid as the iconic, but periods beyond 2 seconds between cessation of stimulus presentation and onset of recall produce poorer recall. 3.4 LTM Comparisons Perception occurs through bottom-up and top-down processing (Matlin, 2009). In bottom-up processing, physical properties of stimuli are received by sensory registers and that information is passed to WM for comparisons with information in LTM to assign meanings. Environmental inputs have tangible physical properties. Assuming normal color vision, everyone who looks at a yellow tennis ball will recognize it as a yellow obꠓject, but only those familiar with tennis will recognize it as a tennis ball. The types of inꠓformation people have acquired account for the different meanings they assign to objects. But perception is affected not only by objective characteristics but also by prior exꠓperiences and expectations. Top-down processing refers to the influence of our knowlꠓedge and beliefs on perception (Matlin, 2009). Motivational states also are important. Perception is affected by what we wish and hope to perceive. We often perceive what we expect and fail to perceive what we do not expect. Have you ever thought you heard your name spoken, only to realize that another name was being called? While waiting to meet a friend at a public place or to pick up an order in a restaurant, you may hear your name because you expect to hear it. Also, people may not perceive things whose apꠓpearance has changed or that occur out of context. You may not recognize co-workers you meet at the beach because you do not expect to see them dressed in beach attire. Top-down processing often occurs with ambiguous stimuli or those registered only briefly (e.g., a stimulus spotted in the “corner of the eye”). An information processing theory of perception is template matching, which holds that people store templates, or miniature copies of stimuli, in LTM. When they encounter a stimꠓulus, they compare it with existing templates and identify the stimulus if a match is found. This view is appealing but problematic. People would have to carry around millions of temꠓplates in their heads to be able to recognize everyone and everything in their environment. Such a large stock would exceed the brain’s capability. Template theory also does a poor es, colꠓors, and designs; hundreds of templates would be needed just to perceive a chair. he problems with templates can be solved by assuming that they can have some variꠓation. Prototype theory addresses this. Prototypes are abstract forms that include the basic ingredients of stimuli (Matlin, 2009; Rosch, 1973). Prototypes are stored in LTM and are compared with encountered stimuli that are subsequently identified based on the protoꠓtype they match or resemble in form, smell, sound, and so on. Some research supports the existence of prototypes (Franks & Bransford, 1971; Posner & Keele, 1968; Rosch, 1973). A major advantage of prototypes over templates is that each stimulus has only one prototype instead of countless variations; thus, identification of a stimulus should be easꠓier because comparing it with several templates is not necessary. One concern with proꠓtotypes deals with the amount of acceptable variability of the stimuli, or how closely a stimulus must match a prototype to be identified as an instance of that prototype. A variation of the prototype model involves feature analysis (Matlin, 2009). In this view, one learns the critical features of stimuli and stores these in LTM as images or verbal codes (Markman, 1999). When a stimulus enters the sensory register, its features are comꠓpared with memorial representations. If enough of the features match, the stimulus is idenꠓtified. For a chair, the critical features may be legs, seat, and a back. Many other features (e.g., color, size) are irrelevant. Any exceptions to the basic features need to be learned (e.g., bleacher and beanbag chairs that have no legs). Unlike the prototype analysis, inforꠓmation stored in memory is not an abstract representation of a chair but rather includes its critical features. One advantage of feature analysis is that each stimulus does not have just one prototype, which partially addresses the concern about the amount of acceptable variꠓability. There is empirical research support for feature analysis (Matlin, 2009). Treisman (1992) proposed that perceiving an object establishes a temporary repreꠓsentation in an object file that collects, integrates, and revises information about its curꠓrent characteristics. The contents of the file may be stored as an object token. For newly perceived objects, we try to match the token to a memorial representation (dictionary) of object types, which may or may not succeed. The next time the object appears, we reꠓtrieve the object token, which specifies its features and structure. The token will facilitate perception if all of the features match but may impair it if many do not match. Regardless of how LTM comparisons are made, research supports the idea that percepꠓtion depends on bottom-up and top-down processing (Anderson, 1980; Matlin, 2009; Resnick, 1985). In reading, for example, bottom-up processing analyzes features and builds a meaningful representation to identify stimuli. Beginning readers typically use bottom-up proꠓcessing when they encounter letters and new words and attempt to sound them out. People also use bottom-up processing when experiencing unfamiliar stimuli (e.g., handwriting). Reading would proceed slowly if all perception required analyzing features in detail. In top-down processing, individuals develop expectations regarding perception based on the context. Skilled readers build a mental representation of the context while reading and expect certain words and phrases in the text (Resnick, 1985). Effective top-down proꠓcessing depends on extensive prior knowledge. 4. Two-store memory model The two-store (dual) memory model serves as our basic information processing perspective on learning and memory, although as noted earlier not all researchers accept this model (Matlin, 2009). Research on verbal learning is covered next to provide a historical backdrop 4.1. verbal learning Stimulus-Response Associations. The impetus for research on verbal learning derived from the work of Ebbinghaus (Chapter 1), who construed learning as gradual strengthening of associations between verbal stimuli (words, nonsense syllables). With repeated pairings, the response dij became more strongly connected with the stimulus wek. Other responses also could become connected with wek during learning of a list of paired nonsense syllaꠓbles, but these associations became weaker over trials. Ebbinghaus showed that three important factors affecting the ease or speed with which one learns a list of items are meaningfulness of items, degree of similarity beꠓtween them, and length of time separating study trials (Terry, 2009). Words (meaningꠓful items) are learned more readily than nonsense syllables. With respect to similarity, the more alike items are to one another, the harder they are to learn. Similarity in meaning or sound can cause confusion. An individual asked to learn several synonyms such as gigantic, huge, mammoth, and enormous may fail to recall some of these but instead may recall words similar in meaning but not on the list (large, behemoth). With nonsense syllables, confusion occurs when the same letters are used in different posiꠓtions (xqv, khq, vxh, qvk). The length of time separating study trials can vary from short (massed practice) to longer (distributed practice). When interference is probable (discussed later in this chapter), distributed practice yields better learning (Underwood, 1961). Learning Tasks. Verbal learning researchers commonly employed three types of learning tasks: serial, paired-associate, and free-recall. In serial learning, people recall verbal stimꠓuli in the order in which they were presented. Serial learning is involved in such school tasks as memorizing a poem or the steps in a problem-solving strategy. Results of many serial learning studies typically yield a serial position curve (Figure 5.2). Words at the beꠓginning and end of the list are readily learned, whereas middle items require more trials for learning. The serial position effect may arise due to differences in distinctiveness of the various positions. People must remember not only the items themselves but also their positions in the list. The ends of a list appear to be more distinctive and are therefore “better” stimuli than the middle positions of a list. In paired-associate learning, one stimulus is provided for one response item (e.g., cat-tree, boat-roof, bench-dog). Participants respond with the correct response upon presentation of the stimulus. Paired-associate learning has three aspects: discriminating among the stimuli, learning the responses, and learning which responses accompany which stimuli. Debate has centered on the process by which paired-associate learning occurs and the role of cognitive mediation. Researchers originally assumed that learning was incremental and that each stimulus–response association was gradually strengthꠓened. This view was supported by the typical learning curve (Figure 5.4). The number of errors people make is high at the beginning, but errors decrease with repeated presenꠓtations of the list. Research by Estes (1970) and others suggested a different perspective. Although list learning improves with repetition, learning of any given item has an all-or-none characꠓter: The learner either knows the correct association or does not know it. Over trials, the number of learned associations increases. A second issue involves cognitive mediation Rather than simply memorizing responses, learners often impose their organization to make material meaningful. They may use cognitive mediators to link stimulus words with their responses. For the pair cat-tree, one might picture a cat running up a tree or think of the sentence, “The cat ran up the tree.” When presented with cat, one recalls the image or sentence and responds with tree. Research shows that verbal learning processes are more complex than originally believed (Terry, 2009). In free-recall learning, learners are presented with a list of items and recall them in any order. Free recall lends itself well to organization imposed to facilitate memory. Often during recall, learners group words presented far apart on the original list. Groupings often are based on similar meaning or membership in the same category (e.g., rocks, fruits, vegetables). In a classic demonstration of the phenomenon of categorical clustering, learners were presented with a list of 60 nouns, 15 each drawn from the following categories: anꠓimals, names, professions, and vegetables (Bousfield, 1953). Words were presented in scrambled order; however, learners tended to recall members of the same category toꠓgether. The tendency to cluster increases with the number of repetitions of the list (Bousfield & Cohen, 1953) and with longer presentation times for items (Cofer, Bruce, & Reicher, 1966). Clustering has been interpreted in associationist terms (Wood & Underwood, 1967); that is, words recalled together tend to be associated under normal conditions, either to one another directly (e.g., pear-apple) or to a third word (fruit). A cognitive explanation is that individuals learn both the words presented and the cateꠓgories of which they are members (Cooper & Monk, 1976). The category names serve as mediational cues: When asked to recall, learners retrieve category names and then their members. Clustering provides insight into the structure of human memory and supports the Gestalt notion that individuals organize their experiences. Verbal learning research identified the course of acquisition and forgetting of verbal material. At the same time, the idea that associations could explain learning of verbal material was simplistic. This became apparent when researchers moved beyond simple list learning to more meaningful learning from text. One might question the relꠓevance of learning lists of nonsense syllables or words paired in arbitrary fashion. In school, verbal learning occurs within meaningful contexts, for example, word pairs (e.g., states and their capitals, English translations of foreign words), ordered phrases and sentences (e.g., poems, songs), and meanings for vocabulary words. With the adꠓvent of information processing views of learning and memory, many of the ideas proꠓpounded by verbal learning theorists were discarded or substantially modified. Researchers increasingly address learning and memory of context-dependent verbal material (Bruning, Schraw, Norby, & Ronning, 2004). We now turn to a key informaꠓtion processing topic—working memory. 4.3. short-term memory In the two-store model, once a stimulus is attended to and perceived, it is transferred to short-term (working) memory (STM or WM; Baddeley, 1992, 1998, 2001; Terry, 2009). WM is our memory of immediate consciousness. WM performs two critical functions: mainteꠓnance and retrieval (Unsworth & Engle, 2007). Incoming information is maintained in an acꠓtive state for a short time and is worked on by being rehearsed or related to information reꠓtrieved from long-term memory (LTM). As students read a text, WM holds for a few seconds the last words or sentences they read. Students might try to remember a particular point by repeating it several times (rehearsal) or by asking how it relates to a topic discussed earlier in the book (relate to information in LTM). As another example, assume that a student is multiplying 45 by 7. WM holds these numbers (45 and 7), along with the product of 5 and 7 (35), the number carried (3), and the answer (315). The information in WM (5 ꠙ 7 ꠗ ?) is compared with activated knowledge in LTM (5 ꠙ 7 ꠗ 35). Also activated in LTM is the mulꠓtiplication algorithm, and these procedures direct the student’s actions. Research has provided a reasonably detailed picture of the operation of WM. WM is limited in duration: If not acted upon quickly, information in WM decays. In a classic study (Peterson & Peterson, 1959), participants were presented with a nonsense syllaꠓble (e.g., khv), after which they performed an arithmetic task before attempting to recall the syllable. The purpose of the arithmetic task was to prevent learners from rehearsing the syllable, but because the numbers did not have to be stored, they did not interfere with storage of the syllable in WM. The longer participants spent on the distracting acꠓtivity, the poorer was their recall of the nonsense syllable. These findings imply that WM is fragile; information is quickly lost if not learned well. If, for example, you are given a phone number to call but then are distracted before being able to call or write it down, you may not be able to recall it. WM also is limited in capacity: It can hold only a small amount of information. Miller (1956) suggested that the capacity of WM is seven plus or minus two items, where items are such meaningful units as words, letters, numbers, and common exꠓpressions. One can increase the amount of information by chunking, or combining inꠓformation in a meaningful fashion. The phone number 555-1960 consists of seven items, but it can easily be chunked to two as follows: “Triple 5 plus the year Kennedy was elected president Sternberg’s (1969) research on memory scanning provides insight into how informaꠓtion is retrieved from WM. Participants were presented rapidly with a small number of digits that did not exceed the capacity of WM. They then were given a test digit and were asked whether it was in the original set. Because the learning was easy, participants rarely made errors; however, as the original set increased from two to six items, the time to reꠓspond increased about 40 milliseconds per additional item. Sternberg concluded that people retrieve information from active memory by successively scanning items. Control (executive) processes direct the processing of information in WM, as well as the movement of knowledge into and out of WM (Baddeley, 2001). Control processes inꠓclude rehearsal, predicting, checking, monitoring, and metacognitive activities (Chapter 7). Control processes are goal directed; they select information relevant to people’s plans and intentions from the various sensory receptors. Information deemed important is reꠓhearsed. Rehearsal (repeating information to oneself aloud or subvocally) can maintain information in WM and improve recall (Baddeley, 2001; Rundus, 1971; Rundus & Atkinson, 1970). Environmental or self-generated cues activate a portion of LTM, which then is more accessible to WM. This activated memory holds a representation of events occurring reꠓcently, such as a description of the context and the content. It is debatable whether acꠓtive memory constitutes a separate memory store or merely an activated portion of LTM. Under the activation view, rehearsal keeps information in WM. In the absence of rehearsal, information decays with the passage of time (Nairne, 2002). High research inꠓterest on the operation of WM continues (Davelaar, Goshen-Gottstein, Ashkenazi, Haarmann, & Usher, 2005). WM plays a critical role in learning. Compared with normally achieving students, those with mathematical and reading disabilities show poorer WM operation (Andersson & Lyxell, 2007; Swanson, Howard, & Sáez, 2006). A key instructional implication is not to overload students’ WM by presenting too much material at once or too rapidly (see the section, Cognitive Load, later in this chapter). Where appropriate, teachers can present inꠓformation visually and verbally to ensure that students retain it in WM sufficiently long enough to further cognitively process (e.g., relate to information in LTM) 4.2. long-term memory Knowledge representation in LTM depends on frequency and contiguity (Baddeley, 1998). The more often that a fact, event, or idea is encountered, the stronger is its representation in memory. Furthermore, two experiences that occur closely in time are apt to be linked in memory, so that when one is remembered, the other is activated. Thus, information in LTM is represented in associative structures. These associations are cognitive, unlike those in conditioning theories that are behavioral (stimuli and responses). Information processing models often use computers for analogies, but some imporꠓtant differences exist, which are highlighted by associative structures. Human memory is content addressable: Information on the same topic is stored together, so that knowing what is being looked for will most likely lead to recalling the information (Baddeley, 1998). In contrast, computers are location addressable: Computers have to be told where information is to be stored. The nearness of the files or data sets on a hard drive to other files or data sets is purely arbitrary. Another difference is that information is stored preꠓcisely in computers. Human memory is less precise but often more colorful and informaꠓtive. The name Daryl Crancake is stored in a computer’s memory as “Daryl Crancake.” In human memory it may be stored as “Daryl Crancake” or become distorted to “Darrell,” “Darel,” or “Derol,” and “Cupcake,” “Cranberry,” or “Crabapple.” A useful analogy for the human mind is a library. Information in a library is content addressable because books on similar content are stored under similar call numbers. Information in the mind (as in the library) is also cross-referenced (Calfee, 1981). Knowledge that cuts across different content areas can be accessed through either area. For example, Amy may have a memory slot devoted to her 21st birthday. The memory inꠓcludes what she did, whom she was with, and what gifts she received. These topics can be cross-referenced as follows: The jazz CDs she received as gifts are cross-referenced in the memory slot dealing with music. The fact that her next-door neighbor attended is filed in the memory slot devoted to the neighbor and neighborhood. Knowledge stored in LTM varies in its richness. Each person has vivid memories of pleasant and unpleasant experiences. These memories can be exact in their details. Other types of knowledge stored in memories are mundane and impersonal: word meanings, arithmetic operations, and excerpts from famous documents. To account for differences in memory, Tulving (1972, 1983) proposed a distinction between episodic and semantic memory. Episodic memory includes information associꠓated with particular times and places that is personal and autobiographical. The fact that the word cat occurs in position three on a learned word list is an example of episodic inꠓformation, as is information about what Amy did on her 21st birthday. Semantic memory involves general information and concepts available in the environment and not tied to a particular context. Examples include the words to the “Star Spangled Banner” and the chemical formula for water (H2O). The knowledge, skills, and concepts learned in school are semantic memories. The two types of memories often are combined, as when a child tells a parent, “Today in school I learned [episodic memory] that World War II ended in 1945 [semantic memory].” Researchers have explored differences between declarative and procedural memories (Gupta & Cohen, 2002). Declarative memory involves remembering new events and exꠓperiences. Information typically is stored in declarative memory quickly, and it is the memory most impaired in patients with amnesia. Procedural memory is memory for skills, procedures, and languages. Information in procedural memory is stored graduꠓally—often with extensive practice—and may be difficult to describe (e.g., riding a biꠓcycle). We return to this distinction shortly. Another important issue concerns the form or structure in which LTM stores knowledge. Paivio (1971) proposed that knowledge is stored in verbal and visual forms, each of which is functionally independent but interconnected. Concrete obꠓjects (e.g., dog, tree, book) tend to be stored as images, whereas abstract concepts (e.g., love, truth, honesty) and linguistic structures (e.g., grammars) are stored in verꠓbal codes. Knowledge can be stored both visually and verbally: You may have a pictorial representation of your home and also be able to describe it verbally. Paivio postulated that for any piece of knowledge, an individual has a preferred storage mode activated more readily than the other. Dual-coded knowledge may be remembered better, which has important educational implications and confirms the general teachꠓing principle of explaining (verbal) and demonstrating (visual) new material (Clark & Paivio, 1991). Paivio’s work is discussed further under mental imagery later in this chapter. His views have been criticized on the grounds that a visual memory exceeds the brain’s caꠓpacity and requires some brain mechanism to read and translate the pictures (Pylyshyn, 1973). Some theorists contend that knowledge is stored only verbally (Anderson, 1980; Collins & Quillian, 1969; Newell & Simon, 1972; Norman & Rumelhart, 1975). Verbal modꠓels do not deny that knowledge can be represented pictorially but postulate that the ultiꠓmate code is verbal and that pictures in memory are reconstructed from verbal codes. Table 5.2 shows some characteristics and distinctions of memory systems. The associative structures of LTM are propositional networks, or interconnected sets comprising nodes or bits of information (Anderson, 1990; Calfee, 1981; see next secꠓtion). A proposition is the smallest unit of information that can be judged true or false. The statement, “My 80-year-old uncle lit his awful cigar,” consists of the following propositions: ■ I have an uncle. ■ He is 80 years old. ■ He lit a cigar. ■ The cigar is awful. Various types of propositional knowledge are represented in LTM. Declarative knowledge refers to facts, subjective beliefs, scripts (e.g., events of a story), and organized passages (e.g., Declaration of Independence). Procedural knowledge consists of conꠓcepts, rules, and algorithms. The declarative-procedural distinction also is referred to as explicit and implicit knowledge (Sun, Slusarz, & Terry, 2005). Declarative and procedural knowledge are discussed in this chapter. Conditional knowledge is knowing when to emꠓploy forms of declarative and procedural knowledge and why it is beneficial to do so (Gagné, 1985; Paris, Lipson, & Wixson, 1983; Chapter 7). Information processing theories contend that learning can occur in the absence of overt behavior because learning involves the formation or modification of propositional networks; however, overt performance typically is required to ensure that students have acquired skills. Research on skilled actions (e.g., solving mathematical problems) shows that people typically execute behaviors according to a sequence of planned segments (Ericsson et al., 1993; Fitts & Posner, 1967; VanLehn, 1996). Individuals select a perforꠓmance routine they expect will produce the desired outcome, periodically monitor their performances, make necessary corrections, and alter their performances following corꠓrective feedback. Because performances often need to vary to fit contextual demands, people find that practicing adapting skills in different situations is helpful. Transfer (Chapter 7) refers to the links between propositions in memory and depends on information being cross-referenced or the uses of information being stored along with it. Students understand that skills and concepts are applicable in different domains if that knowledge is stored in the respective networks. Teaching students how information is applicable in different contexts ensures that appropriate transfer occurs. 4.4 Influences on Encoding Encoding is the process of putting new (incoming) information into the information proꠓcessing system and preparing it for storage in LTM. Encoding usually is accomplished by making new information meaningful and integrating it with known information in LTM. Although information need not be meaningful to be learned—one unfamiliar with geomꠓetry could memorize the Pythagorean theorem without understanding what it means— meaningfulness improves learning and retention. Attending to and perceiving stimuli do not ensure that information processing will continue. Many things teachers say in class go unlearned (even though students attend to the teacher and the words are meaningful) because students do not continue to process the information. Important factors that influence encoding are organization, elaboration, and schema structures. Organization. Gestalt theory and research showed that well-organized material is easier to learn and recall (Katona, 1940). Miller (1956) argued that learning is enhanced by classifying and grouping bits of information into organized chunks. Memory research demonstrates that even when items to be learned are not organized, people often imꠓpose organization on the material, which facilitates recall (Matlin, 2009). Organized maꠓterial improves memory because items are linked to one another systematically. Recall of one item prompts recall of items linked to it. Research supports the effectiveness of organization for encoding among children and adults (Basden, Basden, Devecchio, & Anders, 1991). One way to organize material is to use a hierarchy into which pieces of information are integrated. Figure 5.5 shows a sample hierarchy for animals. The animal kingdom as a whole is on top, and underneath are the major categories (e.g., mammals, birds, repꠓtiles). Individual species are found on the next level, followed by breeds. Other ways of organizing information include the use of mnemonic techniques (Chapter 7) and mental imagery (discussed later in this chapter). Mnemonics enable learners to enrich or elaborate material, such as by forming the first letters of words to be learned into an acronym, familiar phrase, or sentence (Matlin, 2009). Some mnemonic techniques employ imagery; in remembering two words (e.g., honey and bread), one might imagine them interacting with each other (honey on bread). Using audiovisuals in instruction can improve students’ imagery. Elaboration. Elaboration is the process of expanding upon new information by adding to it or linking it to what one knows. Elaborations assist encoding and retrieval because they link the to-be-remembered information with other knowledge. Recently learned informaꠓtion is easier to access in this expanded memory network. Even when the new informaꠓtion is forgotten, people often can recall the elaborations (Anderson, 1990). A problem that many students (not just the ones being discussed in the introductory scenario) have in learning algebra is that they cannot elaborate the material because it is abstract and does not easily link with other knowledge. Rehearsing information keeps it in WM but does not necessarily elaborate it. A disꠓtinction can be drawn between maintenance rehearsal (repeating information over and over) and elaborative rehearsal (relating the information to something already known). Students learning U.S. history can simply repeat “D-Day was June 6, 1944,” or they can elaborate it by relating it to something they know (e.g., In 1944 Roosevelt was elected president for the fourth time). Mnemonic devices elaborate information in different ways. Once such device is to form the first letters into a meaningful sentence. For example, to remember the order of the planets from the sun you might learn the sentence, “My very educated mother just served us nine pizzas,” in which the first letters correspond to those of the planets (Mercury, Venus, Earth, Mars, Jupiter, Saturn, Uranus, Neptune, Pluto). You first recall the sentence and then reconstruct planetary order based on the first letters. Students may be able to devise elaborations, but if they cannot, they do not need to labor needlessly when teachers can provide effective elaborations. To assist storage in memory and retrieval, elaborations must make sense. Elaborations that are too unusual may not be remembered. Precise, sensible elaborations facilitate memory and recall (Bransford et al., 1982; Stein, Littlefield, Bransford, & Persampieri, 1984). Schemas. A schema (plural schemas or schemata) is a structure that organizes large amounts of information into a meaningful system. Schemas include our generalized knowledge about situations (Matlin, 2009). Schemas are plans we learn and use durꠓing our environmental interactions. Larger units are needed to organize propositions representing bits of information into a coherent whole (Anderson, 1990). Schemas asꠓsist us in generating and controlling routine sequential actions (Cooper & Shallice, 2006). In an early study, Bartlett (1932) found that schemas aid in comprehending inforꠓmation. In this experiment, a participant read a story about an unfamiliar culture, after which this person reproduced it for a second participant, who reproduced it for a third participant, and so on. By the time the story reached the 10th person, its unfamiliar context had been changed to one that participants were familiar with (e.g., a fishing trip). Bartlett found that as stories were repeated, they changed in predictable ways. Unfamiliar information was dropped, a few details were retained, and the stories beꠓcame more like participants’ experiences. They altered incoming information to fit their preexisting schemas. Any well-ordered sequence can be represented as a schema. One type of schema is “going to a restaurant.” The steps consist of activities such as being seated at a table, looking over a menu, ordering food, being served, having dishes picked up, receiving a bill, leaving a tip, and paying the bill. Schemas are important because they indicate what to expect in a situation. People recognize a problem when reality and schema do not match. Have you ever been in a restaurant where one of the expected steps did not occur (e.g., you received a menu but no one returned to your table to take your order)? Common educational schemas involve laboratory procedures, studying, and compreꠓhending stories. When given material to read, students activate the type of schema they believe is required. If students are to read a passage and answer questions about main ideas, they may periodically stop and quiz themselves on what they believe are the main points (Resnick, 1985). Schemas have been used extensively in research on reading and writing (McVee, Dunsmore, & Gavelek, 2005). Schemas assist encoding because they elaborate new material into a meaningful structure. When learning material, students attempt to fit information into the schema’s spaces. Less important or optional schema elements may or may not be learned. In readꠓing works of literature, students who have formed the schema for a tragedy can easily fit the characters and actions of the story into the schema. They expect to find elements such as good versus evil, human frailties, and a dramatic denouement. When these events occur, they are fit into the schema students have activated for the story (Application 5.2). Schemas may facilitate recall independently of their benefits on encoding. Anderson and Pichert (1978) presented college students with a story about two boys skipping school. Students were advised to read it from the perspective of either a burglar or a home buyer; the story had elements relevant to both. Students recalled the story and later recalled it a second time. For the second recall, half of the students were advised to use their original perspective and the other half the other perspective. On the second recall, students recalled more information relevant to the second perspective but not to the first perspective and less information unimportant to the second perspective that was imporꠓtant to the first perspective. Kardash, Royer, and Greene (1988) also found that schemas exerted their primary benefits at the time of recall rather than at encoding. Collectively, these results suggest that at retrieval, people recall a schema and attempt to fit elements into it. This reconstruction may not be accurate but will include most schema elements. Production systems, which are discussed later, bear some similarity to schemas 5. Long-term memory:storage This section discusses information storage in LTM. Although our knowledge about LTM is limited because we do not have a window into the brain, research has painted a reasonꠓably consistent picture of the storage process. The characterization of LTM in this chapter involves a structure with knowledge being represented as locations or nodes in networks, with networks connected (associꠓated) with one another. Note the similarity between these cognitive networks and the neural networks discussed in Chapter 2. When discussing networks, we deal primarily with declarative knowledge and procedural knowledge. Conditional knowledge is covꠓered in Chapter 7, along with metacognitive activities that monitor and direct cognitive processing. It is assumed that most knowledge is stored in LTM in verbal codes, but the role of imagery also is addressed at the end of this chapter. Propositions The Nature of Propositions. A proposition is the smallest unit of information that can be judged true or false. Propositions are the basic units of knowledge and meaning in LTM (Anderson, 1990; Kosslyn, 1984; Norman & Rumelhart, 1975). Each of the following is a proposition: ■ The Declaration of Independence was signed in 1776. ■ 2 ꠓ 2 ꠗ 4. ■ Aunt Frieda hates turnips. ■ I’m good in math. ■ The main characters are introduced early in a story. These sample propositions can be judged true or false. Note, however, that people may disagree on their judgments. Carlos may believe that he is bad in math, but his teacher may believe that he is very good. The exact nature of propositions is not well understood. Although they can be thought of as sentences, it is more likely that they are meanings of sentences (Anderson, 1990). Research supports the point that we store information in memory as propositions rather than as complete sentences. Kintsch (1974) gave participants sentences to read that were of the same length but varied in the number of propositions they contained. The more propositions contained in a sentence, the longer it took participants to comprehend it. This implies that, although students can generate the sentence, “The Declaration of Independence was signed in 1776,” what they most likely have stored in memory is a proposition containing only the essential information (Declaration of Independence— signed—1776). With certain exceptions (e.g., memorizing a poem), it seems that people usually store meanings rather than precise wordings. Propositions form networks that are composed of individual nodes or locations. Nodes can be thought of as individual words, although their exact nature is unknown but probably abstract. For example, students taking a history class likely have a “history class” network comprising such nodes as “book,” “teacher,” “location,” “name of student who sits on their left,” and so forth Propositional Networks. Propositions are formed according to a set of rules. Researchers disagree on which rules constitute the set, but they generally believe that rules combine nodes into propositions and, in turn, propositions into higher-order structures or networks, which are sets of interrelated propositions. Anderson’s ACT theory (Anderson, 1990, 1993, 1996, 2000; Anderson et al., 2004; Anderson, Reder, & Lebiere, 1996) proposes an ACT-R (Adaptive Control of ThoughtꠓRational) network model of LTM with a propositional structure. ACT-R is a model of cogꠓnitive architecture that attempts to explain how all components of the mind work toꠓgether to produce coherent cognition (Anderson et al., 2004). A proposition is formed by combining two nodes with a subject–predicate link, or association; one node constitutes the subject and another node the predicate. Examples are (implied information in parenꠓtheses): “Fred (is) rich” and “Shopping (takes) time.” A second type of association is the relation–argument link, where the relation is verb (in meaning) and the argument is the recipient of the relation or what is affected by the relation. Examples are “eat cake” and “solve puzzles.” Relation arguments can serve as subjects or predicates to form complex propositions. Examples are “Fred eat(s) cake,” and “solv(ing) puzzles (takes) time.” Propositions are interrelated when they share a common element. Common elements allow people to solve problems, cope with environmental demands, draw analogies, and so on. Without common elements, transfer would not occur; all knowledge would be stored separately and information processing would be slow. One would not recognize that knowledge relevant to one domain is also relevant to other domains. Figure 5.6 shows an example of a propositional network. The common element is “cat” because it is part of the propositions, “The cat walked across the front lawn,” and “The cat caught a mouse.” One can imagine that the former proposition is linked with other propoꠓsitions relating to one’s house, whereas the latter is linked with propositions about mice. Evidence suggests that propositions are organized in hierarchical structures. Collins and Quillian (1969) showed that people store information at the highest level of generalꠓity. For example, the LTM network for “animal” would have stored at the highest level such facts as “moves” and “eats.” Under this category would come such species as “birds” and “fish.” Stored under “birds” are “has wings,” “can fly,” and “has feathers” (although there are exceptions—chickens are birds but they do not fly). The fact that birds eat and move is not stored at the level of “bird” because that information is stored at the higher level of animal. Collins and Quillian found that retrieval times increased the farther apart concepts were stored in memory. The hierarchical organization idea has been modified by research showing that inforꠓmation is not always hierarchical. Thus, “collie” is closer to “mammal” than to “animal” in an animal hierarchy, but people are quicker to agree that a collie is an animal than to agree that it is a mammal (Rips, Shoben, & Smith, 1973). Furthermore, familiar information may be stored both with its concept and at the highꠓest level of generality (Anderson, 1990). If you have a bird feeder and you often watch birds eating, you might have “eat” stored with both “birds” and “animals.” This finding does not detract from the central idea that propositions are organized and interconnected. Although some knowledge may be hierarchically organized, much information is probably organized in a less systematic fashion in propositional networks. 5.1 storage of knowledge Declarative Knowledge. Declarative knowledge (knowing that something is the case) inꠓcludes facts, beliefs, opinions, generalizations, theories, hypotheses, and attitudes about oneself, others, and world events (Gupta & Cohen, 2002; Paris et al., 1983). It is acquired when a new proposition is stored in LTM, usually in a related propositional network (Anderson, 1990). ACT theory postulates that declarative knowledge is represented in chunks comprising the basic information plus related categories (Anderson, 1996; Anderson, Reder, & Lebiere, 1996). The storage process operates as follows. First, the learner receives new information, such as when the teacher makes a statement or the learner reads a sentence. Next, the new information is translated into one or more propositions in the learner’s WM. At the same time, related propositions in LTM are cued. The new propositions are associated with the related propositions in WM through the process of spreading activation (disꠓcussed in the following section). As this point, learners might generate additional propoꠓsitions. Finally, all the new propositions—those received and those generated by the learner—are stored together in LTM (Hayes-Roth & Thorndyke, 1979). Figure 5.7 illustrates this process. Assume that a teacher is presenting a unit on the U.S. Constitution and says to the class, “The vice president of the United States serves as presiꠓdent of the Senate but does not vote unless there is a tie.” This statement may cue other propositional knowledge stored in students’ memories relating to the vice president (e.g., elected with the president, becomes president when the president dies or resigns, can be impeached for crimes of treason) and the Senate (e.g., 100 members, two elected from each state, 6-year terms). Putting these propositions together, the students should infer that the vice president would vote if 50 senators voted for a bill and 50 voted against it. Storage problems can occur when students have no preexisting propositions with which to link new information. Students who have not heard of the U.S. Constitution and do not know what a constitution is will draw a blank when they hear the word for the first time. Conceptually meaningless information can be stored in LTM, but students learn better when new information is related to something they know. Showing students a facꠓsimile of the U.S. Constitution or relating it to something they have studied (e.g., Declaration of Independence) gives them a referent to link with the new information. Even when students have studied related material, they may not automatically link it with new information. Often the links need to be made explicit. When discussing the function of the vice president in the Senate, teachers could remind students of the comꠓposition of the U.S. Senate and the other roles of the vice president. Propositions sharing a common element are linked in LTM only if they are active in WM simultaneously. This point helps to explain why students might fail to see how new material relates to old maꠓterial, even though the link is clear to the teacher. Instruction that best establishes propoꠓsitional networks in learners’ minds incorporates review, organization of material, and reꠓminders of things they know but are not thinking of now. As with many memory processes, meaningfulness, organization, and elaboration faꠓcilitate storing information in memory. Meaningfulness is important because meaningful information can be easily associated with preexisting information in memory. Consequently, less rehearsal is necessary, which saves space and time of information in WM. The students being discussed in the opening scenario are having a problem making algebra meaningful, and the teachers express their frustration at not teaching the content in a meaningful fashion. A study by Bransford and Johnson (1972) provides a dramatic illustration of the role of meaningfulness in storage and comprehension. Consider the following passage: The procedure is actually quite simple. First you arrange things into different groups. Of course, one pile may be sufficient depending on how much there is to do. If you have to go somewhere else due to lack of facilities that is the next step, otherwise you are pretty well set. It is important not to overdo things. That is, it is better to do too few things at once than too many. In the short run this may not seem important, but complications can easily arise. A mistake can be expensive as well. At first the whole procedure will seem complicated. Soon, however, it will become just another facet of life. It is difficult to foresee any end to the necessity for this task in the immediate future, but then one never can tell. After the procedure is completed one arranges the materials into different groups again. Then they can be put into their appropriate places. Eventually they will be used once more and the whole cycle will then have to be repeated. However, that is part of life. (p. 722) Without prior knowledge this passage is difficult to comprehend and store in memꠓory because relating it to existing knowledge in memory is hard to do. However, knowꠓing that it is about “washing clothes” makes remembering and comprehension easier. Bransford and Johnson found that students who knew the topic recalled about twice as much as those who were unaware of it. The importance of meaningfulness in learning has been demonstrated in numerous other studies (Anderson, 1990; Chiesi, Spilich, & Voss, 1979; Spilich, Vesonder, Chiesi, & Voss, 1979). Organization facilitates storage because well-organized material is easier to relate to preexisting memory networks than is poorly organized material (Anderson, 1990). To the extent that material can be organized into a hierarchical arrangement, it provides a ready structure to be accepted into LTM. Without an existing LTM network, creating a new LTM network is easier with well-organized information than with poorly organized information. Elaboration, or the process of adding information to material to be learned, improves storage because by elaborating information learners may be able to relate it to something they know. Through spreading activation, the elaborated material may be quickly linked with information in memory. For example, a teacher might be discussing the Mt. Etna volꠓcano. Students who can elaborate that knowledge by relating it to their personal knowlꠓedge of volcanoes (e.g., Mt. St. Helens) will be able to associate the new and old inforꠓmation in memory and better retain the new material. Spreading Activation. Spreading activation helps to explain how new information is linked to knowledge in LTM (Anderson, 1983, 1984, 1990, 2000; Collins & Loftus, 1975). The basic underlying principles are as follows (Anderson, 1984): ■ Human knowledge can be represented as a network of nodes, where nodes corꠓrespond to concepts and links to associations among these concepts. ■ The nodes in this network can be in various states that correspond to their levels of activation. More active nodes are processed “better.” ■ Activation can spread along these network paths by a mechanism whereby nodes can cause their neighboring nodes to become active. (p. 61) Anderson (1990) cites the example of an individual presented with the word dog. This word is associatively linked with such other concepts in the individual’s LTM as bone, cat, and meat. In turn, each of these concepts is linked to other concepts. The acꠓtivation of dog in LTM will spread beyond dog to linked concepts, with the spread lessꠓening with concepts farther away from dog. Experimental support for the existence of spreading activation was obtained by Meyer and Schvaneveldt (1971). These investigators used a reaction time task that preꠓsented participants with two strings of letters and asked them to decide whether both were words. Words associatively linked (bread, butter) were recognized faster than words not linked (nurse, butter). Spreading activation results in a larger portion of LTM being activated than knowlꠓedge immediately associated with the content of WM. Activated information stays in LTM unless it is deliberately accessed, but this information is more readily accessible to WM. Spreading activation also facilitates transfer of knowledge to different domains. Transfer depends on propositional networks in LTM being activated by the same cue, so students recognize that knowledge is applicable in the domains. Schemas. Propositional networks represent small pieces of knowledge. Schemas (or schemata) are large networks that represent the structure of objects, persons, and events (Anderson, 1990). Structure is represented with a series of “slots,” each of which correꠓsponds to an attribute. In the schema or slot for houses, some attributes (and their values) might be as follows: material (wood, brick), contents (rooms), and function (human dwelling). Schemas are hierarchical; they are joined to superordinate ideas (building) and subordinate ones (roof). Brewer and Treyens (1981) found research support for the underlying nature of schemas. Individuals were asked to wait in an office for a brief period, after which they were brought into a room where they wrote down everything they could recall about the office. Recall reflected the strong influence of a schema for office. They correctly recalled the office having a desk and a chair (typical attributes) but not that the office contained a skull (nontypical attribute). Books are a typical attribute of offices; although the office had no books, many persons incorrectly recalled books. Schemas are important during teaching and for transfer (Matlin, 2009). Once students learn a schema, teachers can activate this knowledge when they teach any content to which the schema is applicable. Suppose an instructor teaches a general schema for describing geꠓographical formations (e.g., mountain, volcano, glacier, river). The schema might contain the following attributes: height, material, and activity. Once students learn the schema, they can employ it to categorize new formations they study. In so doing, they would create new schemata for the various formations. Procedural Knowledge. Procedural knowledge, or knowledge of how to perform cogniꠓtive activities (Anderson, 1990; Gupta & Cohen, 2002; Hunt, 1989; Paris et al., 1983), is central to much school learning. We use procedural knowledge to solve mathematical problems, summarize information, skim passages, and perform laboratory techniques. Procedural knowledge may be stored as verbal codes and images, much the same way as declarative knowledge is stored. ACT theory posits that procedural knowledge is stored as a production system (Anderson, 1996; Anderson, Reder, & Lebiere, 1996). A production system (or production) is a network of condition–action sequences (rules), in which the condition is the set of circumstances that activates the system and the action is the set of activities that occurs (Anderson, 1990; Andre, 1986; see next section). Production systems seem conceptually similar to neural networks (discussed in Chapter 2). Production Systems and Connectionist Models Production systems and connectionist models provide paradigms for examining the operꠓation of cognitive learning processes (Anderson, 1996, 2000; Smith, 1996). Connectionist models represent a relatively new perspective on cognitive learning. To date, there is little research on connectionist models that is relevant to education. Additional sources proꠓvide further information about connectionist models (Bourne, 1992; Farnham-Diggory, 1992; Matlin, 2009; Siegler, 1989). Production Systems. ACT—an activation theory—specifies that a production system (or production) is a network of condition–action sequences (rules), in which the condition is a set of circumstances that activates the system and the action is the set of activities that occurs (Anderson, 1990, 1996, 2000; Anderson, Reder, & Lebiere, 1996; Andre, 1986). A production consists of if–then statements: If statements (the condition) include the goal and test statements and then statements are the actions. As an example: ■ IF I see two numbers and they must be added, ■ THEN decide which is larger and start with that number and count up to the next one. (Farnham-Diggory, 1992, p. 113) Although productions are forms of procedural knowledge that can have conditions (conꠓditional knowledge) attached to them, they also include declarative knowledge. Learning procedures for performing skills often occurs slowly (J. Anderson, 1982). First, learners represent a sequence of actions in terms of declarative knowledge. Each step in the sequence is represented as a proposition. Learners gradually drop out indiꠓvidual cues and integrate the separate steps into a continuous sequence of actions. For example, children learning to add a column of numbers are apt initially to perform each step slowly, possibly even verbalizing it aloud. As they become more skillful, adding becomes part of an automatic, smooth sequence that occurs rapidly and without deliberate, conscious attention. Automaticity is a central feature of many cognitive processes (e.g., attention, retrieval) (Moors & De Houwer, 2006). When processes beꠓcome automatic, this allows the processing system to devote itself to complex parts of tasks (Chapter 7). A major constraint on skill learning is the size limitation of WM (Baddeley, 2001). Procedures would be learned quicker if WM could simultaneously hold all the declarative knowledge propositions. Because it cannot, students must combine propositions slowly and periodically stop and think (e.g., “What do I do next?”). WM contains insufficient space to create large procedures in the early stages of learning. As propositions are comꠓbined into small procedures, the latter are stored in WM simultaneously with other propoꠓsitions. In this fashion, larger productions are gradually constructed. These ideas explain why skill learning proceeds faster when students can perform the prerequisite skills (i.e., when they become automatic). When the latter exist as wellꠓestablished productions, they are activated in WM at the same time as new propositions to be integrated. In learning to solve long-division problems, students who know how to multiply simply recall the procedure when necessary; it does not have to be learned along with the other steps in long division. Although this does not seem to be the probꠓlem in the opening scenario, learning algebra is difficult for students with basic skill defiꠓciencies (e.g., addition, multiplication), because even simple algebra problems become difficult to answer correctly. Children with reading disabilities seem to lack the capability to effectively process and store information at the same time (de Jong, 1998). In some cases, specifying the steps in detail is difficult. For example, thinking creꠓatively may not follow the same sequence for each student. Teachers can model creative thinking to include such self-questions as, “Are there any other possibilities?” Whenever steps can be specified, teacher demonstrations of the steps in a procedure, followed by student practice, are effective (Rosenthal & Zimmerman, 1978). One problem with the learning of procedures is that students might view them as lockstep sequences to be followed regardless of whether they are appropriate. Gestalt psychologists showed how functional fixedness, or an inflexible approach to a problem, hinders problem solving (Duncker, 1945; Chapter 7). Adamantly following a sequence while learning may assist its acquisition, but learners also need to understand the circumꠓstances under which other methods are more efficient. Sometimes students overlearn skill procedures to the point that they avoid using alꠓternative, easier procedures. At the same time, there are few, if any, alternatives for many of the procedures students learn (e.g., decoding words, adding numbers, determining subject–verb agreement). Overlearning these skills to the point of automatic production becomes an asset to students and makes it easier to learn new skills (e.g., drawing inferꠓences, writing term papers) that require mastery of these basic skills. One might argue that teaching problem-solving or inference skills to students who are deficient in basic mathematical facts and decoding skills, respectively, makes little sense. Research shows that poor grasp of basic number facts is related to low perforꠓmance on complex arithmetic tasks (Romberg & Carpenter, 1986), and slow decoding reꠓlates to poor comprehension (Calfee & Drum, 1986; Perfetti & Lesgold, 1979). Not only is skill learning affected, but self-efficacy (Chapter 4) suffers as well. Practice is essential to instate basic procedural knowledge (Lesgold, 1984). In the early stages of learning, students require corrective feedback highlighting the portions of the procedure they implemented correctly and those requiring modification. Often stuꠓdents learn some parts of a procedure but not others. As students gain skill, teachers can point out their progress in solving problems quicker or more accurately. Transfer of procedural knowledge occurs when the knowledge is linked in LTM with different content. Transfer is aided by having students apply the procedures to the differꠓent content and altering the procedures as necessary. General problem-solving strategies (Chapter 7) are applicable to varied academic content. Students learn about their generꠓality by applying them to different subjects (e.g., reading, mathematics). Productions are relevant to cognitive learning, but several issues need to be adꠓdressed. ACT theory posits a single set of cognitive processes to account for diverse pheꠓnomena (Matlin, 2009). This view conflicts with other cognitive perspectives that delineate different processes depending on the type of learning (Shuell, 1986). Rumelhart and Norman (1978) identified three types of learning. Accretion involves encoding new inforꠓmation in terms of existing schemata; restructuring (schema creation) is the process of forming new schemata; and tuning (schema evolution) refers to the slow modification and refinement of schemata that occurs when using them in various contexts. These involve different amounts of practice: much for tuning and less for accretion and restructuring. ACT is essentially a computer program designed to simulate learning in a coherent manner. As such, it may not address the range of factors involved in human learning. One issue concerns how people know which production to use in a given situation, especially if situations lend themselves to different productions being employed. Productions may be ordered in terms of likelihood, but a means for deciding what production is best given the circumstance must be available. Also of concern is the issue of how productions are altered. For example, if a production does not work effectively, do learners discard it, modify it, or retain it but seek more evidence? What is the mechanism for deciding when and how productions are changed? Another concern relates to Anderson’s (1983, 1990) claim that productions begin as declarative knowledge. This assumption seems too strong given evidence that this seꠓquence is not always followed (Hunt, 1989). Because representing skill procedures a pieces of declarative knowledge is essentially a way station along the road to mastery, one might question whether students should learn the individual steps. The individual steps will eventually not be used, so time may be better spent allowing students to pracꠓtice them. Providing students with a list of steps they can refer to as they gradually deꠓvelop a procedure facilitates learning and enhances self-efficacy (Schunk, 1995). Finally, one might question whether production systems, as generally described, are nothing more than elaborate stimulus-response (S-R) associations (Mayer, 1992). Propositions (bits of procedural knowledge) become linked in memory so that when one piece is cued, others also are activated. Anderson (1983) acknowledged the associationist nature of productions but believes they are more advanced than simple S-R associations because they incorporate goals. In support of this point, ACT associations are analogous to neural network connections (Chapter 2). Perhaps, as is the case with behaviorist theoꠓries, ACT can explain performance better than it can explain learning. These and other questions (e.g., the role of motivation) need to be addressed by research and related to learning of academic skills to establish the usefulness of productions in education better. Connectionist Models. A line of recent theorizing about complex cognitive processes inꠓvolves connectionist models (or connectionism, but not to be confused with Thorndike’s connectionism discussed in Chapter 3; Baddeley, 1998; Farnham-Diggory, 1992; Smith, 1996). Like productions, connectionist models represent computer simulations of learning processes. These models link learning to neural system processing where impulses fire across synapses to form connections (Chapter 2). The assumption is that higher-order cognitive processes are formed by connecting a large number of basic elements such as neurons (Anderson, 1990, 2000; Anderson, Reder, & Lebiere, 1996; Bourne, 1992). Connectionist models include distributed representations of knowledge (i.e., spread out over a wide network), parallel processing (many operations occur at once), and interacꠓtions among large numbers of simple processing units (Siegler, 1989). Connections may be at different stages of activation (Smith, 1996) and linked to input into the system, outꠓput, or one or more in-between layers. Rumelhart and McClelland (1986) described a system of parallel distributed processꠓing (PDP). This model is useful for making categorical judgments about information in memory. These authors provided an example involving two gangs and information about gang members, including age, education, marital status, and occupation. In memory, the similar characteristics of each individual are linked. For example, Members 2 and 5 would be linked if they both were about the same age, married, and engaged in similar gang acꠓtivities. To retrieve information about Member 2, we could activate the memory unit with the person’s name, which in turn would activate other memory units. The pattern created through this spread of activation corresponds to the memory representation for the indiꠓvidual. Borowsky and Besner (2006) described a PDP model for making lexical decisions (e.g., deciding whether a stimulus is a word). Connectionist units bear some similarity to productions in that both involve memory activation and linked ideas. At the same time, differences exist. In connectionist models all units are alike, whereas productions contain conditions and actions. Units are differꠓentiated in terms of pattern and degree of activation. Another difference concerns rules. Productions are governed by rules. Connectionism has no set rules. Neurons “know” how to activate patterns; after the fact we may provide a rule as a label for the sequence (e.g., rules for naming patterns activated; Farnham-Diggory, 1992). One problem with the connectionist approach is explaining how the system knows which of the many units in memory to activate and how these multiple activations beꠓcome linked in integrated sequences. This process seems straightforward in the case of well-established patterns; for example, neurons know how to react to a ringing teleꠓphone, a cold wind, and a teacher announcing, “Everyone pay attention!” With lessꠓestablished patterns the activations may be problematic. We also might ask how neuꠓrons become self-activating in the first place. This question is important because it helps to explain the role of connections in learning and memory. Although the notion of connections seems plausible and grounded in what we know about neurological functioning (Chapter 2), to date this model has been more useful in explaining percepꠓtion rather than learning and problem solving (Mayer, 1992). The latter applications reꠓquire considerable research. 6. long-term memory – retrieval and forgetting 6.1 retrieval Retrieval Strategies. What happens when a student is asked a question such as, “What does the vice president of the United States do in the Senate?” The question enters the student’s WM and is broken into propositions. The process by which this occurs has a neurological basis and is not well understood, but available evidence indicates that inforꠓmation activates associated information in memory networks through spreading activaꠓtion to determine if they answer the question. If they do, that information is translated into a sentence and verbalized to the questioner or into motor patterns to be written. If the activated propositions do not answer the query, activation spreads until the answer is located. When insufficient time is available for spreading activation to locate the answer, students may make an educated guess (Anderson, 1990). Much cognitive processing occurs automatically. We routinely remember our home address and phone number, Social Security number, and close friends’ names. People are often unaware of all the steps taken to answer a question. However, when people must judge several activated propositions to determine whether the propositions properly anꠓswer the question, they are more aware of the process. Because knowledge is encoded as propositions, retrieval proceeds even though the inꠓformation to be retrieved does not exist in exact form in memory. If a teacher asks whether the vice president would vote on a bill when the initial vote was 51 for and 49 against, stuꠓdents could retrieve the proposition that the vice president votes only in the event of a tie. By implication, the vice president would not vote. Processing like this, which involves conꠓstruction, takes longer than when a question requires information coded in memory in the same form, but students should respond correctly assuming they activate the relevant propositions in LTM. The same process is involved in rule learning and transfer (Chapter 7): students learn a rule (e.g., the Pythagorean theorem in mathematics) and recall and apply it to arrive at solutions of problems they have never seen before. Encoding Specificity. Retrieval depends on the manner of encoding. According to the encoding specificity hypothesis (Brown & Craik, 2000; Thomson & Tulving, 1970), the manner in which knowledge is encoded determines which retrieval cues will effectively activate that knowledge. In this view, the best retrieval occurs when retrieval cues match those present during learning (Baddeley, 1998). Some experimental evidence supports encoding specificity. When people are given category names while they are encoding specific instances of the categories, they recall the instances better if they are given the category names at recall than if not given the names (Matlin, 2009). A similar benefit is obtained if they learn words with associates and then are given the associate names at recall than if not given the associates. Brown (1968) gave students a partial list of U.S. states to read; others read no list. Subsequently all stuꠓdents recalled as many states as they could. Students who received the list recalled more of the states on the list and fewer states not on it. Encoding specificity also includes context. In one study (Godden & Baddeley, 1975), scuba divers learned a word list either on shore or underwater. On a subsequent free reꠓcall task, learners recalled more words when they were in the same environment as the one in which they learned the words than when they were in the other environment. Encoding specificity can be explained in terms of spreading activation among propoꠓsitional networks. Cues associated with material to be learned are linked in LTM with the material at the time of encoding. During recall, presentation of these cues activates the relevant portions in LTM. In the absence of the same cues, recall depends on recalling inꠓdividual propositions. Because the cues lead to spreading activation (not the individual propositions or concepts), recall is facilitated by presenting the same cues at encoding and recall. Other evidence suggests that retrieval is guided in part by expectancies about what information is needed and that people may distort inconsistent information to make it coincide with their expectations (Hirt, Erickson, & McDonald, 1993). Retrieval of Declarative Knowledge. Although declarative knowledge often is processed automatically, there is no guarantee that it will be integrated with relevant information in LTM. We can see this in the scenario at the start of this chapter. Information about algeꠓbraic variables and operations has little meaning for students, and they cannot integrate it well with existing information in memory. Meaningfulness, elaboration, and organization enhance the potential for declarative information to be effectively processed and reꠓtrieved. Application 5.3 provides some classroom examples. Meaningfulness improves retrieval. Nonmeaningful information will not activate inꠓformation in LTM and will be lost unless students rehearse it repeatedly until it becomes established in LTM, perhaps by forming a new propositional network. One also can conꠓnect the sounds of new information, which are devoid of meaning, to other similar sounds. The word constitution, for example, may be linked phonetically with other uses of the word stored in learners’ memories (e.g., Constitution Avenue). Meaningful information is more likely to be retained because it easily connects to propositional networks. In the opening scenario, one suggestion offered is to relate algeꠓbraic variables to tangible objects—things that students understand—to give the algebraic notation some meaning. Meaningfulness not only promotes learning, but it also saves time. Propositions in WM take time to process; Simon (1974) estimated that each new piece of information takes 10 seconds to encode, which means that only six new pieces of information can be processed in a minute. Even when information is meaningful, much knowledge is lost before it can be encoded. Although every piece of incoming inꠓformation is not important and some loss usually does not impair learning, students typiꠓcally retain little information even under the best circumstances. When we elaborate we add to information being learned with examples, details, inꠓferences, or anything that serves to link new and old information. A learner might elaboꠓrate the role of the vice president in the Senate by thinking through the roll call and, when there is a tie, having the vice president vote. Elaboration facilitates learning because it is a form of rehearsal: By keeping informaꠓtion active in WM, elaboration increases the likelihood that information will be permaꠓnently stored in LTM. This facilitates retrieval, as does the fact that elaboration establishes links between old and new information. Students who elaborate the role of the vice presꠓident in the Senate link this new information with what they know about the Senate and the vice president. Well-linked information in LTM is easier to recall than poorly linked inꠓformation (Stein et al., 1984). Although elaboration promotes storage and retrieval, it also takes time. Comprehending sentences requiring elaboration takes longer than sentences not requiring elaboration (Haviland & Clark, 1974). For example, the following sentences require drawing an inferꠓence that Marge took her credit card to the grocery store: “Marge went to the grocery store, and “Marge charged her groceries.” The link is clarified in the following sentences: “Marge took her credit card to the grocery store,” and “Marge used her credit card to pay for her groceries.” Making explicit links between adjoining propositions assists their encoding and retention. An important aspect of learning is deciding on the importance of information. Not all learned information needs to be elaborated. Comprehension is aided when students elaborate only the most important aspects of text (Reder, 1979). Elaboration aids retrieval by providing alternate paths along which activation can spread, so that if one path is blocked, others are available (Anderson, 1990, 2000). Elaboration also provides addiꠓtional information from which answers can be constructed (Reder, 1982), such as when students must answer questions with information in a different form from that of the learned material. In general, almost any type of elaboration assists encoding and retrieval; however, some elaborations are more effective than others. Activities such as taking notes and askꠓing how new information relates to what one knows build propositional networks. Effective elaborations link propositions and stimulate accurate recall. Elaborations not linked well to the content do not aid recall (Mayer, 1984). Organization takes place by breaking information into parts and specifying relaꠓtionships between parts. In studying U.S. government, organization might involve breaking government into three branches (executive, legislative, judicial), breaking each of these into subparts (e.g., functions, agencies), and so on. Older students emꠓploy organization more often, but elementary children are capable of using organizaꠓtional principles (Meece, 2002). Children studying leaves may organize them by size, shape, and edge pattern. Organization improves retrieval by linking relevant information; when retrieval is cued, spreading activation accesses the relevant propositions in LTM. Teachers routinely organize material, but student-generated organization is also effective for retrieval. Instruction on organizational principles assists learning. Consider a schema for underꠓstanding stories with four major attributes: setting, theme, plot, and resolution (Rumelhart, 1977). The setting (“Once upon a time . . .”) places the action in a context. The theme is then introduced, which consists of characters who have certain experiences and goals. The plot traces the actions of the characters to attain their goals. The resolution describes how the goal is reached or how the characters adjust to not attaining the goal. By describing and exemplifying these phases of a story, teachers help students learn to identify them on their own. Retrieval of Procedural Knowledge. Retrieval of procedural knowledge is similar to that of declarative knowledge. Retrieval cues trigger associations in memory, and the process of spreading activation activates and recalls relevant knowledge. Thus, if students are told to perform a given procedure in chemistry laboratory, they will cue that production in memꠓory, recall it, and implement it. When declarative and procedural knowledge interact, retrieval of both is necessary. While adding fractions, students use procedures (i.e., convert fractions to their lowest common denominator, add numerators) and declarative knowledge (addition facts). During reading comprehension, some processes operate as procedures (e.g., decoding, monitoring comprehension), whereas others involve only declarative knowledge (e.g., word meanings, functions of punctuation marks). People typically employ procedures to acquire declarative knowledge, such as mnemonic techniques to remember declarative knowledge (see Chapter 7). Having declarative information is typically a prerequisite for successfully implementing procedures. To solve for roots using the quadratic formula, students must know multiplication facts. Declarative and procedural knowledge vary tremendously in scope. Individuals posꠓsess declarative knowledge about the world, themselves, and others; they understand proꠓcedures for accomplishing diverse tasks. Declarative and procedural knowledge are differꠓent in that procedures transform information. Such declarative statements as “2 ꠙ 2 ꠗ 4” and “Uncle Fred smokes smelly cigars” change nothing, but applying the long-division alꠓgorithm to a problem changes an unsolved problem into a solved one. Another difference is in speed of processing. Retrieval of declarative knowledge often is slow and conscious. Even assuming people know the answer to a question, they may have to think for some time to answer it. For example, consider the time needed to anꠓswer “Who was the U.S. president in 1867?” (Andrew Johnson). In contrast, once proceꠓdural knowledge is established in memory, it is retrieved quickly and often automatically. Skilled readers decode printed text automatically; they do not have to consciously reflect on what they are doing. Processing speed distinguishes skilled from poor readers (de Jong, 1998). Once we learn how to multiply, we do not have to think about what steps to follow to solve problems. The differences in declarative and procedural knowledge have implications for teachꠓing and learning. Students may have difficulty with a particular content area because they lack domain-specific declarative knowledge or because they do not understand the preꠓrequisite procedures. Discovering which is deficient is a necessary first step for planning remedial instruction. Not only do deficiencies hinder learning, they also produce low selfꠓefficacy (Chapter 4). Students who understand how to divide but do not know multipliꠓcation facts become demoralized when they consistently arrive at wrong answers. 6.2 language comprehension An application illustrating storage and retrieval of information in LTM is language comꠓprehension (Carpenter, Miyake, & Just, 1995; Corballis, 2006; Clark, 1994; Matlin, 2009). Language comprehension is highly relevant to school learning and especially in light of the increasing number of students whose native language is not English (Fillmore & Valadez, 1986; Hancock, 2001; Padilla, 2006). Comprehending spoken and written language represents a problem-solving process involving domain-specific declarative and procedural knowledge (Anderson, 1990). Language comprehension has three major components: perception, parsing, and utilizaꠓtion. Perception involves attending to and recognizing an input; sound patterns are transꠓlated into words in working memory (WM). Parsing means mentally dividing the sound patterns into units of meaning. Utilization refers to the disposition of the parsed mental representation: storing it in LTM if it is a learning task, giving an answer if it is a question, asking a question if it is not comprehended, and so forth. This section covers parsing and utilization; perception was discussed earlier in this chapter (Application 5.4) Parsing. Linguistic research shows that people understand the grammatical rules of their language, even though they usually cannot verbalize them (Clark & Clark, 1977). Beginning with the work of Chomsky (1957), researchers have investigated the role of deep structures containing prototypical representations of language structure. The English language contains a deep structure for the pattern “noun 1–verb–noun 2,” which allows us to recognize these patterns in speech and interpret them as “noun 1 did verb to noun 2.” Deep structures may be represented in LTM as productions. Chomsky postulated that the capacity for acquiring deep structures is innately human, although which structures are acquired depends on the language of one’s culture. Parsing includes more than just fitting language into productions. When people are exposed to language, they construct a mental representation of the situation. They recall from LTM propositional knowledge about the context into which they integrate new knowledge. A central point is that all communication is incomplete. Speakers do not provide all information relevant to the topic being discussed. Rather, they omit the inꠓformation listeners are most likely to know (Clark & Clark, 1977). For example, suppose Sam meets Kira and Kira remarks, “You won’t believe what happened to me at the conꠓcert!” Sam is most likely to activate propositional knowledge in LTM about concerts. Then Kira says, “As I was locating my seat . . .” To comprehend this statement, Sam must know that one purchases a ticket with an assigned seat. Kira did not tell Sam these things because she assumed he knew them. Effective parsing requires knowledge and inferences (Resnick, 1985). When exposed to verbal communication, individuals access information from LTM about the situation. This information exists in LTM as propositional networks hierarchically organized as schemas. Networks allow people to understand incomplete communications. Consider the following sentence: “I went to the grocery store and saved five dollars with coupons.” Knowledge that people buy merchandise in grocery stores and that they can redeem coupons to reduce cost enables listeners to comprehend this sentence. The missing inꠓformation is filled in with knowledge in memory. People often misconstrue communications because they fill in missing information with the wrong context. When given a vague passage about four friends getting toꠓgether for an evening, music students interpreted it as a description of playing music, whereas physical education students described it as an evening of playing cards (Anderson, Reynolds, Schallert, & Goetz, 1977). The interpretative schemas salient in people’s minds are used to comprehend problematic passages. As with many other linꠓguistic skills, interpretations of communications become more reliable with developꠓment as children realize both the literal meaning of a message and its intent (Beal & Belgrad, 1990). That spoken language is incomplete can be shown by decomposing communications into propositions and identifying how propositions are linked. Consider this example (Kintsch, 1979): The Swazi tribe was at war with a neighboring tribe because of a dispute over some cattle. Among the warriors were two unmarried men named Kakra and his younger brother Gum. Kakra was killed in battle. Although this passage seems straightforward, analysis reveals the following 11 distinct propositions: 1. The Swazi tribe was at war. 2. The war was with a neighboring tribe. 3. The war had a cause. 4. The cause was a dispute over some cattle. 5. Warriors were involved. 6. The warriors were two men. 7. The men were unmarried. 8. The men were named Kakra and Gum. 9. Gum was the younger brother of Kakra. 10. Kakra was killed. 11. The killing occurred during battle. Even this propositional analysis is incomplete. Propositions 1 through 4 link together, as do Propositions 5 through 11, but a gap occurs between 4 and 5. To supply the missing link, one might have to change Proposition 5 to “The dispute involved warriors.” Kintsch and van Dijk (1978) showed that features of communication influence comꠓprehension. Comprehension becomes more difficult when more links are missing and when propositions are further apart (in the sense of requiring inferences to fill in the gaps). When much material has to be inferred, WM becomes overloaded and compreꠓhension suffers. Just and Carpenter (1992) formulated a capacity theory of language comprehension, which postulates that comprehension depends on WM capacity and individuals differ in this capacity. Elements of language (e.g., words, phrases) become activated in WM and are operated on by other processes. If the total amount of activation available to the sysꠓtem is less than the amount required to perform a comprehension task, then some of the activation maintaining older elements will be lost (Carpenter et al., 1995). Elements comꠓprehended at the start of a lengthy sentence may be lost by the end. Production-system rules presumably govern activation and the linking of elements in WM. We see the application of this model in parsing of ambiguous sentences or phrases (e.g., “The soldiers warned about the dangers . . .”; MacDonald, Just, & Carpenter, 1992). Although alternative interpretations of such constructions initially may be activated, the duration of maintaining them depends on WM capacity. Persons with large WM capacities maintain the interpretations for quite a while, whereas those with smaller capacities typiꠓcally maintain only the most likely (although not necessarily correct) interpretation. With increased exposure to the context, comprehenders can decide which interpretation is corꠓrect, and such identification is more reliable for persons with large WM capacities who still have the alternative interpretations in WM (Carpenter et al., 1995; King & Just, 1991). In building representations, people include important information and omit details (Resnick, 1985). These gist representations include propositions most germane to comꠓprehension. Listeners’ ability to make sense of a text depends on what they know about the topic (Chiesi et al., 1979; Spilich et al., 1979). When the appropriate network or schema exists in listeners’ memories, they employ a production that extracts the most central information to fill the slots in the schema. Comprehension proceeds slowly when a network must be constructed because it does not exist in LTM. Stories exemplify how schemas are employed. Stories have a prototypical schema that includes setting, initiating events, internal responses of characters, goals, attempts to attain goals, outcomes, and reactions (Black, 1984; Rumelhart, 1975, 1977; Stein & Trabasso, 1982). When hearing a story, people construct a mental model of the situation by recalling the story schema and gradually fitting information into it (Bower & Morrow, 1990). Some categories (e.g., initiating events, goal attempts, consequences) are nearly alꠓways included, but others (internal responses of characters) may be omitted (Mandler, 1978; Stein & Glenn, 1979). Comprehension proceeds quicker when schemas are easily activated. People recall stories better when events are presented in the expected order (i.e., chronological) rather than in a nonstandard order (i.e., flashback). When a schema is well established, people rapidly integrate information into it. Research shows that early home literacy experiences that include exposure to books relate positively to the develꠓopment of listening comprehension (Sénéchal & LeFevre, 2002). Utilization. Utilization refers to what people do with the communications they receive. For example, if the communicator asks a question, listeners retrieve information from LTM to answer it. In a classroom, students link the communication with related informaꠓtion in LTM. To use sentences properly, as speakers intend them, listeners must encode three pieces of information: speech act, propositional content, and thematic content. A speech act is the speaker’s purpose in uttering the communication, or what the speaker is trying to accomplish with the utterance (Austin, 1962; Searle, 1969). Speakers may be conveying information to listeners, commanding them to do something, requesting information from them, promising them something, and so on. Propositional content is information that can be judged true or false. Thematic content refers to the context in which the utterance is made. Speakers make assumptions about what listeners know. On hearing an utterance, listeners infer information not explicitly stated but germane to how it is used. The speech act and propositional and thematic contents are most likely encoded with productions. As an example of this process, assume that Jim Marshall is giving a history lesson and is questioning students about text material. Mr. Marshall asks, “What was Churchill’s position during World War II?” The speech act is a request and is signaled by the sentence beginning with a WH word (e.g., who, which, where, when, and why). The propositional content refers to Churchill’s position during World War II; it might be represented in memory as follows: Churchill–Prime Minister–Great Britain–World War II. The thematic content refers to what the teacher left unsaid; the teacher assumes stuꠓdents have heard of Churchill and World War II. Thematic content also includes the classroom question-and-answer format. The students understand that Mr. Marshall will be asking questions for them to answer. Of special importance for school learning is how students encode assertions. When teachers utter an assertion, they are conveying to students they believe the stated propoꠓsition is true. If Mr. Marshall says, “Churchill was the Prime Minister of Great Britain durꠓing World War II,” he is conveying his belief that this assertion is true. Students record the assertion with related information in LTM. Speakers facilitate the process whereby people relate new assertions with inforꠓmation in LTM by employing the given-new contract (Clark & Haviland, 1977). Given information should be readily identifiable and new information should be unknown to the listener. We might think of the given-new contract as a production. In integrating information into memory, listeners identify given information, access it in LTM, and reꠓlate new information to it (i.e., store it in the appropriate “slot” in the network). For the given-new contract to enhance utilization, given information must be readily idenꠓtified by listeners. When given information is not readily available because it is not in listeners’ memories or has not been accessed in a long time, using the given-new proꠓduction is difficult. Although language comprehension is often overlooked in school in favor of reading and writing, it is a central component of literacy. Educators lament the poor listening and speaking skills of students, and these are valued attributes of leaders. Habit 5 of Covey’s (1989) Seven Habits of Highly Effective People is “Seek first to understand, then to be understood,” which emphasizes listening first and then speaking. Listening is intiꠓmately linked with high achievement. A student who is a good listener is rarely a poor reader. Among college students, measures of listening comprehension may be indistinꠓguishable from those of reading comprehension (Miller, 1988)