Working Memory and Affective State: Managing Cognitive Load in TurboTax Online

Introduction

Working memory is central to our conceptions of thinking and our sense of self. It is the nexus of our interactions with the world, where sensory perceptions interact with information from long term memory and our attention is directed and controlled, yet compared to this important role, it is somewhat surprisingly limited in capacity and duration and highly susceptible to disruption. Individual working memory capacity, which can predict performance on higher-order cognitive tasks such as reading comprehension and problem solving (Unsworth, Fukuda, Awh & Vogel, 2014), varies in real or effective ways according to a variety of factors. Beginning from the first distinction between short- and long-term memory, recent models of working memory subdivide it further, including Baddeley’s (2000) prominent four-component model. In terms of design, the capacity of working memory is an important consideration that affects efficiency, effectiveness, and satisfaction. Understanding of the observed functions and limitations of working memory, along with Cognitive Load Theory (CLT; Sweller, 1994) and the modulating effects of emotion, provide the basis for design strategies that can better tailor content to its strengths and weaknesses.

Filing a tax return is an example of an often a confusing and frustrating process involving information gathered from multiple sources and many variables to be considered that may place considerable strain on working memory. The many online filing programs represent an improvement over a traditional form-based experience for those wishing to complete the process without the help of a tax professional, generally offering a question and answer format to guide users through the process and providing clarifying information for the many technical or unfamiliar terms. These options vary in their level of success in regulation of the flow of information to limit cognitive load to a reasonable level, but TurboTax stands out for its additional attention to users’ affective state and trust-management.

Models of Working Memory

Although sometimes used interchangeably, Cowan (2008) suggests that short-term memory (STM) can be understood as short-term storage of sensory information, while working memory (WM) is the entire system of manipulating and processing information, including that retrieved from long-term memory (LTM). The most prominent model of WM was first proposed by Baddeley & Hitch (1974) and represented a departure from the then current conception of STM as “a unitary store in favour of a multi-component system” and emphasized its role in complex cognition (Baddeley, 2000, p. 417). This model originally consisted of the phonological loop, auditory channel that handle echoic memory traces and their rehearsal (repetition with the goal of delaying decay); the visuo-spatial sketchpad, a visual channel that deals with both visual and spatial information; and the central executive, where attention is directed and controlled (Baddeley, 2003).

In 2000, a fourth component was proposed to account for additional interaction between the other elements of WM, the episodic buffer, which is “capable of integrating information from a variety of sources” (Baddeley, 2000, p. 241). This is the true ‘workbench of the mind’ where information from the phonological loop, visuospatial sketchpad, and LTM are temporarily stored, manipulated, and combined. Miyaki et al. (2000) refined understanding of the central executive by identifying its three major control functions: inhibition, or preventing the dedication of resources as when resisting distraction; shifting, in response to changes in task demands or multitasking; and updating, or monitoring and revising working memory representations (p. 56-7).

Baddeley’s model provides an account for some of the central observations of how WM functions. The phonological loop is compatible with the similarity effect, in which items that sound the same are more difficult to remember accurately, which indicates an acoustic component, and the word length effect, in which a smaller quantity of longer words can be accurately recalled than shorter words, since they take more time to pronounce in rehearsal leaving greater opportunity for decay (Baddeley, 2000). Likewise, the effects of articulatory suppression, or preventing rehearsal through the repetition an unrelated word, which dramatically reduce memory duration, are integral to the phonological loop. The limited capacity of the visuospatial sketchpad, approximately 3-4 objects, provides insight into change blindness (Baddeley, 2003). Furthermore, a multi-component model is foundational to CLT and Attentional Control Theory (ACT; Eysenck, Derakshan, Santos, & Calvo, 2007), which offers an explanation for how anxiety affects cognitive performance, among others. Although this model has been modified and challenged, Baddeley maintains that most competing explanations, such as Cowan’s attention-based Embedded Process Theory of WM, differ mainly in terms of emphasis and terminology, not fundamental understanding (Baddeley, 2012).

Capacity & Duration

The reasons for the limited capacity and duration of WM are not clear, and there is division over whether these limits are a strength or a weakness (Cowan, 2010, p. 56). In a famous but somewhat misinterpreted paper, Miller (1956) “discussed the ‘magical number seven plus or minus two’ as a constant in short-term processing, including list recall, absolute judgment, and numerical estimation experiments” (Cowan, 2008, p. 323). Though absolute judgment and working memory capacity are subject to a similar numerical limit, “absolute judgment is limited by the amount of information,” while WM “by the number of items” (Miller, 1956, p. 91). These items, or “chunks,” may be anything treated as a single unit, from individual digits, to learned word-pairs or familiar sentences, to entire schemata, as is characteristic of expertise (van Merrienboer & Sweller, 2005). Later evidence has adjusted this rule of thumb down to 3-5 chunks, particularly for active processing (Low, Jin & Sweller, 2005; Cowan, 2010). Some of the variability in WM capacity can be explained by the difference between processing-related versus storage-specific capacity limits; basic storage capacity is relatively stable but WM ability will vary “widely depending on what processes can be applied to a given task,” including rehearsal and various mnemonic devices (Cowan, 2010, p. 51-52).

Early evaluation of duration established that almost all contents WM is lost within about 20 seconds without rehearsal (Low, Jin & Sweller, p. 97). Cowan (2008) describes to two main competing and somewhat controversial explanations for duration limits, one in which storage is a direct function of time, and the other where duration is an extension of capacity, in which “a number of items smaller than the capacity limit could remain in short-term storage until they are replaced by other items” (p. 323). Recent work by Oberauer et al. (2016) found the decay explanation untenable and tentatively favored a combination of limited capacity and interference for the limits of WM, though the observed effects remain the same regardless of underlying mechanism.

Disruption & Interference

WM is susceptible to disruption and interference, which can also further limit its duration and capacity. Disruption may be voluntary, as in some forms of task-switching (Monsell, 2003), or involuntary, as when concentration is interrupted by a loud noise. Interference is a corruption of information WM by either retroactive or proactive influences. Retroactive interferences results when new information disrupts the recall of previously known information, while proactive interference is the disruption of the recall of new information by the intrusion of unrelated information from LTM. The effects of proactive interference are stronger in tasks of recall than in recognition, and “also varies as a function of stimulus type, number of prior lists, and time between learning and retrieval” (Bunting, 2006, p. 193). Interference is more likely to occur with information processed by the same channel (visual vs. auditory) or that is similar in type (e.g. words vs. numbers) (Kane & Engel, 2000).

Cognitive Load Theory

Cognitive Load Theory provides a framework in which to understand how the limits of WM affect learning and more general information processing tasks. Under this model, expertise is gained through more efficient chunking and the construction of increasingly comprehensive schemata, which allow more information to be held for use in WM (van Merrienboer & Sweller, 2005). Automation of repeatedly applied schemata to “those aspects of performance that are consistent across problem situations” further frees up WM for novel information (van Merrienboer & Sweller, 2005, p. 149). High element interactivity, when multiple components need to be considered or understood together, will increase cognitive load (Low, Jin & Sweller, 2005).

As schemata are built, there are three general sources of cognitive load that impact performance. Intrinsic load is based on the number of items that need to be processed simultaneously, which in turn depends on the interactivity of the material, and remains constant. There are also two forms of extrinsic load that are related to the method of presentation: germane load represents effective presentation and the direct results of schema construction and automation, while extraneous load is unproductive noise or inefficiency that should be reduced or eliminated wherever possible (Choi, van Merrienboer & Paas, 2014).

In terms of HCI, extraneous cognitive load may be further divided into two main sources: instructional or information design and the underlying demands of computer or ICT use (Hollender, Hofman, Deneke & Schmitz, 2010, p. 1284). Additionally, Choi, Merrienboer, & Paas (2014) suggest consideration of the effects of the physical learning environments, including visual or auditory noise, physiological responses to temperature, context-dependent memory effects, and influence on affect, as an important addition to CLT.

Within CLT, a number of effects have been identified that can either add to or reduce cognitive load depending on their implementation. The first is the split attention effect, which decreases efficiency when there is a need to “integrate multiple sources of information that are physically or temporally separate from each other” (Low, Jin & Sweller, 2005, p. 100-1). The redundancy effect is the point at which redundant information ceases to provide benefits due to the increased cognitive load that the additional material imposes. This point will vary by individual according to expertise, where the value of redundancy is greater for novices (Low, Jin & Sweller, 2005). Lastly, the modality effect is a consequence of the relative independence of the phonological loop and visuospatial sketchpad, where it is possible to increase effective WM capacity with a mix of visual and auditory presentation compared to either channel alone (Low, Jin & Sweller, 2005).

Affective State

Emotional responses and an individual affective state further affect WM capacity, attentional resources, and cognitive load, in both positive or negative ways. Arousal theory, “a hypothetical construct that represents the level of central nervous system activity along a behavioral continuum ranging from sleep to alertness” gives rise to the idea of an optimal level of arousal, often used synonymously with stress, at which performance peaks (Staal, 2004, p. 2-3). This “inverted U” relationship is referred to as the Yerkes-Dodson law (Staal, 2004, p. 3) and provides a basis for understanding the observed effects of stress or anxiety in which moderate levels increase performance but are severely detrimental in excess. Affective events affect both perceptual competition at the sensory level and allocation of attention within executive control. Pessoa (2009) differentiates between affective influences that are either stimulus-driven (external, either inherent or by association), which will positively or negatively affect performance depending on if it is task-relevant or not, or state-dependent (general mood and anxiety level).

ACT suggests that anxiety particularly impairs the shifting and inhibition functions of the central executive (Eysenck, et al., 2007) reducing the ability to resist distraction and interference, and imposing higher task-switching costs, particular in more complex tasks (Derakshan, Smyth & Eysenck, 2009). In word memory and recognition tests, Delleman & Fernandes (2015) found that high-anxious individuals had lower accuracy than the low-anxious group, but even at high levels of accuracy, confidence judgements remained low. The authors suggested that while this “state-dependent anxiety” (Pessoa, 2009) reduced performance, in other experiments, situation anxiety, such as evaluation by raters, has shown evidence of enhanced memory performance (Delleman & Fernandes, 2015, p. 79).

Likewise, motivation can affect cognitive efficiency in a similar but generally more positive way. Motivation is a results from a combination of extrinsic (external incentives like money, recognition, or other rewards) and intrinsic (“behaviors [that] are themselves enjoyable, purposive, and provide sufficient reason to persist”) factors (Cerasoli, Nicklin & Ford, 2014, p. 981). Engelmann & Pessoa (2007) found that “elevated motivation leads to improved efficiency in orienting and reorienting” spatial attention, sharpening information-processing abilities during “motivationally salient conditions” (p. 668). Motivation reallocates resources to persue reward, but the shared-resource model of WM means that this can have negative results in certain circumstances by reducing resources for indirectly related processes (Pessoa, 2009). In other words, sometimes this attentional narrowing is effective, increasing reaction and processing times, but in some more complex situations like creative problem-solving, it may be detrimental and limit innovation (Pink, 2011).

Case Study

Figure 1

Filing taxes is a critical and complex task with both intrinsic and extrinsic sources of cognitive load and frequently involves a negative emotional component as well. Money is commonly a topic that people are uncomfortable with or anxious about. Tax code in particular is confusing and may be a source of frustration exacerbated by larger political concerns. Additional associated fears, resentments or other negative emotions about an individual’s job and compensation, investments, health care and education costs, or other financial obligations like child support may also be activated. Time pressure is another factor that can contribute to increased anxiety, since many people may put off this often unpleasant task until the last moment. Lastly, taxes are a universal obligation, meaning that the process should be able to be accomplished by people with a full range of cognitive abilities, including diminished WM as a result of cognitive disability or aging. As a result of this context, it is especially important the extraneous sources of cognitive load resulting from interface design and the presentation of information be reduced as much as possible best accommodate the effects of lowered WM capacity, whether innate or anxiety-induced.

Figure 2

Presentation of Information

In general, TurboTax presents a manageable flow of information that is restricted to the step at hand and presented in a clean, easily readable way. A persistent navigation bar presents an overview of the process and allows users to move through it in a non-linear way if desired (fig. 2). There are built-in opportunities for progress review and the opportunity to go back or change information is always presented, reducing stress related to making errors. The rate of information presentation is adaptive, in a basic way, that can better accommodate users who are struggling with the process. If the user leaves the form on a particular page partially or entirely blank, instead of an error message, the system will switch from an all-at-once to a line by line presentation for data entry with additional guidance and explanation (fig. 3).

Figure 3

TurboTax errs towards the side of novice, anxious user, running the risk that more experienced or confident users may find the tone patronizing and feel that the pace slows them down. This danger is most acute in the adaptive response to missing information. Once the task flow is switched to the line-by-line entry, it is not possible to go back to the one-page view, meaning that if a user accidentally left a box blank or wanted to temporarily skip a section, rather than being unsure about how to proceed, they may be significantly inconvenienced.

Anxiety-Reduction Strategies

Figure 4

The vocabulary is accessible and jargon-free, with additional information available as needed for common questions or unclear terms (fig. 4). Furthermore, the tone is informal and personable, with frequent encouragement and positive messages (fig. 5). Recurrent opportunities for users to give feedback helps create a feeling of empowerment (fig. 4).

Figure 5

Promoting Trust

Fig. 6 is an example of several positive features. This information helps reduce cognitive load by letting the user know what to expect and where they are in the process. The essential information is presented in the clear headline, while additional information is provided for those who interested and/or attendent. Failing even that, the image provides a basic sign post for the step in the process. The informal tone and commonly-used language prevent intimidation and foster a sense of collaboration.

Figure 6

According to a study of over 2500 participants, the top three criteria considered when evaluating website credibility are Design Look, Information Design/Structure, and Information Focus (Fogg et al., 2003). Here, TurboTax’s clean design, ample use of white space, subdued color scheme, and consistency are likely to engender positive assessments of credibility. Transparency is use both to lower cognitive load and demonstrate trustworthiness. In fig. 7, demands on the user are reduced by making an automatic selection, but a clear explanation of what was done and why is provided as feedback and reassurance. Personalization (fig. 8) is another technique that may increase feelings of trust.

Figure 7

 

Figure 8

Another downside to the personal and informal tone adopted to reduce negative emotional affect, however, is that it may read as unprofessional to a portion of users. The same is true of the ‘cute’ icons and graphics, which do not reflect the common design standards for other major financial institutions. This is a particular concern when dealing with sensitive personal and financial information. While TurboTax has done a good job in communicating why the information is needed and can rely on its name recognition, more effort may be needed to establish trustworthiness in the current climate of general distrust of institutions, particularly in the financial sector.
Overall, TurboTax is a case study of an excellent effort at managing anxiety and cognitive load for a historically difficult and sometimes emotionally fraught task. Additional improvements may be made to improve flexibility for more advanced users and promote user trust. By understanding the limitations and ways to exploit the strengths of working memory, similar improvements can be brought to a range of online and in-person services and experiences.

References

Baddeley, A. D., & Hitch, G. (1974). Working memory. Psychology of learning and motivation, 8, 47-89.Baddeley. (2000). The episodic buffer: a new component of working memory? Trends in Cognitive Sciences, 4(11), 417-423.Baddeley, A. (2003). Working memory: Looking back and looking forward. Nature Reviews.Neuroscience, 4(10), 829-39. doi:http://dx.doi.org/10.1038/nrn1201

Baddeley, A. (2012). Working memory: theories, models, and controversies. Annual review of psychology, 63, 1-29.

Bunting, M. (2006). Proactive interference and item similarity in working memory. Journal of Experimental Psychology: Learning, Memory, and Cognition, 32(2), 183.

Cerasoli, C. P., Nicklin, J. M., & Ford, M. T. (2014). Intrinsic motivation and extrinsic incentives jointly predict performance: A 40-year meta-analysis. Psychological Bulletin, 140(4), 980-1008.

Choi, H., van Merriënboer, J. J., , G., & Paas, F. (2014). Effects of the physical environment on cognitive load and learning: Towards a new model of cognitive load. Educational Psychology Review, 26(2), 225-244. doi:http://dx.doi.org/10.1007/s10648-014-9262-6

Cowan, N. (2008). What are the differences between long-term, short-term, and working memory?. Progress in brain research, 169, 323-338.

Cowan, N. (2010). The Magical Mystery Four: How Is Working Memory Capacity Limited, and Why? Current Directions in Psychological Science, 19(1), 51-57. Retrieved from http://www.jstor.org/stable/41038538

Delleman, B., & Fernandes, M. (2015). Individual differences in anxiety influence verbal memory accuracy and confidence. Journal Of Individual Differences, 36(2), 73-79. doi:10.1027/1614-0001/a000158

Derakshan, N., Smyth, S., & Eysenck, M. W. (2009). Effects of state anxiety on performance using a task-switching paradigm: An investigation of attentional control theory. Psychonomic Bulletin & Review, 16(6), 1112-1117.

Engelmann, J.B. & Pessoa, L. (2007). Motivation sharpens exogenous spatial attention. Emotion 7, 668–674.

Eysenck, M. W., Derakshan, N., Santos, R., & Calvo, M. G. (2007). Anxiety and cognitive performance: attentional control theory. Emotion, 7(2), 336.

Fogg, B. J., Soohoo, C., Danielson, D. R., Marable, L., Stanford, J., & Tauber, E. R. (2003, June). How do users evaluate the credibility of Web sites?: a study with over 2,500 participants. In Proceedings of the 2003 conference on Designing for user experiences (pp. 1-15). ACM.

Hollender, N., Hofmann, C., Deneke, M., & Schmitz, B. (2010). Integrating cognitive load theory and concepts of human–computer interaction. Computers in Human Behavior, 26(6), 1278-1288.

Kane, M. J., & Engle, R. W. (2000). Working-memory capacity, proactive interference, and divided attention: limits on long-term memory retrieval. Journal of Experimental Psychology: Learning, Memory, and Cognition, 26(2), 336.

Low, R., Jin, P., & Sweller, J. (2005). Cognitive load theory, attentional processes, learning. In Roda, C. (Ed.), Human Attention in Digital Environments (993-113). Cambridge University Press: New York, NY.

Oberauer, K., Farrell, S., Jarrold, C., & Lewandowsky, S. (2016). What limits working memory capacity?. Psychological Bulletin, 142(7), 758-799. doi:10.1037/bul0000046

Miller, G. A. (1956). The magical number seven, plus or minus two: some limits on our capacity for processing information. Psychological review, 63(2), 81.

Miyake, A., Friedman, N. P., Emerson, M. J., Witzki, A. H., Howerter, A., & Wager, T. D. (2000). The unity and diversity of executive functions and their contributions to complex “frontal lobe” tasks: A latent variable analysis. Cognitive Psychology, 41, 49–100.Monsell, S. (2003). Task switching. Trends in Cognitive Sciences, 7, 134 –140.
Pessoa, L. (2009). How do emotion and motivation direct executive control?. Trends in cognitive sciences, 13(4), 160-166.

Pink, D. H. (2011). Drive: The surprising truth about what motivates us. New York, NY: Penguin.

Staal, M. A. (2004). Stress, cognition, and human performance: A literature review and conceptual framework. Hanover, MD: National Aeronautics & Space Administration.

Sweller, J. (1994). Cognitive load theory, learning difficulty, and instructional design. Learning and instruction, 4(4), 295-312.

Unsworth, N., Fukuda K., Awh, E., & Vogel, E. K. (2014). Working memory and fluid intelligence: Capacity, attention control, and secondary memory retrieval. Cognitive Psychology, 71, 1-26.

Van Merrienboer, J. J., & Sweller, J. (2005). Cognitive load theory and complex learning: Recent developments and future directions. Educational psychology review, 17(2), 147-177.