Be sure to set your printer to black and white.

Be sure to set your printer to black and white.
Cognitive Analysis of Dynamic Performance: Cognitive process analysis and modeling. This slide contains information in Note View. Switch to note view and print all of these out. Be sure to set your printer to black and white. You may need to change fonts if we have used some that you do not have (or if your printer or computer does funny things to our slides) Wayne D. Gray Deborah A. Boehm-Davis HFES/IEA San Diego, CA July 30, 2000

Abstract Many cognitive task analysis techniques provide static descriptions of the declarative knowledge possessed by domain experts. However, skilled performance is dynamic, not static. This workshop will describe a family of cognitive process techniques which allow the analyst to go beyond static descriptions of declarative knowledge to develop analytic models that capture the interactions between human cognition, the design of objects, and task performance. Specifically, the workshop will combine lecture with a hands-on approach to provide an overview of the GOMS family of cognitive process analysis techniques and the ways in which GOMS can be used to represent activities that occur in parallel. During this workshop, participants will be provided hands-on experience at developing some types of GOMS models and in using other types of models to quickly see and evaluate design alternatives. Participants are expected to be GOMS novices but to be experienced at (or, at least, exposed to) more traditional approaches to cognitive task analysis.

Wayne D. Gray, Ph.D. Deborah A. Boehm-Davis, Ph.D.
Cognitive Analysis of Dynamic Performance: Cognitive process analysis and modeling Workshop Presented at the HFES/IEA-2000 Conference in San Diego, CA Wayne D. Gray, Ph.D. Deborah A. Boehm-Davis, Ph.D.

Outline of Workshop Intro to the GOMS family of models
Keystroke Level GOMS NGOMSL GOMS CPM-GOMS REVISED Tutorial Materials may be downloaded from:

Definition of GOMS Characterization of a task in terms of
The user's Goals The Operators available to accomplish those goals Methods (frequently used sequences of operators and sub-goals) to accomplish those goals, and If there is more than one method to accomplish a goal, the Selection rules used to choose between methods. GOMS was introduced by Card, Moran, & Newell in 1980, but is most extensively described and exemplified in The Psychology of Human-Computer Interaction (Card, Moran, & Newell, 1983).

What GOMS Is GOMS is a task analysis technique
Very similar to Hierarchical Task Analysis (Indeed, GOMS is a hierarchical task analysis technique) Hard part of task analysis is goal-subgoal decomposition Hard part of GOMS is goal-subgoal decomposition Different members of the GOMS family provide you with different sets of operators With established parameters But set is not complete (extensionable) Lower case vs upper case -- HTA vs hta.

Where does GOMS go? (When do you need to do a CTA versus a TA?)
Task analysis versus cognitive task analysis? What distinguishes GOMS cognitive task analysis from other task analysis techniques? E.g., Cognitive Task Analysis™

TASK ANALYSIS It is obvious that tasks such as making coffee have a certain structure that is dictated by the task itself and the design of the artifacts used to perform the task. For example, coffee must be ground before it can go into the container. For a percolator, water goes into the pot before the grounds. For a drip system, the grounds go in before the water. Similarly, the design of the artifact may dictate a certain number of design- specific subtasks; for example the subtask of putting in a coffee filter exists for a drip system, but not for a percolator system. This is the standard task analysis and although its lowest level operators may be actions that are executed by a human (e.g., add six scoops of coffee), I would not want to call this a cognitive task analysis. GOMS provides a task analysis of embodied cognition -- its lowest level operators are cognitive elements, perceptual operations, and elementary motor movements. This is where GOMS parts company from most task analysis methods, including those that call themselves cognitive task analyses.

Time Scale for GOMS Levels of analysis (based on Newell’s Time Scale of Human Action) I believe that this point, that GOMS provides a method of task analysis that includes embodied cognition is worth driving home as it is little understood by many in the Human Factors community. Tasks take place in the bounded rationality band (see extreme right hand column) in which the influence of human cognition is hard to discern. Events in the bounded rationality band take from 30 seconds to days to occur. As in the coffee making example, the task and artifacts available to perform the task determine the structural components of tasks, that is, subtasks.

Level 1: Task Analysis: Task  Subtasks  Subtasks 
As in the coffee making example, at the bounded rationality level tasks can be analyzed into subtasks without direct reference to human cognitive processes This example is from a task analysis of Argus -- a simulated task environment that we are studying in our lab.

Levels of analysis (based on Newell’s Time Scale of Human Action)
At the bottom of the bounded rationality band, subtasks are decomposed into unit tasks (see arrow). “The unit task is fundamentally a control construct, not a task construct” . The unit task is not determined by the task and tool alone but results from the interaction of these with the control problems faced by the user.

Level 2: Subtask  Unit Tasks
We have built many variations of Argus Prime and there are many possible variations that we have not built. However, the simple unit task structure provided by the figure fits all past and future variations. Participants acquire (or attend to) one of the targets on the screen. They then must determine whether that target has already been classified. If classified, participants acquire another target. If not classified then it is classified on a 1-7 scale based on a computation of its threat value. Finally, after the classification, feedback is provided that compares the participant’s calculated threat value to the target’s actual threat value. Participants can attend to this feedback or ignore it.

Below the Bounded Rationality band is the Cognitive Band. It is in this band that the influence of human cognition on task performance can be discerned. Events in the cognitive band take from 30 milliseconds to 30 seconds to occur. In the cognitive band, structural components are determined by the interaction of embodied cognition with the task and artifacts designed to accomplish the task. We refer to the first level of analysis in the cognitive band, the analysis of unit tasks into activities, as the cognitive task analysis level.

Level 3: Cognitive Task Analysis: Unit Tasks  Activities
Classified? Cognitive task analysis works to decompose unit tasks into operators or activities This example shows how one part of the unit task can be expanded into multiple steps at the activity level of analysis. In this case, we have taken the simple decision point -- has the acquired target been classified? And expanded it into four NGOMSL steps. A large difference between left and right side of the Figure is the level of abstractness. The “Classified?” from Figure 1 can be instantiated in any of a number of ways. In contrast, the four NGOMSL steps imply that the artifact used to accomplish this step will have certain characteristics. For example, a graphic user interface is assumed by step 1. Step 2 assumes that each target will have an identification number. Step 3 requires that a target’s classification status be indicated by radio buttons. Finally, step 4 requires that the identification number of the currently selected target be presented in the information window. Hence, two tasks that appear similar at the unit task level could be very different at the activity level.

Can go a step further and analyze aqctivities into microstrategies -- this is the analysis of embodied cognition.

Level 4: Embodied Cognition: Activity  Microstrategy
Human cognition is embodied cognition. If we look, we will almost always find evidence that perceptual and motor factors influence performance. Here we have expanded the four NGOMSL steps into their underlying cognitive-perceptual-motor operators. As can be seen in the figure, many of the cognitive operators initiate motor movements (dependency lines are drawn from left-to-right). Other cognitive operators serve to direct attention or to verify the results of a perceptual operator. At the CPM-GOMS level of analysis the interdependencies among and between cognition, perception, and motor operators become apparent. It is at this level of analysis that we can begin to appreciate the relative contribution of each operator type to interactive behavior. Correlations in performance across similar tasks may be as much due to common patterns of cognitive, perceptual, and motor operators (i.e., the adoption of common microstrategies) (i.e., structural components) as to a common demand on a cognitive function such as STM or WM. See: Gray, W. D., & Boehm-Davis, D. A. (in press). Milliseconds Matter: An introduction to microstrategies and to their use in describing and predicting interactive behavior. Journal of Experiment Psychology: Applied. Will discuss this later in the workshop.

Level 5: Microstrategies  Elements (Where GOMS doesn’t go!)
Production Rules Declarative memory elements Internal (created on RHS of production rules) External (created by shifts of attention in the external environment) Computational models of embodied cognition work their magic by analyzing microstrategies into elements: production rules and declarative memory elements. A production rule is a condition-action dyad. For example, if x can be retrieved from memory and y is in the environment then add z to memory (or move visual attention to y or pop the current goal, and so on). Across similar tasks some or most of the productions rules may be in common. However, although any given production rule may be used in more than one task, its frequency of use between tasks may vary greatly.

Below the cognitive band is the biological band. Events in this band, whose duration ranges from 0.3 to 30 milliseconds reflect the working of non-cognitive, biological processes. It is at the top of the biological band that cognitive parameters emerge. Such parameters may be influenced by fatigue or anxiety, but are considered relatively impervious to the influence of task and artifact.

Level 6: Elements  Parameters (GOMS doesn’t go here either!)
w -- amount of attentional capacity d -- decay rate s -- fluctuations in the strength of declarative memory elements rt -- retrieval threshold An architecture of cognition, such as ACT-R 4.0, provides a control structure for cognition. The interaction of cognitive elements such as production rules and memory elements are governed by the underlying architecture. The architecture controls the cycle time (how fast productions can fire), the increase in activation per each retrieval and the decrease of activation as a function of time, how fast a given memory element can be retrieved, the amount of activation available, and so on. Any given model uses a fixed set of cognitive parameters that govern a model-specific set of productions and types of memory elements. Typically most of the parameters use the default settings of the architecture, while other parameters are tweaked to provide better fits to the data.

KR not KE GOMS (as with most task analysis methods) focuses on
Knowledge representation, not on Knowledge elicitation There are many sources of knowledge elicitation techniques For knowledge elicitation techniques see: Cooke, N. J. (1994). Varieties of Knowledge Elicitation Techniques. International Journal of Human-Computer Studies, 41(6), Ericsson, K. A., & Simon, H. A. (1993). Protocol analysis: Verbal reports as data ( Revised ed.). Cambridge, MA: The MIT Press. Olson, J. R., & Biolsi, K. J. (1991). Techniques for representing expert knowledge. In K. A. Ericsson & J. Smith (Eds.), Toward a general theory of expertise: Prospects and limits (pp ). New York: Cambridge. vanSomeren, M. W., Barnard, Y. F., & Sandberg, J. A. C. (1994). The think aloud method: A practical guide to modelling cognitive processes. New York: Academic Press.

Task Analysis versus Functionality
A task analysis does not guarantee functionality, see Kieras, D. E. (in press). Task analysis and the design of functionality, CRC Handbook of Computer Science and Engineering.: CRC Press, Inc. for a cogent discussion of this issue

Definition of GOMS Characterization of a task in terms of
The user's Goals The Operators available to accomplish those goals Methods (frequently used sequences of operators and sub-goals) to accomplish those goals, and If there is more than one method to accomplish a goal, the Selection rules used to choose between methods. GOMS was introduced by Card, Moran, & Newell in 1980, but is most extensively described and exemplified in The Psychology of Human-Computer Interaction (Card, Moran, & Newell, 1983).

Example of G-O-M-S To carry out a GOMS analysis of the following task involving a digital clock: Set the clock Top level goal: SET CLOCK

Example of G-O-M-S: Goals
Goals and subgoals

Example of G-O-M-S: Operators
Operators are the most elementary steps in which you choose to analyze the task. Reach <type> button Hold <type> button Release <type> button ClickOn <type> button Decide: if <x> then <y> Verify One analyst’s operator may be another analyst’s goal. It depends upon why you are doing the analysis.

Example of G-O-M-S: Methods
Top-level user goals SET-CLOCK Method for goal: SET-CLOCK Step 1. Hold TIME button Step 2. Accomplish goal: SET-HOUR Step 3. Accomplish goal: SET-MIN Step 4. Release TIME button Step 5. Return with goal accomplished Method for goal: SET-<digit> Step 1. ClickOn <digit> button Step 2. Decide: If target <digit> = current <digit>, then return with goal accomplished Step 3. Goto 1 Be sure to list assumptions e.g., be sure that the hand/arm/finger is in position to press button

Example of G-O-M-S: Selection rules
No selection rules in this example as this clock has only ONE method for accomplishing each goal, but . . . Selection rule for goal: SET-HOUR If target HOUR ≤ 4 hours from current HOUR, then Accomplish Goal: ClickOn HOUR If target HOUR > 4 hours from current HOUR, then Accomplish Goal : Click&Hold HOUR

Applications of GOMS Case 1. Design of mouse-driven text editor
Case 2. Directory assistance workstation Case 3. Space operations database system (for orbital objects) Case 4. Bank deposit reconciliation system. Case 5. CAD system for mechanical design. Case 6. Television control system. Case 7. Nuclear power plant operator's associate. Case 8. Intelligent tutoring system. Case 9. Industrial scheduling system. Case 10. CAD system for ergonomic design. Case 11. Telephone operator workstation. List compiled by John & Kieras (1997a). John, B. E., & Kieras, D. E. (1997b). The GOMS family of user interface analysis techniques: Comparison and contrast. ACM Transactions on Computer-Human Interaction, 3(4), John, B. E., & Kieras, D. E. (1997a). Using GOMS for user interface design and evaluation: Which technique? ACM Transactions on Computer-Human Interaction, 3(4),

Where does GOMS go?

Development process without analytic modeling

Development process with analytic modeling

GOMS as Analytic Modeling
GOMS analysis produces a model of behavior Given a task, the model predicts the methods, or sequences of operators, that a person will perform to accomplish that task Can look at the GOMS model in different ways to qualitatively and quantitatively assess different types of performance GOMS is an example of an "engineering model" of human performance. can be used to make a priori, quantitative predictions of human performance approximations appropriate for design situation with the psychology "built-in" so designers can use it effectively (without having to get a PhD in psychology) The types of things we'd like to predict about computer-based tasks are functionality: coverage & consistency sequence of operators performance time learning time errors and error-recovery

Scope of GOMS: What it can do
Predict the sequence of operators an expert will perform Predict performance time of expert users - even in real-world situations Predict learning time in relatively simple domains Predict savings due to previous learning Help design on-line help and manuals Examples of predicting the performance of expert users. text-editing in a laboratory setting; Card, Moran, & Newell (1983) is the most extensive (sequence and time predictions) using a browser to search for on-line help; Peck & John (1992) (sequence predictions) 9-year-old playing Super Mario Bros. 3 ™; John & Vera (1992) (sequence predictions) telephone operators in the real world; Gray, John, & Atwood (1993). (time predictions) (We'll talk about this a lot today.) time to extract information from direct-walk interactive information visualizations; Card, Pirolli, & Mackinlay (1994). Examples of predicting learning time and transfer of learning text-editing in a laboratory setting ; Polson & Kieras (1985) graphics editing in a laboratory setting; Polson, Muncher, & Engelbeck (1986) using an oscilloscope; Lee, Polson & Bailey (1989) Example of the design of instructional materials on-line help; Elkerton & Palmiter (1991) computer manuals; Gong & Elkerton (1990)

Scope of GOMS: What it can do, con’t
GOMS has been applied to both: User-driven interaction “Situated” or event-driven interaction User-driven interactions are those such as text-editing and spreadsheets in which primary control of what happens next rests with the user. The interaction between the computer and human can be conceived as following plans that are prestored in the user’s head. Event-driven interactions are those where the environment (e.g., computer, other people, etc.) acts independently of the user. Therefore specific interactions, including major subgoals, are as much determined by the environment’s input to the user as by the user’s input to the system. Examples are videogames, intelligent interfaces, systems with surrogate users, etc. Most interfaces contain both user-driven and event-driven aspects.

Scope of GOMS: What it might be able to do
Research has made progress on Predicting the number of some types of errors see discussion in: Gray, W. D. (2000). The nature and processing of errors in interactive behavior. Cognitive Science, 24(2), Predicting the effects of display layout on performance time Example of Errors in spreadsheet formulas; Lerch, F. J., Mantei, M. M., & Olson, J. R. (1989). Example of Errors in VCR use; Gray, W. D. (2000). The nature and processing of errors in interactive behavior. Cognitive Science, 24(2), Example of use of GOMS in graphical perception; Lohse, G. L. (1993). A cognitive model for understanding graphical perception. Human-Computer Interaction, 8(4),

Scope of GOMS: What it can't do
Predict problem-solving behavior Predict how GOMS structure grows from user experience Predict behavior of casual users, individual differences... Predict the effects of fatigue, user preference, organizational impact... For in-depth discussions of the coverage and limitations of GOMS, see: John, B. E., & Kieras, D. E. (1997b). The GOMS family of user interface analysis techniques: Comparison and contrast. ACM Transactions on Computer-Human Interaction, 3(4), John, B. E., & Kieras, D. E. (1997a). Using GOMS for user interface design and evaluation: Which technique? ACM Transactions on Computer-Human Interaction, 3(4), Newell, A & Card, S. K. (1985) "The prospects for psychological science in Human-Computer Interaction." Human-Computer Interaction, 1, 3, Newell, A & Card, S. K. (1986) "Straightening out softening up: Response to Carroll and Campbell." Human-Computer Interaction, 2, 3, Olson, J. R. & Olson, G. M. (1990) "The growth of cognitive modeling in Human-Computer Interaction since GOMS." Human-Computer Interaction, 5, 2&3,

General Factors to Consider in GOMS Models
When deciding what type of GOMS model you need, you must consider... what control structure what level of analysis whether to approximate behavior with serial or parallel processes ...different uses of GOMS models lead to different values of these factors These factors will be a recurring theme The definition of GOMS, as a characterization of tasks in terms of goals, operators, methods, and selection rules, leaves a lot of room for interpretation, judgment, and different forms of models. Some of the distinctions between different interpretations and forms are substantive, and some are more a matter of style or preference. We'll be discussing these issues at length during the day.

GOMS Family of Analysis Methods
Keystroke-Level Model CMN-GOMS NGOMSL CMN-GOMS for Highly Interactive Tasks CPM-GOMS Good examples of the different kinds of GOMS models can be found in the following articles: Keystroke-Level Model (Card, Moran & Newell, 1980a; 1983) CMN-GOMS (Card, Moran & Newell, 1980b; 1983) NGOMSL (Kieras, 1988) CMN-GOMS for Highly Interactive Tasks (John, Vera & Newell, 1994) CPM-GOMS (John, 1990; Gray, John & Atwood, 1993; Gray & Boehm-Davis, in press) = “worked example” provided in this HFES workshop

Break time

Keystroke-Level Model: Intro
The simplest of all GOMS models: OM only!!! No explicit goals or selection rules Operators and Methods (in a limited sense) only “Useful where it is possible to specify the user’s interaction sequence in detail” (CMN83, p. 259). Control structure: Flat Serial or Parallel: Serial Level of Analysis: Keystroke-level operators Sources: Card, S. K., Moran, T. P., & Newell, A. (1980). The keystroke-level model for user performance time with interactive systems. Communications of the ACM, 23(7), Card, S. K., Moran, T. P., & Newell, A. (1983). The psychology of human-computer interaction. Hillsdale, NJ: Lawrence Erlbaum Associates. Kieras, D. (1993). Using the keystroke-level model to estimate execution times University of Michigan. (see his website)

Keystroke-Level Model: Example
TAO example Example based upon an industrial application of the KLM In one workstation design, toll and assistance operators type in the calling card number “blind.” They do not see the numbers they are typing until all 14 digits have been typed. Correcting an error requires that all 14 digits be retyped!!! Question to the HF expert from the software design team: “This is obviously bad. We can do better. You guys (the HF folk) just tell us what system to build and we will build it. By the way, can you tell us by tomorrow?”

Keystroke-Level Model: Overview
Step 1: Lay out assumptions Step 2: Write out the basic action sequence (list the keystroke-level physical operators involved in doing the task) Step 3: Select the operators and durations that will be used Step 4: List the times next to the physical operators for the task Step 4a: If necessary, include system response time operators for when the user must wait for the system to respond Step 5: Next add the mental operators and their times Step 6: Sum the times of the operators Sources: Card, S. K., Moran, T. P., & Newell, A. (1980a; 1983) Notes: Before we get started, let me walk you through the operators we will use (step 2) and some heuristics for placing the mentals (step 4) Procedure for using KLM to predict expert performance Total of the operator times is the estimated time to complete the task

Keystroke-level Model: Operators
K: Keystroke T(n): Type a sequence of n characters on a keyboard P: Point with mouse to a target on a display B: Press or release mouse button BB: Click mouse button H: Home hands to keyboard or mouse M: Mental act of routine thinking W(t): Waiting time for system to respond

Card, Moran, and Newell on “Mentals”
“M operations represent acts of mental preparation for applying physical operations. Their occurrence does not follow directly from the physical encoding, but from the specific knowledge and skill of the user” p. 267 “The rules for placing M’s embody psychological assumptions about the user and are necessarily heuristic, especially given the simplicity of the model” p. 267. In our experience, adding the mental operators is the source of greatest confusion. Remember, GOMS does NOT apply to problem solving, only to routine, skilled behavior. The KLM typically assumes expert performance. Here is what Kieras (1993) says about the mentals: M - Mental act of routine thinking or perception ( sec; use 1.2 sec). Of course, how long it takes to perform a mental act depends on what cognitive processes are involved, and is highly variable from situation to situation or person to person. This operator is based on the fact that when reasonably experienced users are engaged in routine operation of a computer, there are pauses in the stream of actions that are about a second long and that are associated with routine acts such as remembering a filename or finding something on the screen. The M operator is intended to represent this routine thinking, not complex, lengthy, problem-solving, racking the brain, or creative meditations. In a variety of routine computer usage tasks such as word processing and spreadsheet usage, these routine pauses are fairly uniform in length, justifying the simplifying assumption that all Ms take the same amount of time, around one sec.

Heuristics for inserting mental operators
Basic psychological principle: physical operations in methods are chunked into submethods. RULE 0: Insert M’s in front of all K’s or B’s that are not part of argument strings proper (e.g., text or numbers). Place M’s in front of all P’s that select commands (not arguments) or that begin a sequence of direct-manipulation operations belonging to a cognitive unit. Pointing to a cell on a spreadsheet is pointing to an argument -- no M Pointing to a word in a manuscript is pointing to an argument -- no M Pointing to a icon on a toolbar is pointing to a command -- M Pointing to the label of a drop-down menu is pointing to a command -- M Heuristics for inserting mentals Based upon CMN’83, Figure 8.2, p. 265. Rule 0 is as amended by John & Kieras, 1996b, p. 327. Begin with a method (sequence of keystroke-level operators) that includes all physical operations and system responses. Use CMN rule 0 to place candidate M’s, then cycle through rules 1 to 4 for each candidate M to determine if it should be deleted.

Rules 1-4 are heuristics (rules of thumb) for deleting mentals “A single psychological principle lies behind all the deletion heuristics physical operations in methods are chunked into submethods” p. 268 “The user cognitively organizes his methods according to these submethod chunks, which usually reflect syntactic constituents of the system’s command language. Hence, the user mentally prepares for the next physical chunk, not just the next physical operation. It follows that in executing methods the user is more likely to pause between chunks than within chunks. The rules attempt to identify submethod chunks” p. 268

Basic psychological principle: physical operations in methods are chunked into submethods. RULE 0: Insert M’s in front of all K’s or B’s that are not part of argument strings proper (e.g., text or numbers). Place M’s in front of all P’s that select commands (not arguments) or that begin a sequence of direct-manipulation operations belonging to a cognitive unit. RULE 1: If an operator following an M is fully anticipated1 in an operator just previous to M, then delete the M (e.g., PMK --> PK or PMBB --> PBB). That is, the “M” drops out because the “P” and “BB” belong together in a chunk -- mental unit. The button press “BB” is fully anticipated as the cursor is being moved to the target. 1Rule 1: “fully anticipated” the time required to do the mental is pretty close to the time it takes to point (P = 1100 whereas M = 1200); hence the user is assumed to be simultaneously moving the mouse and performing the mental.

Basic psychological principle: physical operations in methods are chunked into submethods. RULE 0: Insert M’s in front of all K’s or B’s that are not part of argument strings proper (e.g., text or numbers). Place M’s in front of all P’s that select commands (not arguments) or that begin a sequence of direct-manipulation operations belonging to a cognitive unit. RULE 1: If an operator following an M is fully anticipated1 in an operator just previous to M, then delete the M (e.g., PMK --> PK or PMBB --> PBB). RULE 2: If a string of MK’s or MB’s belongs to a cognitive unit (e.g., the name of a command), then delete all M’s but the first. Works with command names -- but what is a command name in a GUI interface? Physical actions: P(File)+ B + P(Save) + B RULE 0: MP + MB + MP + MB RULE 1: MPB + MPB Does rule 2 apply to eliminate the middle mental? MPBPB ? Based on the available results (Olson & Olson, 1989), a good overall estimate for the duration of an M is 1.2 sec. Choosing how many Ms are involved, and where they appear, is the hardest part of using the KLM.

Basic psychological principle: physical operations in methods are chunked into submethods. RULE 0: Insert M’s in front of all K’s or B’s that are not part of argument strings proper (e.g., text or numbers). Place M’s in front of all P’s that select commands (not arguments) or that begin a sequence of direct-manipulation operations belonging to a cognitive unit. RULE 1: If an operator following an M is fully anticipated1 in an operator just previous to M, then delete the M (e.g., PMK --> PK or PMBB --> PBB). RULE 2: If a string of MK’s or MB’s belongs to a cognitive unit (e.g., the name of a command), then delete all M’s but the first RULE 3: If a K is a redundant terminator (e.g., the terminator of a command immediately following the terminator of its argument), then delete the M in front of it. Applies to clicking OKAY in dialog buttons after you select a command; e.g., in Powerpoint, you have selected text, gone to the FORMAT:FONT palette, clicked on bold, and now point and click on OKAY -- pointing to and clicking on OKAY is PBB, not MPBB

Basic psychological principle: physical operations in methods are chunked into submethods. RULE 0: Insert M’s in front of all K’s or B’s that are not part of argument strings proper (e.g., text or numbers). Place M’s in front of all P’s that select commands (not arguments) or that begin a sequence of direct-manipulation operations belonging to a cognitive unit. RULE 1: If an operator following an M is fully anticipated1 in an operator just previous to M, then delete the M (e.g., PMK --> PK or PMBB --> PBB). RULE 2: If a string of MK’s or MB’s belongs to a cognitive unit (e.g., the name of a command), then delete all M’s but the first RULE 3: If a K is a redundant terminator (e.g., the terminator of a command immediately following the terminator of its argument), then delete the M in front of it. RULE 4: If a K terminates a constant string (e.g., a command name), then delete the M in front of it; but if the K terminates a variable string (e.g., an argument string), then keep the M in front of it.

The four heuristics do NOT capture the notion of method chunks precisely -- these are only approximations Ambiguities: Is something “fully anticipated” or is something else a “cognitive unit”? Much of this ambiguity stems from variations in expertise of the users we are modeling

Basic psychological principle: physical operations in methods are chunked into submethods. RULE 0: Insert M’s in front of all K’s or B’s that are not part of argument strings proper (e.g., text or numbers). Place M’s in front of all P’s that select commands (not arguments) or that begin a sequence of direct-manipulation operations belonging to a cognitive unit. RULE 1: If an operator following an M is fully anticipated1 in an operator just previous to M, then delete the M (e.g., PMK --> PK or PMBB --> PBB). RULE 2: If a string of MK’s or MB’s belongs to a cognitive unit (e.g., the name of a command), then delete all M’s but the first RULE 3: If a K is a redundant terminator (e.g., the terminator of a command immediately following the terminator of its argument), then delete the M in front of it. RULE 4: If a K terminates a constant string (e.g., a command name), then delete the M in front of it; but if the K terminates a variable string (e.g., an argument string), then keep the M in front of it.

KLM--mentals: example 1.
Example: SET COLUMN WIDTH 5<cr> List the keystroke level physical operators involved in doing the task KKKKKKKKKKKKKKKKKKK (19 K’s) RULE 0 M+KKKK+M+KKKKKKK+M+KKKKKK+K+M+K or M+4K(set_)+M+7K(column_)+M+6K(width_)+1K(5)+M+1K(<cr>) RULE 1 no change in this example RULE 2 M+17K(set_column_width_)+1K(5)+M+1K(<cr>) RULE 3 No change in this example Rule 4 “SET_COLUMN_WIDTH_” is the command “5” is the argument “<cr>” is the carriage return that terminates the command Rule 0 -- M’s in front of SET, COLUMN, WIDTH, <CR>. No M in front of “5” as that is part of an argument.

KLM--mentals: example 2
Example: spellcheck “catelog” List the keystroke level physical operators involved in doing the task P+BBBB+P+BB (where BB is a mousedown + mouseup, and BBBB is a doubleclick) RULE 0 P+M+BBBB+M+P+BB RULE 1 n/a RULE 2 n/a (“catelog” + spellcheck do not form a cognitive unit) RULE 3 RULE 4 Example 2: point and double click word, “catelog,” to spell check then invoke the spell check command by pointing to and single clicking on its icon. The word is an argument, spell checking is the command SEQUENCE = POINT TO <argument>: “catelog” doubleClick on <argument>: BBBB POINT TO <command>: icon clickOn <command>: BB So argument is selected first then the command. Hence by Rule 0 No M before P that points to “catelog” is an argument not a command M before the BBBB that selects “catelog” (BBBB is a command not an argument) M before the P that points to the spellcheck icon (a command) M before the BB that clicks on the icon RULE 1 -- clicking on argument and clicking on command are each fully anticipated in the operator just previous to the M, so delete these M’s

KLM--mentals: example 3
Example: save a file on a Mac using menus List the keystroke level physical operators involved in doing the task P+B+P+B RULE 0 M+P+M+B+M+P+M+B RULE 1 M+P+B+M+P+B RULE 2 n/a or M+P+B+P+B ??? Issue: Is this FILE-->SAVE menu selection a single cognitive unit or two? Example 3 saving a file on the Macintosh using the menu sequence: (all commands, no args!!) point to FILE menu mouseDn on FILE menu point to SAVE item mouseUp at SAVE item Q: How do we identify what the chunks are for different users? A: Good point. That is why these are called heuristics and not rules. so FILE:SAVE may be a cognitive unit, for expert users but INSERT:BREAK probably is not (much less frequently used) The KLM is not rocket science, it is rude and crude and fast, approximate modeling with all of its advantages and disadvantages!!!

Keystroke-Level Model: m1 current
Step 1: Lay out your assumptions There are several fields on the display, first thing that any error recovery method must do is to identify the field to be changed. In this case the field is the calling-card field (CCN). For purposes of this exercise, we assume the error is made in the second number of the exchange. TAO’s hands are on the keyboard Step 2: Write out the basic action sequence (the physical operators) ƒkey(ccn) + digit(14) + enterKey Sources: Card, S. K., Moran, T. P., & Newell, A. (1980) Card, S. K., Moran, T. P., & Newell, A. (1983) Kieras, D. (1993) Instructor Notes: 1. Returning to problem on slide 42, how do you do a KLM of current practice? 2. Put slide 42 on second screen (keep heuristics for mentals handy)

Step 3: select the operators and durations that will be used We will use the ones from Kieras (1993). Sources of times: There are many sources, some of our favorite (most useful) sources include: Barnes, R. M. (1963). Motion and time study: Design and measurement of work. (5th ed.). New York: John Wiley & Sons, Inc. Card, S. K., Moran, T. P., & Newell, A. (1980). The keystroke-level model for user performance time with interactive systems. Communications of the ACM, 23(7), Card, S. K., Moran, T. P., & Newell, A. (1983). The psychology of human-computer interaction. Hillsdale, NJ: Lawrence Erlbaum Associates. John, B. E. (1990). Extensions of GOMS Analyses to Expert Performance Requiring Perception of Dynamic Visual and Auditory Information. In J. C. Chew & J. Whiteside (Eds.), ACM CHI'90 Conference on Human Factors in Computing Systems, (pp ). New York: ACM Press. Kieras, D. (1993). Using the keystroke-level model to estimate execution times University of Michigan. Note: lowest level of task analysis for KLM, it is the only level of analysis

Step 4: List the times next to the physical operators for the task. This slide is simply the physical actions from step 2 plus the times from step 3. 16K

Step 5: Next add the mental operators and their times Step 6: Sum the times of the operators Predicted time for current method is 6.88 sec (note: this time is the same regardless of “where” the error is made) RULE 0: M + K + 14K(argument) + M + K A. M before ƒCNN as that is a command B. No M before 14 digits as that is an argument C. M before ENTER as that terminates argument -- is not a redundant termination, is a variable string Neither rule 1, 2, 3, or 4 applies. In our experience, adding the mental operators is the source of greatest confusion. Remember, GOMS does NOT apply to problem solving, only to routine, skilled behavior. The KLM typically assumes expert performance. Remember here we have fCCN = a command 14 digits = argument string enter = keystroke that terminates an argument string

Keystroke-Level Model: m2 bs/delete
Step 1: Lay out your assumptions s/a model 1 except; delete key backs up and deletes each digit Step 2: Write out the basic action sequence (the physical operators) ƒkey(ccn) + delKey(10) + digit(10) + enterKey Step 3: Same operators as for model 1. This represents the model for an alternate approach to correcting errors. Instructors Note: dab starts here

Step 4: List the times next to the physical operators for the task. This slide is simply the physical actions from step 2 plus the times from step 3. fCCN = command bs/delete = argument or command? digits to retype = argument enter = keystroke that terminates an argument string

Step 5: Next add the mental operators and their times Step 6: Sum the times of the operators An issue here is whether bs/delete to digit is a command or argument. We choose to interpret it as a command+argument “bs/delete(10)”. By Rule 0 the argument does not need a mental. By rule 1 the command of bs/delete is fully anticipated by the prior operator -- ƒcnn.

Keystroke-Level Model:m3 bkup/delete
Step 1: Lay out your assumptions s/a model 1 except; backup key backs up without deleting. Delete key backs up and deletes Step 2: Write out the basic action sequence (the physical operators) ƒkey(ccn) + bkupKey(9) + delKey(1) + digit(1) + enterKey Step 3: Same operators as for model 1. This is the model for a third approach to correcting errors. fCCN = command bs(digits) = argument or command; or command (bs) + arguments (9 digits). delKey = command digit to retype = argument enter = keystroke that terminates an argument string

Keystroke-Level Model: m3 bkup/delete
Step 4: List the times next to the physical operators for the task. This slide is simply the physical actions from step 2 plus the times from step 3.

Step 5: Next add the mental operators and their times (Your turn!!! Our answer are on the next page, no peeking!!!) Method 3: bkup-delete NUM op type time press reset function key fCCN 1 K 0.28 backup to digit bkup 9 K 2.52 delete digit del 1 K 0.28 Note: It is definitely fair to look at models 1 and 2 when you decide where to put the mentals for model 3. Remember, “consistency” is all. The relative predictions of keystroke level GOMS are much more accurate than the absolute time predictions!!! But, only if the analyst is consistent. digits to retype digit 1 K 0.28 outpulse new num to dbase enter 1 K 0.28 total time

Step 5: KLM w/mentals.

Keystroke-Level Model: m4 zap-gp
Step 1: Lay out your assumptions s/a model 1 except; Four separate function keys, zaps (deletes) either area code, exchange, line, or pin number. Retyping need only retype the zapped numbers. Step 2: Write out the basic action sequence (the physical operators) ƒkey(ccn) + zapExch(1) + digit(3) + enterKey Step 3: Same operators as for model 1. This is the model for a fourth approach to correcting errors. fCCN = command zapExch = command digits to retype = argument enter = keystroke that terminates an argument string

Step 4: List the times next to the physical operators for the task. Your turn!! (Our answer are on the next page, no peeking!!!) Method 4: zap-gp NUM op type time This slide is simply the physical actions from step 2 plus the times from step 3. total time

Step 5: Next add the mental operators and their times (Your turn!!! Our answer are on the next page, no peeking!!!) Method 4: zap-gp NUM op type time press reset function key fCCN 1 K 0.28 zap-group zap 1 K 0.28 digits to retype Note: It is definitely fair to look at your previous models when you decide where to put the mentals for this model. Again, remember, “consistency” is all. The relative predictions of keystroke level GOMS are much more accurate than the absolute time predictions!!! But, only if the analyst is consistent. digit 3 K 0.84 outpulse new num to dbase enter 1 K 0.28 total time

Step 5: KLM model w/mentals

Keystroke-Level Model:
Summary Of the three new methods, only one seems likely to be fast enough to justify expense of redesign

Keystroke-Level Model: Dialog box interface
The task: To issue a query on telephone numbers to one out of several databases. The users will be working in a window system, and the information that is to be looked up in the databases can be assumed to be visible on the screen. Fig 1 This project involves two GUI versions of the infamous MANTEL interface. These were developed and used by Jakob Nielsen in experiments originally conducted at BellCore. The first interface is called the Dialog Box interface and the second is the Pop-Up Menu interface. The task is to do a KLM on BOTH the interface alternatives and predict the expert performance time for two tasks (single telephone number query, double telephone number query). We will start on the first one during the workshop and see how far we get before lunch!! "The users' task in this study is that of issuing a query to one out of several databases on one or more data items such as telephone numbers, customer names, addresses, etc. For the purpose of the current study, we looked at queries on telephone numbers. Furthermore, we only studied queries on one or two telephone numbers even though the actual application sometimes involves queries for more than two telephone numbers. The users will be working in a window system, and the information that is to be looked up in the databases can be assumed to be visible on the screen..." [Figure 1] The screen as it looks before the menu choices. The telephone numbers to be looked up are in the window at the bottom of the screen. Thanks to Bonnie John for allowing us to use this material!!!

Keystroke-Level Model: Dialog box interface (2)
To use this interface, the user first pulls down a menu from the menubar at the top of the screen. This menu contains the names of the databases and the user positions the mouse cursor over the name of the relevant database in this list. POINT to databases menu clickOn menu POINT to database item Fig 2

A hierarchical submenu appears with legal queries for the chosen database. This submenu contains an average of four alternatives, and the user moves the mouse until it highlights the option "query on telephone number." The user then releases the mouse button. POINT to databases menu clickOn on menu POINT to database item (submenu appears) ******************** POINT to queryType clickOn queryType “query on telephone number” W (system response -- palette appears) Fig 3

This causes a standard-sized dialog box to appear in the middle of the screen as shown in Figure 4. The large field at the top is initially empty. This field will eventually list the telephone numbers to be submitted to the database as a query. The dialog box does not overlap the window with the list of telephone numbers. POINT to databases menu clickOn on menu POINT to database item (submenu appears) POINT to queryType clickOn queryType “query on telephone number” W (system response -- palette appears) *************** Fig 4

"The user clicks in the input field (the one-line field below the prompt "Inquiry on telephone number") and types the telephone number. POINT to databases menu clickOn on menu POINT to database item (submenu appears) POINT to queryType clickOn queryType “query on telephone number” W (system response -- palette appears) *************** POINT to input field clickOn field (BB) home hands to keyboard type number (8 k) home hands to mouse POINT to ADD clickOn ADD Fig 5 Fig 6 "The user clicks on the "Add" button to add the number to the query.

The state of the dialog box after the user's click on "Add." POINT to databases menu clickOn on menu POINT to database item (DB4 -- submenu appears) POINT to queryType clickOn queryType “query on telephone number” W (system response -- palette appears) POINT to input field clickOn field (BB) home hands to keyboard type number (8 k) home hands to mouse POINT to ADD clickOn ADD ************* POINT to OK clickOn OK Fig 7 Fig 8 "If the query is for a single telephone number, the user then clicks on the "OK" button to submit the query.

Assum hand is on mouse at the beginning. POINT to databases menu : M b efore P as it is selecting a command (RULE 0) clickOn menu : fully anticipated RULE1 & (RULE 2 -- assuming expertise so entire sequence of DATABASES:DB4: query on telephone number is well-practiced) POINT to database item (submenu appears) (RULE 2) POINT to queryType (RULE 2) clickOn queryType “query on telephone number” (RULE 1 & RULE2) W (system response -- palette appears) POINT to input field (RULE 0 -- assuming this is part of new cognitive unit) clickOn field (BB) (RULE 1 & RULE 2) home hands to keyboard (RULE 1 & RULE 2 -- M has amble time to occur while homing, but it seems reasonable to regard homing as part of the command that results in typing in the number) PERCEIVE number on screen (might argue that this is fully anticipated during the homing RULE 1 -- here we assume that once hands are on the keyboard that user doublechecks the number immediately prior to typing it in) type number (8 k) : argument (RULE 0) Verify correctness (RULE 4) -- variable argument home hands to mouse POINT to ADD (RULE 1 fully anticipated during homing) clickOn ADD (RULE 1 & RULE 2) POINT to OK (RULE 0 -- needs M) --> Ss could decide to add a second number to list. clickOn OK (Rule 1)

Keystroke-Level Model: Pop-up menu (1)
As previously mentioned, it can be assumed that the telephone number(s) in question is/are already visible on the screen. (Figure 9) The second interface is the Pop-Up Menu interface move hands to mouse POINT to number clickOn number Fig 9 Fig 10 To query a number, the user moves the mouse cursor to the telephone number on the screen and clicks on the mouse button. This causes a pop-up menu to appear over the number with one element for each database for which queries can be performed with telephone numbers as keys. (Figure 10)

The user moves the mouse until the desired database is highlighted in the pop-up menu. (Figure 11) move hands to mouse POINT to number clickOn on number ********* POINT to database clickOn at target database Fig 11 Fig 12 The user then clicks the mouse button (the system knows which number to search on because the user pointed to it when calling up the menu). (Figure 12)

•Your turn!! Action/operator sequence only. No mentals. (Our answer are on the next page, no peeking!!!) Model 3: Pop-Up Menu, one query action/operator sequence #ops KLM op time Enter telephone number total time

Now try adding the mentals. (Our answer are on the next page, no peeking!!!) INSTRUCTORS’ SOLUTION

KLM w/mentals INSTRUCTORS’ SOLUTION Assume hand is on mouse. M before P (RULE 0) POINT to number (RULE 0) clickOn number :: fully anticipated RULE 1 & RULE 2 POINT to database (RULE 2, consistent with model for Dialog box interface) clickOn target database (RULE 1 & RULE 2, consistent with model for Dialog box interface)

KLM vs “expert judgment”

Break time (Lunch Time!)

NGOMSL Natural Language GOMS
based on structured natural language notation and a procedure for constructing them models are in program form Control Structure: Hierarchical goal stack Serial or parallel: Serial Level of Analysis: As necessary for your design question Key reference: Kieras, D. (1997). A guide to GOMS model usability evaluation using NGOMSL. Handbook of Human-Computer Interaction. M. Helander, T. K. Landauer and P. Prabhu. New York, Elsevier: Level of Analysis: As necessary for your design question, from Unit-Task Level (operator duration ~ 30 sec) down to elementary perceptual, cognitive and motor operators (duration ~ 100 msec) Breadth-first, progressive-deepening of the goals in a goal hierarchy. Stop deepening when methods and operators satisfy the requirements of your design problem. Include selection rules to choose between methods. Include cognitive overhead of goal-hierarchy and working-memory manipulation.

NGOMSL - why? More powerful than KLM. Much more useful for analyzing large systems More built-in cognitive theory Provides predictions of operator sequence, execution time, and time to learn the methods Allows for goals and subgoals as well as explicit selection rules; not possible with KLM Based on MHP (KLM is brain-damaged MHP, NGOMSL uses more of MHP theory) Represents methods in terms of a cognitive architecture called cognitive complexity theory (Kieras & Polson, 1985; Bovair, Kieras, & Polson, 1990) -CCT is a production-system implementation of GOMS that assumes a simple serial stage architecture in which working memory triggers production rules that apply at a fixed rate/these rules alter the contents of working memory or execute primitive external operators such as making a keystroke. -GOMS methods are represented by sets of production rules in a prescribed format -Learning procedural knowledge consists of learning individual production rules (can be transferred from other learned task) Can provide estimates/predictions of sequence, execution time, and time to learn. E.g., if you provide estimates of operator-duration, you can get predictions of error-free expert performance time as well as predictions of consistency of methods, learning time, and transfer of training (see Polson and Kieras, 1985) -Because they specify methods in program form, they can characterize the procedural complexity of a task, both in terms of how much must be learned, and how much has to be executed, but since it assumes a serial stage model of information processing, it works only for hierarchical and sequential methods -So, limitations: no provision for representing methods whose steps can be done in any order, or which could be interrupted, suspended, and resumed Also, since perceptual and motor activities are represented by operators which are embedded in the sequential methods, there is no way to represent how these might overlap with other activities.

NGOMSL - Overall Approach
Step 1: Perform goal/subgoal decomposition Step 2: Develop a method to accomplish each goal List the actions/steps the user has to do goal (at as general and high-level as possible for the current level of analysis) Identify similar methods/collapse where appropriate Step 3: Add flow of control (decides) Step 4: Add verifies Step 5: Add perceptuals, etc. Step 6: Add mentals for retrieves, forgets, recalls Step 7: Add times for each step Step 8: Calculate total time -Use principles for KLM for guidance on steps -Define new high-level operators, and bypass complex psychological processes as needed (make note of new operators and task descriptions -Make simplifying assumptions as needed, such as deferring consideration of shortcuts, etc. -If more than one method for accomplishing goal, draft each method and then draft the selection rule set for the goal (simplifying assumption that alternative methods not used -- defer to later)

NGOMSL - Example Car clock Presetting radio stations simple perverse

NGOMSL--Car Radio example (1)
Provides predictions of methods and operators used to complete a task. If you provide estimates of operator-duration, you can get predictions of error-free expert performance time.

NGOMSL-- Car Radio example (2)
Goal/subgoal hierarchy

Assumptions User's hands are NOT on the buttons at the beginning. Time to move hand & arm to time change level is estimated by 2ft "reach" from Barnes (1963): 410 msec. D = device time, time for clock to move forward one number, = 500 msec (we estimate this on the next slide) I am assuming a 12-hr clock. Seems good assumption for a car clock. Radio is off at beginning and end. Begin knowing the time you want to set the clock to.

D = device time, time for clock to move forward one number, = 500 msec Based on cognitive task analysis, so .5s to display increments so that people can read and anticipate stopping. Assumes that people don’t anticipate future increments, which is probably a faulty assumption Example of using a lower level analysis -- cpm-goms -- to derive minimum duration for a wait (W) activity.

Instructor notes: Explicitly discuss 1. statement time 2. compare with KLM M’s 3. Additive nature of time within steps (statement time + operator time) 4. different levels: subgoals and operators in this method. 5. Stylistic differences in use of subgoals. Key is consistency.

Note: common methods for similar tasks: set-HR and set-MIN. Finding such commonalities is a feature of the ngomsl level of analysis. Would have missed this with the KLM. Note: be sure to talk about SRT issues.

NGOMSL-- PreSetting a station (1)

A general flattening of the goal hierarchy has taken place. Several subgoals have been collapsed into one or two simple operators. See NGOMSL analysis for a picture of what has happened. BTW: This is based upon the car radio shown several slides ago. Note differences between this and previous slide.

NGOMSL-- PreSetting perverse (1)
Note major differences between this design, the optimal design, and the general goal hierarchy. This design uses all the subgoals of the general goal hierarchy plus it introduces a new layer of subgoals. Other differences: Which of the three goal structures would be easier to learn? Which depends upon display-based features and which depends more on “in-the-head” or prelearned knowledge?

NGOMSL--comparison of optimal and perverse designs
For optimal design each method is used just once. For perverse design the enter/exit mode method is used 7 times!!! N.B. no real examples of selection rules in any of our examples

Break time

CPM-GOMS: The Embodied Cognition level
Analyzing activities into microstrategies

CPM-GOMS Extension to GOMS
Critical Path Method (or Cognitive Perceptual Motor) -GOMS Control structure: Relaxed hierarchy, can be interrupted and continued based on new information in the world Serial or Parallel: Parallel Level of Analysis: Primarily elementary perceptual, cognitive and motor operators Key references: John, 1990; Gray, John & Atwood, 1993; John & Kieras (1997a, 1997b; Gray & Boehm-Davis, in press) Task analysis technique for predicting execution time based on an analysis of how the user performs cognitive, perceptual, and motor actions while executing a task, whether the activities occur serially or in parallel

CPM-GOMS - why? Need for analysis suitable for parallel activities
Human cognition is embodied cognition In some cases we need to understand the interleaving and interdependencies between elementary cognitive, perceptual, and motor operations Based on the model human processor (Card, Moran, and Newell, 1983); more direct correspondence than other GOMS Uses a schedule chart (or PERT chart) to represent individual operators and the dependencies among them CPM can represent both: Cognitive-Perceptual-Motor and Critical Path Method, since the time to execute the critical path through the schedule chart provides the prediction of total task time Primitive operators in CPM-GOMS must be at the level of the cycle times for each processor, so they are much more detailed than previous GOMS models Human cognition is embodied cognition. If we look, we will almost always find evidence that perceptual and motor factors influence performance.

Brief Introduction to PERT Charts
PERT charts are a project management tool for planning and tracking schedules. Also called schedule charts, they can be adapted for cognitive modeling. From MacProject II: Getting Started (Claris Corporation, 1989) "Tasks are the units of work in a project. Each task has a box in the Schedule Chart." (p. 3-5) "Milestones mark important points in a project, such as the beginning or end..." (p. 3-11) "Milestones often do not have durations..." (p.3-16) "In most projects, some tasks...cannot begin until another task has ended...You add dependency lines between boxes in the Schedule Chart to indicate relationships among tasks... You can draw these lines from one task to another task that depends on it--in other words, only from left to right." (pp. 3-11, 3-12) "A task duration is the amount of working time that you expect a task to take." (p. 3-16) "The critical path is the sequence of tasks that takes the longest amount of time to complete, and thus needs to be completed within its planned time or will delay the whole schedule." (p. 3-20) "Tasks that are not on a critical path ordinarily have some extra time called slack time. This is the amount of time the task can slip without affecting the rest of the schedule." (p. 3-20)

CPM-GOMS Schedule Charts: A Notation for Representing Parallelism.
Example illustrating part of a CPM-GOMS Model. Note that the cognitive operators are highlighted to illustrate the sequential nature of operators within a category. The critical path is in bold.

But are composites of simpler, basic-level activities
Microstrategies Unit-tasks We will illustrate these in the context of moving and clicking a mouse

Moving: a CPM-GOMS model of one of the two basic activities that can be performed with a mouse. The operators and their dependencies are identical in A and B with one exception. The move-cursor operator in A has a duration of 545 msec, whereas that used in B has a duration of 301 msec. This difference has a dramatic affect on the critical path (indicated by bold lines connecting shadowed boxes) as well as on the prediction of the total time needed to execute this activity. Instructors note: MacProject demo here of critical path changes.

MOVE-CLICK: Alternative microstrategies for when to mouse down
MOVE-CLICK: Alternative microstrategies for when to mouse down. The different microstrategies within a family may be represented by different operators, different sequences of the same operators, and by different dependencies between operators. The critical path is indicated by bold lines connecting shadowed boxes. Different members of the same microstrategy family often reflect upper limits on human performance as constrained by the design of the interface.

CLICK-MOVE: Alternative microstrategies for when to move the cursor
CLICK-MOVE: Alternative microstrategies for when to move the cursor. Each microstrategy starts after the end of the mouse down operator and ends before the start of the move cursor operator. The different microstrategies within a family may be represented by different operators, different sequences of the same operators, and by different dependencies between operators. The critical path is indicated by bold lines connecting shadowed boxes.

The Button Study

From HOME Button to TARGET Button
From HOME Button to TARGET Button. The location of TARGET is not known until it is perceived and verified. The CPM-GOMS prediction for time from mouse down on HOME to mouse down on TARGET is 970 msec. The critical path is indicated by bold lines connecting shadowed boxes.

From TARGET button to HOME button
From TARGET button to HOME button. HOME location is known prior to its being perceived and verified. The CPM-GOMS prediction for time from mouse down on TARGET to mouse down on HOME is 820 msec. The critical path is indicated by bold lines connecting shadowed boxes.

Experimental Data Data and examples are from: Gray, W. D., & Boehm-Davis, D. A. (in press). Milliseconds Matter: An introduction to microstrategies and to their use in describing and predicting interactive behavior. Journal of Experiment Psychology: Applied. Advanced copy available from:

Telephone Operator Workstation: The Real-World Problem
A telephone company wanted to replace old telephone operator workstations with a new workstation. For this company, each second saved per call translates into a savings of $3 million/year in operating costs. Will the new workstation be faster than the current workstation, and if so, how much will it save each year? This work is summarized in: Gray, W. D., John, B. E., & Atwood, M. E. (1993). Project Ernestine: Validating a GOMS analysis for predicting and explaining real-world performance. Human Computer Interaction, 8 (3), In addition to the CPM-GOMS model, this paper reports the results of a four-month field trial that compared the performance of the current and proposed workstations, and how the GOMS predictions compared to the field data.

Telephone Operator Workstation: The Task
Assist a customer in completing calls Record the correct billing information NOT directory assistance This type of telephone operator handles many different tasks, including person-to-person calls, collect calls, calling-card calls and calls billed to a third number (NOT directory assistance). To complete these tasks, the telephone operators must converse with the customer, enter information into the workstation, read information from the workstation screen, and, sometimes write notes to themselves. These activities often occur in parallel.

Telephone Operator Workstation: An Example Call
Workstation: “Beep” TAO: “New England Telephone may I help you?” Customer: “Operator, bill this to ” TAO: Hits a series of function & numeric keys Workstation: Displays the authorization for the calling card TAO: “Thank you.” TAO: Hits a key to release the workstation so another call can come in

Telephone Operator Workstation: Goals
Overall Goal Handle the call Subgoals explicitly trained Determine and enter who pays for the call Determine and enter the appropriate billing rate Determine when the call is complete and release the workstation Examples of the various subgoals: Who pays? calling party? called party? third-party? calling card? What rate? direct-dial rate? person-to-person rate? coin phone rate? Call complete? station-to-station call and calling card verified--release call. collect call and called party accepts charges--release call. person-to-person and conversation has begun--release call.

Telephone Operator Workstation: Operators
Criteria for selecting a level of analysis Captures observed behavior Captures distinctions between workstations Different levels of analysis Unit Task Level Functional Level Cognitive task analysis Level Embodied cognition Level These levels are modified from those discussed in Gray, John, & Atwood, (1993). Unit-task and functional level are directly analogous to the models of the same name in Card, et al., 1983. Cognitive task analysis level is a finer granularity (somewhere between Card, et al.’s argument and keystroke levels) that distinguishes between different types of activities that models of text-editing did not require. The embodied cognition level uses the critical path method and uses operators at the level of elementary cognitive, perceptual, and motor acts (based on the Model Human Processor, Card, Moran & Newell, 1983).

Telephone Operator Workstation Operators: The cognitive task analysis level
Operators reflect the activities the TAO must perform • listen-to-beep • read-screen • greet-customer • listen-to-customer • enter-command • enter-credit-card-number • thank-customer

Telephone Operator Workstation: Cognitive Task Analysis
goal: handle-calls . goal: handle-call . . goal: initiate-call . . . goal: receive-information listen-to-beep Workstation: beep read-screen Workstation: displays information . . . goal: request-information greet-customer TAO: "New England Telephone may I help you?" . . goal: enter-who-pays listen-to-customer Customer: “Operator, bill this to ” . . . goal: enter-information enter-command TAO: hit F1 key enter-calling-card-number TAO: hit 14 numeric keys . . goal: enter-billing-rate read-screen and so on Operators from functional level become subgoals at cognitive task analysis level, composed of finer-grained operators. At the cognitive task analysis level the functional goal RECEIVE-INFORMATION is accomplished with one or more activity operators specific to the type of information being received (LISTEN-FOR-BEEP, READ-INFO-FROM-SCREEN, LISTEN-TO-CUSTOMER). At the cognitive task analysis level, differences between call categories would be seen as differences in the patterns of activities for different goals and subgoals. The cognitive task analysis level is not appropriate for predicting time differences either between workstations or between call categories Also, sequential nature of GOMS becomes apparent at the cognitive task analysis level.

Telephone Operator Workstation: Cognitive Task Analysis Problems with Assumption of Sequential Operators Set the durations of the observable operators (LISTEN-TO-BEEP, GREET-CUSTOMER, LISTEN-TO-CUSTOMER, ENTER-COMMAND, ENTER-CREDIT-CARD-NUMBER, and THANK-CUSTOMER and System Response Time to those observed in the sample call. Estimate duration of READ-SCREEN from previous work reading short words from a CRT screen (John & Newell, 1989). Sum of these operators and system response times = seconds. Observed call = 13 seconds. The reason for this is obvious from a timeline of actual events recorded in the videotape and shown above. These activities are happening in parallel. Such concurrent activities are not captured by sequential GOMS.

Telephone Operator Workstation: Embodied Cognition Level
Operators at the level of elementary operations of Cognition Perception Motor acts Uses the Critical Path Method

Telephone Operator Workstation: CPM-GOMS Level
Serial within operator type, parallel between types. Each MHP-level operator is represented as a box with a name centered inside it and an associated duration above the top right corner (in msec). Lines connecting the boxes represent information-flow dependencies, that is, when a line joins two operators, the operator to the left produces information required by the operator to the right. For visual clarity, we place operators of the same category along a horizontal line. Critical path is shown in bold and dictates the total time to perform the task. The goal hierarchy of the classic GOMS analysis is not explicitly represented in the schedule chart, but is implicit in the operators that are represented. For example, an activity-level operator READ-SCREEN(1) represents the TAO reading the screen to find information about the billing rate. This becomes an activity-level goal in the CPM-GOMS model, READ-SCREEN(1), and is accomplished by five CPM-level operators explicitly represented (second shaded group): attend-info(1), a cognitive operator that decides to look for this information; initiate-eye-movement(1), a cognitive operator that initiates an eye-movement to where that information will appear on the screen; eye-movement(1), a motor operator that positions the eyes; perceive-complex-info(1), a visual perception operator that takes in and comprehends the information when it is displayed; and finally verify-info(1), another cognitive operator that confirms that the information is as expected (that is, not something unexpected like a workstation failure producing an error message).

Telephone Operator Workstation: Making Quantitative Predictions
Use the CPM-GOMS model of the benchmark call on the current workstation as a baseline Modify the model to reflect design decisions substitute the layout, response times and procedures of the proposed workstation propose new features or designs obtain quantitative predictions of the effects on performance time Making predictions for the proposed workstation change the keystroke procedure as described in the manufacturer's training manual change the horizontal movement times between keys as calculated using Fitts' Law and the proposed keyboard layout change the system response times as predicted by the manufacturer do not change the eye-movements because, although the screen layout has changed, the eyes must move as frequently for the proposed workstation as for the current workstation. At this grain size, all eye movements have the same estimated duration. The eye movements are never on the critical path and sufficient slack time that even if small variations in their duration were introduced, these would not affect the length of the call. For the benchmark task described, the predicted length of this benchmark call on the proposed workstation is 12.7 sec, 6% greater than the predicted length of the call on the current workstation.

Telephone Operator Workstation: Quantitative Predictions
Difference between the proposed workstation and the current workstation for 15 benchmark calls This graph shows: T(Proposed Workstation) - T(Current Workstation) so a positive value indicates that the proposed workstation is predicted to produce a longer time for the benchmark call. These predictions for the 15 benchmark calls were validated against real-world data (Gray, John, & Atwood, 1993). The GOMS models predicted that the proposed workstation would be 3% slower than the current workstation, averaging across the call types and weighting for frequency. The field data showed that the proposed workstation was indeed 3% slower than the current workstation.

Telephone Operator Workstation: Qualitative Explanations
Examine the critical path of the different models to explain the quantitative predictions. E.g., why this call is longer on the proposed than on the current workstation. Figure 1. Section of CPM-GOMS analysis from near the beginning of the call. Notice that the proposed workstation (bottom) has removed two keystrokes (which had required 7 motor and 3 cognitive operators) from this part of the call. However, none of the ten operators removed were along the critical path (shown in bold). The models go beyond the data. Remember I told you that when the data started to come in, NYNEX and the manufacturers were ready to hang the trial organizers. The data by themselves did not explain why the proposed was slower than the current. The models did!!! Here is some of it. Beginning of call

Qualitative Explanations
Models as explanation. Figure 2. Section of CPM-GOMS analysis from the end of the call. Notice that the proposed workstation (bottom) has added one keystroke to this part of the call which results in four operators (3 motor and 1 cognitive being added to the critical path (shown in bold). End of call

Uses of CPM-GOMS in Design
Project Ernestine emphasized Quantitative comparisons between alternative systems Qualitative explanations for differences between alternative systems Quantitative comparisons between alternative systems current vs. proposed telephone operator workstations Qualitative explanations for differences between alternative systems the differences caused by the different keying procedures in the telephone operator workstations other such explanations can be found in Gray, John, & Atwood (1993)

Uses of CPM-GOMS in Design
CPM-GOMS can also be used to Direct design effort Bracketing, Profiling, and Diagnosis Directing design effort assessing the impact of design features/changes in design Bracketing Determining fastest-reasonable and slowest microstrategies that can be used in a situation.

Directing Design Effort
Designing telephone operator workstations: Should we redesign keyboards? screens? system response times?

Directing Design Effort Keyboards & Screens
How much of the time is each activity on the critical path? Call Cat. Sys RT Talking Keying Reading Ring/Coin cc01 25% 40% 1% 3% 31% cc02 93% 0% 4% cc03 20% 71% 2% 6% cco4 35% cc05 19% 44% 22% cc06 12% 79% cc07 30% 57% 8% cc08 26% 41% 23% 10% ... Average 16% 64% 5% As a quick upper bound, a redesign of the keyboard could have, at most, a 6% impact on the performance time of these 15 benchmark calls. (Actually any redesign would have less than 6% impact, because as the keystrokes go to zero, other activities will be on the critical path and determine the length of the call.) Redesign of the screen would have, at most, a 5% impact. System response time seems to be the more promising candidate for redesign effort (see following slides). Or, a radical suggestion: Put the money that would go into a redesign into an advertising campaign to educate the customer to give the necessary information without talking so much! (Less radical suggestion: Get the telephone operators to train the customers during the slack time. For instance, if a customer making a collect call does not give his or her name immediately, have the telephone operator say, "Next time, if you give your name when asking to reverse the charges, it will speed you call." This could be said during the slack time while waiting for the phone to ring and the called party to answer.

Directing Design Effort
Bracketing Profiling Diagnosis Bracketing. Taking into account only those constraints imposed by the interface, could we define two sets of models that defined the lower and upper bounds on performance and, therefore, bracket individual performance? Profiling. If we could bracket performance, could we also characterize systematic deviations from the FASTEST REASONABLE model as due solely to the use of a slower microstrategy? To address this question, we constructed BEST FIT models to identify which microstrategies people were using. Diagnosis. Places where the microstrategy used in the BEST FIT model differs from that used in the FASTEST REASONABLE model indicate less than optimal performance. The BEST FIT models identify what people are doing instead of the optimal and may lead to insights regarding why they are doing it.

A screen print of the target display
A screen print of the target display. The labels for three buttons (NORM, EXP2, and NAVDESG) have been enhanced to make it easier for the reader to follow the procedures that we discuss in class

Bracketing: Slow & Fastest Reasonable

Bracketing SLOWEST FASTEST-REASONABLE P1 RMSD 99 165 mean % difference
11.2% 18.6% P2 RMSD 20.4% 7.5%

Profiling: BEST FIT models
Profiling consists of building a BEST FIT model. For each part of the task where performance does not match the FASTEST REASONABLE model, faster microstrategies are swapped for successively slower microstrategies until either a good fit is obtained or the slowest microstrategy of a family has been used. This produces a BEST FIT model. When a model mismatches performance, a common way to increase its fit is to change one or more of its parameters. However, the microstrategy approach constrains the set of possible model changes in several important ways. In building a BEST FIT model: The parameters must be constant: the SLOW, BEST FIT, and FASTEST REASONABLE models must all use the same parameter set; Corresponding parts of the SLOW, BEST FIT, and FASTEST REASONABLE models must use members of the same microstrategy family; and Improving the fit of the BEST FIT model can only be accomplished by swapping microstrategies in and out from the same family. Hence, the BEST FIT models are post hoc, but not ad hoc. We argue that, as with the mouse-button example, the set of microstrategies for any given interactive technology is small, constrained, and determinable.

Profiling Fit of the SLOWEST, FASTEST-REASONABLE, and BEST-FIT Models to the Empirical Data for P1 and P2

Diagnosis: example CLICK-MOVE Used in part 2 and part 3
Use of slower microstrategies may reflect a confusion among the three unit tasks Not perceptually distinct Participants might have had to rely more on memory than is usual for interactive behavior with current interactive technology Diagnosis: Places where the microstrategy used in the BEST FIT model differs from that used in the FASTEST REASONABLE model indicate less than optimal performance. The BEST FIT models identify what people are doing instead of the optimal and may lead to insights regarding why they are doing it

Summary of CPM-GOMS Predicts expert performance time and provides qualitative explanations for tasks involving parallel activities Once the models are built, schedule charts allow designers to rapidly play what-if games with design ideas and parameters Not intended to predict the occurrence of learning time, or errors - other forms of GOMS models are more helpful there Subject to the limitations of GOMS models in general (no casual use, fatigue, etc.)

Summary of Workshop http://hfac.gmu.edu/~graypubs
Intro to the GOMS family of models Keystroke Level GOMS NGOMSL GOMS CPM-GOMS

References: Barnes, R. M. (1963). Motion and time study: Design and measurement of work. (5th ed.). New York: John Wiley & Sons, Inc. Bovair, S.,Kieras, D. E., & Polson, P. G. (1990). The acquisition and performance of text editing skill: A cognitive complexity analysis. Human-Computer Interaction, 5, 1-48. Card, S. K., Moran, T. P., & Newell, A. (1980a). The keystroke-level model for user performance time with interactive systems. Communications of the ACM, 23(7), Card, S. K., Moran, T. P., & Newell, A. (1980b). Computer text-editing: An information-processing analysis of a routine cognitive skill. Cognitive Psychology, 12, Card, S. K., Moran, T. P., & Newell, A. (1983). The psychology of human-computer interaction. Hillsdale, NJ: Lawrence Erlbaum Associates. Card,S. K., Pirolli, P.,& M ackinlay , J. D. (1994). The Cost-of-Knowledge characteristic function: Display evaluation for direct-walk dynamic information visualizations. In Proceedings of CHI, 1994 (Boston, Massachusetts, 1994), ACM, New York, Claris Corporation, (1989). MacProject II: Getting started. Elkerton, J., & Palmiter, S. L. (1991). Designing help using a GOMS model: An information retrieval evaluation. Human Factors, 33, Gong, R., & Elkerton, J. (1990). Designing minimal documentation using a GOMS model: A usability evaluation of an engineering approach. In Proceedings of CHI, 1990 (Seattle, Washington, April 30-May 4, 1990), ACM, New York, Gray, W. D. (2000). “The nature and processing of errors in interactive behavior.” Cognitive Science 24(2): Gray, W. D., & Boehm-Davis, D. A. (in press). Milliseconds Matter: An introduction to microstrategies and to their use in describing and predicting interactive behavior. Journal of Experiment Psychology: Applied. Gray, W. D., John, B. E., & Atwood, M. E. (1993). Project Ernestine: Validating a GOMS analysis for predicting and explaining real-world performance. Human Computer Interaction, 8 (3),

John, B. E. (1990). Extensions of GOMS Analyses to Expert Performance Requiring Perception of Dynamic Visual and Auditory Information. In J. C. Chew & J. Whiteside (Eds.), ACM CHI'90 Conference on Human Factors in Computing Systems, (pp ). New York: ACM Press. John, B. E., & Kieras, D. (1997a). The GOMS family of user interface analysis techniques: Comparison and contrast. ACM Transactions on Computer-Human Interaction. John, B. E., & Kieras, D. (1997b). Using GOMS for user interface design and evaluation: Which technique? ACM Transactions on Computer-Human Interaction. John, B. E., & Newell, A. (1989). Cumulating the science of HCI: From S-R compatibility to transcription typing. In Proceedings of CHI, 1989 (Austin, Texas, April 30-May 4, 1989), ACM, New York, John, B. E., & Vera, A. H. (1992). A GOMS analysis for a graphic, machine-paced, highly interactive task. In Proceedings of CHI (Monterey, May 3-7, 1992), ACM, New York, John, B. E., Vera, A. H., & Newell, A. (1994). Toward real-time GOMS: A model of expert behavior in a highly interactive task. Behavior and Information Technology, 13 (4), Kieras, D. (1997). A guide to GOMS model usability evaluation using NGOMSL. Handbook of Human-Computer Interaction. M. Helander, T. K. Landauer and P. Prabhu. New York, Elsevier: Kieras, D. (1993). Using the keystroke-level model to estimate execution times University of Michigan. Kieras, D. E., & Polson, P. G. (1985). An approach to the formal analysis of user complexity. International Journal of Man-Machine Studies, 2 2, Lee, A. Y., Polson, P. G., & Bailey, W. A. (1989). Learning and transfer of measurement tasks. In Proceedings of CHI, 1989 (Austin, Texas, April 30 - May 4, 1989), ACM, New York,

Lerch, F. J. , Mantei, M. M. , & Olson, J. R. (1989)
Lerch, F. J., Mantei, M. M., & Olson, J. R. (1989). Translating ideas into action: Cognitive analysis of errors in spreadsheet formulas. In Proceedings of CHI, 1989 (Austin, Texas, April 30 - May 4, 1989), ACM, New York, Lohse, G. L. (1993). A cognitive model for understanding graphical perception. Human-Computer Interaction, 8(4), Newell, A & Card, S. K. (1985) "The prospects for psychological science in Human-Computer Interaction." Human-Computer Interaction, 1, 3, Newell, A & Card, S. K. (1986) "Straightening out softening up: Response to Carroll and Campbell." Human-Computer Interaction, 2, 3, Olson, J. R. & Olson, G. M. (1990) "The growth of cognitive modeling in Human-Computer Interaction since GOMS." Human-Computer Interaction, 5, 2&3, Peck, V. A., & John, B. E. (1992). Browser-Soar: A cognitive model of a highly interactive task. In Proceedings of CHI, 1992, (Monterey, California, May 3-May 7, 1992), ACM, New York: Polson, P. G., & Kieras, D. E. (1985). A quantitative model of learning and performance of textediting knowledge. In Proceedings of CHI, 1985 (San Francisco, California, April 14- April 18, 1985), ACM, New York, Polson, P. G., Muncher, E., & Engelbeck, G. (1986). A test of the common elements theory of transfer. In Proceedings of CHI, 1986, ACM, New York.

Be sure to set your printer to black and white.

Similar presentations

Presentation on theme: "Be sure to set your printer to black and white."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Be sure to set your printer to black and white.

Similar presentations

Presentation on theme: "Be sure to set your printer to black and white."— Presentation transcript:

Similar presentations

About project

Feedback