Neuropsychological Assessment Battery (NAB): Introduction and Overview

Neuropsychological Assessment Battery (NAB): Introduction and Overview
Travis White, PhD PAR, Inc.

Acknowledgment Development of the NAB was made possible and funded in part by the following grants from the National Institute of Mental Health: 1 R43 MH 2 R44 MH

Introduction The NAB is a comprehensive, modular battery of 33 new neuropsychological tests, each with an equivalent form, developed to examine a wide array of cognitive skills and functions in adults, age 18 and older. Decisions pertaining to content and format were guided by results of a national survey of neuropsychological assessment needs and practices (Stern & White, 2000), and by guidance from members of the NAB Advisory Council and other consultants.

Rationale for the NAB Arthur Benton (1992): The field of neuropsychology has lacked an integrated battery of instruments capable of providing highly sophisticated test data while requiring only a relatively brief administration time. Oscar Parsons (1993): To meet current needs, such a battery should (a) have good psychometric characteristics, (b) include extensive normative and standardization data, (c) provide clinical information that satisfies a broad range of modern referral sources and questions, and (d) facilitate systematic research.

Goal of Development The goal underlying the development of the NAB was to address these needs by producing a new and innovative neuropsychological test battery that provides a comprehensive evaluation of neuropsychological functions in less than 4 hours. The NAB incorporates the conceptual framework of Bauer (1994) and Tarter and Edwards (1986) by offering a separate Screening Module to indicate the need to administer additional domain-specific Modules.

Screening Module Main Modules
Screening Attention Domain Score Attention Module Language Module Screening Language Domain Score Memory Module Screening Memory Domain Score Spatial Module Screening Spatial Domain Score Executive Functions Module Screening Executive Functions Domain Score

Flexibility For those areas of functioning not included in the NAB (e.g., motor functioning, effort, mood/personality), the examiner can expand upon the NAB assessment with his or her favored instruments. The individual examiner may choose to forego the Screening Module and administer any or all of the five domain-specific modules to a patient, based on specific clinical needs. In addition, the flexibility inherent in the NAB also allows for selection of individual tests from each Module – rather than administering an entire domain-specific Module – when this type of non-battery focused assessment is clinically warranted.

Survey In order to ascertain the needs of the potential users of a new neuropsychological test battery, PAR conducted a comprehensive national Survey of Neuropsychological Assessment Needs (Stern & White, 2000). The results served as a basis for the development of the NAB, vis-à-vis areas of functioning to include, length of battery, and other salient content and format characteristics of the battery.

Survey An important finding was the discrepancy between
The amount of time respondents thought was ideally needed for a comprehensive neuropsychological evaluation given current instrumentation (Mode = 5 to 6 hours; 25 % stated 4 hours or less) and The amount of time they thought was required to conduct a realistic and reimbursable neuropsychological evaluation in today’s health care climate (Mode = 3 to 4 hours; 49% stated 4 hours or less). 89% of respondents stated that there was no commercially available instrument that provided a comprehensive evaluation within the current time/funding constraints.

Innovative Features of the NAB
Screening for both severely impaired and fully intact performance Comprehensive coverage of functional domains Combined strengths of flexible and fixed battery approaches to assessment Avoidance of floor and ceiling effects Reduced administration time Coordinated norming (entire NAB normed on a single standardization group) Demographically corrected norms based on age, education level, and sex Provision of equivalent/alternate form Increased user-friendliness for both examiner and examinee Focus on ecological validity

Dual-Screening Capability
Screening capability rated as moderate-to-very important by 74% of the survey respondents. In practice, neuropsychological screening is typically geared toward identifying patients who show no signs of brain dysfunction and no need for extensive follow-up testing. This approach has been formally incorporated into two popular assessment instruments, the Dementia Rating Scale-2 and Cognistat (NCSE). Screening capability was rated as moderate-to-very important by 75% of the survey respondents. In practice, neuropsychological screening is typically geared toward identifying patients who show no signs of brain dysfunction and no need for extensive follow-up testing. This approach has been formally incorporated into two popular assessment instruments, the Dementia Rating Scale–Second Edition (DRS-2; Jurica, Leitten, & Mattis, 2001) and Cognistat (also knows as the Neurobehavioral Cognitive Status Examination [NCSE]; Kiernan, Mueller, & Langston, 1987). The NAB incorporates this screening algorithm for each functional domain assessed, but also extends this capability to screen out patients who are too impaired to benefit from additional testing. If the referral question requires qualification and description of the patient’s functioning, the user can always disregard the screening recommendation and administer the entire battery or the select main module(s).

Dual-Screening Capability
The NAB Screening Module provides screening recommendations at both ends of the ability spectrum. For each NAB Screening Domain score, two recommendations are offered: (1) administer related module or (2) do not administer related module. Recommendations to forego the main module are made if: the patient is fully intact (i.e., lacks impairment) and thus does not require administration of the analogous NAB Main Module because he/she would obtain similarly intact/above average scores the patient is moderate-to-severely impaired and thus would not require administration of the analogous NAB Main Module because he/she would likely obtain similarly impaired scores If the referral question requires greater quantification and description of the patient’s functioning, the user can always disregard the screening algorithm and administer the entire battery or the select functional Module(s). Screening capability was rated as moderate-to-very important by 75% of the survey respondents. In practice, neuropsychological screening is typically geared toward identifying patients who show no signs of brain dysfunction and no need for extensive follow-up testing. This approach has been formally incorporated into two popular assessment instruments, the Dementia Rating Scale–Second Edition (DRS-2; Jurica, Leitten, & Mattis, 2001) and Cognistat (also knows as the Neurobehavioral Cognitive Status Examination [NCSE]; Kiernan, Mueller, & Langston, 1987). The NAB incorporates this screening algorithm for each functional domain assessed, but also extends this capability to screen out patients who are too impaired to benefit from additional testing. If the referral question requires qualification and description of the patient’s functioning, the user can always disregard the screening recommendation and administer the entire battery or the select main module(s).

Comprehensive Coverage of Functional Domains
Reviews of the neuropsychological literature (e.g., Lezak, 1995; Mapou & Spector, 1995; Spreen & Strauss, 1998) have identified seven major functional domains: Language and verbal communication functions Spatial/perceptual skills Sensorimotor functions Attention and related information processing tasks (including working memory) Learning and memory Executive functions and problem-solving abilities Personality, emotional, and adaptive functions. This conceptual framework has been confirmed with factor analytic studies of various neuropsychological batteries (Larrabee & Curtiss, 1992; Leonberger et al., 1992). Reviews of the neuropsychological literature (e.g., Lezak, 1995; Mapou & Spector, 1995; Spreen & Strauss, 1998) have identified seven major functional domains: (a) language and verbal communicative functions; (b) spatial/perceptual skills; (c) sensorimotor functions; (d) attention and related information processing tasks (including working memory); (e) learning and memory; (f) executive functions and problem-solving abilities; and (g) personality, emotional, and adaptive functions. This conceptual framework has been confirmed with factor analytic studies of various neuropsychological batteries (Larrabee & Curtiss, 1992; Leonberger, Nicks, Larrabee, & Goldfader, 1992). Many neuropsychologists also add to their evaluations measures of overall intellectual functioning and, especially in forensic settings, measures of malingering/symptom validity. The NAB was developed with the overriding goal of providing a common set of core tests that serve as a reasonably comprehensive standard reference base suitable for most routine clinical applications. Thus, the NAB is specifically not a “screening battery,” nor is it an exhaustive test battery that measures every conceivable neuropsychological skill and related functions. The survey of neuropsychologists led to decisions pertaining to the final content composition of the NAB and lends strong support for organizing the NAB into six modules: Screening, Attention, Language, Memory, Spatial, and Executive Functions.

The NAB was developed with the overriding goal of providing a common set of core tests that serve as a reasonably comprehensive standard reference base suitable for most routine clinical applications. Thus, the NAB is specifically not a “screening battery.” The NAB is also not an exhaustive test battery that measures every conceivable neuropsychological skill and function. Reviews of the neuropsychological literature (e.g., Lezak, 1995; Mapou & Spector, 1995; Spreen & Strauss, 1998) have identified seven major functional domains: (a) language and verbal communicative functions; (b) spatial/perceptual skills; (c) sensorimotor functions; (d) attention and related information processing tasks (including working memory); (e) learning and memory; (f) executive functions and problem-solving abilities; and (g) personality, emotional, and adaptive functions. This conceptual framework has been confirmed with factor analytic studies of various neuropsychological batteries (Larrabee & Curtiss, 1992; Leonberger, Nicks, Larrabee, & Goldfader, 1992). Many neuropsychologists also add to their evaluations measures of overall intellectual functioning and, especially in forensic settings, measures of malingering/symptom validity. The NAB was developed with the overriding goal of providing a common set of core tests that serve as a reasonably comprehensive standard reference base suitable for most routine clinical applications. Thus, the NAB is specifically not a “screening battery,” nor is it an exhaustive test battery that measures every conceivable neuropsychological skill and related functions. The survey of neuropsychologists led to decisions pertaining to the final content composition of the NAB and lends strong support for organizing the NAB into six modules: Screening, Attention, Language, Memory, Spatial, and Executive Functions.

The survey of neuropsychologists directly guided the final content composition of the NAB into the following six modules: Screening, Attention, Language, Memory, Spatial, and Executive Functions. Within each of the functional domains, results of the survey guided inclusion and exclusion of specific subdomains of assessment. Reviews of the neuropsychological literature (e.g., Lezak, 1995; Mapou & Spector, 1995; Spreen & Strauss, 1998) have identified seven major functional domains: (a) language and verbal communicative functions; (b) spatial/perceptual skills; (c) sensorimotor functions; (d) attention and related information processing tasks (including working memory); (e) learning and memory; (f) executive functions and problem-solving abilities; and (g) personality, emotional, and adaptive functions. This conceptual framework has been confirmed with factor analytic studies of various neuropsychological batteries (Larrabee & Curtiss, 1992; Leonberger, Nicks, Larrabee, & Goldfader, 1992). Many neuropsychologists also add to their evaluations measures of overall intellectual functioning and, especially in forensic settings, measures of malingering/symptom validity. The NAB was developed with the overriding goal of providing a common set of core tests that serve as a reasonably comprehensive standard reference base suitable for most routine clinical applications. Thus, the NAB is specifically not a “screening battery,” nor is it an exhaustive test battery that measures every conceivable neuropsychological skill and related functions. The survey of neuropsychologists led to decisions pertaining to the final content composition of the NAB and lends strong support for organizing the NAB into six modules: Screening, Attention, Language, Memory, Spatial, and Executive Functions. For example, within the area of Language, respondents provided very little support for the inclusion of repetition tasks. Within the area of Memory, although there was strong preference for the inclusion of multiple choice recognition for a list-learning task, there was very little support for a multiple choice recognition paradigm for a story-learning task. Survey respondents reported a strong preference to continue using existing measures of sensorimotor functions, personality/emotional/adaptive functions, intelligence, and malingering/symptom validity; that is, the preference was to not create additional measures of these functions for a newly developed battery.

Combined Strengths of Flexible and Fixed Battery Approaches to Assessment
The flexible and fixed battery approaches to neuropsychological assessment each have strengths and limitations. In developing the NAB, we attempted to include as many strengths as possible, while avoiding as many weaknesses as possible. As described above, there are both strengths and limitations to each of the previously existing major approaches to neuropsychological assessment (i.e., flexible battery and fixed battery). In developing the NAB, the strengths of each of these approaches were included. Therefore, the NAB provides the following features: (a) a constant background of tests, with a focused, patient-centered examination and shorter administration times afforded by the dual-screening approach; (b) standardized administration and scoring procedures across all tests; (c) quantitative summary indices along with numerous measures of pertinent qualitative aspects of performance; and (d) minimal reliance on clinical decision-making in test selection. This overall approach also allows for the accumulation of extensive validation research.

Combined Strengths of Flexible and Fixed Battery Approaches to Assessment
Therefore, the NAB has the following characteristics: Constant background of tests Focused, patient-centered examination Shorter administration times afforded by the efficient screening/test selection Minimal reliance on clinical decision-making in test selection. Standardized administration and scoring procedures across all tests Quantitative summary indexes along with numerous measures of qualitative aspects of performance As described above, there are both strengths and limitations to each of the previously existing major approaches to neuropsychological assessment (i.e., flexible battery and fixed battery). In developing the NAB, the strengths of each of these approaches were included. Therefore, the NAB provides the following features: (a) a constant background of tests, with a focused, patient-centered examination and shorter administration times afforded by the dual-screening approach; (b) standardized administration and scoring procedures across all tests; (c) quantitative summary indices along with numerous measures of pertinent qualitative aspects of performance; and (d) minimal reliance on clinical decision-making in test selection. This overall approach also allows for the accumulation of extensive validation research.

Avoidance of Floor and Ceiling Effects
Approximately 90% of survey respondents indicated that it would be moderately or very important for a new comprehensive test battery to be appropriate for high functioning examinees and should, therefore, avoid ceiling effects. Approximately 73% of survey respondents indicated that a new battery should also be appropriate for severely impaired patients and should, therefore, avoid floor effects. Approximately 90% of survey respondents indicated that it would be moderately or very important for a new comprehensive test battery to be appropriate for high functioning examinees and should, therefore, avoid ceiling effects. Although not as highly rated (i.e., 73% giving a rating of moderately or very important), survey respondents indicated that a new battery should also be appropriate for severely impaired patients and should, therefore, avoid floor effects. A guiding principle in the development of the NAB was the avoidance of both ceiling and floor effects, when appropriate. For most tests in the NAB, a continuum of difficulty levels was included to provide a relatively normal distribution in test performance. Difficulty ratings were provided by the Advisory Council members and used in the initial creation and selection of individual test items. In addition, difficulty analyses were conducted on data collected at both pilot testing and standardization to assure the adequacy of distributions.

Avoidance of Floor and Ceiling Effects
Thus, a guiding principle in the development of the NAB was the avoidance of both ceiling and floor effects, when appropriate. For most tests in the NAB, a continuum of difficulty levels was included so as to provide a relatively normal distribution in test performance. Difficulty ratings were provided by the Advisory Council members and used in the initial creation and selection of individual test items. In addition, item difficulty statistics were calculated on field testing and standardization data to assure the adequacy of distributions. Approximately 90% of survey respondents indicated that it would be moderately or very important for a new comprehensive test battery to be appropriate for high functioning examinees and should, therefore, avoid ceiling effects. Although not as highly rated (i.e., 73% giving a rating of moderately or very important), survey respondents indicated that a new battery should also be appropriate for severely impaired patients and should, therefore, avoid floor effects. A guiding principle in the development of the NAB was the avoidance of both ceiling and floor effects, when appropriate. For most tests in the NAB, a continuum of difficulty levels was included to provide a relatively normal distribution in test performance. Difficulty ratings were provided by the Advisory Council members and used in the initial creation and selection of individual test items. In addition, difficulty analyses were conducted on data collected at both pilot testing and standardization to assure the adequacy of distributions.

Reduced Administration Time
The NAB provides a reasonably comprehensive evaluation in a much briefer period than is currently available. Approximately 71% of the survey respondents indicated that a realistic and reimbursable neuropsychological evaluation can be completed within 3-to-4 or 4-to-5 hours (excluding record review, interviewing, and report writing). The entire NAB requires less than 4 hours to administer. In fact, in the Standardization sample, the majority of subjects completed the NAB in 2.5 to 3 hours. The NAB provides a reasonably comprehensive evaluation in a much briefer period than is currently available. Approximately 71% of the survey respondents indicated that a realistic and reimbursable neuropsychological evaluation can be completed within 3-to-4 or 4-to-5 hours (excluding record review, interviewing, and report writing). The entire NAB requires approximately 3 hours for the five main modules, and less than 4 hours for all six modules (Screening Module and five main modules). Table 1.4 presents the approximate administration time for each module and the total battery. In most situations, clinicians still have time to administer intelligence and personality tests, as well as to pursue idiographic testing (e.g., motor skills, effort testing) when clinically warranted.

NAB Administration Time
Screening Module = 45 min. Attention Module = 45 min. Language Module = 35 min. Memory Module = 45 min. Spatial Module = 25 min. Executive Functions Module = 30 min. Full NAB (5 main modules) = 180 min. (3 hrs.) Screening Module and Full NAB = 220 min. (3 hrs., 40 min.) The NAB provides a reasonably comprehensive evaluation in a much briefer period than is currently available. Approximately 71% of the survey respondents indicated that a realistic and reimbursable neuropsychological evaluation can be completed within 3-to-4 or 4-to-5 hours (excluding record review, interviewing, and report writing). The entire NAB requires approximately 3 hours for the five main modules, and less than 4 hours for all six modules (Screening Module and five main modules). Table 1.4 presents the approximate administration time for each module and the total battery. In most situations, clinicians still have time to administer intelligence and personality tests, as well as to pursue idiographic testing (e.g., motor skills, effort testing) when clinically warranted. Screening Module = 45 min. Attention Module = 45 min. Language Module = 35 min. Memory Module = 45 min. Spatial Module = 25 min. Executive Functions Module = 30 min. Full NAB (5 main modules) = 180 min. (3 hrs.) Screening Module and Full NAB = 22 min. (3 hrs., 40 min.)

Coordinated Norming Whereas much is known about the psychometric properties of individual neuropsychological tests (Franzen, 1989; Lezak, 1995; Mitrushina, Boone, & D’Elia, 1998; Spreen & Strauss, 1998), very little effort has been devoted to the examination of how individual instruments function within a battery (Russell, 1994). Given the fact that 85% of the survey respondents reported using a customized battery, the lack of psychometric data on customized batteries represents a very large gap in the neuropsychological knowledge base, and may lead to critical limitations in the overall validity of clinical decisions based on neuropsychological test data (Faust et al., 1991). Whereas much is known about the psychometric properties of individual neuropsychological tests (Franzen, 1989; Lezak, 1995; Mitrushina, Boone, & D’Elia, 1998; Spreen & Strauss, 1998), very little effort has been devoted to the examination of how individual instruments function within a battery (Russell, 1994). Given the fact that 85% of the survey respondents reported using a customized battery, the lack of psychometric data on customized batteries represents a very large gap in the neuropsychological knowledge base and may lead to critical limitations in the overall validity of clinical decisions based on neuropsychological test data (Faust et al., 1991). In fact, this lack of coordinated norming of customized batteries may render forensic examination results based on these tests inadmissible as evidence in court based on the Daubert ruling (Ziskin, 1995). It is important to note that 75% of the survey respondents rated “submissible as evidence in forensic cases” as either moderately important or very important. In addition, because of the potential for specific neuropsychological test performance to be associated with IQ level (Tremont, Hoffman, Scott, & Adams, 1998), it is important to understand and quantify the relationship between a battery of neuropsychological tests and a measure of overall IQ. The NAB fills these gaps by providing coordinated norms for all of the NAB modules, along with a recently published measure of intelligence, the Reynolds Intellectual Screening Test (RIST; Reynolds & Kamphaus, 2003). The RIST is an excellent measure of general intelligence (g) and correlates highly with the Full Scale IQ of the Wechsler Adult Intelligence Scale, Third Edition (WAIS-III; Wechsler, 1997a). These coordinated norms allow for within- and between-patient score comparisons across the NAB and between these specific measures and estimated IQ level. Moreover, the examiner can use a single set of normative tables (including appropriate age, sex, and education corrections) for the entire NAB, rather than dealing with the commonly used mixture of test-specific norms compiled (often uniquely by each examiner) in each examiner’s “norms book.”

Coordinated Norming The NAB fills this critical gap by providing coordinated norms for all of the NAB tests and composite scores collected on the same standardization sample. These coordinated norms allow for within- and between-patient score comparisons across the NAB. Thus, the examiner can use a single set of normative tables (including the same age, education, and sex corrections) for the entire NAB, rather than dealing with the commonly used mixture of test-specific norms compiled in each examiner’s idiosyncratic “norms book.” Whereas much is known about the psychometric properties of individual neuropsychological tests (Franzen, 1989; Lezak, 1995; Mitrushina, Boone, & D’Elia, 1998; Spreen & Strauss, 1998), very little effort has been devoted to the examination of how individual instruments function within a battery (Russell, 1994). Given the fact that 85% of the survey respondents reported using a customized battery, the lack of psychometric data on customized batteries represents a very large gap in the neuropsychological knowledge base and may lead to critical limitations in the overall validity of clinical decisions based on neuropsychological test data (Faust et al., 1991). In fact, this lack of coordinated norming of customized batteries may render forensic examination results based on these tests inadmissible as evidence in court based on the Daubert ruling (Ziskin, 1995). It is important to note that 75% of the survey respondents rated “submissible as evidence in forensic cases” as either moderately important or very important. In addition, because of the potential for specific neuropsychological test performance to be associated with IQ level (Tremont, Hoffman, Scott, & Adams, 1998), it is important to understand and quantify the relationship between a battery of neuropsychological tests and a measure of overall IQ. The NAB fills these gaps by providing coordinated norms for all of the NAB modules, along with a recently published measure of intelligence, the Reynolds Intellectual Screening Test (RIST; Reynolds & Kamphaus, 2003). The RIST is an excellent measure of general intelligence (g) and correlates highly with the Full Scale IQ of the Wechsler Adult Intelligence Scale, Third Edition (WAIS-III; Wechsler, 1997a). These coordinated norms allow for within- and between-patient score comparisons across the NAB and between these specific measures and estimated IQ level. Moreover, the examiner can use a single set of normative tables (including appropriate age, sex, and education corrections) for the entire NAB, rather than dealing with the commonly used mixture of test-specific norms compiled (often uniquely by each examiner) in each examiner’s “norms book.”

Coordinated Norming An important consideration in interpreting the performance of individual examinees is the magnitude of difference between planned comparisons of scores. The coordinated norming of the NAB allows users to interpret score differences with two types of comparisons: Statistical significance of score differences Base rate of score differences An important consideration in interpreting the performance of individual examinees is the magnitude of difference between planned comparisons of scores. Score differences or discrepancies have at least two important aspects. The first aspect is one of statistical significance or, put another way, the probability that the scores are not essentially equal. The second important issue relates to the frequency of occurrence of the score differences (also referred to as a base rate) in the standardization sample. These two aspects are often expressed as two questions: Is the score difference real and not due to measurement error? What is the incidence rate of this difference in the normal population? It is quite possible to obtain difference scores that are statistically significantly different but occur relatively frequently in the standardization sample and, by extrapolation, the overall population.

Demographically Corrected Norms
The need to interpret neuropsychological tests within the context of an individual’s age, educational attainment, and sex has been well established in the field (c.f., Heaton, Grant, & Matthews, 1991). Given that over 95% of the survey respondents viewed the availability of demographically corrected norms as moderately (18%) or very important (77%), the norms provided for the NAB represent a unique and critical feature. The need to interpret neuropsychological tests within the context of an individual’s age, educational attainment, and sex has been well established in the field (Heaton, Grant, & Matthews, 1991). Given that over 95% of the survey respondents viewed the availability of coordinated and demographically corrected norms as moderately (18%) or very important (77%), the norms provided for the NAB represent a unique and critical feature.

The NAB demographically corrected norm sample consists of 1,448 individuals. Separate normative tables are provided for all combinations of the following demographic variables: Age (18-29, 30-39, 40-49, 50-59, 60-64, 65-69, 70-74, 75-79, 80-97) Education (<=11 years, 12 years, years, >=16 years) Sex The need to interpret neuropsychological tests within the context of an individual’s age, educational attainment, and sex has been well established in the field (Heaton, Grant, & Matthews, 1991). Given that over 95% of the survey respondents viewed the availability of coordinated and demographically corrected norms as moderately (18%) or very important (77%), the norms provided for the NAB represent a unique and critical feature. The demographically corrected norm sample consists of 1,448 individuals Separate normative tables are provided for the following demographic variables: Age (18-29, 30-39, 40-49, 50-59, 60-64, 65-69, 70-74, 75-79, 80-97) Education (<=11 years, 12 years, years, >=16 years) Sex

Provision of Equivalent/Alternate Forms
An important aspect of neuropsychological assessment is the ability to monitor and document changes in functioning over time. Survey results indicated that 96% of all respondents viewed the detection of change over time as a moderately (33%) or very important (63%) characteristic of a new comprehensive neuropsychological test battery. Current neuropsychological instruments are poorly equipped to meet this goal because of a lack of equivalent, “repeatable” forms (Lezak, 1995) and a limited understanding of practice effects on neuropsychological testing (Sawrie et al., 1996). An important aspect of neuropsychological assessment is the ability to monitor and document changes in functioning over time (Friedes, 1985; Matarazzo, Carmody, & Jacobs, 1980). In fact, the results of the survey indicated that 96% of all respondents viewed the detection of change over time as a moderately (33%) or very important (63%) characteristic of a new comprehensive neuropsychological test battery. Current neuropsychological instruments are poorly equipped to meet this goal because of a lack of equivalent, “repeatable” forms (Lezak, 1995) and a limited understanding of practice effects on neuropsychological testing (Sawrie, Chelune, Naugle, & Lueders, 1996). In developing the NAB, these needs were addressed in two ways. First, two parallel, equivalent forms were developed for each NAB module during the initial development phases. That is, unlike many tests with parallel forms, one original form was not created first with a secondary form developed after the fact. Rather, by beginning with a large item pool and by using ratings by Advisory Council members and the results of pilot testing, each of the two equivalent NAB forms includes a distinct set of items created simultaneously. Second, because many repeat neuropsychological testing sessions occur 6 months or more after the initial evaluation, a test-retest reliability study of the NAB was conducted using a 6-month retest interval. Resulting standard errors of measurement and expected practice effects (see chapter 5) help differentiate meaningful score differences from artifactual practice effects (Ivnik et al., 1999).

Provision of Equivalent/Alternate Forms
These needs were addressed in two ways during development of the NAB: Two parallel, equivalent forms were developed for each NAB module during the initial development phases. Because many repeat testing sessions occur 6 months or more after the initial evaluation, a test-retest reliability study of the NAB was conducted using a 6-month retest interval. Resulting SEMs and expected practice effects help differentiate meaningful score differences from artifactual practice effects. An important aspect of neuropsychological assessment is the ability to monitor and document changes in functioning over time (Friedes, 1985; Matarazzo, Carmody, & Jacobs, 1980). In fact, the results of the survey indicated that 96% of all respondents viewed the detection of change over time as a moderately (33%) or very important (63%) characteristic of a new comprehensive neuropsychological test battery. Current neuropsychological instruments are poorly equipped to meet this goal because of a lack of equivalent, “repeatable” forms (Lezak, 1995) and a limited understanding of practice effects on neuropsychological testing (Sawrie, Chelune, Naugle, & Lueders, 1996). In developing the NAB, these needs were addressed in two ways. First, two parallel, equivalent forms were developed for each NAB module during the initial development phases. That is, unlike many tests with parallel forms, one original form was not created first with a secondary form developed after the fact. Rather, by beginning with a large item pool and by using ratings by Advisory Council members and the results of pilot testing, each of the two equivalent NAB forms includes a distinct set of items created simultaneously. Second, because many repeat neuropsychological testing sessions occur 6 months or more after the initial evaluation, a test-retest reliability study of the NAB was conducted using a 6-month retest interval. Resulting standard errors of measurement and expected practice effects (see chapter 5) help differentiate meaningful score differences from artifactual practice effects (Ivnik et al., 1999).

Increased User-Friendliness
The NAB is more user-friendly than existing instruments with respect to: Modularity Portability Face validity

Modularity Almost 90% of the survey respondents rated modularity as either moderately (29%) or very important (60%) for a new instrument. Each of the six NAB modules are “self-contained” and may be administered independently of the other modules. Almost 90% of the survey respondents rated modularity as either moderately (29%) or very important (60%) for a new instrument. Each of the six NAB modules is “self-contained” and may be administered independently of the other modules.

Portability 76% of survey respondents rated portability as either moderately or very important. NAB materials are highly portable because a minimal number of manipulatives are required and all necessary visual stimuli are integrated into a single stimulus booklet for each module. All administration and scoring instructions are contained in the record forms, thus eliminating the need to juggle multiple forms and manuals during administration. All materials necessary for an entire NAB administration fit into the provided attaché case. In addition, 76% of survey respondents rated portability as either moderately or very important. NAB materials are highly portable because a minimal number of manipulatives are required and all necessary visual stimuli are integrated into a single Stimulus Book for each module. Moreover, the NAB Record Forms have been created to include all necessary administration instructions on the forms themselves, thus eliminating the common difficulty imposed by having to rely on test administration manuals for instructions, which often leads to the examiner awkwardly juggling multiple books and forms while administering a test; in the case of examiners who attempt to “memorize” administration instructions, inconsistency in administration procedures is eliminated. Approximately 73% of the survey respondents rated computerized administration as only slightly important or not at all important. Although this finding is initially surprising, it is understandable because even laptop computers significantly reduce portability and raise design and psychometric problems. Thus, the NAB is administered entirely by an examiner (i.e., not by computer).

Portability Approximately 73% of the survey respondents rated computerized administration as only slightly important or not at all important. Although this finding is initially surprising, it is understandable because even laptop computers significantly reduce portability and raise design and psychometric problems. Thus, the NAB is administered entirely by an examiner (i.e., not by computer). However, there is a computerized scoring software package (NAB-SP). Approximately 73% of the survey respondents rated computerized administration as only slightly important or not at all important. Although this finding is initially surprising, it is understandable because even laptop computers significantly reduce portability and raise design and psychometric problems. Thus, the NAB is administered entirely by an examiner (i.e., not by computer).

Face Validity Face validity is an important and often overlooked aspect of neuropsychological validation (Lezak, 1995; Nevo, 1985) Face validity refers to whether a test appears to measure what it purports to measure, as perceived by: Examinees who take it Administrative personnel who decide upon its use Other technically untrained observers, such as the examinee’s family (Anastasi & Urbina, 1997). Tests that lack face validity are more prone to rejection by patients with brain dysfunction who are likely to be easily frustrated and fatigued. Face validity, an important (Lezak, 1995) and often overlooked aspect of neuropsychological validation (Nevo, 1985), refers to whether a test appears to measure what it purports to measure to (a) the examinees who take it, (b) the administrative personnel who decide upon its use, and (c) other technically untrained observers, such as the examinee’s family (Anastasi & Urbina, 1997). Tests that lack face validity are more prone to rejection by patients with brain dysfunction who are likely to be easily frustrated and fatigued. The face validity of the NAB was rated by the members of the Advisory Council, and items and tasks with poor face validity ratings were eliminated or modified. Although the attractiveness of test materials is not often discussed in literature on face validity, the NAB includes modern, inviting, and colorful stimuli, materials, and artwork, including high-quality digital photography.

Face Validity The face validity of the NAB was rated by the members of the Advisory Council, and items and tasks with poor face validity ratings were eliminated or modified. Although the attractiveness of test materials is not often discussed in literature on face validity, the NAB includes modern, inviting, and colorful stimuli, materials, and artwork, including high-quality digital photography. Face validity, an important (Lezak, 1995) and often overlooked aspect of neuropsychological validation (Nevo, 1985), refers to whether a test appears to measure what it purports to measure to (a) the examinees who take it, (b) the administrative personnel who decide upon its use, and (c) other technically untrained observers, such as the examinee’s family (Anastasi & Urbina, 1997). Tests that lack face validity are more prone to rejection by patients with brain dysfunction who are likely to be easily frustrated and fatigued. The face validity of the NAB was rated by the members of the Advisory Council, and items and tasks with poor face validity ratings were eliminated or modified. Although the attractiveness of test materials is not often discussed in literature on face validity, the NAB includes modern, inviting, and colorful stimuli, materials, and artwork, including high-quality digital photography.

Focus on Ecological Validity
Ecological validity is the functional and predictive relationship between (a) performance on a set of neuropsychological tests during a highly structured, office-based test session and (b) behavior in a variety of real-world settings, such as home, work, or school (Long, 1996). Over 79% of Survey respondents rated ecological validity as being either moderately or highly important attributes of a new comprehensive neuropsychological test battery.

Focus on Ecological Validity
The development of the NAB specifically emphasized ecological validity. For example, each NAB module (with the exception of Screening) includes one Daily Living test that is designed to be highly congruent with an analogous real-world behavior. By definition, NAB Daily Living tests are multifactorial in nature.

NAB Materials Manuals NAB Administration, Scoring, and Interpretation Manual NAB Psychometric and Technical Manual NAB Demographically Corrected Norms Manual NAB U.S. Census-Matched Norms Manual

NAB Materials NAB Software Portfolio (NAB-SP)
Automates many steps involved in calculating raw scores and in obtaining normative scores and profiles. Two score reports: Screening and Main Modules. Choice of two normative samples. Profile graphs, including overlays of multiple administrations. Reports exportable to word processing programs. Data exportable to spreadsheet and database programs.

NAB Materials Test Administration Materials Record Forms:
One for each Module One per each of two NAB equivalent forms All necessary instructions for administration and scoring

NAB Materials Test Administration Materials Response Booklets:
One each for Screening, Attention, Language, and Executive Functions Modules One per each of two NAB equivalent forms Used for tests that require the examinee to write, draw, or provide other similar responses

NAB Materials Test Administration Materials Stimulus Books:
One for each Module One per each of two NAB equivalent forms Contain all visual stimuli presented to examinee for all tests other than Map Reading

NAB Materials Test Administration Materials Manipulatives
Design Construction tests in both Screening and Spatial Modules use a set of five flat, blue plastic geometric shapes (Tans) based on the ancient Chinese puzzle game Spatial Module Map

NAB Materials Test Administration Materials Scoring Templates
Numbers & Letters tests in both Screening and Attention Modules require scoring templates to score omissions and commissions.

General Principles Guiding the Development of the NAB
Tests must be easy to administer and score Stimuli must be attractive and face valid Total administration time for the five Main Modules must be 3 hours or less Start with a large pool of items that represents a wide range of difficulty Meaningful relationship between analogous Screening Module and Main Module tests Theoretical foundation must combine empiricism (prediction) and cognitivism (constructs) Test names should describe the content and/or procedures involved (“Dots” versus “Working Memory Test”) Advisory council ratings must inform development activities

Item Development/Reduction/Selection
For each test, at least two times the final number of items/stimuli were initially created (sometimes 10 times), using detailed development criteria, objective ratings (e.g., word frequency), and computerized manipulations. Results of Advisory Council Ratings AND numerous field testing studies guided both item reduction/selection and equating of forms.

Test/Item Characteristics Rated by Advisory Council
Verbal encodability Clinical utility Difficulty Ecological validity Education bias Ethnic/racial/cultural bias Sex bias U.S. regional bias Linguistic demands Quality of stimuli, artwork Stimulus satisfaction Task appropriateness Overall task satisfaction

Standardization of the NAB

Standardization Sites
Collection of the NAB standardization data started in September of 2001 and concluded in October of 2002. NAB standardization data were collected at five sites that were selected to provide representation in each of the four geographic regions of the U.S. Four of the sites were located at academic institutions with known expertise in neuropsychology; the publisher’s offices in Florida served as the fifth site.

NAB Normative Samples The total NAB standardization sample consisted of 1,448 healthy, community dwelling participants, which formed the basis of the following normative samples: Demographically corrected norms (N = 1,448) Age-based, U.S. Census-matched norms (N = 950)

Age-based, U.S. Census-matched Norms
The Age-based, U.S. Census-matched sample (N = 950) was abstracted from the total standardization sample. Closely matches the characteristics of the current U.S. population with respect to education, sex, race/ethnicity, and geographic region. Purpose: For making inferences regarding the adequacy of the tested ability in more absolute terms, i.e., compared to the population as a whole.

Consists of 1,448 healthy, community-dwelling individuals ranging in age from 18 to 97 years. Of these 1,448 participants, 711 received Form 1 and 737 received Form 2 as part of the standardization study; no participant completed both NAB forms. Purpose: For diagnostic inferences and for interpreting brain-behavior relationships, i.e., compared to age-, education-, and gender-matched peers.

For inferring brain-behavior relationships, it has been well established in the neuropsychological literature that demographically corrected norms are the most appropriate normative standard (Heaton et al., 1993; Heaton et al., 1991; Lezak, 1995; Mitrushina et al., 1998; Spreen & Strauss, 1998). The research literature has clearly established that performance on a neuropsychological test can be significantly impacted by an individual’s age, educational attainment, and sex, irrespective of potential brain dysfunction. Thus, interpretation of brain-behavior relationships should be based on normative data either categorized according to different groupings of these demographic variables, or “corrected for” the effect of these variables. For inferring brain-behavior relationships, it has been well established in the neuropsychological literature that demographically corrected norms are the most appropriate normative standard (Heaton et al., 1993; Heaton et al., 1991; Lezak, 1995; Mitrushina et al., 1998; Spreen & Strauss, 1998). The research literature has clearly established that performance on a neuropsychological test can be significantly impacted by an individual’s age, educational attainment, and sex, irrespective of potential brain dysfunction. Thus, interpretation of brain-behavior relationships should be based on normative data either categorized according to different groupings of these demographic variables, or “corrected for” the effect of these variables.

The demographically corrected norms are recommended for most situations encountered in clinical practice and, thus, they are the primary normative standard for the NAB. All normed scores presented in the psychometric analyses and tables in the NAB manuals are based on the demographically corrected norms.

Development of Norms Selection of normative scores
Equating of Forms 1 and 2 using equipercentile equating methods Verify the accuracy of the equating process Conversion of raw scores to z scores Examination of the effects of age, education, and sex Continuous norming to create norm tables Verify the accuracy of the demographic correction process Calculate composite Index scores

Selection of Normative Scores
The NAB consists of 33 individual tests, most of which provide at least several indicators of quantitative and qualitative performance. Prior to beginning the norming process, all potential NAB scores were categorized into one of three types of scores: primary, secondary, or descriptive. Several sources of information were used to categorize scores, including their (a) reliability, (b) presumed interpretive importance, and (c) content and construct validity. Chapter 5 presents information on the reliability of the NAB scores, including interrater reliability (where appropriate), internal consistency (where appropriate), generalizability, and test-retest reliability. In general, only scores with high reliability across most or all methods were selected as primary variables; those with weaker reliability were relegated to secondary or descriptive status. In addition to reliability estimates, the distributional properties of each potential score were analyzed. The parametric statistical procedures used to convert raw scores to T scores are based on the assumption of approximate normality of the score distribution. Some NAB scores of potential interpretive interest have a restricted range of raw scores, and this limits the ability to use parametric methods. Therefore, NAB primary scores were selected to have both a relatively large range of possible raw scores and approximately normal score distributions. Most NAB secondary scores have skewed score distributions and/or limited score ranges. NAB descriptive scores also have highly skewed score distributions and/or limited score ranges, but to an even greater degree. It is very rare for a healthy participant, regardless of age and education level, to have less-than-perfect performance on most NAB descriptive scores. Many NAB tests yield scores that are analogous to neuropsychological measures that have a rich clinical and research tradition, and users are familiar with interpreting such scores. Each test was reviewed from this perspective and scores were categorized on this basis. Primary scores are thought to be the most important indicators of performance on a NAB test. Secondary and descriptive scores are viewed as useful sources of qualitative interpretive information. Finally, NAB scores were also categorized based on their (a) content validity, (b) interrelationships with scores in the same NAB module, (c) interrelationships with scores in other NAB modules, and (d) relationships with external variables (i.e., concurrent neuropsychological measures).

Types of Normative Scores
Normative Metric Primary T scores (M = 50, SD = 10) Secondary Percentiles by age groups Descriptive Cumulative percentages for overall sample

Equating of Forms 1 and 2 Test equating refers to a family of statistical concepts and procedures that have been developed to adjust for differences in difficulty level on alternate test forms, thus allowing the forms to be used interchangeably. Test equating adjusts for differences in difficulty between the two forms of a test, not for differences in content (Kolen & Brennen, 1995).

Equating of Forms 1 and 2 The equipercentile equating method was selected for use with the NAB because it is thought to have greater generalizability and applicability than mean and linear equating when test scores may deviate from a perfectly normal distribution (Kolen & Brennen, 1995), which is the case with many NAB scores. Note that only NAB primary scores are equated. Secondary and descriptive scores are not equated by form; therefore, normative data for these scores are provided separately by form.

Influence of Demographic Variables
Regression techniques were used to evaluate the potential effects of age, education, and sex on NAB raw scores. Age, education, and sex were entered into separate regression equations as predictors, and the NAB primary z score was the dependent variable. The percentage variance in z scores (as reflected by the R2 value) accounted for by each demographic variable was recorded. Next, the three demographic variables were entered into a stepwise regression equation to determine the impact on z scores of the demographic variables taken as a group. Analyses were conducted on the NAB standardization sample data to evaluate the potential effects of age, education, and sex on NAB raw scores. Regression techniques were used to investigate these relationships. Age, education, and sex were entered into separate regression equations as predictors, and the NAB primary z score was the dependent variable. The percentage variance in z scores (as reflected by the R2 value) accounted for by each demographic variable was recorded. Next, the three demographic variables were entered into a stepwise regression equation to determine the impact on z scores of the demographic variables taken as a group. Tables 4.14 to 4.19 present these results. The first three columns list the percentage of variance accounted for by the individual demographic variables. The fourth column indicates the percentage of variance accounted for by the group of demographic variables. The final column lists the combination of demographic variables of the final stepwise model, and their relative predictive power.

Continuous Norming The method of continuous norming (Gorsuch, 1983b) was used to derive both the NAB demographically corrected norms and the age-based, U.S. Census-matched norms. Continuous norming corrects for irregularities in (a) the distributions of scores within groupings of the norming variable and (b) trends in the means and standard deviations across groupings when group sample sizes are 200 or smaller (Angoff & Robertson, 1987). Continuous norms provide a more accurate estimation of population parameters such as means and standard deviations because they are based on an equation that results from using all demographic groups, rather than only the one group for a particular table (Zachary & Gorsuch, 1985). The method of continuous norming was used to derive the NAB demographically corrected norms. Continuous norming has been recommended to correct for irregularities in (a) the distributions of scores within groupings of the norming variable and (b) trends in the means and standard deviations across groupings when group sample sizes are 200 or smaller (Angoff & Robertson, 1987). Ideally, individually administered tests such as the NAB would have very large samples at each age, education, and sex group. Practical realities, however, result in smaller samples in these groups than is considered ideal from a purely statistical sense. Therefore, these samples provide only estimates of the underlying population parameters. The method of continuous norming was developed by Gorsuch (1983b) to mitigate the effects of relatively small sample sizes across age groups. Continuous norms provide a more accurate estimation of population parameters such as means and standard deviations because they are based on an equation that results from using all demographic groups, rather than only the one group for a particular table (Zachary & Gorsuch, 1985). Thus, information about the effects of age, education, and sex on NAB z scores derived from the entire sample of 1,448 participants is used in determining the normative performance for each age, education, and sex group (i.e., normative table).

Calculation of Index Scores
For each participant in the norm sample, the actual T scores on the tests that comprise the composite score were summed, and the cumulative frequency distribution of this new score was calculated. The Module Indexes were scaled by converting the cumulative frequency distribution of the summed scores to a normalized standard score scale with a mean of 100 and a standard deviation of 15. The Total NAB Index was calculated as the sum of the five Module Indexes, which results in each module contributing equally to the Total NAB Index, regardless of the number of tests that comprise individual Module Indexes. Table 4.38 presents the NAB test composition of the Module Index and Total NAB Index scores. NAB Module Index scores were calculated in the following manner. For each participant in the demographically corrected standardization sample, the actual T scores on the tests that comprise the composite score were summed, and the cumulative frequency distribution of this new score was calculated. The Module Indexes and Screening Domain scores were scaled by converting the cumulative frequency distribution of the summed scores to a normalized standard score scale with a mean of 100 and a standard deviation of 15. The Total NAB Index was calculated as the sum of the five Module Indexes, which results in each module contributing equally to the Total NAB Index, regardless of the number of tests that comprise individual Module Indexes. An analysis of variance (ANOVA) was used to confirm the similarity of mean performance across the normative table groupings.

Composite Scores (M =100, SD = 15)
Main Modules Attention Index Language Index Memory Index Spatial Index Executive Functions Index Total NAB Index Screening Module Screening Attention Domain Screening Language Domain Screening Memory Domian Screening Spatial Domain Screening Executive Functions Domain Total Screening Index

Reliability and Score Differences

Interrater Reliability
The consistency of agreement of test scores from rater to rater is also an important indication of a test’s reliability. This is especially the case for those subtests that require scorer judgment and decision-making. The interrater reliability for the NAB was examined for the following subtests: Writing, Story Learning, Figure Drawing, Judgment, and Categories Thirty Form 1 and 30 Form 2 standardization protocols were randomly selected and independently scored by experienced standardization examiners. For the remaining tests, one-way single-measure intraclass correlation coefficients (ICC) were calculated The consistency of agreement of test scores from rater to rater is also an important indication of a test’s reliability. This is especially the case for those subtests that require scorer judgment and decision-making. Thirty Form 1 and 30 Form 2 standardization protocols were randomly selected and independently scored by two raters. The interrater reliability for the NAB was examined for the following subtests: Writing, Story Learning, Figure Drawing, Judgment, and Categories (see Table 5.24) Two experienced standardization examiners served as raters for the Writing subtest. Due to reduced variability between raters, average percentage agreement coefficients were calculated between the raters. The average interrater agreement percentages for NAB Forms 1 and 2 Writing scores are shown in Table The percentage agreement ranged from 95.0% for the primary Writing score (WRT) to 100.0% for Writing Spelling (WRT–spl). Overall, there was an average 98.1% interrater agreement for the Writing scores. For the remaining subtests (i.e., Story Learning, Figure Drawing, Judgment, and Categories), one-way single-measure intraclass correlation coefficients (ICC) were calculated (see Table 5.24).

Equivalent Forms Reliability
The equivalent forms reliability of the NAB was evaluated by applying generalizability theory (Brennan, 2001; Cronbach, Gleser, Nanda, & Rajaratnam, 1972; Shavelson & Webb, 1991). In contrast to classical psychometric theory that posits true scores and a unitary or undifferentiated source of error, the application of generalizability theory allows the partitioning of various sources of variance using the familiar analysis of variance design. Generalizability coefficients are considered analogues to traditional reliability estimates. However, in contrast to the magnitude of traditional reliability estimates, generalizability coefficients of .60 or higher should be regarded as demonstrating very good reliability (Cicchetti & Sparrow, 1981; Mitchell, 1979). The reliability of the NAB was also evaluated by applying generalizability theory (Brennan, 2001; Cronbach, Gleser, Nanda, & Rajaratnam, 1972; Shavelson & Webb, 1991;). In contrast to classical psychometric theory that posits true scores and a unitary or undifferentiated source of error, the application of generalizability theory allows the partitioning of various sources of variance using the familiar analysis of variance design. This study was designed to evaluate the reliability of NAB scores as a function of test form. Generalizability coefficients are considered analogues to traditional reliability estimates. However, in contrast to the magnitude of traditional reliability estimates, generalizability coefficients of .60 or higher should be regarded as demonstrating very good reliability (Cicchetti & Sparrow, 1981; Mitchell, 1979).

Equivalent Forms Reliability
Results showed good to excellent generalizability coefficients for most NAB primary scores. The average percentage variance attributable to Form was 2.1% (Median = 0.4). After the forms were equated, the average percentage residual variance attributable to form was 0.2% (Median = 0.0).

Reliability of Composite Scores
Given that the alpha coefficient is an inappropriate estimate for many NAB scores, G coefficients were uniformly used as reliability estimates for the purpose of calculating the reliability estimates of Screening Domains, Total Screening Index, Module Index scores, and Total NAB Index scores. The reliability coefficients for all composite scores were calculated with the formula recommended by Guilford (1954) and Nunnally (1978). Given that alpha coefficient is an inappropriate estimate for many NAB scores, G coefficients were uniformly used as reliability estimates for the purpose of calculating the reliability estimates of Screening Domains, Total Screening Index, Module Index scores, and Total NAB Index scores. The reliability coefficients for all composite scores were calculated with the formula recommended by Guilford (1954) and Nunnally (1978).

Standard Errors of Measurement and Confidence Intervals
Standard errors of measurement are provided for all NAB primary test scores and composite Domain and Index scores using the following formula: SEM = SD * SQRT(1-rxx) The SEM provides an estimate of the amount of error in an individual’s observed test score. In this formula, SD is the standard deviation of the scores normative metric (i.e, 10 or 15) 90% and 95% confidence intervals were developed for all NAB composite Domain and Index scores Standard errors of measurement are provided for all NAB primary test scores and composite Domain and Index scores using the following formula: SEM = SD * SQRT(1-rxx) The SEM provides an estimate of the amount of error in an individual’s observed test score. In this formula, SD is the standard deviation of the obtained T score (i.e., 10), or the standard deviation of the obtained Screening Domain, Total Screening Index, Module Index, and Total NAB Index scores (i.e., 15). For most users, however, the Screening Domain, Total Screening Index, Module Index, and Total NAB Index scores will be the primary focus of interpretation and, therefore, confidence intervals have been developed for the user at the 90% and 95% levels; these are presented in the normative tables of the selected norms manual.

Score Differences An important consideration in interpreting the performance of individual examinees is the magnitude of difference between planned comparisons of scores. The coordinated norming of the NAB allows users to interpret score differences with two types of comparisons: Statistical significance of score differences Base rate of score differences An important consideration in interpreting the performance of individual examinees is the magnitude of difference between planned comparisons of scores. Score differences or discrepancies have at least two important aspects. The first aspect is one of statistical significance or, put another way, the probability that the scores are not essentially equal. The second important issue relates to the frequency of occurrence of the score differences (also referred to as a base rate) in the standardization sample. These two aspects are often expressed as two questions: Is the score difference real and not due to measurement error? What is the incidence rate of this difference in the normal population? It is quite possible to obtain difference scores that are statistically significantly different but occur relatively frequently in the standardization sample and, by extrapolation, the overall population.

Base Rate of Score Differences
The base rate of score differences addresses the actual occurrence rates, expressed as cumulative percentages, of score discrepancies that are present in the standardization sample. It is quite possible to obtain difference scores that are statistically significant but that occur relatively frequently in the norm sample. Base rates of score differences are provided for all Screening Module composite score pairs, and for all Main Module composite score pairs. The base rate of score differences addresses the actual occurrence rates, expressed as cumulative percentages, of score discrepancies that are present in the standardization sample. It is quite possible to obtain difference scores that are statistically significant but that occur relatively frequently in the norm sample Base rates of score differences are provided for all Screening Module composite score pairs, and for all Main Module composite score pairs.

Validity of the NAB

Overview of Validity Studies
Evidence based on theory and test content (i.e., content validity) Evidence based on internal structure Intercorrelations of test and Index scores Factor analyses (EFA and CFA) Relationship between Screening Domain and Module Index scores Evidence based on relationships to external variables Evidence based on performance of clinical groups

Content Validity Reviews of the neuropsychological literature (e.g., Hebben & Milberg, 2002; Lezak, 1995; Mapou & Spector, 1995; Mitsrushina et al., 1998; Spreen & Strauss, 1998; Williamson et al., 1996) have identified seven major functional domains that are typically assessed: Attention and information processing (including working memory) Language and verbal communication Spatial/perceptual skills Learning and memory Executive functions and problem-solving abilities Sensorimotor functions Personality, emotional, and adaptive functions. Reviews of the neuropsychological literature (e.g., Hebben & Milberg, 2002; Lezak, 1995; Mapou & Spector, 1995; Mitsrushina et al., 1998; Spreen & Strauss, 1998; Williamson et al., 1996) have identified seven major functional domains included in neuropsychological assessment: (a) attention and information processing (including working memory); (b) language and verbal communicative functions; (c) spatial/perceptual skills; (d) learning and memory; (e) executive functions and problem-solving abilities; (f) sensorimotor functions; and (g) personality, emotional, and adaptive functions. This conceptual framework has been confirmed with factor analytic studies of various neuropsychological batteries (Ardilla, Galeano, & Rosselli, 1998; Larrabee & Curtiss, 1992; Leonberger et al., 1992; Ponton, Gonzalez, Hernandez, Herrera, & Higareda, 2000) and served as the underlying structure throughout the development of the NAB.

Content Validity This conceptual framework has been confirmed with factor analytic studies of various neuropsychological batteries (Ardilla, Galeano, & Rosselli, 1998; Larrabee & Curtiss, 1992; Leonberger et al., 1992; Ponton, Gonzalez, Hernandez, Herrera, & Higareda, 2000) This framework served as the underlying structure of the NAB.

Content Validity As described previously, results of the publisher’s survey of neuropsychological needs and practices guided the content composition of the NAB. Those results provided strong support for organizing the NAB into a Screening Module and five main modules corresponding to functional domains: Attention Module, Language Module, Memory Module, Spatial Module, and Executive Functions Module. Survey respondents reported a strong preference to continue using existing measures of sensorimotor functions and personality/emotional functions; that is, the preference was to not create new measures of these functions for a newly developed battery.

Content Validity Content validity deals with the issue of how well a group of items or tests is representative of the previously defined domain or domains of interest Evidence of content validity is typically obtained by having knowledgeable experts examine the test material and make judgments about the appropriateness of each item and/or test and the overall content coverage of the domain. In addition, content validity is often evaluated by examining the procedures and plans used in test construction.

Content Validity Content validity of the NAB was established through a variety of methods, including: Reviewing the neuropsychological assessment literature, including factor analytic work Survey of neuropsychologists Replicable development procedures for each NAB test Extensive Advisory Council ratings and feedback

Content Validity The detailed procedures used to develop each NAB test are beyond the scope of this presentation. However, they are discussed at length in Chapter 2 of the NAB Psychometric and Technical Manual. The methods used to create the NAB tests provide support for the content-related validity of each test and for the modular structure of the NAB.

Internal Structure of the NAB
Intercorrelations among NAB scores (see Manual) Exploratory factor analyses (EFA) Confirmatory factor analyses (CFA)

Exploratory Factor Analyses
Separate EFAs were performed for the primary scores of the Screening Module and the primary scores of the main modules. The NAB standardization sample was used for the EFAs Factors were extracted using principal axis factoring Promax rotation of retained factors. For both EFAs, three- to six-factor solutions were examined. All factor solutions were interpreted using traditional methods (e.g., evaluation of scree plot and eigenvalues). The theoretical underpinnings of the NAB and the meaningfulness of the constructs were interpreted according to the recommendations of Gorsuch (1983a, 1996) Separate EFAs were performed for the primary scores of the Screening Module and the primary scores of the main modules. Several steps were taken to reduce the impact of method variance The NAB standardization sample was used for the EFAs Factors were extracted using principal axis factoring followed by Promax rotation of retained factors. For both the Screening Module primary scores and the main module primary scores, three- to six-factor solutions were examined . All factor solutions were interpreted using traditional methods (e.g., evaluation of scree plot and eigenvalues). The theoretical underpinnings of the NAB and the meaningfulness of the constructs was examined according to the recommendations of Gorsuch (1983a, 1996) were also taken into consideration.

Summary of EFA Models These EFAs were conducted as a means of forming additional hypotheses regarding the number and composition of the latent factors that underlie the observed NAB data. The obtained EFA models show a fair degree of concordance with the conceptual model of neuropsychological constructs that underlies the NAB. The hypotheses generated by the EFAs were evaluated in the construct-testing process of the subsequent confirmatory factor analyses (CFA). CFA is intended as a theory or construct evaluation procedure (Stevens, 1996), and thus CFA results bear directly on establishing the validity of the NAB Domain and Index scores. Even though the Module Index scores are organized into conceptual cognitive areas, it was fully recognized from the inception of development of the NAB that there is considerable construct overlap in many of the conceptual domains assessed by the NAB, and that many of the NAB tests are multifactorial in nature. Furthermore, some tests are more dependent on speeded performance than others, and the modality of test stimulus presentation also affects the factor loadings. Although the exploratory factor solutions presented in this section vary somewhat from solution to solution, the EFAs do lend evidence, in general, that the NAB measures multiple conceptual domains and that the factor structure is highly consistent with the modular development and the related conceptual neuropsychological domains. The subsequent section describes the results of confirmatory factor analysis (CFA) methods that were used to compare and contrast the model-fit of the obtained EFA factor solutions. Although the NAB test content and resulting test score configurations were based on an extensive review of the neuropsychological literature and multiple iterations of refining the test measures, the hypothesized internal structure was examined empirically with exploratory factor analytic (EFA) techniques. These analyses were conducted as a means of forming additional hypotheses regarding the number and composition of the latent factors that underlie the observed data. Although there are a number of criticisms of EFA methodology (e.g., Mulaik, 1987; Nunnally, 1978), the EFAs presented lend a degree of evidence of a multi-factorial battery. Furthermore, most of the factors extracted by the EFA solutions show a fair degree of concordance with the NAB conceptual model of neuropsychological constructs. One consistent finding of the EFAs was to suggest a potential construct that can be conceptualized by psychomotor speed. Consequently, this hypothesis was evaluated in the construct-testing process of the subsequent Confirmatory Factor Analysis (CFA). The reader should also be aware that factor solutions obtained from EFA many times show inadequate fits when applied to CFA (Van Prooijen & Van der Kloot, 2001). The primary difference between EFA and CFA is the former methodology is often used to explore or generate hypotheses, whereas the latter is intended as theory or construct evaluation procedure (Stevens, 1996). As such, CFA results bear directly on establishing the validity of the NAB Domains and Indexes.

Confirmatory Factor Analyses
CFAs were performed with the AMOS Version 4.0 structural equation modeling software program. The NAB standardization sample was again used for the analyses. A variety of fit statistics were evaluated. The Comparative Fit Index (CFI) and the root mean square error of approximation (RMSEA) were given priority because they provide more stable and accurate estimates (Hu & Bentler, 1995). CFI values at or above .95 suggest good fit. RMSEA values at or below .06 suggest good fit.

Summary of CFA Results CFA of the Screening Module resulted in a five-factor model that mirrors the five functional neuropsychological domains purportedly measured by the Screening Domain scores. CFA of the Main Modules resulted in a six-factor model that mirrors the five functional domains purportedly measured by the Module Index Scores, plus the presence of an additional psychomotor speed factor that underlies performance on several NAB tests. CFA results support the construct validity of the Screening Domain and Module Index scores.

Validity Evidence Based on Relationships to External Variables

Healthy Validity Study
50 Standardization subjects received a “gold standard” battery of neuropsychological tests within 10 days of their initial NAB examination.

Clinical Validity Studies (~200 Patients)
Mild TBI (Full Clinical Battery): n = 31 Dementia (MMSE, DRS-2): n = 20 MS: n = 31 ADHD: n = 30 HIV/AIDS: n = 19 Aphasia (BNT, Token Test): n = 31 Inpatient Rehab (FIM): n = 39

Convergent Validity Results of “Healthy Validity” study and Clinical Group studies were used to examine the convergent and divergent validity of individual NAB scores and indexes. All validity coefficients are presented in the NAB Psychometric and Technical Manual. The following selected data are illustrative of the overall convergent validity findings.

Neuropsychological Assessment Battery (NAB): Introduction and Overview

Similar presentations

Presentation on theme: "Neuropsychological Assessment Battery (NAB): Introduction and Overview"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Neuropsychological Assessment Battery (NAB): Introduction and Overview

Similar presentations

Presentation on theme: "Neuropsychological Assessment Battery (NAB): Introduction and Overview"— Presentation transcript:

Similar presentations

About project

Feedback