Presentation is loading. Please wait.

Presentation is loading. Please wait.

Diagnostics Mathematics Assessments: Main Ideas  Now typically assess the knowledge and skill on the subsets of the 10 standards specified by the National.

Similar presentations


Presentation on theme: "Diagnostics Mathematics Assessments: Main Ideas  Now typically assess the knowledge and skill on the subsets of the 10 standards specified by the National."— Presentation transcript:

1

2 Diagnostics Mathematics Assessments: Main Ideas  Now typically assess the knowledge and skill on the subsets of the 10 standards specified by the National Council of Teachers of Mathematics  Designed to identify specific strengths and weaknesses in skill development  Attempt to assess a wide variety of skills  Fewer diagnostic math assessments than reading since math is more clear cut

3 Purpose for Assessing Math  Provide detailed information so that teachers and interventionists can determine a student’s mastery of skills and plan individualized math instruction  Provide teachers with specific information on the kinds of items that students pass or fail Gives insight into how curriculum and instruction are working in the class Also allows for modification of the curriculum

4 Purpose for Assessing Math  Teachers need to know if students have mastered facts and concepts  Occasionally used to make exceptionality and eligibility decisions  Often used to establish special learning needs and eligibility for programs for children with learning disabilities in math

5 National Council of Teachers of Mathematics  Suggest that a curriculum follow these in each and grades just at different levels.  Content Standards  Process Standards

6 National Council of Teachers of Mathematics  Content Standards- followed at all grades  Numbers and Operations  Algebra  Geometry  Measurement  Data Analysis and Probability

7 National Council of Teachers of Mathematics  So, you ask, what would these look like in First grade?  Numbers and Operations- 3 + 1+  Algebra- 3 + ☐ = 4  Geometry- What shape is  + __________  Measurement- measure the temperature, time etc.  Data Analysis and Probability- Graph how many people have teddy bears and how many have teddy dogs, teddy rabbits

8 National Council of Teachers of Mathematics  Process Standards  Problem Solving  Reasoning and Proof  Communication  Connections  Representation

9 National Council of Teachers of Mathematics  What does it look like in first grade for Process Standards  Reasoning and Proof  Complete the patter  …

10 Group Mathematics Assessment and Diagnostic Evaluation (G-MADE)  Group administered, norm-referenced, standard based test for assessing the math skills of students in K-12  Purpose: to identify specific math skill development strengths and weaknesses and to lead to teaching strategies  Test materials include a CD that provides a cross- reference between specific math skills and teaching resources  Diagnosis of skills is broad

11 G-MADE Subtests Concepts and Communication  Measures student knowledge of the language, vocabulary, and representations of math Operation and Computation  Measures skills in using the basic operations of addition, subtraction, multiplication, and division Process and Application  Measures skill in taking in the language and concepts of math and applying the appropriate operations and computations to solve a word problem

12 G-MADE Scores Raw scores can be converted to standard scores with a mean of 100 and a standard deviation of 15 Growth Scale Values are provided to track growth of math skills Can track growth over one year or from year to year

13 Test Materials  Teacher’s Manual  Student Booklets  Answer Sheets  Hand-Scoring Template  Technical Manual  Age-Based Norms and Grade-Based Out of Level Norms Supplement  Scoring and Reporting Software

14 Reliability  All reliabilities exceed.74 with more than 90% exceeding.80  Only low reliabilities are 7 th grade Concepts and Communications and Process and Applications at all grades beyond 4 th  Internal consistency and stability are sufficient for using the test to make decisions about individuals

15 Validity  Content is based on NCTM standards  Created based on year long study of standards, curriculum benchmarks, score and sequence commonly used in math textbooks, and review of research based on best math practices for teaching concepts and skills  Many studies support criterion related validity of test  In comparison with KeyMath, all correlations were in excess of.80, making the 2 tests highly comparable

16 Other Information  Test is not timed since it is meant to test power not speed  Older students can complete test in one hour long session where most students finish in about 45 minutes  With younger students, multiple, short testing sessions are recommended

17 KeyMath-3 Diagnostic Assessment (KeyMath-3 DA)  An untimed, individually administered, norm- referenced test designed to provide a comprehensive assessment of essential math concepts and skills in individuals ages 4 years, 6 months through 21 years  Time: 30-40 minutes in lower elementary and 70-90 minutes for older students  Provides a means of monitoring individual’s progress over time with 2 parallel forms that can be administered in alternating sequence every 3 months  Also provides Growth Scale Values (GSVs), a type of developmental scale score

18 Uses for KeyMath-3 DA  Assess math proficiency by providing comprehensive coverage of concepts and skills taught in regular math instruction  Assess student progress in math  Support instructional planning  Support educational placement decisions

19 KeyMath-3 DA  2 parallel forms (A and B) of the test  Each test has 372 items divided into the following subtests:  Numeration  Algebra  Geometry  Measurement  Data Analysis and Probability  Mental Computation and Estimation  Addition and Subtraction  Multiplication and Division  Foundations of Problem Solving  Applied Problem Solving

20 KeyMath-3 DA Resources  Manual  Two free standing easels for either Form A or B  25 record forms with detachable Written Computation Examinee Booklets  Two additional products that are available:  ASSIST Scoring and Reporting Software Program  KeyMath-3 DA Essential Resources Instructional Program

21 KeyMath-3 DA Scores  Can be hand scored or by using software  Relative Standing: scale scores, standard scores, percentile rank  Developmental Scores: grade and age equivalents, growth scale values  Composite Scores: basic concepts, operations, application  Software can produce progress reports, narrative summaries, export scores to Excel, parent reports

22 Reliability  Internal Consistency – low in K and 1 st but in other ages exceed.80  Alternate Form – exceed.80 with exception of different forms for Geometry and Data Analysis and Probability  Adjusted Test-Retest – based on 103 students, grades K-12 generally exceed.80 with exception of Foundations of Problem Solving (.70) and Geometry (.78) subtests  Adequate for screening and diagnostic purposes

23 Validity  Correlates very highly with scores on KeyMath- Revised normative update and scores on Kaufman Test of Educational Achievement, Measures of Academic Progress (MAP), and G-MADE  Evidence for content validity is good based on alignment with state and NCTM standards

24 Weaknesses for Diagnostic Math Assessments  Recurring issue of curriculum match  Selecting appropriate test for the type of decision to be made  Do not test a sufficiently detailed sample of math concepts and facts – must generalize  Due to weaknesses, tests are not very useful in assessing readiness or strengths and weaknesses in order to plan instructional programs  Preferred practice is for teachers to develop curriculum-based achievement tests that exactly parallel curriculum being taught

25 Goal of Oral and Written Language Assessments “The assessment of language competence should include evaluation of a student’s ability to process, both in comprehension and in expression, language in a spoken or written format.”

26 Major Communication Processes 1. Oral Comprehension – listening and comprehending speech 2. Written Comprehension – reading 3. Oral Expression – speaking 4. Written Expression - writing

27 Related Terminology Language Component Reception/ Comprehensio n Expression/ Production PhonologyHearing and discriminating speech sounds Articulating speech sounds Morphology and Syntax Understanding the grammatical structure of language Using the grammatical structure of language SemanticsUnderstanding vocabulary, meaning, and concepts Using vocabulary, meaning, and concepts Pragmatics and Supralinguistics Understanding a speaker’s or writer’s intentions Using awareness of social aspects of language

28 Considerations in Assessing Oral Language Cultural Diversity Birth place, pronunciations, comparing with the same language community Developmental Considerations Sounds, linguistic structures, and some semantic elements are developmental

29 Considerations in Assessing Written Language Content – Production Formulating, elaborating, sequencing, clarifying, and precise word choice to convey meaning Form Penmanship, spelling, and style rules

30 Observing Language Behavior The following are the three main procedures for gathering a sample of a student’s language behavior. Spontaneous Language Imitation Elicited Language

31 Observing Language Behavior Advantages to Spontaneous Language Spontaneity is the best and most natural indicator of everyday language performance. Informality makes assessment easy, no formal testing atmosphere.

32 Observing Language Behavior Disadvantages of Spontaneous Language There is a non-standard nature to the data collected by this type of test. This test can take a very long time to collect data.

33 Observing Language Behavior Advantages of Imitation Overcomes many of the problems associated with the spontaneous approach. Assesses many different language elements to give a representative view of child’s language system Structure of the test allows examiner to know all elements of language being assessed. Test can be administered much more quickly than with spontaneous tests.

34 Observing Language Behavior Disadvantages of Imitation Children’s auditory memory may effect the results – a child can score well by imitation without demonstrating productive knowledge of the language structures being tested. A child can repeat exactly what is said if the utterance or sentence is too small requiring no memory processing. Children become very bored and can’t sit still. There is no stimuli like pictures or toys present. Just the repetition of repeating 50 to 100 sentences after the examiner.

35 Observing Language Behavior Advantages to Elicited Language Pictures can be structured to test desired language elements while retaining some of the spontaneous language samples. Allows children to create language on their own. There is no time limit so results do not depend on child’s word retention ability.

36 Observing Language Behavior Disadvantages of Elicited Language Difficult to find pictures to guarantee exact word or sentence response. Child may not produce or attempt to produce the desired language structure.

37 Tests Test of Written Language – 4 th (ed) (TOWL-4) Test of Language Development: Primary – 4 th edition (TOLD-P:4) Test of Language Development: Intermediate – 4 th edition (TOLD-I:4) Oral an Written Language Scales (OWLS) Test of Auditory Reasoning and Processing Skills (TARPS)

38 Six Subtests Sentence combining. The child is required to form one compound or complex sentence from two or more simple sentences spoken by the examiner. Picture vocabulary. The child points to the picture that best represents a series of two-word items. Word ordering. The child forms a complete, correct sentence from a randomly-ordered string of words, ranging from three to seven in length. Relational vocabulary. The child tells how three words, spoken by the examiner, are alike. Morphological comprehension. The child distinguishes between grammatically correct and incorrect sentences. Multiple meanings. The examiner says a word and the student responds by saying as many different meanings for that word as he/she can think of.

39 Reliability and Validity TOLD-I:4 appears to meet and often exceed the standards for reliability for making screening and diagnostic decisions. The coefficients for reliability exceed 0.90 Unlike the TOLD – P:4, there is good evidence for construct validity of this test which is based on oral language ability which is known to be related to literacy and this test has a high correlation with reading and writing abilities.

40 Oral and Written Language Scales (OWLS) Individually administered assessment of receptive and expressive language. Test includes three scales: - Listening Comprehension - Oral Expression - Written Expression Recommended uses: Ages 3 – 21 To determine broad levels of language skills and specific performance in listening, speaking, and writing. Create intervention plans, and monitor student progress scores can be converted to obtain age equivalents/percentiles, etc.

41 Listening Comprehension Takes approx. 5 – 15 min Measures understanding of spoken language 111 items – examiner reads aloud a verbal stimulus. The student has to identify which 4 pictures is the best response to the stimulus.

42 Oral Expression Takes approx. 5-15 min. Measures understanding of and use of spoken language. 96 items – examiner reads aloud a verbal stimulus and shows a picture. Student responds orally by either answering a question, completing a sentence, or generating one or more sentences.

43 Written Expression Timed response test Measures ability of students 5- 21 yrs old regarding use spelling, punctuation, syntax – sentence structure, phrases, etc., and communicate with appropriate content, coherence, organization, etc. The student responds to direct writing prompts by the examiner.

44 Reliability and Validity There are wide ranges in reliability coefficients for this test. Results of this test are sufficient to use as a screening device but are not sufficient to use in making important decisions about individual students. Authors of this test report that the validity studies comparing these subtests to established criterion measured tests were similar in performance and within the expected range of validity.

45 Theory of multiple intelligences Heredity Learn through experiences Today most theorists recognize the importance of both heredity and experience.

46 Intelligence test results are used to determine eligibility for special services. School Psychologists are trained professionals who administer Intelligence Tests. IQ tests are helpful in providing general information as to how to pace instruction.

47 An inferred ability; to explain differences in present behavior and to predict differences in future behavior. It is a general ability that enables people to do many different things.

48 A child’s background experiences and learning opportunities that they already have. Culture Experiences available in one’s environment Age …..that may influence the psychological demands presented by the test. ***Failure is NOT due to an inability to comprehend or solve a problem, but a deficiency in background experience***

49 Discrimination: identify the item that is different from the others Generalization: given a stimulus, identify from a group the one that goes with the stimulus Motor Behavior: requires motor response in duplicating a geometric design using blocks, tracing a path through a maze, or reconstructing designs from memory. General Knowledge: factual questions Vocabulary: naming pictures or reading a definition and selecting a picture (depending on age)

50 Induction: State a rule or principle from a series of objects Comprehension: 3 types: those related to directions, to printed material, or to social customs and mores. Sequencing: identify the response that continues a series Detail Recognition: identify the missing parts of a picture Analogical Reasoning: How things are related to each other “A : B :: C : _____?

51 Pattern Completion: completing a pattern or identifying a missing part of a pattern Abstract Reasoning: identify the absurdity in a picture or verbal statement Memory: many different assessments are used to measure memory, ex. verbatim repetition of a sentence or series of numbers

52 Individual Tests: given one on one by a certified evaluator; most commonly used for educational placement decisions.

53 Three types of Intelligence Tests Group Tests: may be used as a screening tool for individual students, or to gain information about groups of students.

54 Nonverbal Intelligence Tests: Picture- Vocabulary test; Administered to non-readers, ELL’s and hearing impaired students. * This test measures only one aspect of intelligence (receptive vocabulary,) and should not be used to determine eligibility for special services.

55 Developed by David Wechsler in 1949, is has since had several revisions. Wechsler states, “intelligence is the overall capacity of an individual to understand and cope with the world around him.” The test is a measure of the cognitive ability and problem-solving process of a person ages 6 years to 16 years, 11 months.

56

57 Subtests; Core and Supplemental*: Verbal Comprehension Index (VCI) Similarities Vocabulary Comprehension Information* Word Reasoning*

58 Wechsler Intelligence Scale for Children-IV (WISC-IV ) Subtests; Core and Supplemental*: Perceptual Reasoning Index (PRI) Block Design Picture Concepts* Matrix Reasoning* Picture Completion

59 Wechsler Intelligence Scale for Children-IV (WISC-IV ) Subtests; Core and Supplemental*: Working Memory Index (WMI) Digital span Letter-Number Sequencing* Arithmetic

60 Wechsler Intelligence Scale for Children-IV (WISC-IV ) Subtests; Core and Supplemental*: Processing Speed Index (PSI) Coding Symbol Search Cancellation*

61 The full-scale IQ (FSIQ) is reliable enough to make important educational decisions. There is not enough information gathered from the subtests alone to make the educational decisions.

62 When using the WISC-VI to determine educational needs for a student, examiners should only use the FSIQ.

63 timed test sample 2 minutes 9 blocks

64 Pick one picture from each row with common characteristics

65

66 Look at this picture. What part is missing?

67 Measures general intellectual ability, specific cognitive abilities, scholastic aptitudes, oral language and achievement. Individually administered and norm-referenced For ages 2-90+ Computer scored Each Test Record contains a seven-category Test Session Observation Checklist to rate a student’s conversational proficiency, cooperation, activity, attention and concentration, self-confidence, care in responding and response to difficult tasks.

68 20 subtests measuring broad and narrow abilities Comprehension-knowledge, long-term retrieval, visual-spatial thinking, auditory processing, fluid reasoning, processing speed, short- term memory. Subtests can be combined to create additional clusters for verbal ability, thinking ability, cognitive efficiency, phonemic awareness and working memory. Additional supplemental subtests create more clusters, broad attention, cognitive fluency and executive processes

69 22 tests can be combined to form several clusters. Subtests and clusters from the standard battery can be combined to form scores for broad areas in reading, math and writing. Oral expression, listening comprehension, basic reading skills, reading comprehension, phoneme/grapheme knowledge, math calculation skills, math reasoning, written expression

70 Individual tests are combined to provide clusters for educational decision making Cluster reliabilities for some age groups are less than.90, but all median reliabilities across age groups for the standard and broad cognitive and achievement clusters exceed.90

71 Careful item selection is consistent with claims for the content validity of both tests Studies using a broad range of individuals provides evidence for validity For the Cognitive Ability Tests, the correlations between the WJ-III General Intellectual Ability score and the WISC-III Full-Scale IQ range from.69 to.73 For the Achievement Tests, the pattern and magnitude of correlations between the Wechsler Individual tests suggest that the WJ-III measures the same skills similar to those measured by other achievement tests.

72 A non-timed test primarily given to younger children and ELL’s Assesses the receptive(hearing) vocabulary of examinees It consists of stimuli sets of 12 and examinees are tested at their ability or age level As part of a broader assessment, can be useful in evaluating language competence, selecting the level and content of instruction and measuring learning The assessment of vocabulary is also useful when evaluating the effects of injury or disease It is individually administered using an easel Available in Spanish

73 Examinees earn a raw score based on the number of pictures correctly identified between basal and ceiling items Basal - the lowest set administered that contains one or no errors Ceiling – the highest set administered that contains eight or more errors Testing is discontinued once a ceiling is established

74 Multiple kinds of reliability are reported The scores of a PPVT-4 test are very precise and consistent Data also included on the testing and performance of students with disabilities

75 Five studies were conducted and indicate that there is adequate validity Slightly lower correlations were found on assessments that measured broader areas of language than primarily vocabulary Data is also provided on how students with speech and language impairments, hearing impairments, specific learning disabilities, mental retardation, giftedness, emotional/behavioral disturbances and ADHD, perform in relation to the general population Results indicate the value of the PPVT-4 in assessing these special populations

76 Assessing children’s IQ is controversial Intelligence tests assess samples of behavior Different intelligence tests sample different behaviors Educators must always ask “IQ on what test?” Test authors have their own definitions of intelligence and therefore test those items/behaviors they feel represent their definition When interpreting intelligence scores, avoid making judgments that suggest that the score represents much more than the specific behaviors sampled The quality of measurement can be affected by several different types of student characteristics and therefore must be taken into consideration

77 “Many of the behaviors sampled on intelligence tests are more indicative of actual achievement than ability to achieve.” For example, “students who have had more opportunities to learn and achieve are likely to perform better than those who have had less exposure to information, even if they both have the same overall potential to learn.” “Intelligence tests are by no means a pure representation of a student’s ability to learn.”


Download ppt "Diagnostics Mathematics Assessments: Main Ideas  Now typically assess the knowledge and skill on the subsets of the 10 standards specified by the National."

Similar presentations


Ads by Google