Overview of New State Data Forensics Analysis March 2011

Overview of New State Data Forensics Analysis March 2011

Outline of Presentation
Data Forensics (DF) Process FDOE DF Goals DF Tools & Methods Spring 2011 DF Program Conservative thresholds Students & Schools Summary Q&A

Caveon Data Forensics™ Process
Analyses of test data First building a “model” of typical question responses Identify unusual behaviors with potential of unfair advantage

Caveon Data Forensics Process (continued)
Examples of “Unusual” Behavior Very high agreement among pairs or groups of test takers Very unusual number of erasures, particularly wrong to right Very substantial gains or losses from one occasion to another

Overview of the Use of Data Forensics
Many high-stakes testing programs now using Data Forensics Standards for testing, e.g., “CCSSO’s Operational Best Practices for State Assessment Programs” Essential to act on the results

FDOE Data Forensics Goals
Uphold fairness and validity of test results Identify risks and irregularities Take action based on data and analysis “Measure and Manage” Communicate zero tolerance for cheating

Testing Examiner’s Role
Ensure (and then certify) the test administration is fair and proper Declare scores invalid when fairness and validity are negatively impacted Absolute due diligence when proctoring a test Administering or proctoring a test is not a passive activity!

Forensic Tools and Methods
Similarity: answer-copying, collusion Erasures: tampering Gains: pre-knowledge, coaching Aberrance: tampering, exposure Identical tests: collusion Perfect tests: answer key loss 8

Similarity Our most powerful & “credible” statistic
Measures degree of similarity between 2 or more test instances Analyze each test instance against all other test instances in the test Probable causes of extremely high similarity: Answer copying Test coaching Proxy test taking Collusion 9

Erasures Based on estimated answer changing rates from:
Wrong-to-Right Anything-to-Wrong Find answer sheets with unusual WtR answers Extreme statistical outliers could involve tampering, “panic cheating”, etc. Large numbers of wrong to right. Could indicate tampering, or copying. If change from anything to wrong, that is counter evidence 10

Unusual Gains/Losses Predict score using prior year information
Measure large score increases/decreases against predicted score Which score truly reflects the student’s actual ability or competence? Extreme gains/losses may result from: Pre-knowledge Coaching Student development—visual acuity 11

Spring 2011 Data Forensics Focus on two groups
Student-level School-level Utilize VERY conservative thresholds 12

A quick discussion of conservative thresholds…
Chance of being hit by lightning = 1 in a million Chance of winning the lottery = 1 in 10 million Chance of DNA false-positive = 1 in 30 million Chance of tests being flagged and taken independently = 1 in a TRILLION 13

Student-Level Analysis
Similarity Analysis only Most credible Chance of tests being so similar, and taken independently = 1 in a trillion Invalidate test scores beyond 1012 Fairness and validity of test instance must be questioned Appeals process to be implemented 14

Example: 2010 9th Grade Cluster
Identifies apparent student collusion Definitions “Dominant” = same answer selected by majority of group members “non-Dominant” = different answer selected by majority of group members Example of 2 students that passed, but not independently i.e., they didn’t do their own work

Impact of “1 in a Trillion” Threshold, Math & Reading 2010
Grade N Flagged Tests 3rd 408,317 144 4th 394,039 103 5th 390,714 92 6th 387,502 224 7th 393,401 245 8th 387,190 69 9th 401,046 622 10th 360,176 57 Totals 3,122,385 1,556

School-Level Analysis
Similarity, gains, and erasures Flagged schools conduct local review Extreme instances may prompt formal investigations and sanctions 18

Benefits of Conservative Threshholds
Focus on most egregious instances Provides results that are Explainable Defensible Can move later to different thresholds Easier to manage Walk before we run

Program Results Monitored behavior improves
Invalidations deter cheating

Summary Goal: Fair and valid testing for all students
FDOE will conduct Data Forensics on FCAT/FCAT 2.0/EOC test data Focus on Individual students—extremely similar tests Schools—similarity, gains, and erasures

Overview of New State Data Forensics Analysis March 2011

Similar presentations

Presentation on theme: "Overview of New State Data Forensics Analysis March 2011"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Overview of New State Data Forensics Analysis March 2011

Similar presentations

Presentation on theme: "Overview of New State Data Forensics Analysis March 2011"— Presentation transcript:

Similar presentations

About project

Feedback