1 Data Linkage Strategies Shihfen Tu, Ph.D. University of Maine

Slides:



Advertisements
Similar presentations
TWO STEP EQUATIONS 1. SOLVE FOR X 2. DO THE ADDITION STEP FIRST
Advertisements

Delta Confidential 1 5/29 – 6/6, 2001 SAP R/3 V4.6c PP Module Order Change Management(OCM)
You have been given a mission and a code. Use the code to complete the mission and you will save the world from obliteration…
Finding The Unknown Number In A Number Sentence! NCSCOS 3 rd grade 5.04 By: Stephanie Irizarry Click arrow to go to next question.
Advanced Piloting Cruise Plot.
After Baptism, What Then?. Things To Remember... You are a new creature 2Co 5:17; Ro 6:3-4 You are a babe in Christ 1Co 3:1-2; He 5:12-14 You are in a.
Copyright © 2003 Pearson Education, Inc. Slide 1 Computer Systems Organization & Architecture Chapters 8-12 John D. Carpinelli.
Chapter 1 The Study of Body Function Image PowerPoint
1 Copyright © 2010, Elsevier Inc. All rights Reserved Fig 2.1 Chapter 2.
By D. Fisher Geometric Transformations. Reflection, Rotation, or Translation 1.
Document #07-12G 1 RXQ Customer Enrollment Using a Registration Agent Process Flow Diagram (Switch) Customer Supplier Customer authorizes Enrollment.
Document #07-12G 1 RXQ Customer Enrollment Using a Registration Agent Process Flow Diagram (Switch) Customer Supplier Customer authorizes Enrollment.
1 Probabilistic Linkage: Issues and Strategies Craig A. Mason, Ph.D. University of Maine
Business Transaction Management Software for Application Coordination 1 Business Processes and Coordination.
Jeopardy Q 1 Q 6 Q 11 Q 16 Q 21 Q 2 Q 7 Q 12 Q 17 Q 22 Q 3 Q 8 Q 13
Jeopardy Q 1 Q 6 Q 11 Q 16 Q 21 Q 2 Q 7 Q 12 Q 17 Q 22 Q 3 Q 8 Q 13
Title Subtitle.
Determine Eligibility Chapter 4. Determine Eligibility 4-2 Objectives Search for Customer on database Enter application signed date and eligibility determination.
My Alphabet Book abcdefghijklm nopqrstuvwxyz.
Multiplying binomials You will have 20 seconds to answer each of the following multiplication problems. If you get hung up, go to the next problem when.
0 - 0.
DIVIDING INTEGERS 1. IF THE SIGNS ARE THE SAME THE ANSWER IS POSITIVE 2. IF THE SIGNS ARE DIFFERENT THE ANSWER IS NEGATIVE.
FACTORING ax2 + bx + c Think “unfoil” Work down, Show all steps.
Addition Facts
Year 6 mental test 5 second questions
Year 6 mental test 10 second questions
ZMQS ZMQS
Richmond House, Liverpool (1) 26 th January 2004.
REVIEW: Arthropod ID. 1. Name the subphylum. 2. Name the subphylum. 3. Name the order.
Randomized Algorithms Randomized Algorithms CS648 1.
(This presentation may be used for instructional purposes)
ABC Technology Project
Hash Tables.
© Paradigm Publishing, Inc Access 2010 Level 1 Unit 1Creating Tables and Queries Chapter 2Creating Relationships between Tables.
15-1 Copyright © 2013 Pearson Education, Inc. publishing as Prentice Hall Chapter 15 Money and Banking.
1 Undirected Breadth First Search F A BCG DE H 2 F A BCG DE H Queue: A get Undiscovered Fringe Finished Active 0 distance from A visit(A)
VOORBLAD.
1 Breadth First Search s s Undiscovered Discovered Finished Queue: s Top of queue 2 1 Shortest path from s.
“Start-to-End” Simulations Imaging of Single Molecules at the European XFEL Igor Zagorodnov S2E Meeting DESY 10. February 2014.
BIOLOGY AUGUST 2013 OPENING ASSIGNMENTS. AUGUST 7, 2013  Question goes here!
Factor P 16 8(8-5ab) 4(d² + 4) 3rs(2r – s) 15cd(1 + 2cd) 8(4a² + 3b²)
Squares and Square Root WALK. Solve each problem REVIEW:
1..
Do you have the Maths Factor?. Maths Can you beat this term’s Maths Challenge?
CONTROL VISION Set-up. Step 1 Step 2 Step 3 Step 5 Step 4.
© 2012 National Heart Foundation of Australia. Slide 2.
Lets play bingo!!. Calculate: MEAN Calculate: MEDIAN
Understanding Generalist Practice, 5e, Kirst-Ashman/Hull
Chapter 5 Test Review Sections 5-1 through 5-4.
GG Consulting, LLC I-SUITE. Source: TEA SHARS Frequently asked questions 2.
Addition 1’s to 20.
Model and Relationships 6 M 1 M M M M M M M M M M M M M M M M
25 seconds left…...
Januar MDMDFSSMDMDFSSS
Week 1.
We will resume in: 25 Minutes.
©Brooks/Cole, 2001 Chapter 12 Derived Types-- Enumerated, Structure and Union.
A SMALL TRUTH TO MAKE LIFE 100%
PSSA Preparation.
TASK: Skill Development A proportional relationship is a set of equivalent ratios. Equivalent ratios have equal values using different numbers. Creating.
Immunobiology: The Immune System in Health & Disease Sixth Edition
Immunobiology: The Immune System in Health & Disease Sixth Edition
CpSc 3220 Designing a Database
Traktor- og motorlære Kapitel 1 1 Kopiering forbudt.
DISTRIBUSI PROBABILITAS KONTINYU Referensi : Walpole, RonaldWalpole. R.E., Myers, R.H., Myers, S.L., and Ye, K Probability & Statistics for Engineers.
Presentation transcript:

1 Data Linkage Strategies Shihfen Tu, Ph.D. University of Maine

2 Faculty Disclosure Information In the past 12 months, I have not had a significant financial interest or other relationship with the manufacturer(s) of the product(s) or provider(s) of the service(s) that will be discussed in my presentation. This presentation will not include discussion of pharmaceuticals or devices that have not been approved by the FDA.

3 Acknowledgements University of Maine –Quansheng Song –Cecilia Cobo-Lewis Maine Bureau of Health –Kim Church –Pat Day –Ellie Mulcahy –Toni Wall

4

5 Data Linkage

6

7

8 Data Linkage - Probabilistic

9

10 Data Linkage - Probabilistic

11 Data Linkage - Probabilistic

12 Data Linkage - Inconsistency

13 Data Linkage - Inconsistency Inconsistency Detected Correcting…. Message

14 Inconsistencies Record in EHDI links to two records in other database The other source indicates the records belong to different people How to address depends on processing of other database EHDI_ID=394 Brad A. Graham ID=4484 Brad A. Graham ID=7354 Brad Graham

15 Inconsistencies Other source not de-duplicated ? Other source de-duplicated, but insufficient evidence to conclude ID=4484 and ID=7354 are the same person ? –BD may provide additional information so that these probabilities have changed ID=4484 Brad A. Graham ID=7354 Brad Graham EHDI_ID=394 Brad A. Graham

16 Inconsistencies EHDI_ID=394 John A. Graham ID=4048 John A. Graham ID=4048 Jon A. Graham EHDI_ID=948 Jon A. Graham ID=9324 Jon Graham EHDI_ID=948 Jon Graham

17 How this "cross-over" is resolved depends on whether one or neither file is given precedence Influenced by probabilistic de-duplication process performed after a linkage Inconsistencies

18 Linkage Creep EHDI Database contributes an individual,Catherine A. Sampson

19 Linkage Creep Link the Electronic Birth Certificate –Name is Catherine A. Simpson –Are these the same person? –Perform probabilistic match Require.85 probability of a match to conclude two similar records are the same (Critical p =.85) Probability is.90, we conclude theyre the same person

20 Linkage Creep Link Birth Defects Registry Data –Name is Kathy A. Simpson –Are these the same person? –Perform probabilistic match (require.85) P Match is.90, we conclude theyre the same person

21 Linkage Creep If we compare to Catherine A. Sampson –P Match =.81 –Conclude they are NOT the same individual –Would not assign same ID Which is correct?

22 Linkage Creep When is this a problem? –Over time, two distinct individuals may project tendrils composed of combinations of identifiers that statistically overlap in probabilistic space

23 Linkage Creep When is this a problem? –Linkage creep will result in the two distinct individuals being erroneously combined under a single ID

24 Linkage Creep When is this not problem? –Over time, certain key identifiers for an individual are expected to change –This phenomenon will increase as a historical database grows, and as additional sources are input into a centralized system

25 Linkage Creep Complexity of creep in longitudinal datasets –Black records are related to all records –Yellow and Blue records are NOT related to White record –Yellow record is also not related to Red record at

26 Linkage Creep Forbidding creep will result in a single individual being divided into two IDs over time Further challengewhere to divide records into additional IDs?

27 Tools for Evaluating Linkage Inconsistencies can occur in deterministic linkage, but are more common in probabilistic linkages Probabilities that create potential for problems provide a valuable tool for evaluating linkages –Instead of a are two records the same person ? Yes/No –Estimates or indices of how likely it is that two records are the same person Should be able to estimate the number of erroneous linkages Possible to conduct a detailed examination of quality by ignoring very strong and very weak pairings, and only focusing on pairings that are ambiguous