Presentation is loading. Please wait.

Presentation is loading. Please wait.

Misinterpretation of data, the importance of metadata and STC math Misinterpretation of data, the importance of metadata and STC math DLI Atlantic Training.

Similar presentations


Presentation on theme: "Misinterpretation of data, the importance of metadata and STC math Misinterpretation of data, the importance of metadata and STC math DLI Atlantic Training."— Presentation transcript:

1 Misinterpretation of data, the importance of metadata and STC math Misinterpretation of data, the importance of metadata and STC math DLI Atlantic Training DLI Atlantic Training April 2005

2 Data Misinterpretation: Crime Rates Ebert & Roeper review of Michael Wilson movie “Michael Moore hates America” Ebert doubted claim that Canadian crime rate 2X the USA rate Moorelies.com | News: Whoa; Stuart Didn't See That One Coming Moorelies.com | News: Whoa; Stuart Didn't See That One Coming Ebert conceded that the statistics supported claim - figures were right BUT - comparison of STC and US Bureau of Justice website shows how statistics misinterpreted Crimes per 100,000 population - 2003 CanadaUSA All Crimes 8,5304,267 Violent crimes 958523 Property crimes 4,2753,744

3 Comparative Crime Rates Simplistic comparison –Similar category titles on violent and property crimes but different definitions –Violent crime 2-3 times higher in US, property crimes close –Bureau of Justice Statistics Crime & Justice Data Online Bureau of Justice Statistics Crime & Justice Data OnlineBureau of Justice Statistics Crime & Justice Data Online –Canadian Statistics - Crimes by type of offence Canadian Statistics - Crimes by type of offenceCanadian Statistics - Crimes by type of offence Crimes per 100,000 population - 2002 CanadaUSA Violent crime homicide1.95.6 robbery85146 comparison of US (rape and aggravated assault) difficult with Cdn sexual assault and assaults) Property Crime B & E (Cdn) – Burglary (US) 879746 Theft (Cdn) - Larceny & Theft (US) 2,1912,446 Motor Vehicle theft 516432

4 US Crime Data

5 Canadian Crime Data

6 Data Misinterpretation: Drinking Habits of Canadians Data Misinterpretation: Drinking Habits of Canadians Initial analysis of the 1990 Health Promotion Survey, indicated Canadians enjoyed an average 60 drinks per day….

7 Data Misinterpretation: Importance of Metadata Data Misinterpretation: Importance of Metadata 1990 Health Promotion Survey there were a series of questions about alcohol consumption. First they asked if the respondent EVER drank alcohol, and if YES asked if they drank within the last 12 months and if YES asked for number of drinks for each day for the past 7 days. The code book showed number of drinks per day as: 81 F4MON 2 0096 ‑ 0097 HOW MANY DRINKS DID YOU HAVE ON: MONDAY 81 F4MON 2 0096 ‑ 0097 HOW MANY DRINKS DID YOU HAVE ON: MONDAY 00 NONE 4651 7334907 00 NONE 4651 7334907 01:40 NUMBER OF DRINKS 403 2585080 01:40 NUMBER OF DRINKS 403 2585080 41 MORE THAN 40 DRINKS 1 106 41 MORE THAN 40 DRINKS 1 106 98 QUESTION NOT ASKED 7648 0567910 98 QUESTION NOT ASKED 7648 0567910 99 NOT STATED 89 155377 99 NOT STATED 89 155377 82 F4TUE 2 0098 ‑ 0099 HOW MANY DRINKS DID YOU HAVE ON: TUESDAY 00 NONE 4608 7306101 01:40 NUMBER OF DRINKS 1447 2613991 82 F4TUE 2 0098 ‑ 0099 HOW MANY DRINKS DID YOU HAVE ON: TUESDAY 00 NONE 4608 7306101 01:40 NUMBER OF DRINKS 1447 2613991 98 QUESTION NOT ASKED 7648 10567910 98 QUESTION NOT ASKED 7648 10567910 99 NOT STATED 89 155377 99 NOT STATED 89 155377 (Raw Weighted) (Raw Weighted)

8 Metadata for PUMFS With Public Use Microdata Files, the code book is very important –Gives questions asked and codes used for responses –“Missing values”, “refusals”, “don’t know” and “not applicable” numeric codes are often assigned –Not consistent in the numeric codes used –Numeric codes that to most software would seem to be valid response

9 Metadata STC Policy on Informing Users of Data Quality In place since 1978 Tightened up 2000 in response to 1999 AG report Recognition that “All statistics are to some extent estimates” Statistics to be used with awareness of strengths and weaknesses – “fitness for use” Key tool is the Integrated Meta Database (Definitions, data sources and methods) (Definitions, data sources and methods)

10 Metadata Important to find STC metadata and use it Definitions, Data Sources and Methods –Questionnaire and reporting guides Survey Description Data sources and methodology Data Accuracy Documentation Contact us

11 Definitions, Data Sources and Methods

12 Online Catalogue Canadian Community Health Survey: public use microdata file: Product main page Canadian Community Health Survey: public use microdata file: Product main page Canadian Community Health Survey: public use microdata file: Product main page

13 DLI Website DLI - Canadian Community Health Survey Cycle 1.1 DLI - Canadian Community Health Survey Cycle 1.1 DLI - Canadian Community Health Survey Cycle 1.1 DLI listserv: Ask and we will find out from the Division!

14 Data Quality Symbols

15 Use metadata to avoid key pitfalls Collection methodology Questionnaire Data quality: sample size, response rates Definitions Conceptual changes Survey coverage Reweighting/rebasing

16 STC Math Random rounding Percentages and percentage points Central tendencies (mean, median and mode) Current vs constant dollars Raw vs seasonally adjusted


Download ppt "Misinterpretation of data, the importance of metadata and STC math Misinterpretation of data, the importance of metadata and STC math DLI Atlantic Training."

Similar presentations


Ads by Google