Statistics: Unlocking the Power of Data Lock 5 STAT 250 Dr. Kari Lock Morgan Describing Data: Categorical Variables SECTIONS 2.1 One categorical variable.

Slides:



Advertisements
Similar presentations
Introduction to Stats Honors Analysis. Data Analysis Individuals: Objects described by a set of data. (Ex: People, animals, things) Variable: Any characteristic.
Advertisements

C HAPTER 1.1 Analyzing Categorical Data. I NDIVIDUALS AND V ARIABLES Individuals are the objects described by a set of data. Individuals may be people,animals,
Exploratory Data Analysis I
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. Lecture Slides Elementary Statistics Eleventh Edition and the Triola.
Describing Data: One Variable
AP Statistics Section 4.2 Relationships Between Categorical Variables.
Concept Quiz Ch. 1-3 True/False
Statistics: Unlocking the Power of Data Lock 5 STAT 101 Dr. Kari Lock Morgan 9/6/12 Describing Data: One Variable SECTIONS 2.1, 2.2, 2.3, 2.4 One categorical.
© 2010 Pearson Prentice Hall. All rights reserved Organizing and Summarizing Data Graphically.
Describing Data: One Quantitative Variable
STAT 101 Dr. Kari Lock Morgan Exam 2 Review.
CHAPTER 1 Exploring Data
The Practice of Statistics, 5th Edition Starnes, Tabor, Yates, Moore Bedford Freeman Worth Publishers CHAPTER 1 Exploring Data 1.1 Analyzing Categorical.
Statistics: Unlocking the Power of Data Lock 5 1 in 8 women (12.5%) of women get breast cancer, so P(breast cancer if female) = in 800 (0.125%)
Stat 1080 “Elementary Probability and Statistics” By Dr. AFRAH BOSSLY
Statistics: Unlocking the Power of Data Lock 5 STAT 250 Nathaniel Cannon Describing Data: Categorical Variables SECTIONS 2.1 One categorical variable Two.
Chapter 1: Exploring Data
+ The Practice of Statistics, 4 th edition - For AP* STARNES, YATES, MOORE Chapter 1: Exploring Data Introduction Data Analysis: Making Sense of Data.
+ The Practice of Statistics, 4 th edition - For AP* STARNES, YATES, MOORE Chapter 1: Exploring Data Introduction Data Analysis: Making Sense of Data.
Statistics: Unlocking the Power of Data Lock 5 STAT 101 Dr. Kari Lock Morgan 9/11/12 Describing Data: Two Variables SECTIONS 2.1, 2.4, 2.5 Two categorical.
Organizing Data Section 2.1.
Warm-Up List all of the different types of graphs you can remember from previous years:
Chapters 1 and 2 Week 1, Monday. Chapter 1: Stats Starts Here What is Statistics? “Statistics is a way of reasoning, along with a collection of tools.
Exploring Data Section 1.1 Analyzing Categorical Data.
Chapter 1: Exploring Data Sec. 1.1 Analyzing Categorical Data.
Statistics: Analyzing 2 Categorical Variables MIDDLE SCHOOL LEVEL  Session #1  Presented by: Dr. Del Ferster.
The Practice of Statistics, 5th Edition Starnes, Tabor, Yates, Moore Bedford Freeman Worth Publishers CHAPTER 1 Exploring Data 1.0 Introduction Data Analysis:
+ Chapter 1: Exploring Data Section 1.1 Analyzing Categorical Data The Practice of Statistics, 4 th edition - For AP* STARNES, YATES, MOORE Statistics.
+ Chapter 1: Exploring Data Section 1.1 Analyzing Categorical Data The Practice of Statistics, 4 th edition - For AP* STARNES, YATES, MOORE.
+ Chapter 1: Exploring Data Section 1.1 Analyzing Categorical Data The Practice of Statistics, 4 th edition - For AP* STARNES, YATES, MOORE Introduction:
Chapter 3: Displaying and Describing Categorical Data Sarah Lovelace and Alison Vicary Period 2.
Stat1510: Statistical Thinking and Concepts Two Way Tables.
Statistics: Unlocking the Power of Data Lock 5 Exam 2 Review STAT 101 Dr. Kari Lock Morgan 11/13/12 Review of Chapters 5-9.
Aim: How do we analyze data with a two-way table?
+ Chapter 1: Exploring Data Section 1.1 Analyzing Categorical Data.
+ Chapter 1: Exploring Data Section 1.1 Analyzing Categorical Data The Practice of Statistics, 4 th edition - For AP* STARNES, YATES, MOORE.
Statistics: Unlocking the Power of Data Lock 5 STAT 250 Dr. Kari Lock Morgan Probability SECTIONS 11.1, 11.2 Probability (11.1, 11.2) Odds, Odds Ratio.
Statistics: Unlocking the Power of Data Lock 5 STAT 250 Dr. Kari Lock Morgan Describing Data: One Quantitative Variable SECTIONS 2.2, 2.3 One quantitative.
1.1 Analyzing Categorical Data Pages 7-24 Objectives SWBAT: 1)Display categorical data with a bar graph. Decide if it would be appropriate to make a pie.
AP Statistics Section 4.2 Relationships Between Categorical Variables
+ Chapter 1: Exploring Data Section 1.1 Analyzing Categorical Data The Practice of Statistics, 4 th edition - For AP* STARNES, YATES, MOORE.
+ Warm Up Which of these variables are categorical? Which are quantitative?
Synthesis and Review 2/20/12 Hypothesis Tests: the big picture Randomization distributions Connecting intervals and tests Review of major topics Open Q+A.
Chapter 1.1 – Analyzing Categorical Data A categorical variable places individuals into one of several groups of categories. A quantitative variable takes.
+ Chapter 1: Exploring Data Section 1.1 Analyzing Categorical Data The Practice of Statistics, 4 th edition - For AP* STARNES, YATES, MOORE.
Chapter 0: Why Study Statistics? Chapter 1: An Introduction to Statistics and Statistical Inference 1
+ Analyzing Categorical Data Categorical Variables place individuals into one of several groups or categories The values of a categorical variable are.
Unit 2: Exploring Data with Graphs and Numerical Summaries Lesson 2-2a – Graphs for Categorical Data Probability & Stats Essential Question: How do we.
+ Chapter 1: Exploring Data Section 1.1 Analyzing Categorical Data The Practice of Statistics, 4 th edition - For AP* STARNES, YATES, MOORE.
+ Chapter 1: Exploring Data Section 1.1 Analyzing Categorical Data The Practice of Statistics, 4 th edition - For AP* STARNES, YATES, MOORE.
Statistics: Unlocking the Power of Data Lock 5 STAT 250 Dr. Kari Lock Morgan Synthesis and Review for Exam 1.
1.1 ANALYZING CATEGORICAL DATA. FREQUENCY TABLE VS. RELATIVE FREQUENCY TABLE.
© 2012 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Chapter 1: Exploring Data
Chapter 1 Data Analysis Section 1.1 Analyzing Categorical Data.
Warmup Which part- time jobs employed 10 or more of the students?
Chapter 1: Exploring Data
Chapter 1: Exploring Data
Introduction & 1.1: Analyzing categorical data
Chapter 1: Exploring Data
Analyzing One-Variable Data
Chapter 1: Exploring Data
Section 1.1 Analyzing Categorical Data
Chapter 1: Exploring Data
Warmup A teacher is compiling information about his students. He asks for name, age, student ID, GPA and whether they ride the bus to school. For.
Chapter 1: Exploring Data
Chapter 1: Exploring Data
Chapter 1: Exploring Data
Chapter 1: Exploring Data
Chapter 1: Exploring Data
Presentation transcript:

Statistics: Unlocking the Power of Data Lock 5 STAT 250 Dr. Kari Lock Morgan Describing Data: Categorical Variables SECTIONS 2.1 One categorical variable Two categorical variables

Statistics: Unlocking the Power of Data Lock 5 Question of the Day Is cat ownership related to Schizophrenia?

Statistics: Unlocking the Power of Data Lock 5 Toxoplasmosis Toxoplasmosis is a disease caused by the protozoan parasite Toxoplasma gondii (“toxo”), one of the world’s most common parasites Humans can get toxoplasmosis in several ways: 1. Food (undercooked and contaminated meat) 2. Animals (contact with infected cat feces) 3. Mother-to-child transmission (placenta) 4. Blood transfusion/organ transplantation (rare) Toxoplasmosis may cause flu-like symptoms and is dangerous in compromised immune systems, but most infected people are asymptomatic (physically)

Statistics: Unlocking the Power of Data Lock 5 Data How prevalent in Toxoplasmosis? Data from NHANES people aged 6 – 49 were tested for Toxoplasmosis. Of these, 605 people were infected and 3629 were not. One categorical variable. Jones, J.L. Kruszon-Moran, D., Wilson, M. (2007). Toxoplasma gondii Prevalence, United States, Emerging Infectious Diseases, 13(4): Toxoplasma gondii Prevalence, United States

Statistics: Unlocking the Power of Data Lock 5 Frequency Table InfectedNot InfectedTOTAL For one categorical variable, this summarizes all the information in the data A frequency table shows the number of cases that fall in each category: Minitab: Stat -> Tables -> Tally Individual Variables -> Counts

Statistics: Unlocking the Power of Data Lock 5 Bar Chart/Plot/Graph In a bar chart, the height of the bar is the number of cases falling in each category Minitab: Graph -> Bar chart

Statistics: Unlocking the Power of Data Lock 5 Histogram vs Bar Chart A bar chart is for categorical data, and the x-axis has no numeric scale A histogram is for quantitative data, and the x- axis is numeric For a categorical variable, the number of bars equals the number of categories, and the number in each category is fixed For a quantitative variable, the number of bars in a histogram is up to you (or your software), and the appearance can differ with different number of bars

Statistics: Unlocking the Power of Data Lock 5 Proportion The proportion in a category is found by the number in that category divided by the sample size.

Statistics: Unlocking the Power of Data Lock 5 Proportion InfectedNot InfectedTOTAL

Statistics: Unlocking the Power of Data Lock 5 Pie Chart In a pie chart, the relative area of each slice of the pie corresponds to the proportion in each category Minitab: Graph -> Pie Chart

Statistics: Unlocking the Power of Data Lock 5 Relative Frequency Table A relative frequency table shows the proportion of cases that fall in each category All the numbers in a relative frequency table sum to 1 Minitab: Stat -> Tables -> Tally Individual Variables -> Percents InfectedNot Infected

Statistics: Unlocking the Power of Data Lock 5 Toxoplasmosis In the United States, the CDC estimates that 22.5% of the population 12 years and older have been infected with Toxoplasma. (CDC)CDC Why the difference?  Adults only (12 and older different from 6 – 49)  Different sample, different years  Random chance In other places in the world, prevalence is as high as 95% in some populations

Statistics: Unlocking the Power of Data Lock 5 Summary: One Categorical Variable Summary Statistics  Frequency table  Relative frequency table  Proportion Visualization  Bar chart  Pie chart

Statistics: Unlocking the Power of Data Lock 5 Mind Controlling Parasite? Normal rats are terrified of cat pee, toxo-infected rats are drawn to the smell of it Toxo-infected rats are more active, and “less wary of predators in exposed spaces” Toxo-infected humans like cats more Toxoplasmosis is linked to delayed reaction times, and infected people are 2.5 times more likely to get in a car accident Toxoplasmosis is linked to Schizophrenia Lots more… see this article or google the topic.this article

Statistics: Unlocking the Power of Data Lock 5 Question of the Day Is cat ownership related to Schizophrenia?

Statistics: Unlocking the Power of Data Lock 5 Case-Control Study A case-control study is an observational study in which cases are matched with controls  Cases are people with a specific disease or trait  Controls are people similar to the cases that do not have the disease or trait Case-control studies are useful for studying/identifying risk factors for rare diseases Can a case-control study be used to make conclusions about causality? a) Yes b) No

Statistics: Unlocking the Power of Data Lock 5 Cat Ownership and Schizophrenia Multiple case-control studies have been conducted to study the association between cat ownership and schizophrenia Cases were randomly selected from NAMI (National Alliance for the Mentally Ill), almost all of whom had Schizophrenia Controls (from families without mental illness) chosen:  1992 Data: a family friend  1997 Data: matched for age, sex, and socioeconomic status  1982 Data (analyzed 2015): similar families from a survey conducted by the American Veterinary Medical Association Torrey, E.F., Simmons, W., Yolken, R.H. (2015). Is childhood cat ownership a risk factor for schizophrenia later in life?, Schizophrenia Research, June 2015, 165(1):1-2.Is childhood cat ownership a risk factor for schizophrenia later in life?

Statistics: Unlocking the Power of Data Lock 5 Two Categorical Variables Each of these studies recorded (among many others) two different variables: 1. Case or control 2. Whether or not the person had a cat in their house during childhood (birth to age 10 or 13) Is there a relationship between these two variables?

Statistics: Unlocking the Power of Data Lock 5 Side-by-Side Bar Chart (1992) Minitab: Graph -> Barchart -> Cluster

Statistics: Unlocking the Power of Data Lock 5 Side-by-Side Bar Chart (1997)

Statistics: Unlocking the Power of Data Lock 5 Side-by-Side Bar Chart (1982)

Statistics: Unlocking the Power of Data Lock 5 Segmented/Stacked Bar Chart (1982) Minitab: Graph -> Barchart -> Stack

Statistics: Unlocking the Power of Data Lock 5 Two-Way Table It doesn’t matter which variable is displayed in the rows and which in the columns Minitab: Stat -> Tables -> Tally Individual Variables -> Counts 1992 DataCaseControlTotal Cat in house as child No cat in house as child Total

Statistics: Unlocking the Power of Data Lock 5 Dataset Two-Way Table 1992CaseControlTotal Cat No cat Total

Statistics: Unlocking the Power of Data Lock 5 Cats and Schizophrenia What proportion of people in this sample had a cat in the house during their childhood? a)84/165 = 51% b)65/165 = 39% c)149/330 = 45% d)84/149 = 56% 1992 DataCaseControlTotal Cat in house as child No cat in house as child Total

Statistics: Unlocking the Power of Data Lock 5 Cats and Schizophrenia What proportion of cases had a cat in the house during their childhood? a)84/165 = 51% b)65/165 = 39% c)149/330 = 45% d)84/149 = 56% 1992 DataCaseControlTotal Cat in house as child No cat in house as child Total

Statistics: Unlocking the Power of Data Lock 5 Cats and Schizophrenia What proportion of controls had a cat in the house during their childhood? a)84/165 = 51% b)65/165 = 39% c)149/330 = 45% d)65/149 = 44% 1992 DataCaseControlTotal Cat in house as child No cat in house as child Total

Statistics: Unlocking the Power of Data Lock 5 Difference in Proportions A difference in proportions, is the difference in proportions for one categorical variable calculated for different levels of another categorical variable Example: proportion of cases who were cat owners as children – proportion of controls who had a cat in the house as children

Statistics: Unlocking the Power of Data Lock 5 Difference in Proportions 1997 DataCaseControlTotal Cat in house as child No cat in house as child Total What is the difference in proportions (cases – controls) of having a cat in the house as a child? a)136/126 – 220/302 = 0.35 b)136/356 – 126/428 = 0.09 c)136/262 – 220/522 = 0.10 d)136/784 – 220/784 = -0.11

Statistics: Unlocking the Power of Data Lock 5 Creating a Two-Way Table In the 2015 study (1982 data), 1075 out of 2125 cases owned cats as children, and 2065 out of 4847 controls owned cats as children. 1. Create the two-way table. 2. Calculate the difference in proportions.

Statistics: Unlocking the Power of Data Lock 5 Odds If p denotes the proportion, the odds are defined as Interpreting odds  Odds of 1 indicate 50/50  p < 0.5 yield odds < 1  p > 0.5 yield odds > 1 Odds of 3, or 3:1, mean that out of 4 times, we would expect the variable to be in that category 3 times and out of that category 1 time

Statistics: Unlocking the Power of Data Lock 5 Odds Ratio The odds ratio (OR) is the ratio of the odds in one group to the odds in the other group: Odds ratios of 1 indicate no difference between the groups (no relationship between the two variables)

Statistics: Unlocking the Power of Data Lock 5 OR for Cats and Schizophrenia Odds ratio for having a cat in the house as a child, comparing cases to controls (1982):

Statistics: Unlocking the Power of Data Lock 5 From the Paper

Statistics: Unlocking the Power of Data Lock 5 Summary: Two Categorical Variables Visualization  Side-by-side bar chart  Segmented bar chart Summary Statistics  Two-way table  Difference in proportions  Odds ratio

Statistics: Unlocking the Power of Data Lock 5 Real-Life Takeaways Toxoplasmosis can have serious consequences Children and pregnant women should not be exposed to cat feces Only cats that hunt are susceptible, and can only transmit for the first three weeks Worried? Most people with Toxoplasmosis never develop Schizophrenia or mental illness

Statistics: Unlocking the Power of Data Lock 5 To Do Read Section 2.1 Do HW 2.1 (due Friday, 9/25)