Presentation is loading. Please wait.

Presentation is loading. Please wait.

Statistics – OR 155 Section 1 J. S. Marron, Professor Department of Statistics and Operations Research.

Similar presentations


Presentation on theme: "Statistics – OR 155 Section 1 J. S. Marron, Professor Department of Statistics and Operations Research."— Presentation transcript:

1 Statistics – OR 155 Section 1 J. S. Marron, Professor Department of Statistics and Operations Research

2 Class Information Handouts http://www.stat-or.unc.edu/webspace/courses/marron/UNCstor155-2009/ClassInfo/Stor155-09FirstHandout.pdf With: Blackboard Info Student Survey (please fill out & return after class)

3 Class Information Go to Blackboard (for class details): Website: http://blackboard.unc.edu/http://blackboard.unc.edu/ Log-in with Onyen Choose this course Control Panel > Content Areas Course Information Choose Item “Course Information”

4 Relationship to Textbook Ordering of material in textbook is usual But I don’t like it (poorly motivated) So will change the order of the material (for better motivation) Will jump around a lot through the text

5 Reading In Textbook Approximate Reading for Today’s Material: Pages 1-5, 197-203, 203-208 Approximate Reading for Next Class: Pages 237-250

6 What is Statistics? Definition 1: Gaining Insight from Numbers (similar to text’s definition) Definition 2: The Science of Managing Uncertainty

7 What is Statistics? Subtopics: Gathering the Numbers –E.g. Statistician at a ball game –Will see: how this is done is critical Forming Conclusions –Will use math, etc. –Major focus of this course

8 Key Themes I.Uncertainty II.Variability (will get quantitative about these) Favorite Quote: “I was never good at math, but statistics is easy, since it is just common sense”

9 Motivating Examples 1.Political Polls –Try to predict outcome of election –Too expensive to ask everyone –So ask some (hope they are “representative”) 2.Measurement Error –No measurement is exact –Can improve by multiple measurements –How to model? Lessons of these are broadly applicable

10 Common Structure For both, find out about truth from a sample E.g. 1: % for Cand. in population % for Cand. in sample E.g. 2: true size observed measurement

11 Motivating Examples 1.Political Polls 2.Measurement Error Will study each using mathematical models Do E.g. 1 first, since easier Appropriate Models?

12 Political Polls Appropriate Mathematical Models? Depends on how data are gathered. See Text, pages 171-177 Seems easy??? “Just choose some”??? Take a look at history…

13 How to sample? History of Presidential Election Polls During Campaigns, constantly hear in news “polls say …” How good are these? Why?

14 How to sample? History of Presidential Election Polls During Campaigns, constantly hear in news “polls say …” How good are these? Why? 1936 Landon vs. Roosevelt Literary Digest Poll: 43% for R

15 How to sample? History of Presidential Election Polls During Campaigns, constantly hear in news “polls say …” How good are these? Why? 1936 Landon vs. Roosevelt Literary Digest Poll: 43% for R Result: 62% for R

16 How to sample? History of Presidential Election Polls During Campaigns, constantly hear in news “polls say …” How good are these? Why? 1936 Landon vs. Roosevelt Literary Digest Poll: 43% for R Result: 62% for R What happened? Sample size not big enough? 2.4 million Biggest Poll ever done (before or since)

17 Bias in Sampling Bias: Systematically favoring one outcome (need to think carefully) Selection Bias: Addresses from L. D. readers, phone books, club memberships (representative of population?) Non-Response Bias: Return-mail survey (who had time?)

18 How to sample? 1936 Presidential Election (cont.) Interesting Alternative Poll: Gallup: 56% for R (sample size ~ 50,000) Gallup of L.D. 44% for R ( ~ 3,000)

19 How to sample? 1936 Presidential Election (cont.) Interesting Alternative Poll: Gallup: 56% for R (sample size ~ 50,000) Gallup of L.D. 44% for R ( ~ 3,000) Predicted both correct result (62% for R), and L. D. error (43% for R)! (how was improvement done?)

20 Improved Sampling Gallup’s Improvements: (i)Personal Interviews (attacks non-response bias) (ii)Quota Sampling (attacks selection bias)

21 Quota Sampling Idea: make “sample like population” So surveyor chooses people to give: i.Right % male ii.Right % “young” iii.Right % “blue collar” iv.… This worked fairly well (~5% error), until …

22 How to sample? 1948 Dewey Truman sample size

23 How to sample? 1948 Dewey Truman sample size Crossley 50% 45% Gallup 50% 44% ~50,000 Roper 53% 38% ~15,000

24 How to sample? 1948 Dewey Truman sample size Crossley 50% 45% Gallup 50% 44% ~50,000 Roper 53% 38% ~15,000 Actual 45% 50% -

25 How to sample? 1948 Dewey Truman sample size Crossley 50% 45% Gallup 50% 44% ~50,000 Roper 53% 38% ~15,000 Actual 45% 50% - Note: Embarassing for polls, famous photo of Truman + Headline “Dewey Wins”

26 How to sample? Note: Embarassing for polls, famous photo of Truman + Headline “Dewey Wins”

27 What went wrong? Problem: Unintentional Bias (surveyors understood bias, but still made choices)

28 What went wrong? Problem: Unintentional Bias (surveyors understood bias, but still made choices) Lesson: Human Choice can not give a Representative Sample

29 What went wrong? Problem: Unintentional Bias (surveyors understood bias, but still made choices) Lesson: Human Choice can not give a Representative Sample Surprising Improvement: Random Sampling Now called “scientific sampling” Random = Scientific???

30 Random Sampling Key Idea: “random error” is smaller than “unintentional bias”, for large enough sample sizes

31 Random Sampling Key Idea: “random error” is smaller than “unintentional bias”, for large enough sample sizes How large? Current sample sizes: ~1,000 - 3,000

32 Random Sampling Key Idea: “random error” is smaller than “unintentional bias”, for large enough sample sizes How large? Current sample sizes: ~1,000 - 3,000 Note: now << 50,000 used in 1948. So surveys are much cheaper (thus many more done now….)

33 Random Sampling How Accurate? Can (& will) calculate using “probability” Justifies term “scientific sampling” 2 nd improvement over quota sampling

34 Random Sampling What is random? Simple Random Sampling: Each member of population is equally likely to be in sample Key Idea: Different from “just choose some”

35 Random Sampling An old (but still fun?) experiment: Choose a number among 1,2,3,4

36 Random Sampling An old (but still fun?) experiment: Choose a number among 1,2,3,4 Old typical results: about 70% choose “3” (perhaps you have seen this before…)

37 Random Sampling An old (but still fun?) experiment: Choose a number among 1,2,3,4 Old typical results: about 70% choose “3” (perhaps you have seen this before…) Main lesson: human choice does not give “equally likely” (i.e. random sample)

38 Random Sampling How to choose a random sample? Old Approaches: –Random Number Table –Roll Dice Modern Approach: –Computer Generated

39 Random Sampling HW Interesting Question: What is the % of Male Students at UNC? (Your chance of date, or take 100% - to get your chance) HW: C1: Class Handout http://stat-or.unc.edu/webspace/courses/marron/UNCstor155-2009/HWAsst/Stor155HWC1.pdf

40 Random Sampling HW Notes on HW C1: 3 dumb ways to sample, 1 good one Goal is to learn about sampling, Not “get right answer” Part 1, put symbol for yourself, Ms and Fs for others Put both count & % (%100 x count / 25) Part 2, “tally” is: Part 4, student phone directory available in Student Union?

41 Random Sampling HW Notes on HW C1, Hints on Part 4: –For each draw, first draw a “random page” –Tools  Data Analysis  Random Number Generation  Uniform is one way to do this –In “Uniform”, you need to set “Parameters”, to 0 and “number of pages” –This gives a random decimal, to get an integer, round up, using CEILING –In CEILING, set “significance” to 1

42 Random Sampling HW Notes on HW C1, Hints on Part 4 (cont.): –Next Choose Random Column –Next Choose Random Name –Caution: Different numbers on each page. –Challenge: still make equally likely –Approach: choose larger number –Approach: when not there, just toss it out –Approach: then do a “redraw” –Also redraw if can’t tell gender


Download ppt "Statistics – OR 155 Section 1 J. S. Marron, Professor Department of Statistics and Operations Research."

Similar presentations


Ads by Google