Www.chrisbilder.com1 of 31 Turning data into knowledge to solve real world problems Christopher R. Bilder, Ph.D. Department of Statistics University of.

Slides:



Advertisements
Similar presentations
Karsten Schmidt: Students’ Experiences in Technology-based Courses in Maths & Stats1 An Evaluation of Students’ Experiences in Technology-based Courses.
Advertisements

Agricultural Careers By: Dr. Frank Flanders and Ms. Anna Burgess Georgia Agricultural Education Curriculum Office Georgia Department of Education June.
June 9, 2008Stat Lecture 8 - Sampling Distributions 1 Introduction to Inference Sampling Distributions Statistics Lecture 8.
CmpE 104 SOFTWARE STATISTICAL TOOLS & METHODS MEASURING & ESTIMATING SOFTWARE SIZE AND RESOURCE & SCHEDULE ESTIMATING.
Two-Sample Inference Procedures with Means
Chapter 5 Discrete Random Variables and Probability Distributions
A statistical hypothesis is an assumption about a population parameter. This assumption may or may not be true
Department of Mathematical Sciences The University of Texas at El Paso 1 Program Assessment Presentation May 15, 2009 Joe Guthrie Helmut Knaust.
July 1, 2008Lecture 17 - Regression Testing1 Testing Relationships between Variables Statistics Lecture 17.
Univ. of S.C. Department of Statistics1 Careers in Statistics begin in the Department of Statistics at the University of South Carolina.
Fall 2006 – Fundamentals of Business Statistics 1 Chapter 13 Introduction to Linear Regression and Correlation Analysis.
Linear Regression and Correlation Analysis
Chapter 13 Introduction to Linear Regression and Correlation Analysis
 The field of statistics provides the scientist with some of the most useful techniques for evaluating ideas, testing theory, and discovering the.
7-2 Estimating a Population Proportion
Copyright © 2010, 2007, 2004 Pearson Education, Inc. Lecture Slides Elementary Statistics Eleventh Edition and the Triola Statistics Series by.
Inferences About Process Quality
Quantitative Business Methods for Decision Making Estimation and Testing of Hypotheses.
Binomial Probability Distribution.
WVU Mathematics Programs and Research Website: Research interests:
1 1 Slide © 2008 Thomson South-Western. All Rights Reserved Slides by JOHN LOUCKS & Updated by SPIROS VELIANITIS.
Course Introduction: Preface to Social Research and Quantitative Methods.
Statistics 501 Methods of Applies Statistics Using MINITAB Fall 2012 TuTh 11:15AM-12:30PM Rm219 Professor H. K. Hsieh (“Shay”)
INFERENTIAL STATISTICS – Samples are only estimates of the population – Sample statistics will be slightly off from the true values of its population’s.
Kenmore West Math Department Courses School Year.
Department of Mathematics, Statistics and Computer Science Degree Requirements Contact Information Begin Exit.
OverviewOverview – Preparation – Day in the Life – Earnings – Employment – Career Path Forecast – ResourcesPreparationDay in the LifeEarningsEmploymentCareer.
June 2, 2008Stat Lecture 18 - Review1 Final review Statistics Lecture 18.
New interval estimating procedures for the disease transmission probability in multiple- vector transfer designs Joshua M. Tebbs and Christopher R. Bilder.
of 27 Turning data into knowledge to solve real world problems Christopher R. Bilder, Ph.D. Department of Statistics University of.
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved Chapter 8 Hypothesis Testing 8-1 Review and Preview 8-2 Basics of Hypothesis.
Tests of significance & hypothesis testing Dr. Omar Al Jadaan Assistant Professor – Computer Science & Mathematics.
Review of Statistical Inference Prepared by Vera Tabakova, East Carolina University ECON 4550 Econometrics Memorial University of Newfoundland.
Why take a fourth year of math? Putting the pieces together…
Computer Science Graduate Studies in U of Memphis.
Two Sample Tests Nutan S. Mishra Department of Mathematics and Statistics University of South Alabama.
PROBABILITY & STATISTICAL INFERENCE LECTURE 3 MSc in Computing (Data Analytics)
Section Copyright © 2014, 2012, 2010 Pearson Education, Inc. Lecture Slides Elementary Statistics Twelfth Edition and the Triola Statistics Series.
1 1 Slide © 2005 Thomson/South-Western Slides Prepared by JOHN S. LOUCKS St. Edward’s University Slides Prepared by JOHN S. LOUCKS St. Edward’s University.
1 1 Slide © 2007 Thomson South-Western. All Rights Reserved OPIM 303-Lecture #9 Jose M. Cruz Assistant Professor.
Education Research 250:205 Writing Chapter 3. Objectives Subjects Instrumentation Procedures Experimental Design Statistical Analysis  Displaying data.
of 36 Turning data into knowledge to solve real world problems Christopher R. Bilder, Ph.D. Department of Statistics University of.
Various topics Petter Mostad Overview Epidemiology Study types / data types Econometrics Time series data More about sampling –Estimation.
CHAPTER 11 SECTION 2 Inference for Relationships.
AP STATS EXAM REVIEW Chapter 8 Chapter 13 and 14 Chapter 11 and 12 Chapter 9 and Chapter 10 Chapter 7.
1 Chapter 6 Estimates and Sample Sizes 6-1 Estimating a Population Mean: Large Samples / σ Known 6-2 Estimating a Population Mean: Small Samples / σ Unknown.
BUSINESS STATISTICS MGT 2302 BUSINESS STATISTICS MGT 2302 Lecturer Name : Liyana ‘Adilla 1 SCHOOLOGY ACCESS CODE: 7QRB9-4MPNN.
Section 10.1 Confidence Intervals
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. Section 7-1 Review and Preview.
Chapter 9: Inferences Based on Two Samples: Confidence Intervals and Tests of Hypotheses Statistics.
June 11, 2008Stat Lecture 10 - Review1 Midterm review Chapters 1-5 Statistics Lecture 10.
Chap 8-1 Fundamentals of Hypothesis Testing: One-Sample Tests.
Introduction to Statistical Inference Jianan Hui 10/22/2014.
Agresti/Franklin Statistics, 1 of 88 Chapter 11 Analyzing Association Between Quantitative Variables: Regression Analysis Learn…. To use regression analysis.
Applied Quantitative Analysis and Practices LECTURE#25 By Dr. Osman Sadiq Paracha.
1 1 Slide © 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole.
© Copyright McGraw-Hill 2004
Course Outline Presentation Reference Course Outline for MTS-202 (Statistical Inference) Fall-2009 Dated: 27 th August 2009 Course Supervisor(s): Mr. Ahmed.
Stats Term Test 4 Solutions. c) d) An alternative solution is to use the probability mass function and.
1 Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. Example: In a recent poll, 70% of 1501 randomly selected adults said they believed.
Preparing Statistics Majors for Graduate Study (Perhaps Your Own!) ASA Working Group to Revise the Undergraduate Statistics Curriculum Winter 2013.
 What is Hypothesis Testing?  Testing for the population mean  One-tailed testing  Two-tailed testing  Tests Concerning Proportions  Types of Errors.
Stats 242.3(02) Statistical Theory and Methodology.
26134 Business Statistics Week 4 Tutorial Simple Linear Regression Key concepts in this tutorial are listed below 1. Detecting.
Curriculum and Career preparation
Department of Mathematics, Statistics and Computer Science
Mrs. Daniel Alonzo & Tracy Mourning Sr. High
INTEGRATED LEARNING CENTER
Lecture Slides Elementary Statistics Twelfth Edition
Presentation transcript:

of 31 Turning data into knowledge to solve real world problems Christopher R. Bilder, Ph.D. Department of Statistics University of Nebraska-Lincoln

2 of years ago…  The year is 1990  Music – U2  George Bush is president  TV – The Simpson’s  Millard South –Senior year –Big hair –In the middle of winning state titles in basketball for 3 out of 4 years (1988, 1989, 1991)  What am I going to major in at college? –Calculus I –No AP Statistics!

of years ago…  UNO (1990 – 1994) –Math undergraduate major – What can you do with a degree? –Planned to be an actuary –Hypothesis testing in a statistics course (junior year) Use for decision making! Scientifically prove a hypothesis or statement  Kansas State University for graduate school (1994 – 2000) –Statistics graduate major in Department of Statistics –Master of Science (MS) and Doctor of Philosophy (PhD)  Oklahoma State University faculty (2000 – 2003) –Department of Statistics  UNL faculty (2003 – now) –NEW Department of Statistics

of 31Purpose  Tell you a little about the statistical science  Turning data into knowledge to solve real world problems –3 actual examples  AP statistics exam  Website ( for more information

of 31  Undergraduate teaching example for a course like AP STATs  How could you determine which grocery store, Super Wal- Mart or Baker’s, has lower average prices? –Paired or dependent two sample hypothesis test for  Wal-Mart -  Baker’s –Sample the same items at each store Grocery store prices

of 31  Undergraduate teaching example for a course like AP STATs  How could you determine which grocery store, Dillon’s or Food-4-Less in Manhattan, KS, has lower average prices? –Paired or dependent two sample hypothesis test for  Dillon’s -  Food-4-Less –Sample the same items at each store  Only cereals from Fall 1998 Grocery store prices

of 31 Grocery store prices  Sample:

of 31 Grocery store prices  Do you think there are mean differences? 25% 75% 50%

of 31 Grocery store prices  Paired two sample hypothesis test –H o :  Dillon’s -  Food-4-Less =0 H a :  Dillon’s -  Food-4-Less  0 –t = 4.77, p-value = , 95% C.I.: <  Dillon’s -  Food-4-Less < –Reject equal mean prices  If price was the only consideration, what store should one shop at?  Assumptions –Normal populations –The sample was taken in 1998; what about now? –Finite populations

of 31  The use of the statistical science in sports  Find a model to estimate the probability of success for placekicks (field goals, PATs) in the NFL  Video –January 7, 1996 –Playoff game –Indianapolis Colts 10 Kansas City Chiefs 7 –Lin Elliott of KC will attempt a 42 yard field goal to tie the game and send it into overtime –Field goal videoField goal videoPlacekicking

of 31Placekicking  What factors affect the probability of success for NFL placekicks? –Distance –Pressure – How do you quantitatively measure? –Wind –Grass vs. artificial turf –Dome vs. outdoor stadium  Collect sample of >1,700 placekicks during the 1995 NFL season  Find the best logistic regression model of the form where p is the probability of success x i for i=1,…,k are independent variables  i measures the effect of x i on p for i=1,…,k e  2.718; ln(e) = 1

of 31Placekicking  The  i ’s are parameters which are estimated using “iteratively reweighted least squares”  Estimated model –Change: lead change = 1, non-lead change = 0 –Distance: distance in yards –PAT: point after touchdown = 1, field goal = 0 –Wind: windy (speed > 15 MPH) = 1, non-windy = 0  What is the estimated probability of success for Elliott’s field goal? –Conditions: –Estimated probability of success: –90% confidence interval for probability of success: < p <

of 31 Estimated probability of success for a field goal (PAT=0)

of 31 HCV prevalence  Hepatitis C (HCV) –Viral infection that causes cirrhosis and cancer of the liver  Questions: –How can people be tested in a cost effective and timely manner? Blood bank setting –What is the probability a person has HCV? What proportion of people is inflicted with HCV in a population? Prevalence in a population  Individual testing –Each blood sample is tested individually –Problems: Costly Time + or - 

of 31  Group testing –Pool the blood samples together to form n groups of size s –If the GROUP sample is negative, then all s people do not have the disease –If the GROUP sample is positive, then at least ONE of the s people have the disease May want to determine who in the group has the disease –Strategy works well when prevalence of a disease is small HCV prevalence + or -  Group 1 Group 2 Group n

of 31 HCV prevalence  Notation –p = probability an INDIVIDUAL is HCV positive (prevalence) –  = probability a GROUP is HCV positive –s = group size –n = number of groups –T be a random variable denoting the number of positive GROUPS T has a binomial distribution with “n trials” and “  as the probability of success”

of 31 HCV prevalence  How can we estimate p? –We observe information about the groups, not individuals! –Estimate  with = # positive / # of groups –  = P(group is positive) = P(at least one individual is positive) = 1 – P(no individuals are positive) using complement rule = 1 – P(all individuals are negative) = 1 – (1 – p) s since p = P(individual is positive) and s individuals per group –p = 1 – (1 –  ) 1/s –Then

of 31 HCV prevalence  Estimation of HCV prevalence in Xuzhou City, China –Data from Liu et al. (Transfusion, 1997) –1,875 blood donors screened for HCV There were 42 positives –In order to test the usefulness of group testing, blood samples were also pooled n = 375 groups s = 5 individuals per group t = 37 positive groups –Estimates of p, probability individual is positive Using individual data: 42/1875 = Using group data: –Which is easier and more cost effective? 1875 tests using individual testing 375 tests using group testing

of 31 HCV prevalence  New research – MS/PhD research –What factors could affect p? –Include independent variables to help model p –Problem: Do not have the individual outcomes –After a group is tested positive, how can you find what individuals have the disease? Use model to help decide who to retest if get a positive group –Multiple diseases HCV HIV Other disease Simultaneously model

of 31 HCV prevalence  Multiple vector transfer designs –Swallow (Phytopathology, 1985) –Want to estimate the probability a insect vector transfers a pathogen (virus, bacteria, etc.) to a plant Brown planthopper Whitebacked planthopper

of 31 HCV prevalence  Multiple vector transfer designs (continued) Greenhouse  Enclosed test plant Does not transmit virus Transmits virus y=0 y=1 y=0 y=1 y=0 Planthopper y = 0 if plant is negative, 1 if plant is positive T = number of plants with disease

of 31 Why statistics?  Statistics is used in many diverse areas! –Statistics is the “science of science” –Florence Nightingale quote: the most important science in the whole world: for upon it depends the practical application of every other science and of every art: the one science essential to all political and social administration, all education, all organization based on experience, for it only gives results of our experience.  Take statistics courses in college! –Of course, I want you to consider coming to UNL! –Statistics is mainly a graduate discipline, so there is no undergraduate major at UNL –Undergraduate minor in statistics can be useful for many majors –Most statisticians have an undergraduate degree (Bachelor of Science) in math

of 31 Why statistics?  Where do statisticians work? –Pharmaceutical and medical research – Pfizer, Merck, medical centers –Marketing – Target, Hallmark –Government research labs – INEEL, Los Alamos, Sandia, Argonne –Agriculture – Pioneer Hi-Bred –Consulting firms – Quintiles –In Nebraska – ConAgra, Gallup, First National Bank, MDS Pharma, Experian, UNMC and Creighton medical center, various universities, Pfizer, Acton International, Nebraska state agencies, Union Pacific  Everyone that I have known has had a job offer before they graduated!  How many statisticians are there? – 20,000

of 31 Why statistics?  Salaries –Non-academic starting (2003 American Statistical Association survey) Background needed  Strong in mathematics and using computers –Majority of statisticians have Bachelor’s degrees in mathematics Good with calculus Applied math courses Take at least one statistics course Comfortable with using software packages –To actually be a “statistician”, usually need to go to graduate school to get a MS or PhD in statistics Financial support Graduate Teaching Assistantship Survey response rate was 23.5%; see salary surveys at the American Statistical Association’s website

of 31 Why statistics?  What courses to take next in college? –AP statistics equivalent to a one semester introductory statistics course without calculus UNL: STAT 218 (Introduction to Statistics) UNO: MATH/STAT 3000 (Statistical Methods I); Business Administration 2130 (Principles of Business Statistics) –Theory – 2 semester sequence using calculus I-III UNL: STAT 462 (Distribution Theory) and STAT 463 (Statistical Inference) UNO: MATH 4740 and 4750 (Intro. to Probability and Statistics I and II) –Applications UNL: STAT 450 (Introduction to Regression Analysis) or STAT 412 (Introduction to Experimental Design) UNO: MATH /STAT 3010 (Statistical Methods II); Business Administration 3140 (Business Statistical Applications)

of 31 Why statistics?  Other recommended UNL classes (undergraduate) –MATH 340 Numerical Analysis –MATH 314 Applied Linear Algebra –MATH 325 Elementary Analysis and MATH 425 Mathematical Analysis Helpful if go on for a PhD –Computer science programming courses  Other recommended UNO classes (undergraduate) –MATH 3300 Numerical Methods –MATH 4050 Linear Algebra –MATH 4760 Topics in Modeling –MATH 4230 and 4240 Mathematical Analysis I and II Helpful if go on for a PhD –Computer science programming courses

of 31 AP Statistics  Grading done in Lincoln! –State fair grounds –Grade the free response section of about 66,000 student exams (2004) –250 AP statistics high school teachers and college professors –June 13 to June 19, 2005 –8:30AM – 4:45PM EVERYDAY

of 31  I graded in 2002 –About 900 problems graded! –16 graders in a room split into two groups –Each group has a leader Answer questions CHECKS some of your grading! –Paid $1,450 Stay in dorms Free meals and snacks  Grading is not fun –Evening activities –Discussions on how to teach introductory statistics better  The grading rubric –An outline of how to grade a problem that must be followed! –These are put together before graders arrive through examining sample set of tests AP Statistics

of 31 AP Statistics  Question #6 in 2002 –4 parts – (a), (b), (c), (d) –Each part is graded as E = Essentially correct P = Partially correct I = Incomplete –Graders are given a “conversion” table to show how to convert the scores into a numerical score 4 = Complete response 3 = Substantial response 2 = Developing response 1 = Minimal response 0 = No credit – 1 point given to an E, 0.5 points given to a P, 0 points given to an I Round up if (a) or (c) has the correct interpretation –Example given at end of PowerPoint file

of 31 For more information…  me at  Website: –This PowerPoint presentation (including example question) –Links to Introductory information about being a statistician Jobs (including internships) Salary information List of all Departments of Statistics Professional societies Course websites that myself and others teach Newspaper and magazine articles about statistical applications

of 31 Turning data into knowledge to solve real world problems Christopher R. Bilder, Ph.D. Department of Statistics University of Nebraska-Lincoln

32 of 31 Statistics at UNL 33 rd st. Department of Statistics

of 31 AP Statistics

of 31 AP Statistics

of 31 AP Statistics May actually be an E?

of 31 AP Statistics

of 31 AP Statistics

of 31 Estimated probability of success for a field goal (PAT=0)