Presentation is loading. Please wait.

Presentation is loading. Please wait.

Two-Sample Problems – Means 1.Comparing two (unpaired) populations 2.Assume: 2 SRSs, independent samples, Normal populations Make an inference for their.

Similar presentations


Presentation on theme: "Two-Sample Problems – Means 1.Comparing two (unpaired) populations 2.Assume: 2 SRSs, independent samples, Normal populations Make an inference for their."— Presentation transcript:

1 Two-Sample Problems – Means 1.Comparing two (unpaired) populations 2.Assume: 2 SRSs, independent samples, Normal populations Make an inference for their difference: Sample from population 1: Sample from population 2: 1

2 S.E. – standard error in the two-sample process Confidence Interval: Estimate ± margin of error Significance Test: 2

3 Using the Calculator Confidence Interval: On calculator: STAT, TESTS, 0:2-SampTInt… Given data, need to enter: Lists locations, C-Level Given stats, need to enter, for each sample: x, s, n and then C-Level Select input (Data or Stats), enter appropriate info, then Calculate 3

4 Using the Calculator Significance Test: On calculator: STAT, TESTS, 4:2-SampT –Test… Given data, need to enter: Lists locations, H a Given stats, need to enter, for each sample : x, s, n and then H a Select input (Data or Stats), enter appropriate info, then Calculate or Draw Output: Test stat, p-value 4

5 Ex 1. Is one model of camp stove any different at boiling water than another at the 5% significance level? Model 1: Model 2: 5

6 Ex 2. Is there evidence that children get more REM sleep than adults at the 1% significance level? Children: Adults: 6

7 Ex 3. Create a 98% C.I for estimating the mean difference in petal lengths (in cm) for two species of iris. Iris virginica: Iris setosa: 7

8 Ex 4. Is one species of iris any different at petal length than another at the 2% significance level? Iris virginica: Iris setosa: -2 0 1 23 -3 -4 4 8

9 Two-Sample Problems – Proportions Make an inference for their difference: Sample from population 1: Sample from population 2: 9

10 Using the Calculator Confidence Interval: On calculator: STAT, TESTS, B:2-PropZInt… Need to enter: C-Level Enter appropriate info, then Calculate. Estimate ± margin of error 10

11 Using the Calculator On calculator: STAT, TESTS, 6:2-PropZTest… Need to enter: and then H a Enter appropriate info, then Calculate or Draw Output: Test stat, p-value Significance Test: 11

12 Ex 5. Create a 95% C.I for the difference in proportions of eggs hatched. Nesting boxes apart/hidden: Nesting boxes close/visible: 12

13 Ex 6. Split 1100 potential voters into two groups, those who get a reminder to register and those who do not. Of the 600 who got reminders, 332 registered. Of the 500 who got no reminders, 248 registered. Is there evidence at the 1% significance level that the proportion of potential voters who registered was greater than in the group that received reminders? Group 1: Group 2: 13

14 Ex 6. (continued) 14

15 Ex 7. “Can people be trusted?” Among 250 18-25 year olds, 45 said “yes”. Among 280 35-45 year olds, 72 said “yes”. Does this indicate that the proportion of trusting people is higher in the older population? Use a significance level of α =.05. Group 1: Group 2: 15

16 Ex 7. (continued) 16

17 Scatterplots & Correlation Each individual in the population/sample will have two characteristics looked at, instead of one. Goal: able to make accurate predictions for one variable in terms of another variable based on a data set of paired values. 17

18 Variables Explanatory (independent) variable, x, is used to predict a response. Response (dependent) variable, y, will be the outcome from a study or experiment. height vs. weight, age vs. memory, temperature vs. sales 18

19 Scatterplots Plot of paired values helps to determine if a relationship exists. Ex: variables – height(in), weight (lb) HeightWeight 72171 65150 68180 70180 72185 66165 65667072 190 150 170 68 19

20 Scatterplots - Features Direction: negative, positive Form: line, parabola, wave(sine) Strength: how close to following a pattern Direction: 65667072 190 150 170 Form: Strength: 20

21 Scatterplots – Temp vs Oil used Direction: 20307090 45 25 35 Form: Strength: 21

22 Correlation Correlation, r, measures the strength of the linear relationship between two variables. r > 0: positive direction r < 0: negative direction Close to +1: Close to -1: Close to 0: 22

23 .85, -.02,.13, -.79 23

24 Lines - Review y = a + bx 12 34 3 1 2 a: b: 24

25 Regression Looking at a scatterplot, if form seems linear, then use a linear model or regression line to describe how a response variable y changes as an explanatory variable changes. Regression models are often used to predict the value of a response variable for a given explanatory variable. 25

26 Least-Squares Regression Line The line that best fits the data: where: 26

27 Example Fat and calories for 11 fast food chicken sandwiches Fat: Calories: 27

28 Example Fat and calories for 11 fast food chicken sandwiches Fat: Fat Calories Calories: 28

29 Example-continued What is the slope and what does it mean? What is the intercept and what does it mean? How many calories would you predict a sandwich with 40 grams of fat has? 29

30 Why “Least-squares”? The least-squares lines is the line that minimizes the sum of the squared residuals. Residual: difference between actual and predicted 1 3 27 9 18 xy 11014-4 325241 ………… 30

31 Scatterplots – Residuals To double-check the appropriateness of using a linear regression model, plot residuals against the explanatory variable. No unusual patterns means good linear relationship. 31

32 Other things to look for Squared correlation, r 2, give the percent of variation explained by the regression line. Chicken data: 32

33 Other things to look for Influential observations: Prediction vs. Causation: x and y are linked (associated) somehow but we don’t say “x causes y to occur”. Other forces may be causing the relationship (lurking variables). 33

34 Extrapolation: using the regression for a prediction outside of the range of values for the explanatory variables. ageweight 20180 25190 32190 36200 36225 40215 47220 34

35 On calculator Set up: 2 nd 0(catalog), x -1 (D), scroll down to “Diagnostic On”, Enter, Enter Scatterplots: 2 nd Y=(Stat Plot), 1, On, Select Type And list locations for x values and y values Then, ZOOM, 9(Zoom Stat) Regression: STAT, CALC, 8: LinReg (a + bx), enter, List location for x, list location for y, enter Graph: Y=, enter line into Y1 35

36 Examples: CatChickDogDuckGoatLionBirdPig Bun ny Squir rel x63226328151108181153144 Incubation, days y117.5111012108 79 Lifespan, years x252545114261 age, years y161117101211201910161120 resale, thousands $ 36

37 Contingency Tables Contingency tables summarize all outcomes – Row variable: one row for each possible value – Column variable: one column for each possible value – Each cell (i,j) describes number of individuals with those values for the respective variables. Making comparisons between two categorical variables Age\Income<1515-30>30Total <215319 21-2549619 >2522812 Total11141540 37

38 Info from the table – # who are over 25 and make under $15,000: – % who are over 25 and make under $15,000: – % who are over 25: – % of the over 25 who make under $15,000: Age\Income<1515-30>30Total <215319 21-2549619 >2522812 Total11141540 38

39 Marginal Distributions – Look to margins of tables for individual variable’s distribution – Marginal distribution for age: – Marginal distribution for income: Age\Income<1515-30>30Total <215319 21-2549619 >2522812 Total11141540 AgeFreq.Rel. Freq <219 21-2519 >2512 Total40 Income<1515-30>30Total Freq.11141540 Rel. Freq. 39

40 Conditional Distributions – Look at one variable’s distribution given another – How does income vary over the different age groups? – Consider each age group as a separate population and compute relative frequencies: Age\Income<1515-30>30Total <215319 21-2549619 >2522812 Age\Income<1515-30>30Total <21 21-25 >25 40

41 Independence Revisited Two variables are independent if knowledge of one does not affect the chances of the other. In terms of contingency tables, this means that the conditional distribution of one variable is (almost) the same for all values of the other variable. In the age/income example, the conditionals are not even close. These variables are not independent. There is some association between age and income. 41

42 Test for Independence Is there an association between two variable? – H 0 : The variables are ( The two variables are ) – H a : The variables (The two variables are ) Assuming independence: – Expected number in each cell (i, j): (% of value i for variable 1)x(% of j value for variable 2)x (sample size) = 42

43 Example of Computing Expected Values Rh\BloodABABOTotal +1762822198424 -301243076 Total2064026228500 Expected number in cell (A, +): Rh\BloodABABOTotal +22.048193.344424 -3.95234.65676 Total2064026228500 43

44 Chi-square statistic To measure the difference between the observed table and the expected table, we use the chi- square test statistic: where the summation occurs for each cell in the table. 1.Skewed right 2.df = (r – 1)(c – 1) 3.Right-tailed test 44

45 Test for Independence – Steps  State variables being tested  State hypotheses: H 0, the null hypothesis, vars independent H a, the alternative, vars not independent  Compute test statistic: if the null hypothesis is true, where does the sample fall? Test stat = X 2 -score  Compute p-value: what is the probability of seeing a test stat as extreme (or more extreme) as that?  Conclusion: small p-values lead to strong evidence against H 0. 45

46 ST – on the calculator On calculator: STAT, TESTS, C:X 2 –Test Observed: [A] Expected: [B] Enter observed info into matrix A, then perform test with Calculate or Draw. Output: Test stat, p-value, df To enter observed info into matrix A: 2 nd, x -1 (Matrix), EDIT, 1: A, change dimensions, enter info in each cell. 46

47 Ex. Test whether type and rh factor are independent at a 5% significance level. 47

48 Ex. Test whether age and stance on marijuana legalization are associated. stance\age18-2930-4950-Total for172313258743 against52103119274 Total2244163771017 48

49 Additional Examples personality\collegeHealthScienceLib ArtsEducator extrovert68566247 introvert94814566 Job grade\marital statusSingleMarriedDivorced 15887415 2222345060 374120493 City size\practice status GovernmentJudicialPrivateSalaried <250,000304425836 250-500,0007910265190 >500,000223412723 49


Download ppt "Two-Sample Problems – Means 1.Comparing two (unpaired) populations 2.Assume: 2 SRSs, independent samples, Normal populations Make an inference for their."

Similar presentations


Ads by Google