LISA Short Course: A Tutorial in t-tests and ANOVA using JMP Laboratory for Interdisciplinary Statistical Analysis Anne Ryan Assistant Professor of Practice.

Slides:



Advertisements
Similar presentations
EcoTherm Plus WGB-K 20 E 4,5 – 20 kW.
Advertisements

Números.
University Paderborn 07 January 2009 RG Knowledge Based Systems Prof. Dr. Hans Kleine Büning Reinforcement Learning.
Trend for Precision Soil Testing % Zone or Grid Samples Tested compared to Total Samples.
Trend for Precision Soil Testing % Zone or Grid Samples Tested compared to Total Samples.
AGVISE Laboratories %Zone or Grid Samples – Northwood laboratory
Trend for Precision Soil Testing % Zone or Grid Samples Tested compared to Total Samples.
5.1 Rules for Exponents Review of Bases and Exponents Zero Exponents
Lecture 8: Hypothesis Testing
EuroCondens SGB E.
Worksheets.
Reinforcement Learning
Slide 1Fig 26-CO, p.795. Slide 2Fig 26-1, p.796 Slide 3Fig 26-2, p.797.
Sequential Logic Design
Copyright © 2013 Elsevier Inc. All rights reserved.
STATISTICS Linear Statistical Models
STATISTICS HYPOTHESES TEST (I)
STATISTICS INTERVAL ESTIMATION Professor Ke-Sheng Cheng Department of Bioenvironmental Systems Engineering National Taiwan University.
STATISTICS POINT ESTIMATION Professor Ke-Sheng Cheng Department of Bioenvironmental Systems Engineering National Taiwan University.
Addition and Subtraction Equations
Western Public Lands Grazing: The Real Costs Explore, enjoy and protect the planet Forest Guardians Jonathan Proctor.
Add Governors Discretionary (1G) Grants Chapter 6.
CALENDAR.
CHAPTER 18 The Ankle and Lower Leg
This morning’s programme
The 5S numbers game..
突破信息检索壁垒 -SciFinder Scholar 介绍
A Fractional Order (Proportional and Derivative) Motion Controller Design for A Class of Second-order Systems Center for Self-Organizing Intelligent.
Numerical Analysis 1 EE, NCKU Tien-Hao Chang (Darby Chang)
The basics for simulations
Numerical Analysis 1 EE, NCKU Tien-Hao Chang (Darby Chang)
2013 Fox Park Adopt-A-Hydrant Fund Raising & Beautification Campaign Now is your chance to take part in an effort to beautify our neighborhood by painting.
Regression with Panel Data
Statistics Review – Part I
Progressive Aerobic Cardiovascular Endurance Run
Biology 2 Plant Kingdom Identification Test Review.
Name of presenter(s) or subtitle Canadian Netizens February 2004.
MaK_Full ahead loaded 1 Alarm Page Directory (F11)
Facebook Pages 101: Your Organization’s Foothold on the Social Web A Volunteer Leader Webinar Sponsored by CACO December 1, 2010 Andrew Gossen, Senior.
When you see… Find the zeros You think….
Midterm Review Part II Midterm Review Part II 40.
2011 WINNISQUAM COMMUNITY SURVEY YOUTH RISK BEHAVIOR GRADES 9-12 STUDENTS=1021.
Before Between After.
2011 FRANKLIN COMMUNITY SURVEY YOUTH RISK BEHAVIOR GRADES 9-12 STUDENTS=332.
Statistics for the Social Sciences
Numeracy Resources for KS2
1 Non Deterministic Automata. 2 Alphabet = Nondeterministic Finite Accepter (NFA)
Static Equilibrium; Elasticity and Fracture
Ch 14 實習(2).
Resistência dos Materiais, 5ª ed.
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved Lecture Slides Elementary Statistics Eleventh Edition and the Triola.
Copyright © 2013 Pearson Education, Inc. All rights reserved Chapter 11 Simple Linear Regression.
Learning Objectives Describe the linear regression model
Chapter 14 Nonparametric Statistics
Biostatistics course Part 14 Analysis of binary paired data
A Data Warehouse Mining Tool Stephen Turner Chris Frala
1 Dr. Scott Schaefer Least Squares Curves, Rational Representations, Splines and Continuity.
1 Non Deterministic Automata. 2 Alphabet = Nondeterministic Finite Accepter (NFA)
Introduction Embedded Universal Tools and Online Features 2.
Schutzvermerk nach DIN 34 beachten 05/04/15 Seite 1 Training EPAM and CANopen Basic Solution: Password * * Level 1 Level 2 * Level 3 Password2 IP-Adr.
T-T ESTS AND A NALYSIS OF V ARIANCE Jennifer Kensler.
T-T ESTS AND A NALYSIS OF V ARIANCE Jennifer Kensler.
Laboratory for Interdisciplinary Statistical Analysis Anne Ryan Virginia Tech.
Statistical Analysis. Purpose of Statistical Analysis Determines whether the results found in an experiment are meaningful. Answers the question: –Does.
T-T ESTS AND A NALYSIS OF V ARIANCE Jennifer Kensler.
T-tests and ANOVA using JMP Kristopher Patton April 7, 2015 * institute-state-university-virginia-tech/
1 ANALYSIS OF VARIANCE (ANOVA) Heibatollah Baghi, and Mastee Badii.
T-T ESTS AND A NALYSIS OF V ARIANCE Jennifer Kensler July 13, 2010 Fralin Auditorium, Virginia Tech This presentation is annotated. Please click on the.
Analysis of Variance STAT E-150 Statistical Methods.
Presentation transcript:

LISA Short Course: A Tutorial in t-tests and ANOVA using JMP Laboratory for Interdisciplinary Statistical Analysis Anne Ryan Assistant Professor of Practice Department of Statistics, VT

Laboratory for Interdisciplinary Statistical Analysis Collaboration From our website request a meeting for personalized statistical advice Great advice right now: Meet with LISA before collecting your data Short Courses Designed to help graduate students apply statistics in their research Walk-In Consulting Available 1-3 PM: MonFri in the GLC Video Conference Room for questions requiring <30 mins See our website for additional times and locations. All services are FREE for VT researchers. We assist with researchnot class projects or homework. LISA helps VT researchers benefit from the use of Statistics Designing Experiments Analyzing Data Interpreting Results Grant Proposals Using Software (R, SAS, JMP, Minitab...)

3

Defense: Prosecution: Whats the Assumed Conclusion? Represent the accused (defendant) Hold the Burden of Proofobligation to shift the assumed conclusion from an oppositional opinion to ones own position through evidence ANSWER: The accused is innocent until proven guilty. Prosecution must convince the judge/jury that the defendant is guilty beyond a reasonable doubt 4

Burden of ProofObligation to shift the conclusion using evidence Trial Hypothesis Test Innocent until proven guilty Accept the status quo (what is believed before) until the data suggests otherwise 5

Decision Criteria Trial Hypothesis Test Evidence has to convincing beyond a reasonable Occurs by chance less than 100α% of the time (ex: 5%) 6

… a procedure that allows us to make statements about a general population using the results of a random sample from that population. Two Types of Inferential Statistics: Hypothesis Testing Estimation Point estimates Confidence intervals 7

Hypothesis testing is a detailed protocol for decision-making concerning a population by examining a sample from that population. 8

1. Test 2. Assumptions 3. Hypotheses 4. Mechanics 5. Conclusion 9

Used to test whether the population mean is different from a specified value. 10

In a glaucoma study, the following intraocular pressure (mm Hg) values were recorded from a sample of 21 elderly subjects. Based on this data, can we conclude that the mean intraocular pressure of the population from which the sample was drawn differs from 14 mm Hg?* Intraocular Pressure *Wayne, D. Biostatistics: A Foundation for Analysis in the Health Sciences. 5 th ed. New York: John Wiley & Sons,

State the name of the testing method to be used It is important to not be off track in the very beginning

List all the assumptions required for your test to be valid. All tests have assumptions Even if assumptions are not met you should still comment on how this affects your results. Example 1: 2. Assumptions Simple random sample (SRS) was used to collect data The population distribution from which the sample is drawn is normal or approximately normal.

For hypothesis testing there are three versions for testing that are determined by the context of the research question. Left Tailed Hypothesis Test (less than) Right Tailed Hypothesis Test (greater than) Two Tailed or Two Sided Hypothesis Test (not equal to)

Example 1: 3. Hypotheses In a glaucoma study, the following intraocular pressure (mm Hg) values were recorded from a sample of 21 elderly subjects. Based on this data, can we conclude that the mean intraocular pressure of the population from which the sample was drawn differs from 14 mm Hg?* What are the hypotheses for Example 1?

Computational Part of the Test Parts of the Mechanics Step Stating the Significance Level Finding the Rejection Rule Computing the Test Statistic Computing the p-value

23 Test statistic for a one sample t-test

Example 1: 4. Mechanics Test Statistic:

p-value: After computing the test statistic, now you can compute the p-value. A p-value is the probability of obtaining a point estimate as extreme as the current value where the definition of extreme is taken from the alternative hypothesis assuming the null hypothesis is true. The p-value depends on the alternative hypothesis, so there are three ways to compute p-values. p-value: The chance of observing your sample results or more extreme results assuming that the null hypothesis is true. If this chance is small then you may decide the claim in the null hypothesis is false.

Example 1: 4. Mechanics p-value:

Example 1: 4. Mechanics p-value:

Example 1: 4. Mechanics p-value: JMP will give the 3 p-values and you must select the correct p-value based on your alternative hypothesis

Note: The significance level can be thought of as a tolerance for things happening by chance. If we set α=.05 then we are saying that we are willing to say what we observe may be out of the ordinary, but unless it is something that occurs less that 5% of the time we will attribute it to chance.

2-Tailed TestRight-TailedLeft Tailed Null hypothesis Alternative hypothesis 33

34

In a glaucoma study, the following intraocular pressure (mm Hg) values were recorded from a sample of 21 elderly subjects. Based on this data, can we conclude that the mean intraocular pressure of the population from which the sample was drawn differs from 14 mm Hg?* 35

JMP Demonstration Open Pressure.jmp Analyze Distribution Complete the dialog box as shown and select OK. Select the red arrow next to Pressure and select Test Mean. Complete Dialog box as shown and select OK. Select the red arrow next to Pressure and select Confidence Interval->

The normal quantile plot may also be created in JMP to check the normality assumption. The assumption is met if the points fall close to the red line. 37

Two sample t-tests are used to determine whether the population mean of one group is equal to, larger than or smaller than the population mean of another group. 38

The major goal is to determine whether a difference exists between two populations. Examples: Compare blood pressure for male and females. Compare the proportion of smokers and nonsmokers with lung cancer. Compare weight before and after treatment. Is the mean cholesterol of people taking drug A lower than the mean cholesterol of people taking drug B? 39

The population means of the two groups are not equal. H 0 : μ 1 = μ 2 H a : μ 1 μ 2 The population mean of group 1 is greater than the population mean of group 2. H 0 : μ 1 = μ 2 H a : μ 1 > μ 2 The population mean of group 1 is less than the population mean of group 2. H 0 : μ 1 = μ 2 H a : μ 1 < μ 2 40

The two samples are random and independent. The populations from which the samples are drawn are either normal or the sample sizes are large. The populations have the same standard deviation. 41

42

43 2-Tailed TestRight-TailedLeft Tailed Null Alternative Assumption: The populations from which both samples are drawn are normal or approximately normal.

A researcher would like to know whether the mean sepal width of setosa irises is different from the mean sepal width of versicolor irises. The researcher randomly selects 50 setosa irises and 50 versicolor irises and measures their sepal widths. Step 1 Hypotheses: H 0 : μ setosa = μ versicolor H a : μ setosa μ versicolor wiki/Iris_flower_data_set wiki/Iris_versicolor 44

Steps 2-4: JMP Demonstration: Analyze Fit Y By X Y, Response: Sepal Width X, Factor: Species Means/ANOVA/Pooled t Normal Quantile Plot Plot Actual by Quantile 45

Step 5 Conclusion: There is strong evidence (p-value < ) that the mean sepal widths for the two varieties are different. 46

The paired t-test is used to compare the population means of two groups when the samples are dependent. 47

The objective of paired comparisons is to minimize sources of variation that are not of interest in the study by pairing observations with similar characteristics. Example: A researcher would like to determine if background noise causes people to take longer to complete math problems. The researcher gives 20 subjects two math tests one with complete silence and one with background noise and records the time each subject takes to complete each test. 48

The population mean difference is not equal to zero. H 0 : μ difference = 0 H a : μ difference 0 The population mean difference is greater than zero. H 0 : μ difference = 0 H a : μ difference > 0 The population mean difference is less than a zero. H 0 : μ difference = 0 H a : μ difference < 0 49

The sample is random. The data is matched pairs. The differences have a normal distribution or the sample size is large. 50

51

2-TailedRight TailedLeft Tailed Null Alternative 52 Assumption: The population of differences is normal or approximately normal.

A researcher would like to determine whether a fitness program increases flexibility. The researcher measures the flexibility (in inches) of 12 randomly selected participants before and after the fitness program. Step 1: Formulate a Hypothesis H 0 : μ After - Before = 0 H a : μ After - Before >

Steps 2-4: JMP Analysis: Create a new column of After – Before Analyze Distribution Y, Columns: After – Before Normal Quantile Plot Test Mean Specify Hypothesized Mean: 0 54

Step 5 Conclusion: There is not evidence that the fitness program increases flexibility. 55

ANOVA is used to determine whether three or more populations have different distributions. 56

ANOVA is used to determine whether three or more populations have different distributions. A B C Medical Treatment 57

The first step is to use the ANOVA F test to determine if there are any significant differences among the population means. If the ANOVA F test shows that the population means are not all the same, then follow up tests can be performed to see which pairs of population means differ. 58

In other words, for each group the observed value is the group mean plus some random variation. 59

Step 1: We test whether there is a difference in the population means. 60

The samples are random and independent of each other. The populations are normally distributed. The populations all have the same standard deviations. The ANOVA F test is robust to the assumptions of normality and equal standard deviations. 61

Compare the variation within the samples to the variation between the samples. A B C A B C Medical Treatment 62

Variation within groups small compared with variation between groups Large F Variation within groups large compared with variation between groups Small F 63

The mean square for groups, MSG, measures the variability of the sample averages. SSG stands for sums of squares groups. 64

Mean square error, MSE, measures the variability within the groups. SSE stands for sums of squares error. 65

Step 4: Calculate the p-value. Step 5: Write a conclusion. 66

A researcher would like to determine if three drugs provide the same relief from pain. 60 patients are randomly assigned to a treatment (20 people in each treatment). Step 1: Formulate the Hypotheses H 0 : μ Drug A = μ Drug B = μ Drug C H a : The μ i are not all equal. 67

JMP demonstration Analyze Fit Y By X Y, Response: Pain X, Factor: Drug Normal Quantile Plot Plot Actual by Quantile Means/ANOVA 68

Step 5 Conclusion: There is strong evidence that the drugs are not all the same. 69

The p-value of the overall F test indicates that the level of pain is not the same for patients taking drugs A, B and C. We would like to know which pairs of treatments are different. One method is to use Tukeys HSD (honestly significant differences). 70

Tukeys test simultaneously tests JMP demonstration Oneway Analysis of Pain By Drug Compare Means All Pairs, Tukey HSD for all pairs of factor levels. Tukeys HSD controls the overall type I error. 71

The JMP output shows that drugs A and C are significantly different. 72

73

We are interested in the effect of two categorical factors on the response. We are interested in whether either of the two factors have an effect on the response and whether there is an interaction effect. An interaction effect means that the effect on the response of one factor depends on the level of the other factor. 74

75

76

We would like to determine the effect of two alloys (low, high) and three cooling temperatures (low, medium, high) on the strength of a wire. JMP demonstration Analyze Fit Model Y: Strength Highlight Alloy and Temp and click Macros Factorial to Degree Run Model 77

Conclusion: There is strong evidence of an interaction between alloy and temperature. 78

The one sample t-test allows us to test whether the population mean of a group is equal to a specified value. The two-sample t-test and paired t-test allow us to determine if the population means of two groups are different. ANOVA allows us to determine whether the population means of several groups are different. 79

For information about using SAS, SPSS and R to do ANOVA: a.htm htm 80

Fishers Irises Data (used in one sample and two sample t-test examples). Flexibility data (paired t-test example): Michael Sullivan III. Statistics Informed Decisions Using Data. Upper Saddle River, New Jersey: Pearson Education, 2004: