Non-Parametric Statistics

Slides:

Advertisements

Similar presentations

The Mann-Whitney U Test What you need to know

Advertisements

Introductory Mathematics & Statistics for Business

Prepared by Lloyd R. Jaisingh

Detection of Hydrological Changes – Nonparametric Approaches

Overview of Lecture Parametric vs Non-Parametric Statistical Tests.

C82MST Statistical Methods 2 - Lecture 2 1 Overview of Lecture Variability and Averages The Normal Distribution Comparing Population Variances Experimental.

Lecture 2 ANALYSIS OF VARIANCE: AN INTRODUCTION

SADC Course in Statistics Common Non- Parametric Methods for Comparing Two Samples (Session 20)

Assumptions underlying regression analysis

Non-parametric tests, part A:

STATISTICAL INFERENCE ABOUT MEANS AND PROPORTIONS WITH TWO POPULATIONS

Chapter 7 Sampling and Sampling Distributions

Elementary Statistics

Nonparametric Test Distribution-Free Tests 1.No assumptions of normality 2.Focus on medians rather than means 3.Not affected by outliers 4.Des NOT really.

(This presentation may be used for instructional purposes)

Non-parametric statistics

Chapter 4 Inference About Process Quality

Quantitative Methods for Researchers Paul Cairns

STATISTICAL ANALYSIS. Your introduction to statistics should not be like drinking water from a fire hose!!

CHAPTER TWELVE ANALYSING DATA I: QUANTITATIVE DATA ANALYSIS.

Statistical Inferences Based on Two Samples

Comparing Two Means.

Chapter 18: The Chi-Square Statistic

Experimental Design and Analysis of Variance

1 Chapter 20: Statistical Tests for Ordinal Data.

Lecture (11,12) Parameter Estimation of PDF and Fitting a Distribution Function.

BPS - 5th Ed. Chapter 241 One-Way Analysis of Variance: Comparing Several Means.

Chapter 16 Introduction to Nonparametric Statistics

Nonparametric tests and ANOVAs: What you need to know.

Statistics 07 Nonparametric Hypothesis Testing. Parametric testing such as Z test, t test and F test is suitable for the test of range variables or ratio.

Biostatistics in Research Practice: Non-parametric tests Dr Victoria Allgar.

Non-parametric statistics

Chapter 15 Nonparametric Statistics

Statistical Methods II

Nonparametric or Distribution-free Tests

Psy B07 Chapter 1Slide 1 ANALYSIS OF VARIANCE. Psy B07 Chapter 1Slide 2 t-test refresher  In chapter 7 we talked about analyses that could be conducted.

Statistical Significance R.Raveendran. Heart rate (bpm) Mean ± SEM n In men ± In women ± The difference between means.

1 CSI5388: Functional Elements of Statistics for Machine Learning Part I.

Parametric & Non-parametric Parametric Non-Parametric  A parameter to compare Mean, S.D.  Normal Distribution & Homogeneity  No parameter is compared.

Statistical Analysis Mean, Standard deviation, Standard deviation of the sample means, t-test.

Statistical Methods II Session 8 Non Parametric Testing – The Wilcoxon Signed Rank Test.

PCB 3043L - General Ecology Data Analysis. OUTLINE Organizing an ecological study Basic sampling terminology Statistical analysis of data –Why use statistics?

Nonparametric Tests IPS Chapter 15 © 2009 W.H. Freeman and Company.

The Statistical Analysis of Data. Outline I. Types of Data A. Qualitative B. Quantitative C. Independent vs Dependent variables II. Descriptive Statistics.

Lesson 15 - R Chapter 15 Review. Objectives Summarize the chapter Define the vocabulary used Complete all objectives Successfully answer any of the review.

Descriptive & Inferential Statistics Adopted from ;Merryellen Towey Schulz, Ph.D. College of Saint Mary EDU 496.

CD-ROM Chap 16-1 A Course In Business Statistics, 4th © 2006 Prentice-Hall, Inc. A Course In Business Statistics 4 th Edition CD-ROM Chapter 16 Introduction.

BPS - 5th Ed. Chapter 251 Nonparametric Tests. BPS - 5th Ed. Chapter 252 Inference Methods So Far u Variables have had Normal distributions. u In practice,

From the population to the sample The sampling distribution FETP India.

Chapter 13 Understanding research results: statistical inference.

Two-Sample-Means-1 Two Independent Populations (Chapter 6) Develop a confidence interval for the difference in means between two independent normal populations.

1 Nonparametric Statistical Techniques Chapter 18.

Dr Hidayathulla Shaikh. Objectives At the end of the lecture student should be able to – Discuss normal curve Classify parametric and non parametric tests.

Chapter 12 Chi-Square Tests and Nonparametric Tests

Nonparametric Tests IPS Chapter : The Wilcoxon Rank Sum Test

Statistical tests for quantitative variables

Non-Parametric Tests 12/1.

Non-Parametric Tests 12/1.

Non-Parametric Tests 12/6.

Non-Parametric Tests.

Y - Tests Type Based on Response and Measure Variable Data

Part Three. Data Analysis

Chi-Square Test Dr Kishor Bhanushali.

Non-parametric tests, part A:

Writing the IA Report: Analysis and Evaluation

Non – Parametric Test Dr. Anshul Singh Thapa.

Sampling Distributions

Nonparametric Statistics

InferentIal StatIstIcs

Presentation transcript:

Non-Parametric Statistics A Presentation by Rob McMullen for AP Statistics

What are Non-Parametric Statistics? end What are Non-Parametric Statistics? Non-parametric statistics are a special form of statistics which help statisticians with a problem occuring in Parametric statistics. In order to understand what non parametric statistics are, it is first necessary to know what parametric statistics are.

What are Parametric Statistics? end What are Parametric Statistics? In AP statistics, when we refer to a distribution we often make certain assumptions about it that enable us to work with it. One thing that helps us with this is the CLT, which allows us to assume that many sampling distributions are approximately normal. This theorem, the Central Limit Therom, tells us that for any distribution with a mean and variance, the sampling distribution for all samples of a given sample size is approximately normally distributed.

When are Parametric Statistics not useful? end When are Parametric Statistics not useful? When we do significance tests, we rely on the assumption that the sampling distribution of samples taken follows the t-distribution or the z-distribution, depending on the situation. When this assumption is not true, none of our tests, which are called “parametric statistical inference tests,” are reliable. Everything we have done in AP stats has been in the field of “parametric statistics.”

Why does lack of normality cause problems? end Why does lack of normality cause problems? When we calculate the p-value for an inference test, we find the probability that the sample was different due to sampling variability. Basically, we are trying to see if a recorded value occurred by chance and chance alone. When we look for a p-value, we are assuming that all samples of the given sample size are normally distributed around the mean. This is why the test statistic, which is the number of standard deviations away from the population mean the sample mean is, is able to be used. Therefore, without normality, no p-value can be found.

What are Non-Parametric Statistics? end What are Non-Parametric Statistics? The way in which statisticians deal with this problem of parametric statistics is the field of non-parametric statistics. These are tests that can be done without the assumption of normality, approximate normality, or symmetry. These tests do not require a mean and standard deviation. Since a standard deviation assumes symmetry, it is not useful for many distributions anyway.

What is different about Non-Parametric Statistics? end What is different about Non-Parametric Statistics? Sometimes statisticians use what is called “ordinal” data. This data is obtained by taking the raw data and giving each sample a rank. These ranks are then used to create test statistics. In parametric statistics, one deals with the median rather than the mean. Since a mean can be easily influenced by outliers or skewness, and we are not assuming normality, a mean no longer makes sense. The median is another judge of location, which makes more sense in a non-parametric test. The median is considered the center of a distribution.

Tests for non-parametric statistics are similar to the tests covered in AP stats, but each is slightly different. There are non-parametric tests which are similar to the parametric tests. The following table shows how some of the tests match up. end Parametric Test Goal for Parametric Test Non-Parametric Test Goal for Non-Parametric Test Two Sample T-Test To see if two samples have identical population means Wilcoxon Rank-Sum Test To see if two samples have identical population medians One Sample T-Test To test a hypothesis about the mean of the population a sample was taken from Wilcoxon Signed Ranks Test To test a hypothesis about the median of the population a sample was taken from Chi-Squared Test for Goodness of Fit To see if a sample fits a theoretical distribution, such as the normal curve Kolmogorov-Smirnov Test To see if a sample could have come from a certain distribution ANOVA To see if two or more sample means are significantly different Kruskal-Wallis Test To test if two or more sample medians are significantly different

A N O V A end What is an ANOVA? When are ANOVAs useful? How does one carry out an ANOVA?

A N O V A end What is an ANOVA? Since ANOVAs were not covered in AP stats, I will now explain them. An ANOVA is a way to compare multiple sample means to see if they are significantly different. The term comes from a term that describes what the experiment does: ANalysis Of VAriance = ANOVA. An ANOVA looks at the variance between the sample means, and decides if they are significant or not. This can be done to compare two or more samples.

A N O V A When are ANOVAs useful? end When are ANOVAs useful? An ANOVA can be used when one wants to compare any number of samples. This test be done to see if many samples could have come from the same population. This test can also tell you about the differences between two or more areas. For example, if a survey is conducted in many different towns, you can see if their average responses differ significantly. Similarly, you can take samples of plant growth in different climates, soil, or with different treatments. In all cases, an ANOVA can be used to see if the means vary significantly.

A N O V A How does one carry out an ANOVA? end An ANOVA is conducted by first putting all the samples into one, large sample. The standard deviation of this sample is then found, and called  . Next, the value for the range of variation in sample means is found. If the variation between the means is greater than the range of variation, the null hypothesis is rejected. The range of variation is found by finding  / N½, (N½ is the square-root of N) where N is the number of samples in each sample. The difference between each pair of sample means is then found, which is the variation of the means. If any one of these is greater than the range of variation, then those two means are significantly different from each other. Depending on your goal, this may cause you to reject your null hypothesis.

end EXAMPLE Now that I have explained the background principles of Non-Parametric Statistics, I will now carry out an example of one of the tests. I have chosen the Wilcoxon Rank-Sum Test (also call the Wilcoxon Mann-Whitney Test) because it is the most commonly used test.

The Wilcoxon Rank-Sum Test end The Wilcoxon Rank-Sum Test is used in place of the two-sample t-test when the sampling distributions of the variables being compared are not normal. This test requires two samples of sample size n1 and n2. The test is carried out as follows. Items in green are the steps to the test. Items in white are an example of a real test.

The Wilcoxon Rank-Sum Test end 1: The first step in this procedure is to collect two samples. Sample 1: {3,2,12,9,13,7,9,11,4,5,6} n1=11 Sample 2: {1,8,4,15,12,6,10,14,3,3} n2=10

The Wilcoxon Rank-Sum Test end 2: The Second step is to combine the two samples into one large sample. Simply take all the data values from each sample and make one large group. Make sure to know the original samples, as the data will have to be separated back into its original state later. Combined Sample size: n1+n2 = 10+11 = 21 {3,2,12,9,13,7,9,11,4,5,6} and {1,8,4,15,12,6,10,14,3,3} becomes: {3,2,12,9,13,7,9,11,4,5,6,1,8,4,15,12,6,10,14,3,3}

The Wilcoxon Rank-Sum Test end 3: Once all the data is in one sample, the data must be put into order by size. The data should go from smallest to largest. {3,2,12,9,13,7,9,11,4,5,6,1,8,4,15,12,6,10,14,3,3} In order is: {1,2,3,3,3,4,4,5,6,6,7,8,9,9,10,11,12,12,13,14,15}

The Wilcoxon Rank-Sum Test end 4: Each data value is given a rank based on size. If two or more data have the same value, their rank is the average of the ranks. This step is when the raw data becomes ordinal data, or ranked data. Combined sample in order is: (sample size 21) {1,2,3,3,3,4,4,5,6,6,7,8,9,9,10,11,12,12,13,14,15} Each data value is ranked 1-21: RANK: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 RAW DATA:

end RANK: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 RAW DATA: When two or more data have the same rank, the rank is averaged. Therefore, the data becomes: 1 2 4 6.5 8 9.5 11 12 13.5 15 16 17.5 19 20 21 3 3 3 4 4 5 6 6 7 9 9 10 12 12 13 14 RANK: RAW DATA:

The Wilcoxon Rank-Sum Test end 5: The data are then put back into their original sampling groups as ranked data. 1 2 4 6.5 8 9.5 11 12 13.5 15 16 17.5 19 20 21 3 3 3 4 4 5 6 6 7 9 9 10 12 12 13 14 RANK: RAW DATA: Orininal Sample 1: {3,2,12,9,13,7,9,11,4,5,6} Original Sample 2: {1,8,4,15,12,6,10,14,3,3} Ranked Sample 1: {4,2,17.5,13.5,19,11,13.5,16,6.5,8,9.5} Ranked Sample 2: {1,12,6.5,21,17.5,9.5,15,20,4,4,}

The Wilcoxon Rank-Sum Test end 6: The sum of the ranks is taken for each sample. This is the test statistic. Ranked Sample 1: {4,2,17.5,13.5,19,11,13.5,16,6.5,8,9.5} Ranked Sample 2: {1,12,6.5,21,17.5,9.5,15,20,4,4,} Sum of sample 1: 120.5 Sum of sample 2: 110.5

The Wilcoxon Rank-Sum Test end SUMMARY: 1: Two samples are taken. 2: The samples are combined to make one distribution of sample size (n1+n2). 3: The data are put into order, based on size. 4: Each data value is given a rank based on size. If two or more data have the same value, their rank is the average of the ranks. 5: The data are then put back into their original sampling groups as ranked data. 6: The sum of the ranks is taken for each sample. This is the test statistic.

Non-Parametric Statistics This concludes my presentation. Are there any topics which have been covered that are not clear, which you would like to see again? Wilcoxon Rank-Sum Test explanation/example Explanation of an ANOVA Introduction to Non-Parametric Statistics Chart comparing Significance Tests

THANK YOU I would like to thank you for taking the time to view this presentation. If you have any questions regarding this topic, you may email me at Robert_McMullen@BBNS.org. I hope that this has been informational and that you now clearly understand what non-parametric statistics are.