STATISTICAL ANALYSIS. Your introduction to statistics should not be like drinking water from a fire hose!!

Slides:



Advertisements
Similar presentations
Overview of Lecture Parametric vs Non-Parametric Statistical Tests.
Advertisements

February Nature of the distribution is not known, or known to be non-normal. Sometimes called distribution free statistics Everything up to this.
Non-Parametric Statistics
CHAPTER TWELVE ANALYSING DATA I: QUANTITATIVE DATA ANALYSIS.
Basic Statistics Measures of Central Tendency.
Statistics. Review of Statistics Levels of Measurement Descriptive and Inferential Statistics.
Statistical Tests Karen H. Hagglund, M.S.
Statistics.
Statistical Analysis SC504/HS927 Spring Term 2008 Week 17 (25th January 2008): Analysing data.
Analysis of Research Data
Introduction to Educational Statistics
Social Research Methods
Data Analysis Statistics. Levels of Measurement Nominal – Categorical; no implied rankings among the categories. Also includes written observations and.
Statistics for CS 312. Descriptive vs. inferential statistics Descriptive – used to describe an existing population Inferential – used to draw conclusions.
Statistical Analysis I have all this data. Now what does it mean?
Some Introductory Statistics Terminology. Descriptive Statistics Procedures used to summarize, organize, and simplify data (data being a collection of.
Fall 2013 Lecture 5: Chapter 5 Statistical Analysis of Data …yes the “S” word.
Chapter 3 Statistical Concepts.
Statistics in psychology Describing and analyzing the data.
Psychometrics.
Statistics. Question Tell whether the following statement is true or false: Nominal measurement is the ranking of objects based on their relative standing.
© 2006 McGraw-Hill Higher Education. All rights reserved. Numbers Numbers mean different things in different situations. Consider three answers that appear.
Statistical Analysis I have all this data. Now what does it mean?
© 2006 McGraw-Hill Higher Education. All rights reserved. Numbers Numbers mean different things in different situations. Consider three answers that appear.
Biostatistics Class 1 1/25/2000 Introduction Descriptive Statistics.
Lecture 5: Chapter 5: Part I: pg Statistical Analysis of Data …yes the “S” word.
An Introduction to Statistics. Two Branches of Statistical Methods Descriptive statistics Techniques for describing data in abbreviated, symbolic fashion.
Statistics in Biology. Histogram Shows continuous data – Data within a particular range.
L. Liu PM Outreach, USyd.1 Survey Analysis. L. Liu PM Outreach, USyd.2 Types of research Descriptive Exploratory Evaluative.
Research Ethics:. Ethics in psychological research: History of Ethics and Research – WWII, Nuremberg, UN, Human and Animal rights Today - Tri-Council.
CHI SQUARE TESTS.
SPSS Workshop Day 2 – Data Analysis. Outline Descriptive Statistics Types of data Graphical Summaries –For Categorical Variables –For Quantitative Variables.
Chapter Eight: Using Statistics to Answer Questions.
BASIC STATISTICAL CONCEPTS Chapter Three. CHAPTER OBJECTIVES Scales of Measurement Measures of central tendency (mean, median, mode) Frequency distribution.
1 Outline 1. Why do we need statistics? 2. Descriptive statistics 3. Inferential statistics 4. Measurement scales 5. Frequency distributions 6. Z scores.
LIS 570 Summarising and presenting data - Univariate analysis.
Introduction to statistics I Sophia King Rm. P24 HWB
Descriptive and Inferential Statistics Or How I Learned to Stop Worrying and Love My IA.
HL Psychology Internal Assessment
Chapter 14 – 1 Chi-Square Chi-Square as a Statistical Test Statistical Independence Hypothesis Testing with Chi-Square The Assumptions Stating the Research.
Measurements Statistics WEEK 6. Lesson Objectives Review Descriptive / Survey Level of measurements Descriptive Statistics.
Chapter 2 Describing and Presenting a Distribution of Scores.
PXGZ6102 BASIC STATISTICS FOR RESEARCH IN EDUCATION
©2013, The McGraw-Hill Companies, Inc. All Rights Reserved Chapter 2 Describing and Presenting a Distribution of Scores.
Data Presentation Numerical Summary Measures Chung-Yi Li, PhD Dept. of Public Health, College of Med. NCKU.
AP PSYCHOLOGY: UNIT I Introductory Psychology: Statistical Analysis The use of mathematics to organize, summarize and interpret numerical data.
Lecture 8 Data Analysis: Univariate Analysis and Data Description Research Methods and Statistics 1.
Exploratory Data Analysis
Introduction to Marketing Research
Measurements Statistics
Chapter 12 Chi-Square Tests and Nonparametric Tests
Statistics in psychology
Statistics.
Hypothesis Testing Review
Module 6: Descriptive Statistics
Description of Data (Summary and Variability measures)
Social Research Methods
Georgi Iskrov, MBA, MPH, PhD Department of Social Medicine
Analysis and Interpretation: Exposition of Data
Introduction to Statistics
Basic Statistical Terms
Hypothesis testing. Chi-square test
15.1 The Role of Statistics in the Research Process
Parametric versus Nonparametric (Chi-square)
Chapter Nine: Using Statistics to Answer Questions
Chapter Fifteen Frequency Distribution, Cross-Tabulation, and
Descriptive Statistics
CLASS 6 CLASS 7 Tutorial 2 (EXCEL version)
Georgi Iskrov, MBA, MPH, PhD Department of Social Medicine
Introductory Statistics
Presentation transcript:

STATISTICAL ANALYSIS. Your introduction to statistics should not be like drinking water from a fire hose!!

What do you mean by data?? Nature of the Data Two main types: categorical or continuous 1. Categorical: Nominal (unordered, unequal categories) E.g.: Female=1 and male=2 Ordinal (ordered unequal or ranked categories) E.g.: 1=SD 2=D 3=N 4=A 5=SA 2. Continuous: Interval (ordered, equal intervals, no zero) E.g.: 5-point Likert scale with equal intervals or IQ score Ratio (ordered, equal intervals with absolute zero) E.g.: raw scores, class attendance (in days); age (in years) Descriptive statistics: Procedures used for summarizing the data in both numerical and graphic form. Includes, frequencies, distributions, percents, cumulative percents, pie charts, bar graphs (histograms) and scatter plots. (Cross-tabulations: summarizes relationships between two variables like a scatter plot but in a table form.) Measures of central tendency: Mean: arithmetic average (interval & ratio data only) Mode: most frequent; can be bimodal or multimodal (all types) Median: mid point with equal half above and below; (ordinal, interval and ration)

Statistics 101!! Statistics Measures of location—mean vs. median and why Measures of scale—range, interquartile range, standard deviation (and variance) Measures of position—percentiles, deciles, quartiles, median Note. For categorical variables, we use proportions as the descriptive statistics

Why does lack of normality cause problems? When we calculate the p-value for an inference test, we find the probability that the sample was different due to sampling variability. Basically, we are trying to see if a recorded value occurred by chance and chance alone. When we look for a p-value, we are assuming that all samples of the given sample size are normally distributed around the mean. This is why the test statistic, which is the number of standard deviations away from the population mean the sample mean is, is able to be used. Therefore, without normality, no p-value can be found.

Goal for Parametric Test Non-Parametric Test There are non-parametric tests which are similar to the parametric tests. The following table shows how some of the tests match up. Parametric Test Goal for Parametric Test Non-Parametric Test Goal for Non-Parametric Test Two Sample T-Test To see if two samples have identical population means Wilcoxon Rank-Sum Test To see if two samples have identical population medians One Sample T-Test To test a hypothesis about the mean of the population a sample was taken from Wilcoxon Signed Ranks Test To test a hypothesis about the median of the population a sample was taken from Chi-Squared Test for Goodness of Fit To see if a sample fits a theoretical distribution, such as the normal curve Kolmogorov-Smirnov Test To see if a sample could have come from a certain distribution ANOVA To see if two or more sample means are significantly different Kruskal-Wallis Test To test if two or more sample medians are significantly different

What is different about Non-Parametric Statistics? Sometimes statisticians use what is called “ordinal” data. This data is obtained by taking the raw data and giving each sample a rank. These ranks are then used to create test statistics. In parametric statistics, one deals with the median rather than the mean. Since a mean can be easily influenced by outliers or skewness, and we are not assuming normality, a mean no longer makes sense. The median is another judge of location, which makes more sense in a non-parametric test. The median is considered the center of a distribution.

Drawing a histogram..the good the bad and the downright ugly!!. Many modern introductory texts and confuse frequency graphs, relative frequency graphs, and histograms. Bad Good

What's the difference between a bar chart & a Histogram??

Critical Values For a given number of degrees of freedom, by the property of the t-distribution, we know how large the t-statistic must be in order to reject the null. We call that number the “critical value” of the t-statistic and is typically determined by the values in a table of the t-statistic. If the value of the t-statistic calculated from the data is greater than this critical value, then we “reject the null hypothesis.” - This is because, for t-statistics greater than this critical value, our probability of falsely rejecting the null hypothesis is very small.

Example Suppose our null hypothesis is that X is less than 0. The sample mean is 3; The sample standard deviation is 2; There are 121 observations. Step 1. We need to establish our “critical value.” We wish to reject the null hypothesis if we are 95% certain that it is false. For 121 observations and a “one-tailed test,” the critical value is 1.66 (which we look up on the table. This corresponds to a significance level of .05 with 120 degrees of freedom). Step 2. The t-statistic = ( 3 – 0 ) / ( 2 / 121 )  3 / .18  16.7. Step 3. Compare the t-statistic with the critical value. If the t-statistic is greater than the critical value, then you can reject the null hypothesis. In this case, 16.7 is greater than 1.66, so we can reject the null hypothesis that X is less than zero.

Example The table to the right is a sample “cross-tab” Your research hypothesis is that dog ownership and gender are related. How do you test this hypothesis? Dog-Owners No Pets Totals Men 100 400 500 Women 50 450 150 850 1,000

Hypothesis Tests about tables Step 1. Define null and research hypotheses. The null hypothesis will usually be that there is no relationship between the rows and the columns. Step 2. Determine your tolerance for falsely rejecting the null hypothesis of no relationship. Step 3. Empirically analyse the data to determine if there is a relationship.

Example To calculate independence: 1) Identify the number of respondents in each internal cell of the table 2) Calculate the number of respondents who would be in each cell if independent (corresponds to the second number under each total) e.g. cell1,1 = .5 * .15 *1000 = 75 cell1,2 = .5 * .85 *1000 = 425 3) Compute the chi-squared test statistic (next slide) Dog-Owners No Pets Totals Men 100 ( 75 ) 400 ( 425 ) 500 Women 50 450 150 850 1,000 1.00

The Chi-Square Test Statistic To calculate independence: 3) Compute the chi-squared test statistic The chi-squared test statistic is simply: 2 = rowscolumns (Observedrow,column - Expectedrow,column)2 Expectedrow,column The chi-squared statistic follows a chi-squared distribution with degrees of freedom = (rows – 1) (columns – 1).

Example If we look at our table of the 2 with 1 degrees of freedom, the critical value for our test statistic is 3.84. 2 = (100 - 75)2 / 75 +(400-425)2 / 425 + (50- 75)2 / 75 + (450-425)2 / 425 =19.6 In this case, we reject the null hypothesis that the two populations are statistically independent because our test-statistic is greater than our critical value. Dog-Owners No Pets Totals Men 100 (75) 400 (425) 500 Women 50 450 150 850 1,000