32931 Technology Research Methods Autumn 2017 Quantitative Research Component Topic 4: Bivariate Analysis (Contingency Analysis and Regression Analysis)

Slides:



Advertisements
Similar presentations
11-1 Empirical Models Many problems in engineering and science involve exploring the relationships between two or more variables. Regression analysis.
Advertisements

Hypothesis Testing Steps in Hypothesis Testing:
Inference for Regression
6-1 Introduction To Empirical Models 6-1 Introduction To Empirical Models.
11 Simple Linear Regression and Correlation CHAPTER OUTLINE
© 2010 Pearson Prentice Hall. All rights reserved Least Squares Regression Models.
The Simple Regression Model
SIMPLE LINEAR REGRESSION
Correlation and Regression Analysis
Review for Exam 2 Some important themes from Chapters 6-9 Chap. 6. Significance Tests Chap. 7: Comparing Two Groups Chap. 8: Contingency Tables (Categorical.
Simple Linear Regression Analysis
Estimation and Hypothesis Testing Faculty of Information Technology King Mongkut’s University of Technology North Bangkok 1.
Chapter 13: Inference in Regression
Simple Linear Regression. Correlation Correlation (  ) measures the strength of the linear relationship between two sets of data (X,Y). The value for.
Research Seminars in IT in Education (MIT6003) Quantitative Educational Research Design 2 Dr Jacky Pow.
Section 9-1: Inference for Slope and Correlation Section 9-3: Confidence and Prediction Intervals Visit the Maths Study Centre.
© Copyright McGraw-Hill Correlation and Regression CHAPTER 10.
Copyright (C) 2002 Houghton Mifflin Company. All rights reserved. 1 Understandable Statistics Seventh Edition By Brase and Brase Prepared by: Lynn Smith.
Understanding Basic Statistics Fourth Edition By Brase and Brase Prepared by: Lynn Smith Gloucester County College Chapter Nine Hypothesis Testing.
26134 Business Statistics Week 4 Tutorial Simple Linear Regression Key concepts in this tutorial are listed below 1. Detecting.
26134 Business Statistics Week 4 Tutorial Simple Linear Regression Key concepts in this tutorial are listed below 1. Detecting.
Lecture 2 Survey Data Analysis Principal Component Analysis Factor Analysis Exemplified by SPSS Taylan Mavruk.
Section Copyright © 2014, 2012, 2010 Pearson Education, Inc. Lecture Slides Elementary Statistics Twelfth Edition and the Triola Statistics Series.
Stats Methods at IC Lecture 3: Regression.
Chi Square Test Dr. Asif Rehman.
Comparing Counts Chi Square Tests Independence.
26134 Business Statistics Autumn 2017
Chapter Nine Hypothesis Testing.
Lecture #8 Thursday, September 15, 2016 Textbook: Section 4.4
Inference about the slope parameter and correlation
Lecture #25 Tuesday, November 15, 2016 Textbook: 14.1 and 14.3
Inferential Statistics
Nonparametric Statistics
32931 Technology Research Methods Autumn 2017 Quantitative Research Component Topic 3: Comparing between groups Lecturer: Mahrita Harahap
Regression and Correlation
Chapter 12 Chi-Square Tests and Nonparametric Tests
Causality, Null Hypothesis Testing, and Bivariate Analysis
Lecture 10 Regression Analysis
Presentation 12 Chi-Square test.
Dr. Siti Nor Binti Yaacob
26134 Business Statistics Week 5 Tutorial
Correlation and Simple Linear Regression
Applied Biostatistics: Lecture 2
32931 Technology Research Methods Autumn 2017 Quantitative Research Component Topic 2: Inference Statistics Lecturer: Mahrita Harahap
Hypothesis Testing and Confidence Intervals (Part 1): Using the Standard Normal Lecture 8 Justin Kern October 10 and 12, 2017.
Hypothesis testing. Chi-square test
CRITICAL NUMBERS Bivariate Data: When two variables meet
Correlation and Simple Linear Regression
Reasoning in Psychology Using Statistics
Correlation and Regression
STAT120C: Final Review.
Inferential Statistics
Nonparametric Statistics
9 Tests of Hypotheses for a Single Sample CHAPTER OUTLINE
6-1 Introduction To Empirical Models
Stat 217 – Day 28 Review Stat 217.
Hypothesis testing. Chi-square test
Correlation and Simple Linear Regression
Statistics in SPSS Lecture 9
Association, correlation and regression in biomedical research
Reasoning in Psychology Using Statistics
When You See (This), You Think (That)
Product moment correlation
Topic 8 Correlation and Regression Analysis
Parametric versus Nonparametric (Chi-square)
Reasoning in Psychology Using Statistics
InferentIal StatIstIcs
32931 Technology Research Methods Autumn 2017 Quantitative Research Component Topic 2: Inference Statistics Background. 30% Quantitative Assignment due.
Correlation and Simple Linear Regression
Correlation and Simple Linear Regression
Presentation transcript:

32931 Technology Research Methods Autumn 2017 Quantitative Research Component Topic 4: Bivariate Analysis (Contingency Analysis and Regression Analysis) Lecturer: Mahrita Harahap Mahrita.Harahap@uts.edu.au B MathFin (Hons) M Stat (UNSW) PhD (UTS) mahritaharahap.wordpress.com/ teaching-areas Faculty of Engineering and Information Technology

Last Week: Hypothesis Testing Process Hypothesis tests give us an objective way of assessing such questions. They are based on a proof by contradiction form of argument. We formulate a null hypothesis (H0) We formulate an alternative hypothesis (H1) – determine if it is a 1-tailed or 2-tailed test. State the assumptions of the test and it’s level of significance. We calculate a Test Statistic. Measures the compatibility of the sample obtained, with the H0, assuming it is true. Find it’s associated P-Value, which represents the probability of observing this sample statistic or as extreme, assuming H0 is true. Weigh up the conclusion based on the P-value. If p-value≤0.05, we reject H0. If p-value>0.05, we do not reject H0. State the conclusion in context so people who don’t understand statistics can still understand your conclusions H A T P C

Last Week: Parametric Tests 1-Sample tests 1-Sample T 1-Sample Proportion 2-Sample tests Paired T 2-Sample T Chi-Square Independence Test K-Sample tests Analysis of Variance Chi-Square Goodness of Fit Test Week 2

Last Week: Nonparametric Tests 1-Sample tests 1-Sample Wilcoxon 2-Sample tests Wilcoxon test on difference Mann-Whitney Test K-Sample tests Kruskal Wallis Test Week 2

This Week: Bivariate Analysis 2 categorical variables: Contingency Analysis – chi-square test 2 quantitative variables: Regression Analysis – F-test Week 2

Categorical x Categorical Contingency Analysis: Chi-Squared Test Categorical x Categorical Week 4

Chi-Squared Test In some situations, it is useful to look at the number of observations in each level of a categorical variable, or in each combination of two or more categorical variables This allows us to find relationships between two or more categorical variables. We can construct a contingency table (otherwise known as a two-way table or a crosstab) that contains this information. Week 4

Chi-Squared Test Chi-Squared tests test for relationships between two categorical variables H0: variables are independent of each other H1: variables are not independent of each other OR H0: there is no association between the variables H1: there is an association between the variables This test is based on the amount that the observed cells in the crosstab differ from those that we would expect if the variables were independent (i.e. not associated). Week 4

Chi-Squared Test In order to test hypotheses on contingency tables, we need a table to test against. This table should reflect the case where there is no relationship between the variables. Since we wish to ‘prove’ a relationship between the variables, if one exists. Then H0 will be that there is no relationship between the variables (i.e. independent or not associated) Week 4

Chi-Squared Test If there is no relationship, then we can use the number of observations in each level for each individual variable to determine how many we would expect in each pair of levels for the two variables. Week 4

Chi-Squared Test: Assumptions We require there to be an adequate sample in each cell. As a rule of thumb, we need to have no more than 20% of the cells in the table with an expected cell count of less than 5 and no cells with an expected cell count of less than 1. If we do not have an adequate sample, then the test statistic (and hence the p-value) will be very sensitive to small changes in the sample counts. If this assumption is not valid, the equivalent nonparametric alternative test is called the Fisher’s Exact Test.

Example: Pulse Data In the lecture last week, we considered a data set based on the pulses of people who either ran for one minute, or who rested for one minute. We can use inferential techniques to test some of the theories that we may have made from our exploration. Pulse 1 Pulse 2 Ran Smokes Sex Height Weight Activity 64.0 88.0 1 0. 1.68 2 58.0 70.0 1.83 66.0 : 76.0 1.57 49.0 First pulse measurement Second pulse measurement 1 = Yes 0 = No 1 = Male 2 = Female in m in kg 1 = slight 2 = moderate 3 = high

Chi-Squared Test: Example Suppose that we would like to test whether there are gender differences in whether a person smokes or not. Step 1: Set up the hypotheses H0: gender and smokes are not associated H1: gender and smokes are associated Step 2: Choose an appropriate test – Chi-Squared test Step 3: Execute the test in SPSS and obtain a p-value Use Analyze> Descriptive Statistics > Crosstabs check Statistics>Chi Square Week 4

Chi-Squared Test: Example Step 4: Make a conclusion P-value = 0.216 > α(0.05) (level of significance) Therefore do not reject H0 Step 5: State the conclusion in context Therefore there are no significant gender differences in whether a person smokes or not (i.e. gender and smoking are not associated). Week 4

Testing Assumptions Example: Gender x Smokes SPSS will calculate expected cell counts. It will also warn us if the assumptions do fail. Week 4

Quantitative x quantitative Regression Analysis Quantitative x quantitative Week 4

Simple Linear Regression Suppose that we would like to describe the relationship between height and weight in the pulse dataset. Both height and weight are quantitative variables. We can use a scatterplot to visualise the relationship between them. Week 4

Simple Linear Regression We notice that there is a linear relationship between height and weight. We can add a line to the scatterplot to summarise the relationship. We call this line the regression line. Week 4

Simple Linear Regression We can find the equation of the regression line by minimising the squared distance between the points and the line. Minimise Week 4

Simple Linear Regression The equation for a straight line is Recall that in statistics, we prefer to use Greek letters to label parameters that relate to the population. Then the ‘true’ regression model for the population is And the estimates from our data are We call the y variable the dependent variable(i.e. response) and the x variable the independent variable(i.e. predictor). Week 4

Simple Linear Regression There are two reasons why we may like to fit a regression Tests of significance – determine whether two or more variables are related to each other. Prediction – use a known set of independent variables to predict what the value of the dependent variable is. Week 4

Using Regression to Test Significance – F-test We can run a hypothesis test on the coefficient of the independent variable to see whether the dependent variable changes when the independent variable does. If β1 = 0 then it doesn’t matter what value the independent variable takes, we will have the same estimate for the dependent variable. So we set up the hypotheses H0: β1 = 0 H1: β1 ≠ 0 We use the F-test statistic. If we reject H0 then the independent variable has an effect on the dependent variable. We can use a p-value to make our decision as usual. Week 4

Simple Linear Regression Returning to the pulse example… Suppose that we would like to test whether there is a relationship between height (quantitative) and weight (quantitative). Then the appropriate analysis method is regression. The regression equation is When we use SPSS to run the regression, one of the tables that we obtain is Week 4

Simple Linear Regression Example continued We can also see that both of the coefficients are significantly different from 0. This means that height and weight are related. Week 4

Simple Linear Regression How do we interpret the parameters? Slope (β1) For every 1 unit increase in the independent variable, the dependent variable will on average increase/decrease by β1 units. Intercept (β0) When there are zero units of the independent variable, there are β0 units of the dependent variable. Caution: Only interpret the intercept if we have data for the independent variable at 0. Week 4

Simple Linear Regression Example continued Slope (β1) For every 1m increase in height, weight will on average increase by 90.008 kg. Intercept (β0) It does not make sense to interpret the intercept as 0 is outside the range of values we have for height. If we were to interpret it, we would be saying that a person who is 0m tall, would weigh -91.147kg (which doesn’t make sense!!) Week 4

Using Regression for Prediction Some researchers are interested in answering ‘what if’ questions. We can use the regression equation to answer these. If my height was 173cm, what do I expect my weight to be? Week 4

Using Regression for Prediction However, we must be cautious. It is not a good idea to predict the weight for somebody whose height is not in the range of my participants. This is called extrapolation. The estimates that we obtain could be unstable outside the range of our sample. We also need to make sure that we only use models that predict well to make predictions. Week 4

Using Regression for Prediction To see how well the model fits the data we look at the R2, which is the proportion of the variation in our response that can be accounted for with the predictor. An R2 > 75% is considered to give a good fit to the data If we have a high R2 then we will be able to give precise estimates when we use the model to predict. Returning to our example, we also get the table This suggests that while weight and height are related, a model that predicts weight based on height only will not be very good at predicting new observations (since R2 < 75%). Week 4

GOOD LUCK IN THE ASSIGNMENT!

Week 4