Presentation is loading. Please wait.

Presentation is loading. Please wait.

1/27/ Statistical Analysis. Meaning of Univariate, Bivariate & Multivariate Analysis of Data Univariate Analysis – In univariate analysis, one variable.

Similar presentations


Presentation on theme: "1/27/ Statistical Analysis. Meaning of Univariate, Bivariate & Multivariate Analysis of Data Univariate Analysis – In univariate analysis, one variable."— Presentation transcript:

1 1/27/2018 1 Statistical Analysis

2 Meaning of Univariate, Bivariate & Multivariate Analysis of Data Univariate Analysis – In univariate analysis, one variable is analysed at a time. Bivariate Analysis – In bivariate analysis two variables are analysed together and examined for any possible association between them. Multivariate Analysis – In multivariate analysis, the concern is to analyse more than two variables at a time. The type of statistical techniques used for analysing univariate and bivariate data depends upon the level of measurements of the questions pertaining to those variables. Further, the data analysis could be of two types, namely, descriptive and inferential. 1/27/2018 1.2

3 Descriptive vs. Inferential Analysis Descriptive analysis - Descriptive analysis deals with summary measures relating to the sample data. The common ways of summarizing data are by calculating average, range, standard deviation, frequency and percentage distribution. The first thing to do when data analysis is taken up is to describe the sample. Examples of Descriptive Analysis: What is the average income of the sample? What is the average age of the sample? What percentage of sample respondents are married? Is the level of job satisfaction related with the age of the employees? Which TV channel is viewed by the majority of viewers in the age group 20– 30 years? 1/27/2018 1.3

4 Descriptive vs. Inferential Analysis Inferential Analysis – Under inferential statistics, inferences are drawn on population parameters based on sample results. The researcher tries to generalize the results to the population based on sample results. Examples of Inferential Analysis: Is the average income of population significantly greater than 25,000 per month? Is the job satisfaction of unskilled workers significantly related with their pay packet? Is the growth in the sales of the company statistically significant? 1/27/2018 1.4

5 Descriptive Analysis of Univariate Data Measures of Central Tendency Arithmetic mean (appropriate for Interval and Ratio scale data) Median (appropriate for Ordinal, Interval and Ratio scale data) Mode (appropriate for Nominal, Ordinal, Interval and Ratio scale data) Measures of Dispersion Range (appropriate for Interval and Ratio scale data) Variance and Standard Deviation (for interval and ratio scale data) Coefficient of variation (appropriate for Ratio scale data) 1/27/2018 1.5

6 Descriptive Analysis of Bivariate Data Preparation of cross-tables – For interpretation of cross-tables, it is required to identify dependent and independent variable. Percentages should be computed in the direction of independent variable. There is no hard and fast rule as to where the dependent or independent variables are to be taken. They can be taken either in rows or in columns. 1/27/2018 1.6

7 Inferential Analysis: Parametric vs. Nonparametric Statistics Parametric Statistics are statistical techniques based on assumptions about the population from which the sample data are collected. Assumption that data being analyzed are randomly selected from a normally distributed population. Requires quantitative measurement that yield interval or ratio level data. Nonparametric Statistics are based on fewer assumptions about the population and the parameters. Sometimes called “distribution-free” statistics. A variety of nonparametric statistics are available for use with nominal or ordinal data. 1/27/2018 7

8 Following point may be born in mind before setting null hypothesis If we want to test the significance of the difference between a statistics and a parameter or between two independent sample statistics then we set up null hypothesis that the difference is not significant If we want to test any statement about the population we set up null hypothesis that it is true. 1/27/2018 8

9 A Broad Classification of Hypothesis Tests Means Proportions Tests of Association Tests of Differences Hypothesis Tests 1/27/20189

10 Choosing the appropriate test: What is the level of measurement to analyse the data How many sample are involved What is the purpose of the study 1/27/2018 10

11 What is the level of measurement to analyse the data Nominal/Ordinal Categorical/Rank Qualitative Non parametric Summery statistics: frequency,% frequency or mode Test : Test for proportion Difference of two proportion Chi-square for independence of attribute 1/27/2018 11

12 What is the level of measurement to analyse the data Interval/Ratio Quantitative Parametric Summery statistics: Mean Test : Test for mean Difference of two independent sample mean Difference of two sample paired mean Regression 1/27/2018 12

13 How many sample are involved One sample Test for proportion Test for single mean Two sample Difference of two proportion Difference of two independent sample mean One sample with two variable Chi square Regression Difference of two mean (paired) 1/27/2018 13

14 What is the purpose of study Testing against the hypothesised value Test of proportion Test of mean Test of paired mean Comparing two statistics Difference of two proportion Difference of two independent mean Looking for a relationship Chi square (in case of Nominal) Regression (in case of Interval/Ratio) 1/27/2018 14

15 what type of test require for the following? 1/27/2018 15

16 what type of test require for the following? Hellen selling Choco bar, wants to know whether the amount of nuts are sufficient or not in each bar. She takes a sample of 20 packets weight, and compares with a standard packet weight (i.e., 5 g) containing nuts. 1/27/2018 16

17 Sol. Data: weight of bar, it is Interval/Ratio Sample: one sample of 20 packets Purpose: comparing against a given value 5g. Test : Test for single mean 1/27/2018 17

18 In a promotional campaigning 20% of all packets of chocolate include a prize ticket, Hellen takes the sample of 50 packets and find 7 out of 50 wining the prize tickets. 1/27/2018 18 what type of test require for the following?

19 Sol. Data: whether or not winning the ticket i.e., yes or no so it is a nominal data Sample: one sample of 50 packets Purpose: compare sample value with given value Test; test for single proportion 1/27/2018 19

20 Hellen presume that my chocolate (Chacobars) lasts longer than another chocolate(Nuttabars), she takes the sample of 36 people, each bar 18 times recoded. 1/27/2018 20 what type of test require for the following?

21 Sol. Data: time collects in seconds so it is Interval/ Ratio data Sample: one sample 36 people for two score for separate bars Purpose: compare the amount of time that lasts longer Test: difference of two mean – paired 1/27/2018 21

22 Hellen thing there is difference in two rapping machine in her factory, she takes two sample of 200 and 150 bars from 1 st and 2 nd machine and found 10 and 9 bars are badly rapped. 1/27/2018 22 what type of test require for the following?

23 Sol. Data: information if ok or not so it is a nominal data Sample: two independent sample one sample in each machine Purpose: comparing the proportion of two sample Test: difference of two independent proportion 1/27/2018 23

24 Hellen wants to observe offer of prize tickets put any difference on sells of chocolate; she takes the sells figures of 13 days with prize tickets and 10 days without prize tickets. 1/27/2018 24 what type of test require for the following?

25 Sol. Data: numbers of sells in corresponding days so it is Interval/ Ratio scale Sample: two sample one with prize tickets and other without Purpose: comparing the average sells of two treatments Test : difference of two mean independent sample 1/27/2018 25

26 Hellen sells three different types of chocolate that is Dark, Milk and White. Her thinking is that Men and women might have different preference of chocolate, she collects the data from 50 customers consists of both men and women and asking them which type of chocolate they prefers. 1/27/2018 26 what type of test require for the following?

27 Sol. Data: Type of chocolate and male or female both are nominal scale Sample: one sample of 50 customer and two measures of variable (gender and types of chocolate) Purpose: relationship between the two variables Test: Chi-square, Test of independence 1/27/2018 27

28 Hellen takes 30 weeks of sells figure and try to observe the temperatures impact on sells of chocolate bar in a city. 1/27/2018 28 what type of test require for the following?

29 Sol. Data: sells and temperature both are Interval/Ratio scale Sample: one sample Purpose: relationship between sells and temperature Test: Regression 1/27/2018 29

30 In a nutshell SAMPLE/SCALENOMINALINTERVAL/RATIO ONE SAMPLETEST FOR SINGLE PROPORTION TEST FOR SINGLE MEAN TWO SAMPLETEST FRO TWO DIFFERENT PROPORTION TEST FOR TWO INDIPEDENT MEAN ONE SAMPLE WITH TWO VARIABLES CHI-SQUAREREGRES SION DIFFERE NCE OF PAIRED MEAN 1/27/2018 30

31 Introduction to SPSS 1/27/2018 31

32 Tests of Association(Chi-Square) Research question type: Association of two variables What kind of variables: Categorical (nominal or ordinal with few categories) 1/27/2018 32

33 Uses: Testing the goodness of fit: Uses frequency data from a sample to test hypotheses about population proportions. In these tests we are assessing how well sample data fits the population proportions specified by the null hypothesis Testing the independence/ association of attributes: Whether there is a statistical relationship between two qualitative or discrete quantitative variables. For a contingency table we use the following chi-square test statistic, 1/27/2018 33

34 For example 1/27/2018 34 S.No. Backgrou nd Grade 1B.ComB 2 C 3 A 4 C 5 B 6B.E.A 7 A 8 A 9 B 10B.E.A 11B.Sc.B 12B.Sc.B S.No. Backgrou nd Grade 13B.Sc.C 14B.Sc.C 15B.Sc.C 16BBAA 17BBAB 18BBAC 19BBAC 20BBAB 21B.A.C 22B.A.C 23B.A.C 24B.A.C 25B.A.B Educational Background Code B.Com.1 B.E.2 B.Sc.3 B.B.A4 B.A.5 Grade ObtainedGrade Code A1 B2 C3

35 Cross Tab 1/27/2018 35 Grade code Total 123 Code 1Count 1225 % within Grade code 16.7%25.0%18.2%20.0% 2Count 4105 % within Grade code 66.7%12.5%.0%20.0% 3Count 0235 % within Grade code.0%25.0%27.3%20.0% 4Count 1225 % within Grade code 16.7%25.0%18.2%20.0% 5Count 0145 % within Grade code.0%12.5%36.4%20.0% TotalCount 681125 % within Grade code 100.0%

36 Chi Square Test 1/27/2018 36 Valuedf Asymp. Sig. (2- sided) Pearson Chi-Square 13.7508.089 Likelihood Ratio 15.5818.049 Linear-by-Linear Association 3.6301.057 N of Valid Cases 25

37 Nominal Symmetric Measures 1/27/2018 37 ValueApprox. Sig. Nominal by NominalPhi.742.089 Cramer's V.524.089 Contingency Coefficient.596.089 N of Valid Cases 25 In the above case the contingency coefficient value being 0.596 which is greater than 0.5, hence the variables are strongly associated. The square of phi coefficient (0.742) is 0.5506 which indicates that 55.06% variations in performance of grade explained by the Educational Background of PGDM students. Also the Cramer’s V coefficient is 0.524 reflect a moderate relationship between the variable.

38 Tests of Differences (Mean, Proportion) Sample sizePopulation s.d knownPopulation s.d not known Large (n>30)ZZ Small (n<=30)Zt 1/27/2018 38

39 Test Concerning Means – Case of Single Population With an increase in the sample size (and hence degrees of freedom), t distribution loses its flatness and approaches the normal distribution whenever n > 30. 1/27/2018 1.39

40 Test of significance for two different means 1/27/2018 40

41 Paired t-test for difference of means 1/27/2018 41 Research question type: Difference between (comparison of) two related (paired, repeated or matched) variables What kind of variables: Continuous (scale/interval/ratio) Common Applications: Comparing the means of data from two related samples

42 Paired t-test for difference of means 1/27/2018 42

43 Test of significance for several different means (ANOVA) 1/27/2018 43 Source Of Variatio n Sum Of square Degree Of freedom Mean square F-ratio S.S.Bc-1 S.S.Wn-c S.S.Tn-1 Source Of Variation Sum Of square Degree Of freedom Mean square F- ratio S.S.CC-1 S.S.RR-1 S.S.E (C-1)(R-1) T.S.Sn-1 One Way ANOVA Two Way ANOVA

44 Parametric Vs Non-parametric 1/27/2018 44 ParametricNon-parametric Assumed distributionNormalAny Assumed varianceHomogeneousAny Typical dataRatioRatio or IntervalIntervalOrdinalOrdinal or NominalNominal Data set relationshipsIndependentAny Usual central measureMeanMedian Correlation testPearsonSpearman Independent measures, 2 groups Independent-measures t-testMann-Whitney test Independent measures, >2 groups One-way, independent- measures ANOVAANOVA Kruskal-Wallis test Repeated measures, 2 conditions pair t-test Wilcoxon test Repeated measures, >2 conditions One-way, repeated measures ANOVAANOVA Friedman's test

45 THANK YOU 1/27/2018 45


Download ppt "1/27/ Statistical Analysis. Meaning of Univariate, Bivariate & Multivariate Analysis of Data Univariate Analysis – In univariate analysis, one variable."

Similar presentations


Ads by Google