مقدمة في الإحصاء الحيوي مع تطبيقات برنامج الحزم الإحصائية SPSS د. أسامة عبدالحليم سمرقندي المشرف على إدارة الإحصاء والمعلومات جامعة الملك سعود
Introduction to Biostatistics with SPSS Application Osama A Samarkandi, PhD
Session II
* Welcome Back * Day 2
Univariate Descriptive Statistics Univariate Examination: of two frequency distribution, central tendency, and variability. Bivariate: Examination: of two variables simultaneously. "Is SES related to intelligence? Do SAT scores have anything to do with how well one does in college? The question, is Do these variables correlate, or, covary?
The correlation coefficient (Pearson) The correlation coefficient is bivariate statistic that measures the degree of linear association between 2 variables. (Pearson Product Moment correlation coefficient)
Scatterplot Reveals the presence of association between 2 variables. The-stronger the relationship, the more the data points cluster along an imaginary line. Indicates the direction of the relationship. Reveals the presence of outliers, النقطة بعيدة عن الخط SAT GRE
النقطة بعيدة عن الخط SAT GRE
Covariance Examining the scatterplot is not enough. A single number can represent the degree and direction of the linear relation between two variables.
The Logic of the Covariance What does it mean for two variables to be positively associated? Where there is a positive association between two variables, scores above the mean OR X tend to be associated with scores above the-mean OR Y and scores below the mean on X tend to be accompanied by scores below the mean of Y. (Note: For this reason deviation - score is an Important part of Covarince)
Properties of the Pearson (r) r is metric-independent, r -reflects the-direction of the relationship, r-reflects the magnitude of the relationship. The Strength of association (r2) = Coefficient of determination. 1-r2 = Coefficient of non-determination. Variance Practical Significant
Example 1 The following data are representing both GRE & SAT score for a random selection of (12) students. Find the Covariance, and the correlation coefficient for the distribution and give a clear interpretation. Students Y (GPA) X (SAT) A 1.6 400 B 2.0 350 C 2.2 500 D 2.8 E 450 F 2.6 550 G 3.2 H 600 I 2.4 650 J 3.4 K 700 L 3.0 750
Solution Cov= 𝛴(Y− 𝑌 )(X− 𝑋 ) 𝑛−1 = 378.33 11 =34.39 N=12 𝑋 = Σ𝑋 𝑛 =30.8/12=2.57 𝑌 = Σ𝑌 𝑛 =6,550/12=545.8 SDy= : σ = Σ yi− 𝑦 2 n−1 = 3.185 11 =0.54 SDx= Σ Xi− 𝑋 2 n−1 = 182,291.68 11 = 128.73 Cov= 𝛴(Y− 𝑌 )(X− 𝑋 ) 𝑛−1 = 378.33 11 =34.39 rxy= 𝐶𝑜𝑣 𝑆𝐷𝑥 . 𝑆𝐷𝑦 = 34.39 0.54∗128.73 =0.5 r2= (0.5)2=0.25
Coefficient of determination (r2) B Independent Variable (predictor) Dependent Variable (Certain Outcome) Using depending variable to predict the independent variable; 25% of the variability of A & B are common, 25% of variability of A (Dependent Variable), is explained by B (Independent Variable).
SPSS Practice
SPSS Out put
Example 2 What does it mean for two variables to be positively associated? Where there is a positive association between two variables, scores above the mean OR X tend to be associated with scores above the-mean OR Y and scores below the mean on X tend to be accompanied by scores below the mean of Y. (Note: For this reason deviation - score is an Important part of Covarince)
Solution Students Y (GPA) X (SAT) Y- 𝑌 (Y- 𝑌 )2 X- 𝑋 (X- 𝑋 )2 1.6 400 -0.97 0.94 -145.8 21,257.64 141.43 B 2.0 350 -0.57 0.32 -195.8 38,337.64 111.61 C 2.2 500 -0.37 0.14 -45.8 2,097.64 16.95 D 2.8 0.23 0.053 -33.53 E 450 -95.8 9,177.64 -22.03 F 2.6 550 0.03 0.0009 4.2 17.64 0.13 G 3.2 0.63 0.40 2.65 H 600 54.2 2,937.64 -30.89 I 2.4 650 -0.17 104.2 10,857.64 -17.71 J 3.4 0.83 0.69 86.49 K 700 154.2 23,777.64 35.47 L 3.0 750 0.43 0.185 204.2 41,697.64 87.8 Sum: 30.8 6,550 -0.04 3.1849 0.4 182,291.68 378.37 Mean : 2.57 545.80 SD (σ):
* Break *