Presentation is loading. Please wait.

Presentation is loading. Please wait.

11. Multivariate Analysis CSCI N207 Data Analysis Using Spreadsheet Lingma Acheson Department of Computer and Information Science, IUPUI.

Similar presentations


Presentation on theme: "11. Multivariate Analysis CSCI N207 Data Analysis Using Spreadsheet Lingma Acheson Department of Computer and Information Science, IUPUI."— Presentation transcript:

1 11. Multivariate Analysis CSCI N207 Data Analysis Using Spreadsheet Lingma Acheson linglu@iupui.edu Department of Computer and Information Science, IUPUI

2 Multivariate Data Analysis Univariate data analysis concerned itself with describing an entity using a single variable. Multivariate data analysis tries to establish a mathematical relationship between multiple data sets. –smoking/cancer –salary/productivity –temperature/chirps in 15 seconds MeasureTemperature (X)Chirps (Y) 11857 22060 32164 42365 52768 ………

3 Correlation Multivariate data analysis depends largely on correlation. Correlation is a mathematical tool used to establish a dependency between two variables. Researchers use Pearson's Correlation Coefficient to represent correlation, signified by R:

4 Review Variance: One measure of dispersion (deviation from the mean) of a data set. The larger the variance, the greater is the average deviation of each datum from the average value. Standard deviation: Square root of the variance. The magnitude of the number is more in line with the values in the data set. Variance = Average value of the data set Standard Deviation =

5 Calculating Covariance

6 Measuring Correlation R: Values range between -1 (perfect negative or inverse correlation) and +1 (perfect positive correlation). A positive correlation (+) reflects a situation where an increase in value of one variable accompanies an increase in the value of the second variable. An R value of +1 is called "perfect positive." A negative correlation (-) reflects a situation where an increase in value of one variable accompanies an decrease in the value of the second variable (inverse correlation). An R value of -1 is called "perfect negative."

7 Measuring Correlation This measurement applies only to linear systems. Excel Covariance Function: =COVAR(Range1, Range2) Excel Correlation Function: =CORREL(Range1, Range2)

8 Magnitude of Association Although interpretation is discipline specific, we can generally draw the following strengths for |R|, where -1 <= R <= 1 : Correlation Strength Value of |R| Weak0.0 – 0.3 Moderate0.3 – 0.6 Strong0.6 – 1.0

9 R is close to +1

10 R is close to.5

11 R is close to zero


Download ppt "11. Multivariate Analysis CSCI N207 Data Analysis Using Spreadsheet Lingma Acheson Department of Computer and Information Science, IUPUI."

Similar presentations


Ads by Google