Presentation on theme: "11. Multivariate Analysis CSCI N207 Data Analysis Using Spreadsheet Lingma Acheson Department of Computer and Information Science, IUPUI."— Presentation transcript:
11. Multivariate Analysis CSCI N207 Data Analysis Using Spreadsheet Lingma Acheson email@example.com Department of Computer and Information Science, IUPUI
Multivariate Data Analysis Univariate data analysis concerned itself with describing an entity using a single variable. Multivariate data analysis tries to establish a mathematical relationship between multiple data sets. –smoking/cancer –salary/productivity –temperature/chirps in 15 seconds MeasureTemperature (X)Chirps (Y) 11857 22060 32164 42365 52768 ………
Correlation Multivariate data analysis depends largely on correlation. Correlation is a mathematical tool used to establish a dependency between two variables. Researchers use Pearson's Correlation Coefficient to represent correlation, signified by R:
Review Variance: One measure of dispersion (deviation from the mean) of a data set. The larger the variance, the greater is the average deviation of each datum from the average value. Standard deviation: Square root of the variance. The magnitude of the number is more in line with the values in the data set. Variance = Average value of the data set Standard Deviation =
Measuring Correlation R: Values range between -1 (perfect negative or inverse correlation) and +1 (perfect positive correlation). A positive correlation (+) reflects a situation where an increase in value of one variable accompanies an increase in the value of the second variable. An R value of +1 is called "perfect positive." A negative correlation (-) reflects a situation where an increase in value of one variable accompanies an decrease in the value of the second variable (inverse correlation). An R value of -1 is called "perfect negative."
Measuring Correlation This measurement applies only to linear systems. Excel Covariance Function: =COVAR(Range1, Range2) Excel Correlation Function: =CORREL(Range1, Range2)
Magnitude of Association Although interpretation is discipline specific, we can generally draw the following strengths for |R|, where -1 <= R <= 1 : Correlation Strength Value of |R| Weak0.0 – 0.3 Moderate0.3 – 0.6 Strong0.6 – 1.0