Multivariate Statistics Harry R. Erwin, PhD School of Computing and Technology University of Sunderland.

Slides:



Advertisements
Similar presentations
Statistical basics Marian Scott Dept of Statistics, University of Glasgow August 2010.
Advertisements

Statistical basics Marian Scott Dept of Statistics, University of Glasgow August 2008.
Using R Harry R. Erwin, PhD School of Computing and Technology University of Sunderland.
Chapter 2 Minitab for Data Analysis KANCHALA SUDTACHAT.
Chapter 3 – Data Exploration and Dimension Reduction © Galit Shmueli and Peter Bruce 2008 Data Mining for Business Intelligence Shmueli, Patel & Bruce.
Lecture 2 Summarizing the Sample. WARNING: Today’s lecture may bore some of you… It’s (sort of) not my fault…I’m required to teach you about what we’re.
Plotting Multivariate Data Harry R. Erwin, PhD School of Computing and Technology University of Sunderland.
Analyzing Survey Data Angelina Hill, Associate Director of Academic Assessment 2009 Academic Assessment Workshop May 14 th & 15 th UNLV.
Course in Statistics and Data analysis Course B, September 2009 Stephan Frickenhaus.
Chapter 6: Exploring Data: Relationships Lesson Plan
1 Objective Investigate how two variables (x and y) are related (i.e. correlated). That is, how much they depend on each other. Section 10.2 Correlation.
Statistics 350 Lecture 21. Today Last Day: Tests and partial R 2 Today: Multicollinearity.
Chapter 13 Analyzing Quantitative data. LEVELS OF MEASUREMENT Nominal Measurement Ordinal Measurement Interval Measurement Ratio Measurement.
Chapter 14 Analyzing Quantitative Data. LEVELS OF MEASUREMENT Nominal Measurement Nominal Measurement Ordinal Measurement Ordinal Measurement Interval.
SOWK 6003 Social Work Research Week 10 Quantitative Data Analysis
STRATEGIES FOR RESEARCH Approaching the Paper Assignment.
Chapter One An Introduction to Business Statistics McGraw-Hill/Irwin Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.
Data Description Tables and Graphs Data Reduction.
Social Research Methods
Measures of Association Deepak Khazanchi Chapter 18.
Summary of Quantitative Analysis Neuman and Robson Ch. 11
Mother and Child Health: Research Methods G.J.Ebrahim Editor Journal of Tropical Pediatrics, Oxford University Press.
PPA 501 – A NALYTICAL M ETHODS IN A DMINISTRATION Lecture 3b – Fundamentals of Quantitative Research.
© 2005 The McGraw-Hill Companies, Inc., All Rights Reserved. Chapter 12 Describing Data.
Exploratory Data Analysis. Height and Weight 1.Data checking, identifying problems and characteristics Data exploration and Statistical analysis.
Biost 511 DL Discussion Section Announcements Quiz 1 (CEU students only) Will be available on Canvas.uw.edu Friday 12 pm – Sunday 11:59 pm One hour to.
What factors are most responsible for height?
R Example Descriptive Statistics Frequency and Histogram Diagrams Standard Deviation.
Class Meeting #11 Data Analysis. Types of Statistics Descriptive Statistics used to describe things, frequently groups of people.  Central Tendency 
Section Copyright © 2014, 2012, 2010 Pearson Education, Inc. Lecture Slides Elementary Statistics Twelfth Edition and the Triola Statistics Series.
Chapter 6: Exploring Data: Relationships Chi-Kwong Li Displaying Relationships: Scatterplots Regression Lines Correlation Least-Squares Regression Interpreting.
Section Copyright © 2014, 2012, 2010 Pearson Education, Inc. Lecture Slides Elementary Statistics Twelfth Edition and the Triola Statistics Series.
Multivariate Analysis Harry R. Erwin, PhD School of Computing and Technology University of Sunderland.
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved Lecture Slides Elementary Statistics Eleventh Edition and the Triola.
1 G Lect 8b G Lecture 8b Correlation: quantifying linear association between random variables Example: Okazaki’s inferences from a survey.
Basic Statistics Correlation Var Relationships Associations.
Introduction to Statistics Harry R. Erwin, PhD School of Computing and Technology University of Sunderland.
Descriptive Statistics becoming familiar with the data.
Applied Quantitative Analysis and Practices LECTURE#09 By Dr. Osman Sadiq Paracha.
Social Science Methods Fall 2010 Soyoung Jung Kevin Balster Melvin Hale.
Chapter 8 Making Sense of Data in Six Sigma and Lean
GRAPHS AND NUMBERS 1 / 1 As we study graphs and numerical summaries, we keep firmly in mind where the data come from and what we hope to learn from them.
Qualitative Data: consists of attributes, labels or non-numerical entries Examples: Quantitative Data: consists of numerical measurements or counts Examples:
Descriptive statistics Petter Mostad Goal: Reduce data amount, keep ”information” Two uses: Data exploration: What you do for yourself when.
Chapter 9 Correlational Research Designs. Correlation Acceptable terminology for the pattern of data in a correlation: *Correlation between variables.
Multivariate Data Analysis Chapter 2 – Examining Your Data
In Stat-I, we described data by three different ways. Qualitative vs Quantitative Discrete vs Continuous Measurement Scales Describing Data Types.
The basic task of most research = Bivariate Analysis
Data Mining Anomaly Detection Lecture Notes for Chapter 10 Introduction to Data Mining by Tan, Steinbach, Kumar © Tan,Steinbach, Kumar Introduction to.
Statistical Inference Harry R. Erwin, PhD School of Computing and Technology University of Sunderland.
Research Tools and Techniques The Research Process: Step 7 (Data Analysis Part B) Lecture 29.
Introduction to Machine Learning Multivariate Methods 姓名 : 李政軒.
Statistical Data Analysis 2011/2012 M. de Gunst. Statistical Data Analysis 2 About the course (1) Docent Mathisca de Gunst ,
Outline Research Question: What determines height? Data Input Look at One Variable Compare Two Variables Children’s Height and Parents Height Children’s.
Presenting Multivariate Data Harry R. Erwin, PhD School of Computing and Technology University of Sunderland.
Multivariate statistical methods. Multivariate methods multivariate dataset – group of n objects, m variables (as a rule n>m, if possible). confirmation.
AP Statistics Review Day 1 Chapters 1-4. AP Exam Exploring Data accounts for 20%-30% of the material covered on the AP Exam. “Exploratory analysis of.
Scatterplots & Correlations Chapter 4. What we are going to cover Explanatory (Independent) and Response (Dependent) variables Displaying relationships.
STATISTICS 200 Lecture #2Thursday, August 25, 2016 Distinguish between: - A statistic and a parameter - A categorical and a quantitative variable - A response.
Active Learning Lecture Slides
Module 6: Descriptive Statistics
Correlation and Regression Basics
Introduction To Statistics
Correlation and Regression Basics
Data cleaning and transformation
Treat everyone with sincerity,
Introduction To Statistics
Regression III.
Let’s continue to review some of the statistics you’ve learned in your first class: Bivariate analyses (two variables measured at a time on each observation)
Ungraded quiz Unit 5.
Presentation transcript:

Multivariate Statistics Harry R. Erwin, PhD School of Computing and Technology University of Sunderland

Resources Everitt, BS, and G Dunn (2001) Applied Multivariate Data Analysis, London:Arnold. Everitt, BS (2005) An R and S-PLUS® Companion to Multivariate Analysis, London:Springer

Introduction Most statistical data sets are multivariate. Sometimes it’s useful to study a variable in isolation, but usually you need to examine all the variables to understand the data. The next few lectures are the core of this module. We will examine the description, exploration, and analysis of multivariate data.

Multivariate Data Natural form of multivariate data is a table or data frame. Kinds of data –Unordered categorical variables (nominal data) –Ordinal data (numbered but not measured) –Interval data (measured data) –Ratio data (numerical with a defined ‘zero’) Missing values (common)

Handling Missing Data Ignore it. –Often biased. Fill in plausible values –Known as imputation –Advanced topic Be aware this is a problem area

Summary Statistics Means –Generated by mean Variances –Generated by var Covariances –Also generated by var Correlation coefficients –Generated by cor Distances –Generated by dist

Aims Data exploration (data mining) –Looking for non-random patterns and structures –Visual and graphical displays Confirmatory analysis (later in the module) –Statistical testing

Looking at Multivariate Data Scatterplots –Demonstration “The convex hull of bivariate data” –Demonstration Chiplot –Demonstration Bivariate Boxplot –Demonstration

More Multivariate Graphics Bivariate Densities –Demonstration Other Variables in a Scatterplot –Demonstration Scatterplot Matrix –Demonstration of pairs 3-D Plots –Demonstration Conditioning Plots and Trellis Graphics –Demonstration

Summary Most statistical data are multivariate. Most multivariate data have structure. Detecting that structure is what data mining is all about. Most data mining involves data visualisation and graphing—nothing more. Most of your conclusions from data mining will be obvious—once you see them! And you really don’t need to learn very much statistics to be good at multivariate data analysis.