Factor & Cluster Analyses. Factor Analysis Goals Data Process Results.

Slides:



Advertisements
Similar presentations
Discrimination amongst k populations. We want to determine if an observation vector comes from one of the k populations For this purpose we need to partition.
Advertisements

Aaker, Kumar, Day Seventh Edition Instructor’s Presentation Slides
Metrics, Algorithms & Follow-ups Profile Similarity Measures Cluster combination procedures Hierarchical vs. Non-hierarchical Clustering Statistical follow-up.
AEB 37 / AE 802 Marketing Research Methods Week 7
Cluster Analysis.
Statistics for Marketing & Consumer Research Copyright © Mario Mazzocchi 1 Cluster Analysis (from Chapter 12)
6-1 ©2006 Raj Jain Clustering Techniques  Goal: Partition into groups so the members of a group are as similar as possible and different.
Dimension reduction : PCA and Clustering Agnieszka S. Juncker Slides: Christopher Workman and Agnieszka S. Juncker Center for Biological Sequence Analysis.
Cluster analysis. Partition Methods Divide data into disjoint clusters Hierarchical Methods Build a hierarchy of the observations and deduce the clusters.
University of CreteCS4831 The use of Minimum Spanning Trees in microarray expression data Gkirtzou Ekaterini.
Multivariate Distance and Similarity Robert F. Murphy Cytometry Development Workshop 2000.
Dimension reduction : PCA and Clustering by Agnieszka S. Juncker
1 Data Analysis  Data Matrix Variables ObjectsX1X1 X2X2 X3X3 …XPXP n.
What is Cluster Analysis
Clustering with FITCH en UPGMA Bob W. Kooi, David M. Stork and Jorn de Haan Theoretical Biology.
Topological Data Analysis MATH 800 Fall Topological Data Analysis (TDA) An ε-chain is a finite sequence of points x 1,..., x n such that |x i –
Cleaver – Classification of Expression Array Version 1.0 Hongli Li Spring Computational Biology Computer Science Department UMASS Lowell.
Dimension reduction : PCA and Clustering by Agnieszka S. Juncker Part of the slides is adapted from Chris Workman.
Multivariate Data Analysis Chapter 9 - Cluster Analysis
Goals of Factor Analysis (1) (1)to reduce the number of variables and (2) to detect structure in the relationships between variables, that is to classify.
Clustering Ram Akella Lecture 6 February 23, & 280I University of California Berkeley Silicon Valley Center/SC.
Dr. Michael R. Hyman Cluster Analysis. 2 Introduction Also called classification analysis and numerical taxonomy Goal: assign objects to groups so that.
Clustering analysis workshop Clustering analysis workshop CITM, Lab 3 18, Oct 2014 Facilitator: Hosam Al-Samarraie, PhD.
CSC 4510 – Machine Learning Dr. Mary-Angela Papalaskari Department of Computing Sciences Villanova University Course website:
Midterm Review. 1-Intro Data Mining vs. Statistics –Predictive v. experimental; hypotheses vs data-driven Different types of data Data Mining pitfalls.
Data mining methodology in Weka
© 2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Data Clustering 1 – An introduction
Clustering Methods K- means. K-means Algorithm Assume that K=3 and initially the points are assigned to clusters as follows. C 1 ={x 1,x 2,x 3 }, C 2.
Cluster analysis 포항공과대학교 산업공학과 확률통계연구실 이 재 현. POSTECH IE PASTACLUSTER ANALYSIS Definition Cluster analysis is a technigue used for combining observations.
Microarray data analysis David A. McClellan, Ph.D. Introduction to Bioinformatics Brigham Young University Dept. Integrative Biology.
es/by-sa/2.0/. Principal Component Analysis & Clustering Prof:Rui Alves Dept Ciencies Mediques.
Classification Heejune Ahn SeoulTech Last updated May. 03.
Cluster Analysis Cluster Analysis Cluster analysis is a class of techniques used to classify objects or cases into relatively homogeneous groups.
Dimension reduction : PCA and Clustering Slides by Agnieszka Juncker and Chris Workman modified by Hanne Jarmer.
Computer Graphics and Image Processing (CIS-601).
Marketing Research Aaker, Kumar, Day and Leone Tenth Edition Instructor’s Presentation Slides 1.
Data Science and Big Data Analytics Chap 4: Advanced Analytical Theory and Methods: Clustering Charles Tappert Seidenberg School of CSIS, Pace University.
Cluster Analysis Potyó László. Cluster: a collection of data objects Similar to one another within the same cluster Similar to one another within the.
K-Means Algorithm Each cluster is represented by the mean value of the objects in the cluster Input: set of objects (n), no of clusters (k) Output:
Data Mining Practical Machine Learning Tools and Techniques Chapter 4: Algorithms: The Basic Methods Section 4.8: Clustering Rodney Nielsen Many of these.
Intelligent Database Systems Lab N.Y.U.S.T. I. M. The application of SOM as a decision support tool to identify AACSB peer schools Presenter : Chun-Ping.
Data Mining Practical Machine Learning Tools and Techniques By I. H. Witten, E. Frank and M. A. Hall 6.8: Clustering Rodney Nielsen Many / most of these.
Radial Basis Function ANN, an alternative to back propagation, uses clustering of examples in the training set.
Analyzing Expression Data: Clustering and Stats Chapter 16.
MKT 700 Business Intelligence and Decision Models Week 8: Algorithms and Customer Profiling (1)
COMBO-17 Galaxy Dataset Colin Holden COSC 4335 April 17, 2012.
Canopy Clustering Given a distance measure and two threshold distances T1>T2, 1. Determine canopy centers - go through The list of input points to form.
Chapter 9 Scatter Plots and Data Analysis LESSON 1 SCATTER PLOTS AND ASSOCIATION.
Multivariate statistical methods Cluster analysis.
McGraw-Hill/Irwin © 2003 The McGraw-Hill Companies, Inc.,All Rights Reserved. Part Four ANALYSIS AND PRESENTATION OF DATA.
Basic statistical concepts Variance Covariance Correlation and covariance Standardisation.
Methods of multivariate analysis Ing. Jozef Palkovič, PhD.
FACTOR ANALYSIS CLUSTER ANALYSIS Analyzing complex multidimensional patterns.
Rodney Nielsen Many of these slides were adapted from: I. H. Witten, E. Frank and M. A. Hall Data Science Algorithms: The Basic Methods Clustering WFH:
Machine Learning Supervised Learning Classification and Regression K-Nearest Neighbor Classification Fisher’s Criteria & Linear Discriminant Analysis Perceptron:
Data Science Practical Machine Learning Tools and Techniques 6.8: Clustering Rodney Nielsen Many / most of these slides were adapted from: I. H. Witten,
Multivariate statistical methods
Two Quantitative Variables
Outline Peter N. Belhumeur, Joao P. Hespanha, and David J. Kriegman, “Eigenfaces vs. Fisherfaces: Recognition Using Class Specific Linear Projection,”
CSCI N317 Computation for Scientific Applications Unit Weka
Z1 = a1X1 + a2X2 + a3X Multivariate methods
(A) Refined profile principal component analysis loading weights plot was used to derive insight into possible association of biomarkers. (A) Refined profile.
Cluster Analysis.
Multivariate Methods Berlin Chen, 2005 References:
Yamanishi, M., Itoh, M., Kanehisa, M.
Principal Component Analysis
Canonical Correlation Analysis
What is Artificial Intelligence?
Presentation transcript:

Factor & Cluster Analyses

Factor Analysis Goals Data Process Results

Factor Analysis Goals Reduce number of variables Typically for further analysis Measure/Understand underlying construct e.g. What is intelligence, beauty, effectiveness?

Factor Analysis Data Typically numeric Variables must have some intercorrelations

Factor Analysis Data

Factor Analysis Process Linear combination of variables Types: Principal Components Analysis Factor Analysis Maximize variance in each Factor / Component, with 0 covariance between components.

Factor Analysis Results These numbers show relative importance of each variable within a component

Cluster Analysis Goal Data Process Results

Cluster Analysis Goals Find subgroups within a larger group Create profiles of subgroups for further action (marketing, medical intervention, etc.)

Cluster Analysis Data Typically numeric Free of correlations Free of outliers

Cluster Analysis Data

Cluster Analysis Process K-means clustering Prespecified number of clusters Based on Euclidean distances Hierarchical Tree Each observation is a cluster, and the number of clusters is iteratively reduced

Cluster Analysis Results Cluster Means The mean of each variable for all the observations within the cluster is output. The combined set of these means for each cluster is called the cluster Centroid.

Cluster Analysis Results