Benefits Key Features and Results. XLSTAT-ADA’s functions.

Slides:

Advertisements

Similar presentations

Tables, Figures, and Equations

Advertisements

Covariance Matrix Applications

Mutidimensional Data Analysis Growth of big databases requires important data processing.  Need for having methods allowing to extract this information.

An Introduction to Multivariate Analysis

Descriptive Analysis and PCA Hervé Abdi The university of Texas at Dallas Dominique Valentin ENSBANA/CESG

Dimensionality Reduction PCA -- SVD

Statistics for Marketing & Consumer Research Copyright © Mario Mazzocchi 1 Cluster Analysis (from Chapter 12)

1 Multivariate Statistics ESM 206, 5/17/05. 2 WHAT IS MULTIVARIATE STATISTICS? A collection of techniques to help us understand patterns in and make predictions.

Principal Component Analysis CMPUT 466/551 Nilanjan Ray.

LISA Short Course Series Multivariate Analysis in R Liang (Sally) Shan March 3, 2015 LISA: Multivariate Analysis in RMar. 3, 2015.

Types of Data Displays Based on the 2008 AZ State Mathematics Standard.

Prénom Nom Document Analysis: Data Analysis and Clustering Prof. Rolf Ingold, University of Fribourg Master course, spring semester 2008.

Dimension reduction : PCA and Clustering Slides by Agnieszka Juncker and Chris Workman.

Contingency tables and Correspondence analysis Contingency table Pearson’s chi-squared test for association Correspondence analysis using SVD Plots References.

Tables, Figures, and Equations

Version 4 for Windows NEX T. Welcome to SphinxSurvey Version 4,4, the integrated solution for all your survey needs... Question list Questionnaire Design.

DRAFT ONLY Sensory evaluation Foundation.

Domain decomposition in parallel computing Ashok Srinivasan Florida State University COT 5410 – Spring 2004.

Statistics for Marketing & Consumer Research Copyright © Mario Mazzocchi 1 Correspondence Analysis Chapter 14.

The Tutorial of Principal Component Analysis, Hierarchical Clustering, and Multidimensional Scaling Wenshan Wang.

Summarized by Soo-Jin Kim

Key Features and Results

Chapter 2 Dimensionality Reduction. Linear Methods

Chapter 3 Data Exploration and Dimension Reduction 1.

Multivariate Data Analysis Chapter 8 - Canonical Correlation Analysis.

Learning Objectives Copyright © 2002 South-Western/Thomson Learning Multivariate Data Analysis CHAPTER seventeen.

EMIS 8381 – Spring Netflix and Your Next Movie Night Nonlinear Programming Ron Andrews EMIS 8381.

Principal Component Analysis Bamshad Mobasher DePaul University Bamshad Mobasher DePaul University.

Canonical Correlation Analysis and Related Techniques Simon Mason International Research Institute for Climate Prediction The Earth Institute of Columbia.

Data Mining Manufacturing Data Dave E. Stevens Eastman Chemical Company Kingsport, TN.

Chapter 9 Factor Analysis

Sébastien Lê Agrocampus Rennes Multiple Factor Analysis, MFA, main features of the method The wines of the Loire valley.

PCB 3043L - General Ecology Data Analysis. OUTLINE Organizing an ecological study Basic sampling terminology Statistical analysis of data –Why use statistics?

Available at Chapter 13 Multivariate Analysis BCB 702: Biostatistics

A B S T R A C T The study presents the application of selected chemometric techniques to the pollution monitoring dataset, namely, cluster analysis,

Correspondence Analysis Ahmed Rebai Center of Biotechnology of Sfax.

Principal Components Analysis. Principal Components Analysis (PCA) A multivariate technique with the central aim of reducing the dimensionality of a multivariate.

Slide 10-1 © 1999 South-Western Publishing McDaniel Gates Contemporary Marketing Research, 4e Using Measurement Scales to Build Marketing Effectiveness.

Multivariate Analysis and Data Reduction. Multivariate Analysis Multivariate analysis tries to find patterns and relationships among multiple dependent.

PCB 3043L - General Ecology Data Analysis.

Sébastien Lê Agrocampus Rennes How to characterize the products from a unidimensional point of view?

UNIT #1 CHAPTERS BY JEREMY GREEN, ADAM PAQUETTEY, AND MATT STAUB.

Domain decomposition in parallel computing Ashok Srinivasan Florida State University.

Principle Component Analysis and its use in MA clustering Lecture 12.

Principal Component Analysis (PCA)

Multidimensional Scaling and Correspondence Analysis © 2007 Prentice Hall21-1.

Feature Extraction 主講人：虞台文. Content Principal Component Analysis (PCA) PCA Calculation — for Fewer-Sample Case Factor Analysis Fisher’s Linear Discriminant.

Université d’Ottawa / University of Ottawa 2001 Bio 8100s Applied Multivariate Biostatistics L11.1 Lecture 11: Canonical correlation analysis (CANCOR)

Feature Extraction 主講人：虞台文.

Chapter Seventeen Copyright © 2004 John Wiley & Sons, Inc. Multivariate Data Analysis.

Presented by: Muhammad Wasif Laeeq (BSIT07-1) Muhammad Aatif Aneeq (BSIT07-15) Shah Rukh (BSIT07-22) Mudasir Abbas (BSIT07-34) Ahmad Mushtaq (BSIT07-45)

FACTOR ANALYSIS.  The basic objective of Factor Analysis is data reduction or structure detection.  The purpose of data reduction is to remove redundant.

Multivariate statistical methods. Multivariate methods multivariate dataset – group of n objects, m variables (as a rule n>m, if possible). confirmation.

Why Is It There? Chapter 6. Review: Dueker’s (1979) Definition “a geographic information system is a special case of information systems where the database.

Methods of multivariate analysis Ing. Jozef Palkovič, PhD.

Principal Component Analysis (PCA)

JMP Discovery Summit 2016 Janet Alvarado

How fast is a ‘rapid method’?

Unit 4 Statistical Analysis Data Representations

Principal Component Analysis (PCA)

Multidimensional Scaling and Correspondence Analysis

Quality Control at a Local Brewery

Multidimensional Scaling

Multidimensional Scaling

Z1 = a1X1 + a2X2 + a3X Multivariate methods

Conjoint analysis.

Canonical Correlation Analysis and Related Techniques

Data Pre-processing Lecture Notes for Chapter 2

NON-NEGATIVE COMPONENT PARTS OF SOUND FOR CLASSIFICATION Yong-Choon Cho, Seungjin Choi, Sung-Yang Bang Wen-Yi Chu Department of Computer Science &

Multivariate Analysis of Sensory and Consumer Data With JMP®

Presentation transcript:

Benefits Key Features and Results

XLSTAT-ADA’s functions

Canonical correlation analysis  Studies the correlation between two sets of variables  Extracts a set of canonical variables that are as closely correlated with both tables as possible and orthogonal to each other.  Symmetrical method

Canonical correlation analysis Recording of data on men in a training center, Two sets of data:  The physiological data: Weight Waist Pulse  The exercises the men did: Chin-ups Sit-ups Jumps

Canonical correlation analysis  Men doing sit-ups or chin-ups have usually a smaller waist.  In general people training more have a smaller waist and weight.  Jumping seems to have an impact on the weight but not as much on the waist.

Redundancy analysis  Redundancy Analysis is an alternative to Canonical Correlation Analysis.  Non-symmetric method.  The components extracted from X are such that they are as closely correlated with the variables of Y as possible. Then, the components of Y are extracted so that they are as closely correlated with the components extracted from X as possible.

Redundancy analysis  Same example as Canonical correlation analysis: Recording of data on men in a training center, Two sets of data:  The physiological data: Weight Waist Pulse  The exercises the men did: Chin-ups Sit-ups Jumps

Redundancy analysis  Same relationships are observed: Men doing more sit-ups or chin-ups have usually a smaller waist. In general people training more have a smaller waist and weight. Jumping seems to have an impact on the weight but not as much on the waist. Note that the first factor is explaining more variance than in canonical correlation analysis (93,30)  The larger the waist, the lower the pulse

Redundancy analysis  It is possible to project the observations in the same graphic.  It is easy to visualize which men are doing more exercises and the one being fitter.

Canonical Correspondence Analysis  Canonical Correspondence Analysis (CCA) was developed to allow ecologists to relate the abundance of species to environmental variables.  CCA  simultaneous representation of the sites, the objects, and the variables describing the sites.  Principles of Canonical Correspondence Analysis T1 n sites p species T2 n sites q descriptive variables Contingency table

Canonical Correspondence Analysis  Canonical Correspondence Analysis can be divided into two parts: A constrained analysis in a space which number of dimensions is equal to q = analysis of the relation between the two tables T1 and T2. An unconstrained part = analysis of the residuals.  XLSTAT-ADA offers as well: Partial CCA PLS-CCA

Canonical Correspondence Analysis  Contingency table: the counts of 10 species of insects on 12 different sites in a tropical region.  A second table includes 3 quantitative variables that describe the 12 sites: altitude, humidity, and distance to the lake.

Canonical Correspondence Analysis  Some insects: insects 2, 4 and 5 prefer the humid sites, such as sites 7 to 12, while some prefer dry climates such as insects 1, 6, 8 and 10.  Insect 9 prefers a climate with higher altitude

Principal coordinate analysis  Principal Coordinate Analysis is aimed at graphically representing a resemblance matrix between p elements.  The algorithm can be divided into three steps:

Principal coordinate analysis  Principal Coordinate Analysis is aimed at graphically representing a resemblance matrix between p elements.  The algorithm can be divided into three steps: 1.Computation of a distance matrix for the p elements p n x 11 x 12 x 1p x n1 x n2 x np p p 0 d 12 d 1p 0 d p1 d p2 0

Principal coordinate analysis  Principal Coordinate Analysis is aimed at graphically representing a resemblance matrix between p elements.  The algorithm can be divided into three steps: 2.Centering of the matrix by rows and columns p n p p x 11 x 12 x 1p x n1 x n2 x np 0 d 12 d 1p 0 d p1 d p2 0 p p -r 1 -c 1 d 1p -r 1 -c p d ij -r i -c j d p1 -r p -c 1 -r p -c p

Principal coordinate analysis  Principal Coordinate Analysis is aimed at graphically representing a resemblance matrix between p elements.  The algorithm can be divided into three steps: 3.Eigen-decomposition of the centered distance matrix p n p p x 11 x 12 x 1p x n1 x n2 x np 0 d 12 d 1p 0 d p1 d p2 0 pp pt t p p -r 1 -c 1 d 1p -r 1 -c p d ij -r i -c j d p1 -r p -c 1 -r p -c p

Principal coordinate analysis  Principal Coordinate Analysis is aimed at graphically representing a resemblance matrix between p elements.  The algorithm can be divided into three steps:  The rescaled eigenvectors correspond to the principal coordinates that can be used to display the p objects in a space with 1, 2, p-1 dimensions. 1.Computation of a distance matrix for the p elements 2.Centering of the matrix by rows and columns 3.Eigen-decomposition of the centered distance matrix

Principal coordinate analysis  5 products are graded by 10 individuals Note that product 4 is preferable.

Principal coordinate analysis  The results is a map of the proximity of the 5 products.  P1 and P3 are the most similar products.

Generalized Procrustes Analysis (GPA)  GPA is a pretreatment used to: reduce the scale effects and obtain a consensual configuration on data where products have been graded by several judges.  GPA compares the proximity between the terms that are used by different experts to describe products.

Generalized Procrustes Analysis (GPA)  10 experts graded 4 cheeses for 3 sensory attributes: Acidity Strangeness Hardness

Generalized Procrustes Analysis (GPA)  The products do not have the exact same grade by each expert

Generalized Procrustes Analysis (GPA)  A consensus can be found for the position of each product  Cheese 1 and 2 are the strangest  Cheese 3 is the Hardest

Generalized Procrustes Analysis (GPA)  Strangeness is not graded in the same way by the different experts  Acidity and Hardness are quite reproducible

Multiple Factor Analysis (MFA)  MFA is a generalization of PCA (Principal Component Analysis) and MCA (Multiple Correspondence Analysis).  MFA makes it possible to: Analyze several tables of variables simultaneously, Obtain results that allow studying the relationship between the observations, the variables and tables.

 36 experts have graded 21 wines analysed on several criteria: Olfactory (5 variables) Visual (3 variables) Taste (9 variables) Quality (2 variables) Multiple Factor Analysis (MFA)

 MFA groups the information on one chart

Multiple Factor Analysis (MFA)  MFA groups the information on one chart

Multiple Factor Analysis (MFA)  Wine 13 is in the direction of the two quality variables and is therefore the wine of preference.

Multiple Factor Analysis (MFA)  The olfactory criteria are often increasing the distance between the wines.

Penalty analysis  Identify potential directions for the improvement of products, on the basis of surveys performed on consumers or experts.  Two types of data are used: Preference data (or liking scores) for a product or for a characteristic of a product Data collected on a JAR (Just About Right) scale

Penalty analysis A type of potato chips is evaluated:  By 150 consumers  On a JAR scale (1 to 5) for 4 attributes: Saltiness, Sweetness, Acidity, Crunchiness.  And on an overall liking (1 to 10) score scale

Penalty analysis Mean of Liking for JAR – Mean of Liking for too little and too much

Semantic differential charts  The semantic differential method is a visualization method to plot the differences between individuals' connotations for a given word.  This method can be used for: Analyzing experts’ agreement on the perceptions of a product described by a series of criteria on similar scales Analyzing customer satisfaction surveys and segmentation Profiling products

Semantic differential charts  1 yoghurt  5 experts  6 attributes: Color Fruitiness Sweetness Unctuousness Taste Smell

Semantic differential charts

TURF analysis  TURF = Total Unduplicated Reach and Frequency method  Highlight a line of products from a complete range of products in order to have the highest market share.  XLSTAT offers three algorithms to find the best combination of products

TURF analysis  27 possible dishes  185 customers  "Would you buy this product?" (1: No, not at all to 5: Yes, quite sure).  The goal is to obtain a product line of 5 dishes maximizing the reach

TURF analysis

Product characterization  Find which descriptors are discriminating well a set of products and which the most important characteristics of each product are.  All computations are based on the analysis of variance (ANOVA) model.  Check the influence on the scores of attributes of: Product Judge Session Judge*Product

Product characterization  29 assessors  6 chocolate drinks  14 characteristics: Cocoa and milk taste and flavor Other flavors: Vanilla, Caramel Tastes: bitterness, astringency, acidity, sweetness Texture: granular, crunchy, sticky, melting

Product characterization

DOE for sensory data analysis  Designing an experiment is a fundamental step to ensure that the collected data will be statistically usable in the best possible way.

DOE for sensory data analysis  Prepare a sensory evaluation where judges (experts and/or consumers) evaluate a set of products taking into account: Number of judges to involve Maximum number of products that a judge can evaluate during each session Which products will be evaluated by each of the consumers in each session, and in what order (carry-over)  Complete plans or incomplete block designs, balanced or not.  Search optimal designs with A- or D- efficiency

 60 judges  8 products  Saturation: 3 products / judge DOE for sensory data analysis

Let XLSTAT-ADA complete your advanced analytical needs