Blind Information Processing: Microarray Data Hyejin Kim, Dukhee KimSeungjin Choi Department of Computer Science and Engineering, Department of Chemical.

Slides:



Advertisements
Similar presentations
FMRI Methods Lecture 10 – Using natural stimuli. Reductionism Reducing complex things into simpler components Explaining the whole as a sum of its parts.
Advertisements

Yinyin Yuan and Chang-Tsun Li Computer Science Department
Outlines Background & motivation Algorithms overview
PCA for analysis of complex multivariate data. Interpretation of large data tables by PCA In industry, research and finance the amount of data is often.
Dimension reduction (2) Projection pursuit ICA NCA Partial Least Squares Blais. “The role of the environment in synaptic plasticity…..” (1998) Liao et.
Color Imaging Analysis of Spatio-chromatic Decorrelation for Colour Image Reconstruction Mark S. Drew and Steven Bergner
Pattern Recognition for the Natural Sciences Explorative Data Analysis Principal Component Analysis (PCA) Lutgarde Buydens, IMM, Analytical Chemistry.
Face Recognition Ying Wu Electrical and Computer Engineering Northwestern University, Evanston, IL
Dimension reduction (1)
Minimum Redundancy and Maximum Relevance Feature Selection
Principal Component Analysis (PCA) for Clustering Gene Expression Data K. Y. Yeung and W. L. Ruzzo.
1 Multivariate Statistics ESM 206, 5/17/05. 2 WHAT IS MULTIVARIATE STATISTICS? A collection of techniques to help us understand patterns in and make predictions.
Clustered alignments of gene- expression time series data Adam A. Smith, Aaron Vollrath, Cristopher A. Bradfield and Mark Craven Department of Biosatatistics.
Multivariate Methods Pattern Recognition and Hypothesis Testing.
Basis State Prediction of Cell-Cycle Transcription Factors in Saccharomyces cerevisiae Dr. Matteo Pellegrini Dr. Shawn Cokus Sherri Rose UCLA Molecular,
Principal Component Analysis
SocalBSI 2008: Clustering Microarray Datasets Sagar Damle, Ph.D. Candidate, Caltech  Distance Metrics: Measuring similarity using the Euclidean and Correlation.
Regulatory Network (Part II) 11/05/07. Methods Linear –PCA (Raychaudhuri et al. 2000) –NIR (Gardner et al. 2003) Nonlinear –Bayesian network (Friedman.
Dimension reduction : PCA and Clustering Agnieszka S. Juncker Slides: Christopher Workman and Agnieszka S. Juncker Center for Biological Sequence Analysis.
Unsupervised Learning - PCA The neural approach->PCA; SVD; kernel PCA Hertz chapter 8 Presentation based on Touretzky + various additions.
Dimensional reduction, PCA
Classification: Support Vector Machine 10/10/07. What hyperplane (line) can separate the two classes of data?
Yeast Dataset Analysis Hongli Li Final Project Computer Science Department UMASS Lowell.
Probability theory 2011 The multivariate normal distribution  Characterizing properties of the univariate normal distribution  Different definitions.
Dimension reduction : PCA and Clustering Slides by Agnieszka Juncker and Chris Workman.
Multidimensional Analysis If you are comparing more than two conditions (for example 10 types of cancer) or if you are looking at a time series (cell cycle.
Dimension reduction : PCA and Clustering Christopher Workman Center for Biological Sequence Analysis DTU.
Microarray analysis Algorithms in Computational Biology Spring 2006 Written by Itai Sharon.
Exploring Microarray data Javier Cabrera. Outline 1.Exploratory Analysis Steps. 2.Microarray Data as Multivariate Data. 3.Dimension Reduction 4.Correlation.
ICA Alphan Altinok. Outline  PCA  ICA  Foundation  Ambiguities  Algorithms  Examples  Papers.
ICA-based Clustering of Genes from Microarray Expression Data Su-In Lee 1, Serafim Batzoglou 2 1 Department.
Laurent Itti: CS599 – Computational Architectures in Biological Vision, USC Lecture 7: Coding and Representation 1 Computational Architectures in.
Probability theory 2008 Outline of lecture 5 The multivariate normal distribution  Characterizing properties of the univariate normal distribution  Different.
Principal Component Analysis. Philosophy of PCA Introduced by Pearson (1901) and Hotelling (1933) to describe the variation in a set of multivariate data.
Survey on ICA Technical Report, Aapo Hyvärinen, 1999.
Empirical Modeling Dongsup Kim Department of Biosystems, KAIST Fall, 2004.
Sparse Coding Arthur Pece Outline Generative-model-based vision Linear, non-Gaussian, over-complete generative models The penalty method.
0 Pattern Classification, Chapter 3 0 Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda,
Next. A Big Thanks Again Prof. Jason Bohland Quantitative Neuroscience Laboratory Boston University.
Feature extraction 1.Introduction 2.T-test 3.Signal Noise Ratio (SNR) 4.Linear Correlation Coefficient (LCC) 5.Principle component analysis (PCA) 6.Linear.
INDEPENDENT COMPONENT ANALYSIS OF TEXTURES based on the article R.Manduchi, J. Portilla, ICA of Textures, The Proc. of the 7 th IEEE Int. Conf. On Comp.
2 2  Background  Vision in Human Brain  Efficient Coding Theory  Motivation  Natural Pictures  Methodology  Statistical Characteristics  Models.
IEEE TRANSSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE
Using Bayesian Networks to Analyze Whole-Genome Expression Data Nir Friedman Iftach Nachman Dana Pe’er Institute of Computer Science, The Hebrew University.
Descriptive Statistics vs. Factor Analysis Descriptive statistics will inform on the prevalence of a phenomenon, among a given population, captured by.
Paper review EOF: the medium is the message 報告人:沈茂霖 (Mao-Lin Shen) 2015/11/10 Seminar report.
STATISTICS FOR HIGH DIMENSIONAL BIOLOGICAL RECORDINGS Dr Cyril Pernet, Centre for Clinical Brain Sciences Brain Research Imaging Centre
EE4-62 MLCV Lecture Face Recognition – Subspace/Manifold Learning Tae-Kyun Kim 1 EE4-62 MLCV.
Clustering Algorithms to make sense of Microarray data: Systems Analyses in Biology Doug Welsh and Brian Davis BioQuest Workshop Beloit Wisconsin, June.
Sudhakar Jonnalagadda and Rajagopalan Srinivasan
Linear Subspace Transforms PCA, Karhunen- Loeve, Hotelling C306, 2000.
PCA vs ICA vs LDA. How to represent images? Why representation methods are needed?? –Curse of dimensionality – width x height x channels –Noise reduction.
Principle Component Analysis and its use in MA clustering Lecture 12.
Principal Component Analysis and Linear Discriminant Analysis for Feature Reduction Jieping Ye Department of Computer Science and Engineering Arizona State.
2D-LDA: A statistical linear discriminant analysis for image matrix
Principal Components Analysis ( PCA)
Multivariate statistical methods. Multivariate methods multivariate dataset – group of n objects, m variables (as a rule n>m, if possible). confirmation.
Dimension reduction (1) Overview PCA Factor Analysis Projection persuit ICA.
Dimension reduction (2) EDR space Sliced inverse regression Multi-dimensional LDA Partial Least Squares Network Component analysis.
1 CISC 841 Bioinformatics (Fall 2008) Review Session.
Principal Component Analysis
School of Computer Science & Engineering
Application of Independent Component Analysis (ICA) to Beam Diagnosis
Outline Multilinear Analysis
Introduction PCA (Principal Component Analysis) Characteristics:
Dimension reduction : PCA and Clustering
Generally Discriminant Analysis
Dynamic modeling of gene expression data
Yamanishi, M., Itoh, M., Kanehisa, M.
NON-NEGATIVE COMPONENT PARTS OF SOUND FOR CLASSIFICATION Yong-Choon Cho, Seungjin Choi, Sung-Yang Bang Wen-Yi Chu Department of Computer Science &
Presentation transcript:

Blind Information Processing: Microarray Data Hyejin Kim, Dukhee KimSeungjin Choi Department of Computer Science and Engineering, Department of Chemical Engineering POSTECH, Korea

Outline Blind Information Processing? Independent Component Analysis (ICA) Application of ICA to Microarray Data Time courses Yeast cell cycle data

Information Processing Blind Information Processing Little Prior Knowledge

Latent Variable Models Data Space (observation) Latent Variable Space Generative Model (FA, PPCA, ICA, GTM) Recognition Model (PCA, ICA, SOM)

What is ICA? ICA is a statistical method, the goal of which is to decompose given multivariate data into a linear sum of statistically independent components. For example, given two-dimensional vector, x = [ x 1 x 2 ] T, ICA aims at finding the following decomposition where a 1, a 2 are basis vectors and s 1, s 2 are basis coefficients Constraint: Basis coefficients s 1 and s 2 are statistically independent.

Information Geometry of ICA s y yp Mutual information Marginal mismatch Product manifold

PCA vs ICA Linear Transform Compression Classification PCA Orthogonal transform Second-order statistics Optimal coding in MS sense ICA Non-orthogonal transform Higher-order statistics Related to the projection pursuit Better than PCA in classification task?

Example of PCA

PCA vs ICA PCA (orthogonal coordinate) ICA (non-orthogonal coordinate)

PCA vs ICA x1x2 ICA PCA

Microarray Data (1)

Microarray Data Analysis(1) gene influence profile Expression mode of a sample x = gene sample influence gene expression profile

ICA: Time Courses (1) Time courses Yeast cell cycle data 77 by 6178 ORF expression ( Spellman et al ) Each mode shows specific cell-cycle behavior ICA modes remain inactive within some of the experiments Dimension reduction improve a prediction of cell-cycle regulated genes

ICA: Time Courses (2) by Liebermeister Mode1 76 components Mode2 76 components Mode1 12 components Mode1 12 components alphaelucidationcdc15cdc28

PCA Results

ICA Results(I)

ICA Results (II)

Conclusion Linear models of gene expression Model assumptions Matrix decomposition is simultaneously To interpret expression pattern and To cluster co-activated genes ICA advantage More biological meaningful analysis No order, No orthogonality More sensitive to detect expression pattern