Observations des données: recherche des régions spectrales corrélées.

Slides:



Advertisements
Similar presentations
A 10-a editie a Seminarului National de nanostiinta si nanotehnologie 18 mai 2011 Biblioteca Academiei Romane Composite materials based on carbon nanotubes.
Advertisements

Introduction Gliomas are the most common primary brain tumour. Their infiltrative nature makes complete resection difficult, and yet gross total resection.
Calibration Techniques
K-means Clustering Given a data point v and a set of points X,
Discrimination amongst k populations. We want to determine if an observation vector comes from one of the k populations For this purpose we need to partition.
Summary Number of pages: 10 Number of Figures: 9 Number of Tables: 3 Diagnostic segregation of human brain tumours using Fourier-transform infrared and/or.
Imaging MS MIAPE Working Document Helmholtz Institute, Munich, April 16 th 2012.
PCA for analysis of complex multivariate data. Interpretation of large data tables by PCA In industry, research and finance the amount of data is often.
Hierarchical Clustering. Produces a set of nested clusters organized as a hierarchical tree Can be visualized as a dendrogram – A tree-like diagram that.
UNSUPERVISED ANALYSIS GOAL A: FIND GROUPS OF GENES THAT HAVE CORRELATED EXPRESSION PROFILES. THESE GENES ARE BELIEVED TO BELONG TO THE SAME BIOLOGICAL.
DNA Microarray Bioinformatics - #27611 Program Normalization exercise (from last week) Dimension reduction theory (PCA/Clustering) Dimension reduction.
Clustering… in General In vector space, clusters are vectors found within  of a cluster vector, with different techniques for determining the cluster.
Dimension reduction : PCA and Clustering Agnieszka S. Juncker Slides: Christopher Workman and Agnieszka S. Juncker Center for Biological Sequence Analysis.
Cluster Analysis.  What is Cluster Analysis?  Types of Data in Cluster Analysis  A Categorization of Major Clustering Methods  Partitioning Methods.
Dimension reduction : PCA and Clustering by Agnieszka S. Juncker
Dimension reduction : PCA and Clustering Slides by Agnieszka Juncker and Chris Workman.
Supported by the NSF Plant Genome Research and REU Programs *Supported by the NSF Plant Genome Research and REU Programs FTIR data analysis tutorial Bryan.
Dimension reduction : PCA and Clustering Christopher Workman Center for Biological Sequence Analysis DTU.
Semi-Supervised Clustering Jieping Ye Department of Computer Science and Engineering Arizona State University
Microarray analysis 2 Golan Yona. 2) Analysis of co-expression Search for similarly expressed genes experiment1 experiment2 experiment3 ……….. Gene i:
PATTERN RECOGNITION : PRINCIPAL COMPONENTS ANALYSIS Prof.Dr.Cevdet Demir
Dimension reduction : PCA and Clustering by Agnieszka S. Juncker Part of the slides is adapted from Chris Workman.
Genomic signatures to guide the use of chemotherapeutics Authors: Anil Potti et. al Presenter: Jong Cheol Jeong.
Ulf Schmitz, Pattern recognition - Clustering1 Bioinformatics Pattern recognition - Clustering Ulf Schmitz
Today: IR Next time: (see our website!) Partition coefficient and partition calculations Separations of mixtures.
Lecture 3 INFRARED SPECTROMETRY
Multipurpose analysis: soil, plant tissue, wood, fruits, oils. Benchtop, portable Validation in-built, ISO compliant Little or no sample preparation. Rapid.
Evaluating Performance for Data Mining Techniques
Gene expression profiling identifies molecular subtypes of gliomas
Large Two-way Arrays Douglas M. Hawkins School of Statistics University of Minnesota
Principal Components Approach for Estimating Heritability of Mid-Infrared Spectrum in Bovine Milk H. Soyeurt 1,2,*, S. Tsuruta 3, I. Misztal 3 & N. Gengler.
1 University of Petra Faculty of Science & Arts Department of Chemistry Seminar I.R Spectroscopy By Firas Al-ouzeh Supervisor : Nuha I. Swidan Summer 2007.
DNA microarray technology allows an individual to rapidly and quantitatively measure the expression levels of thousands of genes in a biological sample.
Threeway analysis Batch organic synthesis. Paul Geladi Head of Research NIRCE Chairperson NIR Nord Unit of Biomass Technology and Chemistry Swedish University.
COMMON EVALUATION FINAL PROJECT Vira Oleksyuk ECE 8110: Introduction to machine Learning and Pattern Recognition.
Lecture 20: Cluster Validation
es/by-sa/2.0/. Principal Component Analysis & Clustering Prof:Rui Alves Dept Ciencies Mediques.
Metabolomics Metabolome Reflects the State of the Cell, Organ or Organism Change in the metabolome is a direct consequence of protein activity changes.
A B S T R A C T The study presents the application of selected chemometric techniques to the pollution monitoring dataset, namely, cluster analysis,
Dimension reduction : PCA and Clustering Slides by Agnieszka Juncker and Chris Workman modified by Hanne Jarmer.
PATTERN RECOGNITION : CLUSTERING AND CLASSIFICATION Richard Brereton
Analyzing Expression Data: Clustering and Stats Chapter 16.
Molecular Classification of Cancer Class Discovery and Class Prediction by Gene Expression Monitoring.
PATTERN RECOGNITION : PRINCIPAL COMPONENTS ANALYSIS Richard Brereton
CZ5211 Topics in Computational Biology Lecture 4: Clustering Analysis for Microarray Data II Prof. Chen Yu Zong Tel:
Fourier Transform IR Spectroscopy. Absorption peaks in an infrared absorption spectrum arise from molecular vibrations Absorbed energy causes molecular.
The Electromagnetic Spectrum
PROJECT: LIFE12 ENV/IT/352 «BIONAD»
Date of download: 5/28/2016 Copyright © 2016 SPIE. All rights reserved. Peripheral blood mononuclear cells (PBMCs) cultures in media with heat-inactivated.
Date of download: 6/21/2016 Copyright © 2016 SPIE. All rights reserved. Cellular and functional characterization of hepatocyte cells at the last stage.
Date of download: 6/22/2016 Copyright © 2016 SPIE. All rights reserved. Representation of the effect of applying the LOOCV test to the spectra. The white.
LAB. S4 Identification of Drugs Using Attenuated Total Reflectance Fourier Transform Infrared Spectroscopy (ATR-FT-IR Spectroscopy)
Unsupervised Learning
PREDICT 422: Practical Machine Learning
Semi-Supervised Clustering
Dimension reduction : PCA and Clustering by Agnieszka S. Juncker
Louise Fortunato, Sulaf Assi, Paul Kneller and David Osselton
IR-Spectroscopy IR region Interaction of IR with molecules
Clustering and Multidimensional Scaling
Yevgeniya Kobrina, Lassi Rieppo, Simo Saarakkala, Jukka S
IR-Spectroscopy IR region Interaction of IR with molecules
Volume 7, Issue 4, Pages (April 2005)
Shohei Fujita, Takuya Matsuo, Masahiro Ishiura, Masahide Kikkawa 
Dimension reduction : PCA and Clustering
Evaluation of the Information Content in Infrared Spectra for Protein Secondary Structure Determination  Erik Goormaghtigh, Jean-Marie Ruysschaert, Vincent.
Evaluation of the Information Content in Infrared Spectra for Protein Secondary Structure Determination  Erik Goormaghtigh, Jean-Marie Ruysschaert, Vincent.
Introduction During the last years the use of Fourier Transform Infrared spectroscopy (FTIR) to determine the structure of biological macromolecules.
Unsupervised Learning
Presentation transcript:

Observations des données: recherche des régions spectrales corrélées

6- Classification de spectres Etant donné un nombre de spectres, comment les classer par “ressemblance”

6- Classification de spectres hierarchical clustering Step 1: the Euclidian distance between each pair of spectra is calculated. Figure: 5 spectra represented in a 2-D space (say we recorded only the absorbance at two wavenumbers)

6- Classification de spectres Step 2: grouping starts by linking the closest spectra. Figure: grouping of spectra (#1 to 5) and clusters (beyond #5) hierarchical clustering

6- Classification de spectres Step 3: dendrogram representation. Statistical significance of the distances hierarchical clustering

50 protéines

6- Classification de spectres donovani hierarchical clustering

6- Classification de spectres K-means clustering kmeans treats each spectrum as an object having a location in space. It finds a partition in which spectra within each cluster are as close to each other as possible, and as far from spectra in other clusters as possible. kmeans uses an iterative algorithm that minimizes the sum of distances from each object to its cluster centroid, over all clusters. This algorithm moves objects between clusters until the sum cannot be decreased further. The result is a set of clusters that are as compact and well-separated as possible.

Identification de lignées cellulaires

Chemical differences as (CH 3 ) s (CH 3 ) as (CH 2 ) (C=O) ester  (CH 3 )  (CH 2 ) as (PO 2 - ) (C-O) s (PO 2 - )  as (N-(CH 3 ) 3 + ) Amide I (C=O) amide Amide II  (N-H) amide (C-OH) Phospholipid (DMPC) Glycoprotéine Mucine RNA DNA Correlated with growth, not with species

INFRARED MEASUREMENTS X3

Infrared spectrum of a cell The conformation of the molecules, especially proteins IR spectrum = fingerprint of: The chemical nature of the components (glycosylations, DNA, RNA, proteins, lipids,….)

Fingerprinting and cell classification

Dendrogram of a hierarchical cluster analysis performed on 240 spectra of different strains of Gram-positive and Gram-negative bacteria, and of yeasts belonging to the genus Candida (a). Dendrogram obtained when cluster analysis is performed on the yeast spectra only (b). FTIR of bacteria

Spectral typing of closely related microorganisms. (a) Clinical isolates of E. coli (numbers in right column) belonging to different serogroups: O 25, O 18, and O 114 according to their O-antigenic structure. FTIR of bacteria

Application à des cellules eucaryotes 1.Identification de Leismania sp. 2.Cellules leucémiques K562 wt ou résistantes 3.Classification de cellules gliales 4.Mode d’action de molécules anticancéreuses 5.Etudes microscopiques de tissus

L. lainsoni versus L. brasiliensis (87 spectra)

recording: 2 cm -1, 256 scans noise evaluation Water vapor subtraction (when necessary), apodization at 4 cm -1 final resolution Baseline subtraction (typically ) Scaling for a same area under cm -1 Spectra recording and processing

Identification de cellules eucaryotes Leishmania lainsoni Leishmania brasiliensis

Différences significatives (Student test) Différence des moyennes Moyenne pour L. lainsoni Moyenne pour L. brasiliensis * Student positive, alpha=0.01

Classification supervisée / non supervisée Analyse non supervisée: decomposition en composants principaux (cross validation) Analyse supervisée: régression linéaire (cross validation)

Distance between spectra Model built after variable selection and principal component analysis Leishmania lainsoni Leishmania brasiliensis

L. lainsoni versus L. brasiliensis MANOVA using 1753, 1724, 3008 and 1430 cm -1 L. brasiliensis L. lainsoni

Classification de quatre espèces Leishmania peruviana Leishmania lainsoni Leishmania donovani Leishmania brasiliensis

Distance analysis between species donovani

Chemical differences * Student positive, alpha=0.01 Mean L. brasiliensis – Mean L. lainsoni as (CH 3 ) s (CH 3 ) as (CH 2 ) (C=O) ester  (CH 3 )  (CH 2 ) as (PO 2 - ) (C-O) s (PO 2 - )  as (N-(CH 3 ) 3 + ) Amide I (C=O) amide Amide II  (N-H) amide (C-OH) Phospholipid (DMPC) Glycoprotéine Mucine RNA DNA Correlated with growth, not with species

Difference of the means Strain 1, mean Strain 2, mean Effect of culture growth. Comparing two strains. Culture day

CONCLUSIONS 1.Certaines régions spectrales décrivent la croissance de la culture indépendamment de l’espèce 2.D’autres régions décrivent l’espèce indépendamment de l’état de la culture 3.La spectroscopie FTIR peut devenir un outil rapide et économique pour la détermination de Leishmania sp.

trace A: Representative infrared spectrum of resistant K562 cells. trace B: Representative infrared spectrum of sensitive K562 cells. trace C: Difference infrared spectrum between resistant and sensitive K562 cells, this spectrum is magnified 4 times. trace D: Result of the Student test performed at alpha level = 5%, the wavelengths in blue are significantly different between the two cell lines. Example of information retrieved from the data: the intensity ratio between 2958cm -1 (CH 3 stretching) and 2923cm -1 (CH 2 stretching) is increased by 20% in resistant cells, suggesting a qualitative modification of the lipids in the cell membranes. 2) Resistant / sensitive K562

2D plot of K562 sensitive (22 spectra, blue points) and resistant (26 spectra, red stars) cells spectra reduced by PCA. Unsupervised classification

Belot et al., Glia 2001, 36, ) Identification de phénotypes: le cas des gliomes

In vitro parameters :  Motility : Maximum Relative Distance from the Origin  Motility : Average Speed  Growth : Anchorage-dependent growth  Growth : Anchorage-independent growth (in semi-solid agar)  Invasion : percentage of cells invading a collagen matrix In vivo parameter :  Aggressiveness : Median Survival Time 2852 cm cm cm -1 Identification de phénotypes: le cas des gliomes

Sample preparation FTIR de cellules Identification de phénotypes: le cas des gliomes

16 cell lines used Identification de phénotypes: le cas des gliomes

Multiple regression explaining the average speed R = 0.96 (P = 0.003). Multiple regression explaining the median survival periods of the nude mice grafted with glioma cells. R = 0.97 (P = ). Average speed Median survival time Identification de phénotypes: le cas des gliomes

Mode d’action de molécules anticancéreuses

Studies on cells - PC-3 prostate cancer cells in culture - Washed in 0.9% NaCl - Deposited on a BaF 2 window

Daunorubicine Doxorubicine Irinotecan Mercaptopurine Méthotrexate Paclitaxel Vinblastine Vincristine Non traitées

Daunorubicin Doxorubicin Mercaptopurine Methotrexate Paclitaxel Vinblastine Vincristine Distance Hierarchichal classification of “ difference spectra” FTIR of drug signature on cancer cells

Model Predicted Control Topoiso- merases inhibitors Antimeta- bolites Antimicro- tubules Training Control Topoiso- merases inhibitors Antimeta- bolites Antimicro- tubules FTIR of drug signature on cancer cells