Geometric Approaches to Reconstructing Time Series Data Project Update 29 March 2007 CSC/Math 870 Computational Discrete Geometry Connie Phong.

Slides:



Advertisements
Similar presentations
Yinyin Yuan and Chang-Tsun Li Computer Science Department
Advertisements

SHREYAS PARNERKAR. Motivation Texture analysis is important in many applications of computer image analysis for classification or segmentation of images.
Clustered alignments of gene- expression time series data Adam A. Smith, Aaron Vollrath, Cristopher A. Bradfield and Mark Craven Department of Biosatatistics.
Getting the numbers comparable
How many transcripts does it take to reconstruct the splice graph? Introduction Alternative splicing is the process by which a single gene may be used.
Mutual Information Mathematical Biology Seminar
DNA Microarray Bioinformatics - #27612 Normalization and Statistical Analysis.
T T20-01 Mean Chart (Known Variation) CL Calculations Purpose Allows the analyst calculate the "Mean Chart" for known variation 3-sigma control.
Introduction to Bioinformatics Algorithms Clustering.
A New Approach to Analyzing Gene Expression Time Series Data Ziv Bar-Joseph Georg Gerber David K. Gifford Tommi S. Jaakkola Itamar Simon Learning Seminar:
‘Gene Shaving’ as a method for identifying distinct sets of genes with similar expression patterns Tim Randolph & Garth Tan Presentation for Stat 593E.
Gene Regulatory Networks - the Boolean Approach Andrey Zhdanov Based on the papers by Tatsuya Akutsu et al and others.
Gene Expression 1. Methods –Unsupervised Clustering Hierarchical clustering K-means clustering Expression data –GEO –UCSC EPCLUST 2.
Geometric Approaches to Reconstructing Time Series Data Final Presentation 10 May 2007 CSC/Math 870 Computational Discrete Geometry Connie Phong.
Geometric Approaches to Reconstructing Times Series Project Outline 15 February 2007 CSC/Math 870 Computational Discrete Geometry Connie Phong.
Artificial Intelligence Term Project #3 Kyu-Baek Hwang Biointelligence Lab School of Computer Science and Engineering Seoul National University
Computer Vision Lecture 3: Digital Images
Tutorial 8 Clustering 1. General Methods –Unsupervised Clustering Hierarchical clustering K-means clustering Expression data –GEO –UCSC –ArrayExpress.
Cluster Analysis Hierarchical and k-means. Expression data Expression data are typically analyzed in matrix form with each row representing a gene and.
T T18-09 Line Plot (by Observation) Purpose Allows the analyst to visually analyze up to 5 time series plots on a single graph data samples by.
Introduction to Bioinformatics Algorithms Clustering and Microarray Analysis.
B IOINFORMATICS Dr. Aladdin HamwiehKhalid Al-shamaa Abdulqader Jighly Lecture 8 Analyzing Microarray Data Aleppo University Faculty of technical.
Microarray Gene Expression Data Analysis A.Venkatesh CBBL Functional Genomics Chapter: 07.
CS Machine Learning. What is Machine Learning? Adapt to / learn from data  To optimize a performance function Can be used to:  Extract knowledge.
1 Formation et Analyse d’Images Session 7 Daniela Hall 7 November 2005.
BIONFORMATIC ALGORITHMS Ryan Tinsley Brandon Lile May 9th, 2014.
By Meidika Wardana Kristi, NRP  Digital cameras used to take picture of an object requires three sensors to store the red, blue and green color.
Chapter 7 Essential Concepts in Molecular Pathology Companion site for Molecular Pathology Author: William B. Coleman and Gregory J. Tsongalis.
DNA microarray technology allows an individual to rapidly and quantitatively measure the expression levels of thousands of genes in a biological sample.
Image Restoration using Iterative Wiener Filter --- ECE533 Project Report Jing Liu, Yan Wu.
September 5, 2013Computer Vision Lecture 2: Digital Images 1 Computer Vision A simple two-stage model of computer vision: Image processing Scene analysis.
Artificial Intelligence Project #3 : Analysis of Decision Tree Learning Using WEKA May 23, 2006.
Statistical learning and optimal control: A framework for biological learning and motor control Lecture 4: Stochastic optimal control Reza Shadmehr Johns.
Tutorial 7 Gene expression analysis 1. Expression data –GEO –UCSC –ArrayExpress General clustering methods –Unsupervised Clustering Hierarchical clustering.
Decision Trees. What is a decision tree? Input = assignment of values for given attributes –Discrete (often Boolean) or continuous Output = predicated.
Metabolic Network Inference from Multiple Types of Genomic Data Yoshihiro Yamanishi Centre de Bio-informatique, Ecole des Mines de Paris.
Data Mining the Yeast Genome Expression and Sequence Data Alvis Brazma European Bioinformatics Institute.
Making Time: Pseudo Time-Series for the Temporal Analysis of Cross-Section Data Emma Peeling, Allan Tucker Centre for Intelligent Data Analysis Brunel.
Artificial Intelligence Project #3 : Diagnosis Using Bayesian Networks May 19, 2005.
Gene expression & Clustering. Determining gene function Sequence comparison tells us if a gene is similar to another gene, e.g., in a new species –Dynamic.
Computational Biology Clustering Parts taken from Introduction to Data Mining by Tan, Steinbach, Kumar Lecture Slides Week 9.
Math 285 Project Diffusion Maps Xiaoyan Chong Department of Mathematics and Statistics San Jose State University.
Nonlinear differential equation model for quantification of transcriptional regulation applied to microarray data of Saccharomyces cerevisiae Vu, T. T.,
Tutorial 8 Gene expression analysis 1. How to interpret an expression matrix Expression data DBs - GEO Clustering –Hierarchical clustering –K-means clustering.
Learning Photographic Global Tonal Adjustment with a Database of Input / Output Image Pairs.
1 Microarray Clustering. 2 Outline Microarrays Hierarchical Clustering K-Means Clustering Corrupted Cliques Problem CAST Clustering Algorithm.
Statistical Analysis for Expression Experiments Heather Adams BeeSpace Doctoral Forum Thursday May 21, 2009.
Lecture 26 GWAS Based on chapter 9 Functional and Comparative Genomics Copyright © 2010 Pearson Education Inc.
Computer Graphics CC416 Lecture 04: Bresenham Line Algorithm & Mid-point circle algorithm Dr. Manal Helal – Fall 2014.
Clustering [Idea only, Chapter 10.1, 10.2, 10.4].
Computational Biology
EQTLs.
From: Variational Integrators for Structure-Preserving Filtering
Hyunghoon Cho, Bonnie Berger, Jian Peng  Cell Systems 
Volume 63, Issue 3, Pages (August 2009)
Computer Vision Lecture 3: Digital Images
1 Department of Engineering, 2 Department of Mathematics,
1 Department of Engineering, 2 Department of Mathematics,
1 Department of Engineering, 2 Department of Mathematics,
Clustering.
Volume 4, Issue 5, Pages (September 2011)
Volume 63, Issue 3, Pages (August 2009)
Anastasia Baryshnikova  Cell Systems 
ADAGE model example. ADAGE model example. For one sample in the expression compendium (one column in the figure with red or green colors, representing.
Boosting Signal-to-Noise in Complex Biology: Prior Knowledge Is Power
Comparison of proteomics and RNA‐Seq data.
Clustering.
Hyunghoon Cho, Bonnie Berger, Jian Peng  Cell Systems 
Introduction to Artificial Intelligence Lecture 22: Computer Vision II
Fig. 2 Device tests. Device tests. (A) Comparison of displacements obtained from consumer GNSS receivers with and without phase smoothing (p-s) and SBAS,
Presentation transcript:

Geometric Approaches to Reconstructing Time Series Data Project Update 29 March 2007 CSC/Math 870 Computational Discrete Geometry Connie Phong

Objectives and Motivations To reconstruct a time ordering from data without explicit time indices Unordered or poorly ordered sets of observations are common in biological experiments such as DNA microarray experiments

Implementing a MST based algorithm Calculate MST Find diameter path of the MST Compute diameter path statistics Create PQ-Tree from diameter path and MST Output diameter path as the estimated ordering Input weighted graph constructed from samples Low noise and high sampling intensity? YesNo Output PQ tree showing uncertainties in the ordering

Artificial Dataset: Jelly roll

Yeast Microarray Dataset Rows – genes Columns – time points Magnitude of the ratio of induction to repression is indicated by color intensity: red indicates an increase in mRNA abundance and green indicates a decrease in mRNA abundance Spellman et al.’s original dataset contains 6177 open reading frames 18 time points, 7 min intervals reduced to 5541 genes ran algorithm on 500 genes exhibiting the most sample variation synchronized by treatment with alpha factor

Yeast Microarray Dataset Figure 3a: sample points in the space of the three largest principle coordinates Figure 3f: known ordering and path Figure 3b: mst for the data with diameter path shown in bold noise = intensity =

Yeast Microarray Dataset Create PQ-tree [ {(1, 2, 3, 4, 5, 6, 7), 8, 9}, {17, 18, 10}, {16, 15, (14, 13, 12, 11)} ] Costs of known ordering: –No relationship between cost of particular ordering and accuracy of the ordering [1, 2, 3, 4, 5, 6, 7, 8, 9, 17, 18, 10, 11, 12, 13, 14, 16, 15] = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 18, 17, 17, 15, 14, 13, 12, 11] =

Current Work Researching principle curves Researching Kalman filter Compiling other datasets –A little bit of research on the implications of certain preprocessing steps Overall objective: develop an algorithm for reconstructing time orderings that is more theoretically rigorous and addresses error and noise more succinctly