Supplementary material Figure S1. Cumulative histogram of the fitness of the pairwise alignments of random generated ESSs. In order to assess the statistical.

Slides:



Advertisements
Similar presentations
Original Figures for "Molecular Classification of Cancer: Class Discovery and Class Prediction by Gene Expression Monitoring"
Advertisements

STATISTICS.
A new method of finding similarity regions in DNA sequences Laurent Noé Gregory Kucherov LORIA/UHP Nancy, France LORIA/INRIA Nancy, France Corresponding.
P449. p450 Figure 15-1 p451 Figure 15-2 p453 Figure 15-2a p453.
Bioinformatics Finding signals and motifs in DNA and proteins Expectation Maximization Algorithm MEME The Gibbs sampler Lecture 10.
Hidden Markov Models Pairwise Alignments. Hidden Markov Models Finite state automata with multiple states as a convenient description of complex dynamic.
M. Griniasty and S. Fishman, Phys. Rev. Lett., 60, 1334 (1988).
Midterm Review. Review of previous weeks Pairwise sequence alignment Scoring matrices PAM, BLOSUM, Dynamic programming Needleman-Wunsch (Global) Semi-global.
Pairwise Sequence Alignment Part 2. Outline Global alignments-continuation Local versus Global BLAST algorithms Evaluating significance of alignments.
Introduction to Simulation. What is simulation? A simulation is the imitation of the operation of a real-world system over time. It involves the generation.
Chapter 4 The Normal Distribution EPS 625 Statistical Methods Applied to Education I.
Sequence Alignment II CIS 667 Spring Optimal Alignments So we know how to compute the similarity between two sequences  How do we construct an.
PROBABILITY AND SAMPLES: THE DISTRIBUTION OF SAMPLE MEANS.
Sequence comparison: Significance of similarity scores Genome 559: Introduction to Statistical and Computational Genomics Prof. James H. Thomas.
1. Normal Curve 2. Normally Distributed Outcomes 3. Properties of Normal Curve 4. Standard Normal Curve 5. The Normal Distribution 6. Percentile 7. Probability.
Sequence comparison: Local alignment
Chapter 11: Random Sampling and Sampling Distributions
TM Biological Sequence Comparison / Database Homology Searching Aoife McLysaght Summer Intern, Compaq Computer Corporation Ballybrit Business Park, Galway,
§ 5.2 Normal Distributions: Finding Probabilities.
Protein Structure Alignment by Incremental Combinatorial Extension (CE) of the Optimal Path Ilya N. Shindyalov, Philip E. Bourne.
Multiple Sequence Alignment CSC391/691 Bioinformatics Spring 2004 Fetrow/Burg/Miller (Slides by J. Burg)
Sequence Analysis Alignments dot-plots scoring scheme Substitution matrices Search algorithms (BLAST)
Space-Efficient Sequence Alignment Space-Efficient Sequence Alignment Bioinformatics 202 University of California, San Diego Lecture Notes No. 7 Dr. Pavel.
Chapter 9 Superposition and Dynamic Programming 1 Chapter 9 Superposition and dynamic programming Most methods for comparing structures use some sorts.
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved Section 8-6 Testing a Claim About a Standard Deviation or Variance.
Normal Curve with Standard Deviation |  + or - one s.d.  |
Continuous Random Variables
Gapped BLAST and PSI- BLAST: a new generation of protein database search programs By Stephen F. Altschul, Thomas L. Madden, Alejandro A. Schäffer, Jinghui.
Multiple Alignment and Phylogenetic Trees Csc 487/687 Computing for Bioinformatics.
Chapter 5 Discrete Probability Distributions
Phylogenetic Prediction Lecture II by Clarke S. Arnold March 19, 2002.
Slide Slide 1 Section 8-6 Testing a Claim About a Standard Deviation or Variance.
A Tutorial of Sequence Matching in Oracle Haifeng Ji* and Gang Qian** * Oklahoma City Community College ** University of Central Oklahoma.
Math b (Discrete) Random Variables, Binomial Distribution.
DYNAMIC FACILITY LAYOUT : GENETIC ALGORITHM BASED MODEL
The Standard Normal Distribution Section 5.2. The Standard Score The standard score, or z-score, represents the number of standard deviations a random.
Pairwise Local Alignment and Database Search Csc 487/687 Computing for Bioinformatics.
Pairwise Sequence Alignment Part 2. Outline Summary Local and Global alignments FASTA and BLAST algorithms Evaluating significance of alignments Alignment.
THE NATURE OF STATISTICS Copyright © Cengage Learning. All rights reserved. 14.
341- INTRODUCTION TO BIOINFORMATICS Overview of the Course Material 1.
Sequence Alignment.
Lecture 11 CS5661 Structural Bioinformatics – Structure Comparison Motivation Concepts Structure Comparison.
Huffman Coding (2 nd Method). Huffman coding (2 nd Method)  The Huffman code is a source code. Here word length of the code word approaches the fundamental.
GA for Sequence Alignment  Pair-wise alignment  Multiple string alignment.
ISA Kim Hye mi. Introduction Input Spectrum data (Protein database) Peptide assignment Peptide validation manual validation PeptideProphet.
Section 5.1 Introduction to Normal Distributions © 2012 Pearson Education, Inc. All rights reserved. 1 of 104.
3.3b1 Protein Structure Threading (Fold recognition) Boris Steipe University of Toronto (Slides evolved from original material.
Your friend has a hobby of generating random bit strings, and finding patterns in them. One day she come to you, excited and says: I found the strangest.
Section 2 Standard Units and Areas under the Standard Normal Distribution.
Econ 110: Sampling Theory and Statistical Inference In Economics 2 nd semester 2016 richard makoto Economics Department University of Zimbabwe Normal Distribution.
Introduction to Normal Distributions
Figure 2-7 (p. 47) A bar graph showing the distribution of personality types in a sample of college students. Because personality type is a discrete variable.
Continuous Random Variables
Chapter 5 Normal Probability Distributions.
Sequence comparison: Dynamic programming
Sequence comparison: Local alignment
A. BCL6 Intensity Score Bin B. CD7 Intensity Score Bin C.
Elementary Statistics
Elementary Statistics
Histograms of grades in two classes, each of 200 students
Figure 6-13 Determining probabilities or proportions for a normal distribution is shown as a two-step process with z-scores as an intermediate stop along.
Formative Assessment 2.
Statistics 2 Lesson 2.7 Standard Deviation 2.
Introduction to Normal Distributions
Chapter 5 Normal Probability Distributions.
Projects….
Testing a Claim About a Standard Deviation or Variance
Chapter 5 Normal Probability Distributions.
Introduction to Normal Distributions
Reliability of Assessment of Protein Structure Prediction Methods
Presentation transcript:

Supplementary material Figure S1. Cumulative histogram of the fitness of the pairwise alignments of random generated ESSs. In order to assess the statistical significance of the ESS pairwise alignments, 10 random sets were generated containing the same composition and length size than the original ESSs. The figure shows the mean ± standard deviation of the random alignments. The real data has a overrepresentation of scores below 0.4 that corresponds to the fitness values of the alignments of similar ESSs. Figure S2. Histogram of the fitness (scores) of the pairwise alignments of ESSs. The comparison between the scores obtained by GA and Dynamic Programing (DP) algorithms are shown. In order to asses the consistence of the GA we implement a DP algorithm that creates and evaluates the ESS alignments using the same Objective Function used by the GA. This algorithm were used to repeat the all against all pairwise alignments previously generated with the GA. The results show that both algorithms generate similar distributions.

Figure S1

Figure S2

Table SI. Number of ESS and mean length generated by KEGG E. coli K-12 metabolic maps. 452 ESS were obtained from 47 maps. Columns are as follow: KEGG identification number, metabolic map, number of sequences generated per map, sequence length and standard deviation.

Table SI. Continuation.