Ways to construct Protein Space Construction of sequence space from (Eigen et al. 1988) illustrating the construction of a high dimensional sequence space.

Slides:



Advertisements
Similar presentations
Finding Gold In The Forest …A Connection Between Fractal Trees, Topology, and The Golden Ratio.
Advertisements

Phylogenetic Tree A Phylogeny (Phylogenetic tree) or Evolutionary tree represents the evolutionary relationships among a set of organisms or groups of.
Triangular Numbers An Investigation Triangular Numbers Triangular numbers are made by forming triangular patterns with counters. The first four triangular.
Trees and Sequence Space J. Peter Gogarten University of Connecticut Dept. of Molecular and Cell Biology Sculpture at Royal Botanical Gardens, Kew.
Multiple Sequence Alignment Algorithms in Computational Biology Spring 2006 Most of the slides were created by Dan Geiger and Ydo Wexler and edited by.
"Nothing in biology makes sense except in the light of evolution" Theodosius Dobzhansky.
Explorations of Multidimensional Sequence Space. one symbol -> 1D coordinate of dimension = pattern length.
Identifying functional residues of proteins from sequence info Using MSA (multiple sequence alignment) - search for remote homologs using HMMs or profiles.
Branches, splits, bipartitions In a rooted tree: clades (for urooted trees sometimes the term clann is used) Mono-, Para-, polyphyletic groups, cladists.
"Nothing in biology makes sense except in the light of evolution" Theodosius Dobzhansky.
Computational Biology, Part 2 Representing and Finding Sequence Features using Consensus Sequences Robert F. Murphy Copyright  All rights reserved.
"Nothing in biology makes sense except in the light of evolution" Theodosius Dobzhansky.
Cenancestor (aka LUCA or MRCA) can be placed using the echo remaining from the early expansion of the genetic code. reflects only a single cellular component.
Approaches To Infinity. Fractals Self Similarity – They appear the same at every scale, no matter how much enlarged.
Multiple Sequence Alignment CSC391/691 Bioinformatics Spring 2004 Fetrow/Burg/Miller (Slides by J. Burg)
EX_01.1/46 Numeric Systems. EX_01.2/46 Overview Numeric systems – general, Binary numbers, Octal numbers, Hexadecimal system, Data units, ASCII code,
1 Generalized Tree Alignment: The Deferred Path Heuristic Stinus Lindgreen
DECIDABILITY OF PRESBURGER ARITHMETIC USING FINITE AUTOMATA Presented by : Shubha Jain Reference : Paper by Alexandre Boudet and Hubert Comon.
Infinities 6 Iteration Number, Algebra and Geometry.
 The amount of data we deal with is getting larger  Not only do larger files require more disk space, they take longer to transmit  Many times files.
A B CA’ B’ C’ Similar Shapes The following diagram shows an enlargement by a scale factor 3 of triangle ABC Note Each length on the enlargement A’B’C’
EXAMPLE 4 Solve a multi-step problem To construct what is known as a fractal tree, begin with a single segment (the trunk) that is 1 unit long, as in Step.
STEPHANIE HINTZEN BIOL 471 SIV and HIV: Differences in Diversity and Divergence.
Strategies and Rubrics for Teaching Chaos and Complex Systems Theories as Elaborating, Self-Organizing, and Fractionating Evolutionary Systems Fichter,
1 Network Motifs in Prebiotic Metabolic Networks Omer Markovitch and Doron Lancet, Department of Molecular Genetics, Weizmann Institute of Science.
A Global View of the Protein Structure Universe and Protein Evolution Sung-Hou Kim University of California, Berkeley, CA U.S.A. June 27, 2006.
CSE 423 Computer Graphics | MUSHFIQUR ROUF CSE423:
Lossless Decomposition and Huffman Codes Sophia Soohoo CS 157B.
EBI is an Outstation of the European Molecular Biology Laboratory. A web based integrated search service to understand ligand binding and secondary structure.
1 What did we learn before?. 2 line and segment generation.
Turing’s Thesis Costas Busch - LSU.
Fig. 2 Two-dimensional embedding result obtained using nMDS.
System Software Unit-1 (Language Processors) A TOY Compiler
PDBemotif A web based integrated search service to understand ligand binding and secondary structure properties in macromolecular structures.
Shelling Protein Interfaces
Ways to construct Protein Space
Computer Graphics Lecture 40 Fractals Taqdees A. Siddiqi edu
From: Motion processing with two eyes in three dimensions
Average: 86.5% Median: 88% Stdev: 9%
There are four levels of structure in proteins
Teaching for Mastery: variation theory
S.K.H. Bishop Mok Sau Tseng Secondary School
MCB Class 1.
Average: 86.5% Median: 88% Stdev: 9%
MCB Class 1.
Volume 3, Issue 7, Pages (July 1995)
Binary Data representation
Marrying structure and genomics
Why use Binary? It is a two state system (on/off) which makes it simple to operate Even if degradation of current occurs (ie a slight drop in voltage)
Reading Phylogenetic Trees
Structure of the Microtubule-Binding Domain of Flagellar Dynein
WARM-UP 8 in. Perimeter = _____ 7 in. Area = _____ 12 in. 4 cm
Tom Huxford, De-Bin Huang, Shiva Malek, Gourisankar Ghosh  Cell 
Folding studies of immunoglobulin-like β-sandwich proteins suggest that they share a common folding pathway  Jane Clarke, Ernesto Cota, Susan B Fowler,
MCB Class 1.
Predicting the Onset of AIDS
Sean X. Sun, Hongyun Wang, George Oster  Biophysical Journal 
Mechanism of Triton binding to ShHTL7
Daniel Peisach, Patricia Gee, Claudia Kent, Zhaohui Xu  Structure 
The Unbinding of ATP from F1-ATPase
Volume 6, Issue 6, Pages (December 2000)
Structure of an IκBα/NF-κB Complex
Structural insights based on chimeric Alp4-GCP2 analysis.
Structures of P. aeruginosa ExoU and its chaperone, SpcU (PDB ID 3TU3), shown as a cartoon. Structures of P. aeruginosa ExoU and its chaperone, SpcU (PDB.
Phylogenetic analysis and domain structure of CALM proteins.
Sabine Pokutta, William I. Weis  Molecular Cell 
Crystal Structure of Escherichia coli RNase D, an Exoribonuclease Involved in Structured RNA Processing  Yuhong Zuo, Yong Wang, Arun Malhotra  Structure 
Andrey V Kajava, Gilbert Vassart, Shoshana J Wodak  Structure 
S. moellendorffii BBI3 Is Predicted to Be a BBI Based on Sequence Similarity at the Inhibitory Motifs and Shared Primary Protein Architecture. S. moellendorffii.
Presentation transcript:

Ways to construct Protein Space Construction of sequence space from (Eigen et al. 1988) illustrating the construction of a high dimensional sequence space. Each additional sequence position adds another dimension, doubling the diagram for the shorter sequence. Shown is the progression from a single sequence position (line) to a tetramer (hypercube). A four (or twenty) letter code can be accommodated either through allowing four (or twenty) values for each dimension (Rechenberg 1973; Casari et al. 1995), or through additional dimensions (Eigen and Winkler-Oswatitsch 1992). Eigen, M. and R. Winkler-Oswatitsch (1992). Steps Towards Life: A Perspective on Evolution. Oxford; New York, Oxford University Press. Eigen, M., R. Winkler-Oswatitsch and A. Dress (1988). "Statistical geometry in sequence space: a method of quantitative comparative sequence analysis." Proc Natl Acad Sci U S A 85(16): Casari, G., C. Sander and A. Valencia (1995). "A method to predict functional residues in proteins." Nat Struct Biol 2(2): Rechenberg, I. (1973). Evolutionsstrategie; Optimierung technischer Systeme nach Prinzipien der biologischen Evolution. Stuttgart-Bad Cannstatt, Frommann-Holzboog.

Diversion: From Multidimensional Sequence Space to Fractals

one symbol -> 1D coordinate of dimension = pattern length

Two symbols -> Dimension = length of pattern length 1 = 1D:

Two symbols -> Dimension = length of pattern length 2 = 2D: dimensions correspond to position For each dimension two possibiities Note: Here is a possible bifurcation: a larger alphabet could be represented as more choices along the axis of position!

Two symbols -> Dimension = length of pattern length 3 = 3D:

Two symbols -> Dimension = length of pattern length 4 = 4D: aka Hypercube

Two symbols -> Dimension = length of pattern

Three Symbols (the other fork)

Four Symbols: I.e.: with an alphabet of 4, we have a hypercube (4D) already with a pattern size of 2, provided we stick to a binary pattern in each dimension.

hypercubes at 2 and 4 alphabets 2 character alphabet, pattern size 4 4 character alphabet, pattern size 2

Three Symbols Alphabet suggests fractal representation

3 fractal enlarge fill in outer pattern repeats inner pattern = self similar = fractal

3 character alphabet 3 pattern fractal

3 character alphapet 4 pattern fractal Conjecture: For n -> infinity, the fractal midght fill a 2D triangle Note: check Mandelbrot

Same for 4 character alphabet 1 position 2 positions 3 positions

4 character alphabet continued (with cheating I didn’t actually add beads) 4 positions

4 character alphabet continued (with cheating I didn’t actually add beads) 5 positions

4 character alphabet continued (with cheating I didn’t actually add beads) 6 positions

4 character alphabet continued (with cheating I didn’t actually add beads) 7 positions

Animated GIf 1-12 positions

Protein Space in JalView

Alignment of V F A ATPase ATP binding SU (catalytic and non- catalytic SU)

UPGMA tree of V F A ATPase ATP binding SU with line dropped to partition (and colour) the 4 SU types (VA cat and non cat, F cat and non cat). Note that details of the tree

PCA analysis of V F A ATPase ATP binding SU using colours from the UPGMA tree

Same PCA analysis of V F A ATPase ATP binding SU using colours from the UPGMA tree, but turned slightly. (Giardia A SU selected in grey.)

Same PCA analysis of V F A ATPase ATP binding SU Using colours from the UPGMA tree, but replacing the 1st with the 5th axis. (Eukaryotic A SU selected in grey.)

Same PCA analysis of V F A ATPase ATP binding SU Using colours from the UPGMA tree, but replacing the 1st with the 6th axis. (Eukaryotic B SU selected in grey - forgot rice.)

Problems Jalview’s approach requires an alignment - only homologous sequences can be depicted in the same space Solution: One could use pattern absence / presence as coordinates