Marcin Pacholczyk, Silesian University of Technology.

Slides:



Advertisements
Similar presentations
Rosetta Energy Function Glenn Butterfoss. Rosetta Energy Function Major Classes: 1. Low resolution: Reduced atom representation Simple energy function.
Advertisements

Predicting Enhancers in Co-Expressed Genes Harshit Maheshwari Prabhat Pandey.
Thermodynamic Models of Gene Regulation Xin He CS598SS 04/30/2009.
Combined analysis of ChIP- chip data and sequence data Harbison et al. CS 466 Saurabh Sinha.
Bioinformatics Vol. 21 no (Pages ) Reporter: Yu Lun Kuo (D )
Regulatory Motifs. Contents Biology of regulatory motifs Experimental discovery Computational discovery PSSM MEME.
Molecular Biology Fifth Edition
Genome-wide prediction and characterization of interactions between transcription factors in S. cerevisiae Speaker: Chunhui Cai.
Protein-DNA interactions: amino acid conservation and the effects of mutations on binding specificity Nicholas M. Luscombe and Janet M. Thornton JMB (2002)
Heuristic alignment algorithms and cost matrices
Quantitative Structure-Activity Relationships (QSAR) Comparative Molecular Field Analysis (CoMFA) Gijs Schaftenaar.
Protein Primer. Outline n Protein representations n Structure of Proteins Structure of Proteins –Primary: amino acid sequence –Secondary:  -helices &
The Model To model the complex distribution of the data we used the Gaussian Mixture Model (GMM) with a countable infinite number of Gaussian components.
Bioinformatics IV Quantitative Structure-Activity Relationships (QSAR) and Comparative Molecular Field Analysis (CoMFA) Martin Ott.
BACKGROUND E. coli is a free living, gram negative bacterium which colonizes the lower gut of animals. Since it is a model organism, a lot of experimental.
. Protein Structure Prediction [Based on Structural Bioinformatics, section VII]
An Integrated Approach to Protein-Protein Docking
Biological Sequence Pattern Analysis Liangjiang (LJ) Wang March 8, 2005 PLPTH 890 Introduction to Genomic Bioinformatics Lecture 16.
A System Approach to Measuring the Binding Energy Landscapes of Transcription Factors Authors: Sebastian J. et. al Presenter: Hongliang Fei.
Ab initio motif finding
Protein Structure Prediction Samantha Chui Oct. 26, 2004.
Cis-regultory module 10/24/07. TFs often work synergistically (Harbison 2004)
Motif finding: Lecture 1 CS 498 CXZ. From DNA to Protein: In words 1.DNA = nucleotide sequence Alphabet size = 4 (A,C,G,T) 2.DNA  mRNA (single stranded)
Gaussian Processes for Transcription Factor Protein Inference Neil D. Lawrence, Guido Sanguinetti and Magnus Rattray.
The Geometry of Biomolecular Solvation 1. Hydrophobicity Patrice Koehl Computer Science and Genome Center
Bayesian integration of biological prior knowledge into the reconstruction of gene regulatory networks Dirk Husmeier Adriano V. Werhli.
Homology Modeling David Shiuan Department of Life Science and Institute of Biotechnology National Dong Hwa University.
Protein Tertiary Structure Prediction
Gary Stormo by Andrew Bardee. History Born 1950 in South Dakota Undergraduate in Biology from Caltech PhD in Molecular Biology from University of Colorado.
 Four levels of protein structure  Linear  Sub-Structure  3D Structure  Complex Structure.
Transcription factor binding sites and gene regulatory network Victor Jin Department of Biomedical Informatics The Ohio State University.
Statistical Physics of the Transition State Ensemble in Protein Folding Alfonso Ramon Lam Ng, Jose M. Borreguero, Feng Ding, Sergey V. Buldyrev, Eugene.
Flexible Multi-scale Fitting of Atomic Structures into Low- resolution Electron Density Maps with Elastic Network Normal Mode Analysis Tama, Miyashita,
From Structure to Function. Given a protein structure can we predict the function of a protein when we do not have a known homolog in the database ?
Motif finding with Gibbs sampling CS 466 Saurabh Sinha.
Summary Various mathematical models describing gene regulatory networks as well as algorithms for network reconstruction from experimental data have been.
Inferring strengths of protein-protein interactions from experimental data using linear programming Morihiro Hayashida, Nobuhisa Ueda, Tatsuya Akutsu Bioinformatics.
Unraveling condition specific gene transcriptional regulatory networks in Saccharomyces cerevisiae Speaker: Chunhui Cai.
Computational Genomics and Proteomics Lecture 8 Motif Discovery C E N T R F O R I N T E G R A T I V E B I O I N F O R M A T I C S V U E.
10/3/2003 Molecular and Cellular Modeling 10/3/2003 Introduction Objective: to construct a comprehensive simulation software system for the computational.
Altman et al. JACS 2008, Presented By Swati Jain.
Problem Limited number of experimental replications. Postgenomic data intrinsically noisy. Poor network reconstruction.
Protein Modeling Protein Structure Prediction. 3D Protein Structure ALA CαCα LEU CαCαCαCαCαCαCαCα PRO VALVAL ARG …… ??? backbone sidechain.
TF-DNA binding dependency A progress report March 17, 2010 Hugo Willy.
Introduction to Protein Structure Prediction BMI/CS 576 Colin Dewey Fall 2008.
A MULTIBODY ATOMIC STATISTICAL POTENTIAL FOR PREDICTING ENZYME-INHIBITOR BINDING ENERGY Majid Masso Laboratory for Structural Bioinformatics,
Russell Group, Protein Evolution _________ ____ Rob Russell Cell Networks University of Heidelberg Interactions and Modules: the how and why of molecular.
Last Class 1. Transcription 2. RNA Modification and Splicing
Motif Search and RNA Structure Prediction Lesson 9.
Special Topics in Genomics Motif Analysis. Sequence motif – a pattern of nucleotide or amino acid sequences GTATGTACTTACTATGGGTGGTCAACAAATCTATGTATGA TAACATGTGACTCCTATAACCTCTTTGGGTGGTACATGAA.
Transcription factor binding motifs (part II) 10/22/07.
Protein Structures from A Statistical Perspective Jinfeng Zhang Department of Statistics Florida State University.
Evaluation of count scores for weight matrix motifs Project Presentation for CS598SS Hong Cheng and Qiaozhu Mei.
BIOBASE Training TRANSFAC ® Containing data on eukaryotic transcription factors, their experimentally-proven binding sites, and regulated genes ExPlain™
A new protein-protein docking scoring function based on interface residue properties Reporter: Yu Lun Kuo (D )
Alignment table: group 4
A Very Basic Gibbs Sampler for Motif Detection
Rong Chen Boston University
Babak Alipanahi1, Andrew Delong, Matthew T Weirauch & Brendan J Frey
Molecular Docking Profacgen. The interactions between proteins and other molecules play important roles in various biological processes, including gene.
Computational Analysis
Do enzyme-inhibiting drugs show increased reliance
An Integrated Approach to Protein-Protein Docking
Giovanni Settanni, Antonino Cattaneo, Paolo Carloni 
BIOBASE Training TRANSFAC® ExPlain™
Deep Learning in Bioinformatics
Mr.Halavath Ramesh 16-MCH-001 Dept. of Chemistry Loyola College University of Madras-Chennai.
Mr.Halavath Ramesh 16-MCH-001 Dept. of Chemistry Loyola College University of Madras-Chennai.
Mr.Halavath Ramesh 16-MCH-001 Dept. of Chemistry Loyola College University of Madras-Chennai.
Mr.Halavath Ramesh 16-MCH-001 Dept. of Chemistry Loyola College University of Madras-Chennai.
Presentation transcript:

Marcin Pacholczyk, Silesian University of Technology

 Physics-based  Laws of Physics – electrostatics, van der Waals, molecular flexibility, geometry of hydrogen bonds  Computationally intensive, some effects difficult to model (desolvation)  Knowledge-based  Relatively simple, based on observation  Training set!

Poisson-Boltzmann equation + Lenard-Jones potential

Robertson and Varani 2007 Gibbs energy  probability of „correctness”

Probability of individual atomic contact P(C) – Bayesian prior of observing native-like protein-DNA complex – set to 1. Robertson and Varani 2007

Probability function Continous d ij is mapped to a set of discrete distance bins  b 0, b 1, …, b n  with distance cutoffs  d b0, d b1, …, d bn  A count is assigned to b i if d bi-1  d ij < d bi 3 Å, 4 Å, 5 Å, 6 Å, 7 Å, 8 Å, 9 Å, 10 Å Robertson and Varani 2007

Marginal distribution N C – total number of observed contacts between interface atoms of all types, at all distances in the training set Robertson and Varani 2007 Training set – Nucleic Acid Database ndbserver.rutgers.edu

Almanova et al Three members of the NF-  B family of TF p50p50 homodimer (1NFK) p50RelB heterodimer (2V2T) p50p65 heterodimer (1VKX) Complexes with DNA fragments DNA chains were mutated one base pair at each step (backbone fixed) DNA chains were mutated (MMTSB – Multiscale Modeling Tools for Structural Biology) one base pair at each step (backbone fixed)

Almanova et al Three members of the NF-  B family of TF p50p50 homodimer (1NFK) p50RelB heterodimer (2V2T) p50p65 heterodimer (1VKX) Complexes with DNA fragments (PDB)

p50p50 p50p65 p50RelB

DNA chains were mutated one base pair at each step (backbone fixed) DNA chains were mutated (MMTSB – Multiscale Modeling Tools for Structural Biology) one base pair at each step (backbone fixed) 4N + R All weights PWM linear equation All weights w(i, u) in the PWM predicted by solving the linear equation: X estimated weights X is a vector of 4N dimensions of the estimated weights A A is a binary matrix of dimensions ( 4N, 4N + R ), with all random DNA sequences whose free binding energy was computed. free binding energy vector The free binding energy vector b consists of 4N + R values obtained with the protein-DNA scoring procedure

Almanova et al. 2010

p50p50 p50RelB p50p65 TRANSFACV$NFKAPPAB_01

AlmanovaDDNA2TRANSFACp50p p50RelB2.84- p50p Relative entropy Almanova et al. 2010

 69 human genes regulated by NF-  B with 124 promoter sequences (TRANSPRO)  Experimentally confirmed 31 out of 124 promoters belonging to 25 genes  Matrix scan with Match on 58 confirmed binding sitesAlmanovaTRANSFACp50p5030 (5)25V$P50P50_Q3 p50p6525 (5)26 (6)V$P50RELAP65 _Q5_01 Binding site discovery Almanova et al. 2010

AlmanovaDDNA2TRANSFACp50p p50p AUC Almanova et al. 2010

Discovery of novel NF-  B binding sites Investigation of postranslational modifications like RelA Ser 276 phosphorylation (Nowak et al. 2008) It is possible to compute PWMs which perform comparably to the ones derived from experimental data (TRANSFAC) Thermodynamic based models of transcriptional regulation including Synergistic Activation, Cooperative Binding and Short-Range Repression (He et al. 2010)