Department of Biochemistry

Slides:



Advertisements
Similar presentations
Understanding biology through structures Course work 2006 Protein-Nucleic Acid Interactions: General Principles.
Advertisements

DNA STRUCTURE. NUCLEIC ACIDS Include DNA: Deoxyribonucleic acid RNA: Ribonucleic acid.
PROTEOMICS 3D Structure Prediction. Contents Protein 3D structure. –Basics –PDB –Prediction approaches Protein classification.
Protein Structure Prediction
9-1 Chapter 9 DNA-Protein Interactions in Bacteria Student learning outcomes: Describe examples of structure /function relationships in phage repressors.
Promoter and Module Analysis Statistics for Systems Biology.
Negative regulatory proteins bind to operator sequences in the DNA and prevent or weaken RNA polymerase binding.
Hidden Markov models for detecting remote protein homologies Kevin Karplus, Christian Barrett, Richard Hughey Georgia Hadjicharalambous.
درس بیوانفورماتیک December 2013 مدل ‌ مخفی مارکوف و تعمیم ‌ های آن به نام خدا.
Molecular Biology Fifth Edition
Protein Tertiary Structure Prediction
Structural bioinformatics
Intro to Bioinformatics Summary. What did we learn Pairwise alignment – Local and Global Alignments When? How ? Tools : for local blast2seq, for global.
Protein-DNA interactions: amino acid conservation and the effects of mutations on binding specificity Nicholas M. Luscombe and Janet M. Thornton JMB (2002)
Protein structure (Part 2 of 2).
MCSG Site Visit, Argonne, January 30, 2003 Genome Analysis to Select Targets which Probe Fold and Function Space  How many protein superfamilies and families.
1 Computational Analysis of Protein-DNA Interactions Changhui (Charles) Yan Department of Computer Science Utah State University.
Domain Assignment to Transcription Factors 416 Proteins with at least one SCOP DBD assignment 416 Proteins with at least one SCOP DBD assignment PFAM assignments.
Summary Protein design seeks to find amino acid sequences which stably fold into specific 3-D structures. Modeling the inherent flexibility of the protein.
. Protein Structure Prediction [Based on Structural Bioinformatics, section VII]
Protein Structure and Function Prediction. Predicting 3D Structure –Comparative modeling (homology) –Fold recognition (threading) Outstanding difficult.
Protein Tertiary Structure Prediction Structural Bioinformatics.
How does a repressor find its operator in a sea of other sequences? It is not enough just for the regulatory protein to recognize the correct DNA.
Transcriptional Regulation and RNA Processing
Protein Tertiary Structure Prediction Structural Bioinformatics.
Protein Structures.
Sigma-aldrich.com/cellsignaling Modular Structure of Transcription Factors.
DNA Motif and protein domain discovery
vanderbilt.edu March 6-11, 2003 Biochemistry 305 Structural Mechanisms of the DNA Replication, Recombination and Repair (DNA Processing.
Bioinformatics for biomedicine Protein domains and 3D structure Lecture 4, Per Kraulis
Protein Tertiary Structure Prediction
PAT project Advanced bioinformatics tools for analyzing the Arabidopsis genome Proteins of Arabidopsis thaliana (PAT) & Gene Ontology (GO) Hongyu Zhang,
Bioinformatics.
Genomics and Personalized Care in Health Systems Lecture 9 RNA and Protein Structure Leming Zhou, PhD School of Health and Rehabilitation Sciences Department.
Proteins Secondary Structure Predictions Structural Bioinformatics.
Chapter 11: Transcription Initiation Complex Copyright © Garland Science 2007.
Sequence analysis: Macromolecular motif recognition Sylvia Nagl.
Protein-Nucleic Acid Interactions - part 1 Blackburn & Gait, Ch. 9 Define persistence length of nucleic acid Know four forces used in protein-nucleic acid.
Raven - Johnson - Biology: 6th Ed. - All Rights Reserved - McGraw Hill Companies Control of Gene Expression Copyright © McGraw-Hill Companies Permission.
CSCI 6900/4900 Special Topics in Computer Science Automata and Formal Grammars for Bioinformatics Bioinformatics problems sequence comparison pattern/structure.
Function first: a powerful approach to post-genomic drug discovery Stephen F. Betz, Susan M. Baxter and Jacquelyn S. Fetrow GeneFormatics Presented by.
Biological Networks. Can a biologist fix a radio? Lazebnik, Cancer Cell, 2002.
Conserved features of protein-DNA interaction in all X-ray characterized families of DNA-binding proteins N.N. (GI/MR/M) / N.N. (GI/MR/M) Introduction.
Hydrogen bonding between purines and pyrimidines established the appropriate pairs and reinforced Chargaff’s Rules – 2 hydrogen bonds between A and T –
Protein Structure & Modeling Biology 224 Instructor: Tom Peavy Nov 18 & 23, 2009
Multiple Mapping Method with Multiple Templates (M4T): optimizing sequence-to-structure alignments and combining unique information from multiple templates.
Protein secondary structure Prediction Why 2 nd Structure prediction? The problem Seq: RPLQGLVLDTQLYGFPGAFDDWERFMRE Pred:CCCCCHHHHHCCCCEEEECCHHHHHHCC.
Biological Signal Detection for Protein Function Prediction Investigators: Yang Dai Prime Grant Support: NSF Problem Statement and Motivation Technical.
Protein Structure Prediction ● Why ? ● Type of protein structure predictions – Sec Str. Pred – Homology Modelling – Fold Recognition – Ab Initio ● Secondary.
Using structure in protein function annotation: predicting protein interactions Donald Petrey, Cliff Qiangfeng Zhang, Raquel Norel, Barry Honig Howard.
Regulation of Gene Expression in Bacteria and Their Viruses
Last Class 1. Transcription 2. RNA Modification and Splicing
Query sequence MTYKLILNGKTKGETTTEAVDAATAEKVFQYANDN GVDGEWTYTE Structure-Sequence alignment “Structure is better preserved than sequence” Me! Non-redundant.
Structural classification of Proteins SCOP Classification: consists of a database Family Evolutionarily related with a significant sequence identity Superfamily.
PROPERTIES OF DNA I PRIMARY SEQUENCE A. Base Pairing
TATA box Promoter-proximal elements Effects of mutations in promoter element sequences on transcription.
Protein Tertiary Structure Prediction Structural Bioinformatics.
Modelling Genome Structure and Function Ram Samudrala University of Washington.
Protein Structure Prediction. Protein Sequence Analysis Molecular properties (pH, mol. wt. isoelectric point, hydrophobicity) Secondary Structure Super-secondary.
Biochemical Organization &Functions of DNA
Molecular biology (1) (Foundation Block).
Gene structure DNA replication
Protein Structures.
Debanu Das, Millie M Georgiadis  Structure 
Protein structure prediction.
Volume 16, Issue 4, Pages (November 2004)
Brett K. Kaiser, Matthew C. Clifton, Betty W. Shen, Barry L. Stoddard 
Structure of the Mtb CarD/RNAP β-Lobes Complex Reveals the Molecular Basis of Interaction and Presents a Distinct DNA-Binding Domain for Mtb CarD  Gulcin.
Molecular biology (1) (Foundation Block).
Debanu Das, Millie M Georgiadis  Structure 
Presentation transcript:

Department of Biochemistry Protein DNA Interactions From interactions to function prediction Sue Jones Department of Biochemistry University of Sussex 20th Sept 2004 EMBL Lecture Course

Outline Protein-DNA Interactions :importance Structural Data Predicting DNA Binding Function Alternative Method & New Perspectives

Protein-DNA Interactions : Importance Gene expression Transcription initiation (TATA binding protein) RNA synthesis (RNA polymerase) Transcription regulation (MAX protein) DNA repair (DNA glycosylase : oxidative DNA damage)

Protein-DNA Interactions : Importance DNA packaging (Histone H2A.e) DNA replication (Polymerases, Ligases, single stranded binding proteins)

Outline Protein-DNA Interactions :importance Structural Data Predicting DNA Binding Function Alternative Method & New Perspectives

DNA B A Z DNA has structural flexibility Structure described by Watson & Crick : B-form Feature B A Type of helix RH Diameter 2.37 2.55 Rise per bp 0.34 0.29 # bp per turn 10 11 Major groove Wide, deep Narrow, Minor groove shallow Wide, shallow B A Z

Structural Data NDB : assemble and distribute structural information about nucleic acids 2490 structures (25/08/04) Protein-DNA Complex Number Double Helix 593 Single Strand 57 http://ndbserver.rutgers.edu Berman et al., 1992. Biophys J 63 p751

Protein-DNA Interactions : Structure

Protein-DNA Interactions : characteristics Major and minor groove binding DNA-binding motifs Positively charged surface areas Size ASA : 618Å2 - 2833Å2 Conformational changes DNA bending domain movements, quaternary changes Nadassy et al., 1999 Biochemistry 38 p1999 Jones et al., 1999 J.Mol.Biol. 287 p877

Outline Protein-DNA Interactions :importance Structural Data Predicting DNA Binding Function New Perspectives

Predicting DNA Binding Function Knowing a protein’s function is essential in understanding cellular location interactions biochemical pathways potential as drug targets Prediction of protein DNA binding site given unbound protein structure electrostatic patches motifs

Predicting Function from Structure Structural genomics : filling in the gaps of protein structure space Structures solved that have low sequence identity (< 30% sequence identity) Potentially little or no fold similarity to any currently in the PDB Require algorithms to make fast & reliable function predictions

Predicting DNA Binding Function Easy to make matches between globally homologous structures Method aims to identify remote matches based on local homology of a specific motif Helix-Turn-Helix (HTH) C-terminal helix - major groove binding 1/3 DNA-binding protein families (16/54)

Catabolic Activator Protein HTH Motif Proteins Hin Recombinase (1hcr) Catabolic Activator Protein (1j59)

HTH Motif Dataflow NDB PDB Literature PFAM SMART PDB Chains NDB PDB Literature PFAM SMART 26 Hidden Markov Models PDB SAM-T99 Literature Rasmol 349 HTH Chains 227 HTH Proteins 28 HMMs 86 NI Proteins 3D-Templates 7 HREPS 29 SREPS 84 NI Proteins 232 HTH Chains 30 SREPS HTH Motif Dataflow

HTH Template Library 1ais 1hcr 1b9m 1eto 1jhg 1lmb 1orc 1hcrA160-181 1b9mA32-56 1etoA73-95 1aisB1267-1293 1jhgA68-91 1lmb331-53 1orc016-36 1jhg 1lmb 1orc

Template Scanning Scanning template library against 3D structures One template T (length n) scanned against protein P of length m, calculated optimal gapless superposition at each m-n+1 possible positions in P using RMSD Based on Kabsch (1976) Acta Cryst A. 32 p922

RMSD Distributions 1.6Å Frequency 368/8266 = 3.5% false positives 5/84 = 1.4% false negatives

Improving Template Specificity Extending templates Assessing motif accessible surface area (ASA) +2 templates 61/8264 = 0.7% false positives ASA threshold (990Å2) 38/8264 = 0.5% false positives 3 ‘false’ positives were actually real HTH proteins not previously annotated

‘New’ HTH Motif 1 DNA Methyltransferase (MGMT) 110-129 C-terminal domain ‘d’ and ‘e’ helices Site directed mutagenesis 1mgtA

‘New’ HTH Motif 2 1fy7A Histone acetyltransferase 368-388 C-terminal domain zinc finger N-terminal domain protein-protein interactions SCOP : ‘winged helix’

‘New’ HTH Motif 3 1taq 1tau Polymerase I 673-700 ‘fingers’ subdomain DNA contacts ‘O’ helix New HTH precedes ‘O’ helix

Generic Templates

Generic Templates Sequence Structure RMSD < 1.6 Full sequence HMMs (0.001) Structure RMSD < 1.6

Structural Genomics Targets Scanned template library against 30 target structures from MCSG 21-49 1LMB331-53 1695 1.3 APS048 Location Template ASA RMSD MSGC Target Isocitrate lyase regulator transcription factor. (Zhang et al., J. Biol. Chem. 2002)

Summary Method combined structural data from NDB and PDB with sequence data from PFAM and SMART Structural template library of 7 HTH motifs RMSD threshold from optimal superposition Hit rate of 88% & false positive rate of 0.5% Recognition across families Template method independent of global fold similarity Potential to identify new DNA binding HTH motifs

Online Function Prediction http://www.ebi.ac.uk/thornton-srv/databases/PDNA-pred

Outline Protein-DNA Interactions :importance Structural Data Predicting DNA Binding Function Alternative Method & New Perspectives

Alternative Statistical Model Statistical Models for discerning protein structures containing the DNA-binding HTH motif. Mclaughlin and Berman, J. Mol. Biol. 2003 p43. Decision tree model to identify key structural features geometric measurements of recognition helix (RH) & helices & beta sheets preceding and following Key features High solvent accessibility of RH Hydrophobic interaction between RH & 2nd helix preceding Predicting HTH motifs within the PDB 98% accuracy & 0.7% false positive rate Predicted new HTH motifs

Future Perspectives Extend method to other DNA binding motifs : HLH, HhH, -ribbon Using electrostatic potentials with motifs to improve method Spatial templates for proteins that don’t use discrete motifs for DNA recognition

Acknowledgements Mario Garcia Carles Ferrer Department of Energy : USA Jonathan Barker Janet Thornton Hugh Shanahan Helen Berman Mario Garcia Carles Ferrer Department of Energy : USA European Bioinformatics Institute Rutgers The State University