François Fages MPRI Bio-info 2006 Formal Biology of the Cell Modeling, Computing and Reasoning with Constraints François Fages, Constraint Programming.

Slides:



Advertisements
Similar presentations
François Fages Les Houches, avril 2007 Formal Verification of Dynamical Models and Application to Cell Cycle Control François Fages, Sylvain Soliman Constraint.
Advertisements

François Fages MPRI Bio-info 2006 Formal Biology of the Cell Modeling, Computing and Reasoning with Constraints François Fages, Constraint Programming.
François Fages MPRI Bio-info 2007 Formal Biology of the Cell Protein structure prediction with constraint logic programming François Fages, Constraint.
François FagesLyon, Dec. 7th 2006 Biologie du système de signalisation cellulaire induit par la FSH ASC 2006, projet AgroBi INRIA Rocquencourt Thème “Systèmes.
François Fages MPRI Bio-info 2007 Formal Biology of the Cell Modeling, Computing and Reasoning with Constraints François Fages, Constraint Programming.
François Fages MPRI Bio-info 2005 Formal Biology of the Cell Modeling, Computing and Reasoning with Constraints François Fages, Constraint Programming.
François Fages MPRI Bio-info 2007 Formal Biology of the Cell Modeling, Computing and Reasoning with Constraints François Fages, Constraint Programming.
François Fages MPRI Bio-info 2006 Formal Biology of the Cell Locations, Transport and Signaling François Fages, Constraint Programming Group, INRIA Rocquencourt.
François Fages MPRI Bio-info 2006 Formal Biology of the Cell Modeling, Computing and Reasoning with Constraints François Fages, Constraints Group, INRIA.
François Fages MPRI Bio-info 2007 Formal Biology of the Cell Inferring Reaction Rules from Temporal Properties François Fages, Constraint Programming Group,
DNA Replication and RNA Production Selent. Replication The process of copying DNA The two chains of nucleotides separate by unwinding and act as templates.
François Fages MPRI Bio-info 2005 Formal Biology of the Cell Locations, Transport and Signaling François Fages, Constraint Programming Group, INRIA Rocquencourt.
François Fages FJCP 2005 Temporal Logic Constraints in the Biochemical Abstract Machine BIOCHAM François Fages, Project-team: Contraintes, INRIA Rocquencourt,
DNA.
RNA and Protein Synthesis
François Fages CPCV, March 2004 Constraint-based Model Checking of Hybrid Systems: A First Experiment in Systems Biology François Fages, INRIA Rocquencourt.
RNA and Protein Synthesis
. Class 1: Introduction. The Tree of Life Source: Alberts et al.
Transcription & Translation Biology 6(C). Learning Objectives Describe how DNA is used to make protein Explain process of transcription Explain process.
DNA as the genetic code.
2.7 DNA Replication, transcription and translation
G-protein linked Plasma membrane receptor. Works with “G-protein”, an intracellular protein with GDP or GTP. Involved in yeast mating factors, epinephrine.
DNA Replication.
RNA Ribonucleic Acid.
DNA Biology Lab 11. Nucleic Acids  DNA and RNA both built of nucleotides containing Sugar (deoxyribose or ribose) Nitrogenous base (ATCG or AUCG) Phosphate.
The Nucleic Acids An Introduction.
Year 12 Biology 2012 Ms Hodgins.  We’ve all heard that DNA is important because it holds the instructions for life, but what does it actually do?  DNA.
CISC841, F08, Lec2, Liao CISC 841 Bioinformatics (Fall 2008) A Primer on Molecular Biology & Bioinformatics challenges.
National 5 Biology Course Notes Part 4 : DNA and production of
François FagesICLP, Edinburgh, 18/7/2010 A Logical Paradigm for Systems Biology François Fages INRIA Paris-Rocquencourt
Lecture #3 Transcription Unit 4: Molecular Genetics.
Genetics AP Biology. The Discovery of DNA Structure Rosalind Franklin: x-ray diffraction photographs of DNA Rosalind Franklin: x-ray diffraction photographs.
DNA REPLICATION AND PROTEIN SYNTHESIS
RNA AND PROTEIN SYNTHESIS
What is central dogma? From DNA to Protein
François Fages ICLP December 2003 The Biochemical Abstract Machine BIOCHAM Logic programming steps towards formal biology François Fages, INRIA Rocquencourt.
François Fages LOPSTR-SAS 2005 Temporal Logic Constraints in the Biochemical Abstract Machine BIOCHAM François Fages, Project-team: Contraintes, INRIA.
Structure and functions of RNA. RNA is single stranded, contains uracil instead of thymine and ribose instead of deoxyribose sugar. mRNA carries a copy.
DNA How are cells structured to do the “right” thing?
Leaving Cert Biology Genetics – section 2.5 Genetics ( RNA), 2.5.5,
François Fages Rennes March 2005 The Biochemical Abstract Machine BIOCHAM-2 François Fages, Contraintes project-team, Theme: symbolic systems, INRIA Rocquencourt.
RNA, transcription & translation Unit 1 – Human Cells.
DNA, proteins and proteomes VCE Biology Unit 3. Contents Structure of DNA Protein Synthesis Protein Formation Protein Function Proteome.
Proteins Polypeptide chains in specific conformations Protein Graphic Design video.
RNA (ribonucleic acid)
DNA, RNA and PROTEIN SYNTHESIS. WHAT MAKES UP DNA? IT IS A MOLECULE COMPOSED OF CHEMICAL SUBUNITS CALLED NUCLEOTIDES.
Chapter 12 DNA and RNA.
DNA. Unless you have an identical twin, you, like the sisters in this picture will share some, but not all characteristics with family members.
August 18, 2015 Bell Work:  What is the purpose of DNA replication? Objective: The student will be able to… 1. Demonstrate his or her knowledge of DNA.
François Fages MPRI Bio-info 2005 Formal Biology of the Cell Modeling, Computing and Reasoning with Constraints François Fages, Constraints Group, INRIA.
Chapter 3 – pp Unit III: Lively Molecules Cellular Control.
DNA Structure and Protein Synthesis Topic 2.4. Introduction  Cause of CF?  faulty CFTR protein  What causes faulty protein?  DNA Mutation  What is.
DNA, RNA & Protein Synthesis. A. DNA and the Genetic Code 1. DNA controls the production of proteins by the order of the nucleotides.
Ch. 11: DNA Replication, Transcription, & Translation Mrs. Geist Biology, Fall Swansboro High School.
Gene Expression and Protein Synthesis
Unit 2.1: BASIC PRINCIPLES OF HUMAN GENETICS
Chapter 10 – DNA, RNA, and Protein Synthesis
What is a genome? The complete set of genetic instructions (DNA sequence) of a species.
DNA, RNA and Protein Synthesis
Pharmacogenetics and Pharmacoepidemiology
The Double Helix.
Nucleotide.
Unit 2.1: BASIC PRINCIPLES OF HUMAN GENETICS
The Cell Cycle and Protein Synthesis
RNA and Transcription DNA RNA PROTEIN.
CISC 841 Bioinformatics (Fall 2007) A Primer on Molecular Biology & Bioinformatics challenges CISC841, F07, Lec2, Liao.
Pharmacogenetics and Pharmacoepidemiology
Higher Biology Unit 1: 1.3 Transcription.
TRANSCRIPTION DNA mRNA.
DNA, RNA, & Proteins Vocab review
Presentation transcript:

François Fages MPRI Bio-info 2006 Formal Biology of the Cell Modeling, Computing and Reasoning with Constraints François Fages, Constraint Programming Group, INRIA Rocquencourt Idea: apply Formal Methods of Program Verification to Systems Biology, Constraint Logic Programming and Constraint-based Model Checking In course, Learn bits of Biology through computational models, Study new formalisms, languages and … implementations.

François Fages MPRI Bio-info 2006 Systems Biology Multidisciplinary field aiming at getting over the complexity walls to reason about biological processes at the system level. Conferences ICSB, CMSB, … journal TCSB Virtual cell: emulate high-level biological processes in terms of their biochemical basis at the molecular level (in silico experiments) Bioinformatics: end 90’s, genomic sequences  post-genomic data (RNA expression, protein synthesis, protein-protein interactions,… ) Need for a strong effort on: - the formal representation of biological processes, - formal tools for modeling and reasoning about their global behavior.

François Fages MPRI Bio-info 2006 Language Approach to Cell Systems Biology Qualitative models: from diagrammatic notation to Boolean networks [Thomas 73] Petri Nets [Reddy 93] Milner’s π–calculus [Regev-Silverman-Shapiro 99-01, Nagasali et al. 00] Bio-ambients [Regev-Panina-Silverman-Cardelli-Shapiro 03] Pathway logic [Eker-Knapp-Laderoute-Lincoln-Meseguer-Sonmez 02] Transition systems [Chabrier-Chiaverini-Danos-Fages-Schachter 04] Biochemical abstract machine BIOCHAM-1 [Chabrier-Fages 03] Quantitative models: from differential equation systems to Hybrid Petri nets [Hofestadt-Thelen 98, Matsuno et al. 00] Hybrid automata [Alur et al. 01, Ghosh-Tomlin 01] Hybrid concurrent constraint languages [Bockmayr-Courtois 01] Rules with continuous dynamics BIOCHAM-2 [Chabrier-Fages-Soliman 04]

François Fages MPRI Bio-info 2006 The Biochemical Abstract Machine BIOCHAM Software environment based on two formal languages: 1.Biocham Rule Language for Modeling Biochemical Systems 1.Syntax of molecules, compartments and reactions 2.Semantics at 3 abstraction levels: Boolean, Concentrations, Populations 2.Biocham Temporal Logic for Formalizing Biological Properties 1.CTL for Boolean semantics 2.Constraint LTL for Concentration semantics Machine learning Rules and Parameters from Temporal Properties 1.Learning reaction rules from CTL specification 2.Learning kinetic parameter values from Constraint-LTL specification Internship topics:

François Fages MPRI Bio-info 2006 Overview of the Lectures 1.Introduction. Formal molecules and reactions in BIOCHAM. 2.Formal biological properties in temporal logic. Symbolic model-checking. 3.Continuous dynamics. Kinetics and transport models. 4.Computational models of the cell cycle control. 5.Abstract interpretation and typing of biochemical networks 6.Machine learning reaction rules from temporal properties. 7.Constraint-based model checking. Learning kinetic parameter values. 8.Constraint Logic Programming approach to protein structure prediction.

François Fages MPRI Bio-info 2006 References A wonderful textbook: Molecular Cell Biology. 5th Edition, 1100 pages+CD, Freeman Publ. Lodish, Berk, Zipursky, Matsudaira, Baltimore, Darnell. Nov Modeling dynamic phenomena in molecular and cellular biology. Segel. Cambridge Univ. Press Modeling and querying bio-molecular interaction networks. Chabrier, Chiaverini, Danos, Fages, Schächter. Theoretical Computer Science 04 The Biochemical Abstract Machine BIOCHAM. Chabrier, Fages, Soliman

François Fages MPRI Bio-info 2006 Map of Course 1 1.Introduction 2. BIOCHAM syntax Proteins: complexation and phosphorylation DNA: replication and transcription Reaction and transport rules 3. Boolean semantics: concurrent transition system, Kripke structure States and transitions Examples: RTK membrane receptors, MAPK signaling pathways

François Fages MPRI Bio-info Syntax: a Simple Algebra of Cell Molecules Small molecules: covalent bonds kcal/mol 70% water 1% ions 6% amino acids (20), nucleotides (5), fats, sugars, ATP, ADP, … Macromolecules: hydrogen bonds, ionic, hydrophobic, Waals 1-5 kcal/mol Stability and bindings determined by the number of weak bonds: 3D shape 20% proteins ( amino acids) RNA ( nucleotides AGCU) DNA ( nucleotides AGCT)

François Fages MPRI Bio-info 2006 Structure Levels of Proteins 1) Primary structure: word of n amino acids residues (20 n possibilities) linked with C-N bonds Example: MPRI Methionine-Proline-Arginine-Isoleucine 2) Secondary: word of m  helix,  strands, random coils,… (3 m -10 m ) stabilized by hydrogen bonds H---O 3) Tertiary 3D structure: spatial folding stabilized by hydrophobic interactions

François Fages MPRI Bio-info 2006 Formal proteins Cyclin dependent kinase 1 Cdk1 (free, inactive) Complex Cdk1-Cyclin B Cdk1–CycB (low activity) Phosphorylated form Cdk1~{thr161}-CycB at site threonine 161 (high activity) BIOCHAM syntax

François Fages MPRI Bio-info 2006 Deoxyribonucleic Acid DNA 1)Primary structure: word over 4 nucleotides Adenine, Guanine, Cytosine, Thymine 2) Secondary structure: double helix of pairs A--T and C---G stabilized by hydrogen bonds DNA replication: separation of the two helices and production of one complementary strand for each copy

François Fages MPRI Bio-info 2006 DNA: Genome Size SpeciesGenome sizeChromosomesCoding DNA E. Coli (bacteria)5 Mb1 circular100 % S. Cerevisae (yeast)12 Mb1670 % …3 Gb …15 Gb …140 Gb

François Fages MPRI Bio-info 2006 DNA: Genome Size SpeciesGenome sizeChromosomesCoding DNA E. Coli (bacteria)5 Mb1 circular100 % S. Cerevisae (yeast)12 Mb1670 % Mouse, Human3 Gb20, 2315 % …15 Gb …140 Gb 3,200,000,000 pairs of nucleotides single nucleotide polymorphism 1 / 2kb

François Fages MPRI Bio-info 2006 Genome Size SpeciesGenome sizeChromosomesCoding DNA E. Coli (bacteria)4 Mb1100 % S. Cerevisae (yeast)12 Mb1670 % Mouse, Human3 Gb20, 2315 % Onion15 Gb81 % …140 Gb

François Fages MPRI Bio-info 2006 Genome Size SpeciesGenome sizeChromosomesCoding DNA E. Coli (bacteria)4 Mb1100 % S. Cerevisae (yeast)12 Mb1670 % Mouse, Human3 Gb20, 2315 % Onion15 Gb81 % Lungfish140 Gb0.7 %

François Fages MPRI Bio-info 2006 Transcription: DNA  pre-mRNA  mRNA  Protein Genes: parts of DNA 1.Activation: transcription factors bind to the regulatory region of the gene 2.Transcription: RNA polymerase copies the DNA from start to stop positions into a single stranded pre-mature messenger pRNA 3.(Alternative) splicing: non coding regions of pRNA are removed giving mature messenger mRNA 4.Protein synthesis: mRNA moves to cytoplasm and binds to ribosome to assemble a protein _ =[#E2-E2F13-DP12 ]=> pRNAcycA

François Fages MPRI Bio-info 2006 BIOCHAM Syntax of Objects E == compound | E-E | E~{p1,…,pn} Compound : molecule, #gene binding site, - : binding operator for protein complexes, gene binding sites, … Associative and commutative. ~{…} : modification operator for phosphorylated sites, … Set of modified sites (Associative, Commutative, Idempotent). O == E | E::location Location : symbolic compartment (nucleus, cytoplasm, membrane, …) S == _ | O+S + : solution operator (Associative, Commutative, Neutral _)

François Fages MPRI Bio-info 2006 Seven Fundamental Rule Schemas Complexation: A + B => A-B Decomplexation A-B => A + B cdk1+cycB => cdk1–cycB Phosphorylation: A =[C]=> A~{p} Dephosphorylation A~{p} =[C]=> A Cdk1-CycB =[Myt1]=> Cdk1~{thr161}-CycB Cdk1~{thr14,tyr15}-CycB =[Cdc25~{Nterm}]=> Cdk1-CycB Synthesis: _ =[C]=> A. Degradation: A =[C]=> _. _=[#Ge2-E2f13-Dp12]=>cycA cycE _ (not for cycE-cdk2 which is stable) Transport: A::L1 => A::L2 Cdk1~{p}-CycB::cytoplasm=>Cdk1~{p}-CycB::nucleus

François Fages MPRI Bio-info 2006 BIOCHAM Syntax of Reaction Rules R ::= S=>S | S=[O]=>S | S S | S S where A=[C]=>B stands for A+C=>B+C A B stands for A=>B and B=>A, etc. N ::= kinetic for R (import/export SBML format) Three abstraction levels: 1.Boolean Semantics: presence-absence of molecules 1.Concurrent Transition System (asynchronous, non-deterministic) 2.Concentration Semantics: number / volume of diffusion 1.Ordinary Differential Equations or Hybrid system (deterministic) 3.Stochastic Semantics: number of molecules 1.Continuous time Markov chain

François Fages MPRI Bio-info 2006 The Actin-Myosin two-stroke Engine with ATP fuel Myosin + ATP => Myosin-ATP Myosin-ATP => Myosin + ADP

François Fages MPRI Bio-info 2006 The Actin-Myosin two-stroke Engine with ATP fuel Myosin + ATP => Myosin-ATP Myosin-ATP => Myosin + ADP

François Fages MPRI Bio-info 2006 The Actin-Myosin two-stroke Engine with ATP fuel Myosin + ATP => Myosin-ATP Myosin-ATP => Myosin + ADP

François Fages MPRI Bio-info 2006 The Actin-Myosin two-stroke Engine with ATP fuel Myosin + ATP => Myosin-ATP Myosin-ATP => Myosin + ADP

François Fages MPRI Bio-info 2006 Cell to Cell Signaling by Hormones and Receptors Signals: insulin, adrenaline, steroids, EGF, …, Delta, …, nutriments, light, pressure, … Receptors: tyrosine kinases, G-protein coupled, Notch, … L + R L-R RAS-GDP =[L-R]=> RAS-GTP

François Fages MPRI Bio-info 2006 Five MAP Kinase Pathways in Budding Yeast (Saccharomyces Cerevisiae)

François Fages MPRI Bio-info 2006 MAPK Signaling Pathways Input: RAF Activated by the receptor RAF-p RAS-GTP => RAF + p RAS-GDP Output: MAPK~{T183,Y185} moves to the nucleus phosphorylates a transcription factor which stimulates gene transcription

François Fages MPRI Bio-info 2006 MAPK Signaling Pathway in BIOCHAM RAF + RAFK RAF-RAFK. RAF-RAFK => RAFK + RAF~{p1}. RAF~{p1} + RAFPH RAF~{p1}-RAFPH. RAF~{p1}-RAFPH => RAF + RAFPH. MEK~$P + RAF~{p1} MEK~$P-RAF~{p1} where p2 not in $P. MEK~{p1}-RAF~{p1} => MEK~{p1,p2} + RAF~{p1}. MEK-RAF~{p1} => MEK~{p1} + RAF~{p1}. MEKPH + MEK~{p1}~$P MEK~{p1}~$P-MEKPH. MEK~{p1}-MEKPH => MEK + MEKPH. MEK~{p1,p2}-MEKPH => MEK~{p1} + MEKPH. MAPK~$P + MEK~{p1,p2} MAPK~$P-MEK~{p1,p2} where p2 not in $P. MAPKPH + MAPK~{p1}~$P MAPK~{p1}~$P-MAPKPH. MAPK~{p1}-MAPKPH => MAPK + MAPKPH. MAPK~{p1,p2}-MAPKPH => MAPK~{p1} + MAPKPH. MAPK-MEK~{p1,p2} => MAPK~{p1} + MEK~{p1,p2}. MAPK~{p1}-MEK~{p1,p2} => MAPK~{p1,p2}+MEK~{p1,p2}. Pattern variables $P for Phosphorylation sites Molecules with constraints BIOCHAM rules are expanded in BIOCHAM-0 rules without patterns

François Fages MPRI Bio-info 2006 Bipartite Proteins-Reactions Graph of MAPK GraphViz

François Fages MPRI Bio-info 2006 Random Boolean Simulation of MAPK Signaling

François Fages MPRI Bio-info 2006 Numerical simulation of MAPK in BIOCHAM-2

François Fages MPRI Bio-info 2006 Boolean Semantics Associate: Boolean state variables to molecules denoting the presence/absence of molecules in the cell or compartment A Finite concurrent transition system [Shankar 93] to rules (asynchronous) over-approximating the set of all possible behaviors A reaction A+B=>C+D is translated into 4 transition rules for the possibly complete consumption of reactants: A+B  A+B+C+D A+B   A+B +C+D A+B  A+  B+C+D A+B   A+  B+C+D

François Fages MPRI Bio-info 2006 Kripke Structure K=(S,R) Given: V is a set of state variables, with domain D, T a set of transition rules between states. Associate: a Kripke structure (S,R) where S=D V is the set of possible states with variables ranging in domain D R  SxS is the total relation induced by T, that is (A,B) is in R if there exists a transition rule from state A to B (A,A) is in R if there exist no transition from state A.