François Fages Rennes March 2005 The Biochemical Abstract Machine BIOCHAM-2 François Fages, Contraintes project-team, Theme: symbolic systems, INRIA Rocquencourt.

Slides:



Advertisements
Similar presentations
François Fages Les Houches, avril 2007 Formal Verification of Dynamical Models and Application to Cell Cycle Control François Fages, Sylvain Soliman Constraint.
Advertisements

CS 267: Automated Verification Lecture 2: Linear vs. Branching time. Temporal Logics: CTL, CTL*. CTL model checking algorithm. Counter-example generation.
Algorithmic Software Verification VII. Computation tree logic and bisimulations.
Lecture 24 MAS 714 Hartmut Klauck
François Fages MPRI Bio-info 2006 Formal Biology of the Cell Modeling, Computing and Reasoning with Constraints François Fages, Constraint Programming.
François Fages MPRI Bio-info 2007 Formal Biology of the Cell Protein structure prediction with constraint logic programming François Fages, Constraint.
François FagesLyon, Dec. 7th 2006 Biologie du système de signalisation cellulaire induit par la FSH ASC 2006, projet AgroBi INRIA Rocquencourt Thème “Systèmes.
François Fages MPRI Bio-info 2007 Formal Biology of the Cell Modeling, Computing and Reasoning with Constraints François Fages, Constraint Programming.
François Fages WCB Nantes 2006 On Using Temporal Logic with Constraints to express Biological Properties of Cell Processes François Fages, Constraint Programming.
Bioinformatics 3 V18 – Kinetic Motifs Mon, Jan 12, 2015.
François FagesShonan village 14/11/11 Formal Cell Biology in Biocham François Fages Constraint Programming Group INRIA Paris-Rocquencourt.
François Fages MPRI Bio-info 2005 Formal Biology of the Cell Modeling, Computing and Reasoning with Constraints François Fages, Constraint Programming.
An Introduction to the Model Verifier verds Wenhui Zhang September 15 th, 2010.
François Fages MPRI Bio-info 2007 Formal Biology of the Cell Modeling, Computing and Reasoning with Constraints François Fages, Constraint Programming.
François Fages MPRI Bio-info 2006 Formal Biology of the Cell Locations, Transport and Signaling François Fages, Constraint Programming Group, INRIA Rocquencourt.
1 Modelling Biochemical Pathways in PEPA Muffy Calder Department of Computing Science University of Glasgow Joint work with Jane Hillston and Stephen.
Modelling Cell Signalling Pathways in PEPA
ECE Synthesis & Verification - L271 ECE 697B (667) Spring 2006 Synthesis and Verification of Digital Systems Model Checking basics.
François Fages MPRI Bio-info 2006 Formal Biology of the Cell Modeling, Computing and Reasoning with Constraints François Fages, Constraints Group, INRIA.
François Fages MPRI Bio-info 2007 Formal Biology of the Cell Inferring Reaction Rules from Temporal Properties François Fages, Constraint Programming Group,
UPPAAL Introduction Chien-Liang Chen.
SBML2Murphi: a Translator from a Biology Markup Language to Murphy Andrea Romei Ciclo di Seminari su Model Checking Dipartimento di Informatica Università.
François Fages MPRI Bio-info 2006 Formal Biology of the Cell Modeling, Computing and Reasoning with Constraints François Fages, Constraint Programming.
François Fages MPRI Bio-info 2005 Formal Biology of the Cell Locations, Transport and Signaling François Fages, Constraint Programming Group, INRIA Rocquencourt.
François Fages MPRI Bio-info 2005 Formal Biology of the Cell Modeling, Computing and Reasoning with Constraints François Fages, Constraint Programming.
François Fages Rocquencourt, Sep Semantical and Algorithmic Aspects of the Living François Fages Constraint Programming project-team, INRIA Paris-Rocquencourt.
August Moscow meeting1August Moscow meeting1August Moscow meeting11 Deductive tools in insertion modeling verification A.Letichevsky.
François Fages FJCP 2005 Temporal Logic Constraints in the Biochemical Abstract Machine BIOCHAM François Fages, Project-team: Contraintes, INRIA Rocquencourt,
Anna Philippou Department of Computer Science University of Cyprus Joint work with Mauricio Toro Department of Comp. Sc. EAFIT University Christina Kassara.
ISBN Chapter 3 Describing Syntax and Semantics.
François Fages New Delhi, Dec Formal Verification and Inference of Biochemical Models François Fages Constraint Programming project-team, INRIA.
François Fages CPCV, March 2004 Constraint-based Model Checking of Hybrid Systems: A First Experiment in Systems Biology François Fages, INRIA Rocquencourt.
Models and methods in systems biology Daniel Kluesing Algorithms in Biology Spring 2009.
Petri net modeling of biological networks Claudine Chaouiya.
Discrete Abstractions of Hybrid Systems Rajeev Alur, Thomas A. Henzinger, Gerardo Lafferriere and George J. Pappas.
Computational Biology, Part 17 Biochemical Kinetics I Robert F. Murphy Copyright  1996, All rights reserved.
Honours Thesis – “Metabolic Pathways” “Metabolic Pathways“ Tim Conrad B.Comp.Sci. Honours Thesis – Final Presentation 10/2004.
Lecture 4&5: Model Checking: A quick introduction Professor Aditya Ghose Director, Decision Systems Lab School of IT and Computer Science University of.
SCB : 1 Department of Computer Science Simulation and Complexity SCB : Simulating Complex Biosystems Susan Stepney Department of Computer Science Leo Caves.
Use of Ontologies in the Life Sciences: BioPax Graciela Gonzalez, PhD (some slides adapted from presentations available at
Witness and Counterexample Li Tan Oct. 15, 2002.
PML: Toward a High-Level Formal Language for Biological Systems Bor-Yuh Evan Chang and Manu Sridharan July 24, 2003.
Review of the automata-theoretic approach to model-checking.
1 Ivan Lanese Computer Science Department University of Bologna Italy Concurrent and located synchronizations in π-calculus.
Boolean Here, we are focusing on the early steps of FSH-induced signalling: the FSH receptor transduction mechanisms. We have translated the model previously.
Witness and Counterexample Li Tan Oct. 15, 2002.
Describing Syntax and Semantics
Model Checking Lecture 4 Tom Henzinger. Model-Checking Problem I |= S System modelSystem property.
Lecture 4: Metabolism Reaction system as ordinary differential equations Reaction system as stochastic process.
On Reducing the Global State Graph for Verification of Distributed Computations Vijay K. Garg, Arindam Chakraborty Parallel and Distributed Systems Laboratory.
ISBN Chapter 3 Describing Semantics -Attribute Grammars -Dynamic Semantics.
François FagesICLP, Edinburgh, 18/7/2010 A Logical Paradigm for Systems Biology François Fages INRIA Paris-Rocquencourt
François Fages ICLP December 2003 The Biochemical Abstract Machine BIOCHAM Logic programming steps towards formal biology François Fages, INRIA Rocquencourt.
François Fages LOPSTR-SAS 2005 Temporal Logic Constraints in the Biochemical Abstract Machine BIOCHAM François Fages, Project-team: Contraintes, INRIA.
PML: Toward a High-Level Formal Language for Biological Systems Bor-Yuh Evan Chang and Manu Sridharan Computer Science Division University of California,
- 1 -  P. Marwedel, Univ. Dortmund, Informatik 12, 05/06 Universität Dortmund Validation - Formal verification -
François Fages MPRI Bio-info 2005 Formal Biology of the Cell Modeling, Computing and Reasoning with Constraints François Fages, Constraint Programming.
Verification & Validation By: Amir Masoud Gharehbaghi
Symbolic Algorithms for Infinite-state Systems Rupak Majumdar (UC Berkeley) Joint work with Luca de Alfaro (UC Santa Cruz) Thomas A. Henzinger (UC Berkeley)
1 CSEP590 – Model Checking and Automated Verification Lecture outline for July 9, 2003.
Dynamical Modeling in Biology: a semiotic perspective Junior Barrera BIOINFO-USP.
François Fages MPRI Bio-info 2005 Formal Biology of the Cell Modeling, Computing and Reasoning with Constraints François Fages, Constraints Group, INRIA.
Metabolic pathways. What do we mean by metabolism? Metabolism is the collective term for the thousands of biochemical _________ that occur within a living.
6/12/20161 a.a.2015/2016 Prof. Anna Labella Formal Methods in software development.
Model Checking Lecture 2. Model-Checking Problem I |= S System modelSystem property.
Complexity of Compositional Model Checking of Computation Tree Logic on Simple Structures Krishnendu Chatterjee Pallab Dasgupta P.P. Chakrabarti IWDC 2004,
CIS 842: Specification and Verification of Reactive Systems
Formal Methods in software development
Computer Security: Art and Science, 2nd Edition
Formal Methods in software development
Presentation transcript:

François Fages Rennes March 2005 The Biochemical Abstract Machine BIOCHAM-2 François Fages, Contraintes project-team, Theme: symbolic systems, INRIA Rocquencourt Joint work with : Nathalie Sylvain Laurence Chabrier-Rivier Soliman Calzone : ARC CPBIO “Process Calculi and Biology of Molecular Networks” A.Bockmayr, LORIA, V. Danos, CNRS PPS, V. Schächter, Genoscope Evry

François Fages Rennes March 2005 Systems Biology ? Multidisciplinary field aiming at getting over the complexity walls to reason about biological processes at the system level. Virtual cell: emulate high-level biological processes in terms of their biochemical basis at the molecular level (in silico experiments) Beyond providing tools to biologists, Computer Science has much to offer in terms of concepts and methods. Bioinformatics: end 90’s, genomic sequences  post-genomic data (ARN expression, protein synthesis, protein-protein interactions,… ) Need for a strong parallel effort on: - the formal representation of biological processes, - formal tools for modeling and reasoning about their global behavior.

François Fages Rennes March 2005 Language Approach to (Cell) Systems Biology Qualitative models: from diagrammatic notation to Boolean networks [Thomas 73] Milner’s π–calculus [Regev-Silverman-Shapiro 99-01, Nagasali et al. 00] Concurrent transition systems [Chabrier-Chiaverini-Danos-Fages-Schachter 03] Biochemical abstract machine BIOCHAM-1 [Chabrier-Fages 03] Pathway logic [Eker-Knapp-Laderoute-Lincoln-Meseguer-Sonmez 02] Bio-ambients [Regev-Panina-Silverman-Cardelli-Shapiro 03] Quantitative models: from differential equation systems to Hybrid Petri nets [Hofestadt-Thelen 98, Matsuno et al. 00] Hybrid automata [Alur et al. 01, Ghosh-Tomlin 01] Hybrid concurrent constraint languages [Bockmayr-Courtois 01] Rule-based compositional language BIOCHAM-2 [Chabrier-Fages-Soliman 04]

François Fages Rennes March 2005 Plan for today 1.Introduction 2.BIOCHAM Language for Modeling Biochemical Systems 1.Syntax: molecules and reactions 2.Semantics at 3 abstraction levels: molecule populations, concentrations, Boolean 3.BIOCHAM Language for Formalizing Biological Properties 1.Computation Tree Logic for Boolean semantics 2.Constraint Linear Time Logic for concentration semantics 4.Machine Learning from Temporal Properties 1.Learning reaction rules 2.Learning kinetic parameter values 5.Conclusion, collaborations and perspectives

François Fages Rennes March Modeling Biochemical Systems: syntax of molecules Small molecules: covalent bonds (outer electrons shared) kcal/mol 70% water 1% ions 6% amino acids (20), nucleotides (5), fats, sugars, ATP, ADP, … Macromolecules: hydrogen bonds, ionic, hydrophobic, Waals 1-5 kcal/mol Stability and bindings determined by the number of weak bonds: 3D shape 20% proteins ( amino acids) RNA ( nucleotides AGCU) DNA ( nucleotides AGCT)

François Fages Rennes March 2005 Formal proteins Cyclin dependent kinase 1 Cdk1 (free, inactive) Complex Cdk1-Cyclin B Cdk1–CycB (low activity) Phosphorylated form Cdk1~{thr161}-CycB at site threonine 161 (high activity) (BIOCHAM syntax)

François Fages Rennes March 2005 Formal Genes and RNA Genes = parts of DNA # ERCC1 Gene transcription: RNA copying from a gene RNA expression: Protein synthesis from an RNA # ERCC1-(PRB-JUN-CFOS)

François Fages Rennes March 2005 BIOCHAM Syntax of Molecules E ::= Name|E-E|E~{E,…,E}|(E) S ::= _|E+S Names : molecules, proteins, #gene binding sites, - : binding operator for protein complexes, gene binding sites, … Associative and commutative. ~{…} : modification operator for phosphorylated sites, … Set of modified sites (Associative, Commutative, Idempotent). + : solution operator, “soup aspect”, Assoc. Comm. Idempotent, Neutral _ No membranes, no transport formalized. Bitonal calculi [Cardelli 03].

François Fages Rennes March 2005 BIOCHAM Syntax of Reactions N ::= name : expr for R | name : R | expr for R | R R ::= S=>S | S=[E]=>S | S=[R]=>S | S S | S S where A B stands for A=>B and B=>A A=[C]=>B for A+C=>B+C, etc. Three abstraction levels: 1.Boolean abstraction: presence/absence of molecules 1.Concurrent Transition System 2.Concentrations: number / volume 1.ODE 3.Population of molecules: number of molecules 1.Multiset Rewriting, Stochastic

François Fages Rennes March 2005 Boolean Semantics (BIOCHAM-1) Associate: Boolean state variables to molecules denoting the presence/absence of molecules in the cell or compartment A Finite concurrent transition system [Shankar 93] to rules (asynchronous) over-approximating the set of all possible behaviors A reaction A+B=>C+D is translated with 4 transition rules taking into account the possible consumption of reactants: A+B  A+B+C+D A+B   A+B +C+D A+B   A+  B+C+D A+B  A+  B+C+D

François Fages Rennes March 2005 Six Elementary Reaction Rule Schemas Complexation: A + B => A-B Decomplexation A-B => A + B Cdk1+CycB => Cdk1–CycB Phosphorylation: A =[C]=> A~{p} Dephosphorylation A~{p} =[C]=> A Cdk1–CycB =[Myt1]=> Cdk1~{thr161}-CycB Cdk1~{thr14,tyr15}-CycB =[Cdc25~{Nterm}]=> Cdk1-CycB Synthesis: _ =[C]=> A. _ =[#Ge2-E2f13-Dp12]=> CycA Degradation: A =[C]=> _. CycE _ (not for CycE-Cdk2 which is stable)

François Fages Rennes March 2005 MAPK Signaling Pathway RAF + RAFK RAF-RAFK. RAF~{p1} + RAFPH RAF~{p1}-RAFPH. MEK~$P + RAF~{p1} MEK~$P-RAF~{p1} where p2 not in $P. MEKPH + MEK~{p1}~$P MEK~{p1}~$P-MEKPH. MAPK~$P + MEK~{p1,p2} MAPK~$P-MEK~{p1,p2} where p2 not in $P. MAPKPH + MAPK~{p1}~$P MAPK~{p1}~$P-MAPKPH. RAF-RAFK => RAFK + RAF~{p1}. RAF~{p1}-RAFPH => RAF + RAFPH. MEK~{p1}-RAF~{p1} => MEK~{p1,p2} + RAF~{p1}. MEK-RAF~{p1} => MEK~{p1} + RAF~{p1}. MEK~{p1}-MEKPH => MEK + MEKPH. MEK~{p1,p2}-MEKPH => MEK~{p1} + MEKPH. MAPK-MEK~{p1,p2} => MAPK~{p1} + MEK~{p1,p2}. MAPK~{p1}-MEK~{p1,p2} => MAPK~{p1,p2} + MEK~{p1,p2}. MAPK~{p1}-MAPKPH => MAPK + MAPKPH. MAPK~{p1,p2}-MAPKPH => MAPK~{p1} + MAPKPH.

François Fages Rennes March 2005 MAPK Signaling Pathway RAF + RAFK RAF-RAFK. RAF~{p1} + RAFPH RAF~{p1}-RAFPH. MEK~$P + RAF~{p1} MEK~$P-RAF~{p1} where p2 not in $P. MEKPH + MEK~{p1}~$P MEK~{p1}~$P-MEKPH. MAPK~$P + MEK~{p1,p2} MAPK~$P-MEK~{p1,p2} where p2 not in $P. MAPKPH + MAPK~{p1}~$P MAPK~{p1}~$P-MAPKPH. RAF-RAFK => RAFK + RAF~{p1}. RAF~{p1}-RAFPH => RAF + RAFPH. MEK~{p1}-RAF~{p1} => MEK~{p1,p2} + RAF~{p1}. MEK-RAF~{p1} => MEK~{p1} + RAF~{p1}. MEK~{p1}-MEKPH => MEK + MEKPH. MEK~{p1,p2}-MEKPH => MEK~{p1} + MEKPH. MAPK-MEK~{p1,p2} => MAPK~{p1} + MEK~{p1,p2}. MAPK~{p1}-MEK~{p1,p2} => MAPK~{p1,p2} + MEK~{p1,p2}. MAPK~{p1}-MAPKPH => MAPK + MAPKPH. MAPK~{p1,p2}-MAPKPH => MAPK~{p1} + MAPKPH.

François Fages Rennes March Concentration Semantics Add kinetic expressions to BIOCHAM reaction rules k*[A]*[B] for A + B => C Associate real values to molecules [A] concentration of A Associate a system of ordinary differential equations (ODE) to a system of reaction rules (BIOCHAM model)

François Fages Rennes March 2005 Physical Interpretation of Kinetic Expressions 1)Probability of collision Different diffusion speeds of molecules (small>substrates>enzymes…) Average travel in a random walk: 1 μm in 1s, 2μm in 4s, 10μm in 100s random collisions per second for a substrate concentration of random collisions per second for a substrate concentration of ) Probability of reaction upon collision non elastic collision determined by the shape and orientation of matching surfaces 3) Energy of bonds (for dissociation rates)

François Fages Rennes March 2005 The Law of Mass Action is Compositional Law: The number of reactions is proportional to the number of A and B’s. A + B  k C reaction rate=kAB=dC/dt, dA/dt=-kAB, dB/dt=-kAB Diffusion assumption: each molecule moves independently of other molecules in a random walk (dilute solutions, low concentration ). The dynamics of a complex system is the composition of the dynamics of the reactions under mass action law (at given temperature, pH,…): E+S  k1 C  k2 E+P E+S  k3 C dE/dt = -k 1 ES+(k 2 +k 3 )C dC/dt = k 1 ES-(k 2 +k 3 )C dS/dt = -k 1 ES+k 3 C dP/dt = k 2 C

François Fages Rennes March 2005 Multi-Scale Phenomena Hydrolysis of benzoyl-L-arginine ethyl ester by trypsin present(En,1e-8). present(S,1e-5). absent(C). absent(P). (k1*[En]*[S],km1*[C]) for En+S C. k2*[C] for C => En+P. parameter(k1,4e6). parameter(km1,25). parameter(k2,15). Complex formation 5e-9 in 0.1s Product formation 1e-5 in 1000s

François Fages Rennes March 2005 Michaelis-Menten, Hill,… kinetics are not compositional They are derived from mass action law by quasi-steady approximation for given simple systems: Simple enzymatic reaction for Michaelis Menten Simple cooperative n-dimeric enzymatic reaction for Hill of order n The quasi-steady state approximation may be no longer valid after composition with other molecules and reactions. In a compositional approach to Systems Biology (making models composable and re-usable in different contexts) Michaelis-Menten kinetics, Hill kinetics etc. should be abandonned as reaction kinetics (no intrinsic value) and recovered after composition (property of the system)

François Fages Rennes March 2005 Plan 1.Introduction 2.BIOCHAM Language for Modeling Biochemical Systems 1.Syntax 2.Semantics at 3 abstraction levels (molecule populations, concentrations, Boolean) 3.BIOCHAM Language for Formalizing Biological Properties 1.Computation Tree Logic for Boolean semantics 2.Constraint Linear Time Logic for concentration semantics 4.Machine Learning from Temporal Properties 1.Learning reaction rules 2.Learning kinetic parameter values 5.Conclusion, collaborations and perspectives

François Fages Rennes March Temporal Logic CTL as a Query Language Computation Tree Logic [Clarke & al. 99] Choice Time E exists A always X next time EX(  )AX(  ) F finally EF(  )  AG(  ) AF(  ) liveness G globally EG(  )  AF(   ) AG(  ) safety U until E (    U   )A (    U   )

François Fages Rennes March 2005 Biological Queries (1/3) About reachability: Given an initial state init, can the cell produce some protein P? init  EF(P) Which are the states from which a set of products P1,..., Pn can be produced simultaneously? EF(P1^…^Pn) About pathways: Can the cell reach a state s while passing by another state s 2 ? init  EF(s 2 ^EFs) Is state s 2 a necessary checkpoint for reaching state s?  EF(  s 2 U s) Is it possible to produce P without using nor creating Q? EF(  Q U s) Can the cell reach a state s without violating some constraints c? init  EF(c U s)

François Fages Rennes March 2005 Biological Queries (2/3) About stability: Is a certain (partially described) state s a stable state? s  AG(s) s  AG(s) (s denotes both the state and the formula describing it). Is s a steady state (with possibility of escaping) ? s  EG(s) Can the cell reach a stable state? init  EF(AG(s)) not a LTL formula. Must the cell reach a stable state? init  AF(AG(s)) What are the stable states? Not expressible in CTL [Chan 00]. Can the system exhibit a cyclic behavior w.r.t. the presence of P ? init  EG((P  EF  P) ^ (  P  EF P))

François Fages Rennes March 2005 Biological Queries (3/3) About the correctness of the model: Can one see the inaccuracies of the model and correct them? Exhibit a counterexample pathway or a witness. Suggest refinements of the model or biological experiments to validate/invalidate the property of the model. About durations: How long does it take for a molecule to become activated? In a given time, how many Cyclins A can be accumulated? What is the duration of a given cell cycle’s phase? CTL operators abstract from durations. Time intervals can be modeled in FO by adding numerical arguments for start times and durations.

François Fages Rennes March 2005 MAPK Signaling Pathway MEK~{p1} is a checkpoint for producing MAPK~{p1,p2} biocham: !E(!MEK~{p1} U MAPK~{p1,p2}) True The PH complexes are not compulsory for the cascade biocham: !E(!MEK~{p1}-MEKPH U MAPK~{p1,p2}) false Step 1 rule 15 Step 2 rule 1 RAF-RAFK present Step 3 rule 21 RAF~{p1} present Step 4 rule 5 MEK-RAF~{p1} present Step 5 rule 24 MEK~{p1} present Step 6 rule 7 MEK~{p1}-RAF~{p1} present Step 7 rule 23 MEK~{p1,p2} present Step 8 rule 13 MAPK-MEK~{p1,p2} present Step 9 rule 27 MAPK~{p1} present Step 10 rule 15 MAPK~{p1}-MEK~{p1,p2} present Step 11 rule 28 MAPK~{p1,p2} present

François Fages Rennes March 2005 Kripke Semantics A Kripke structure K is a triple (S; R; L) where S is a set of states, and R  SxS is a total relation. s |=  if  is true in s, s |= E  if there is a path  from s such that  |= , s |= A  if for every path  from s,  |= ,  |=  if s |=  where s is the starting state of ,  |= X  if  1 |= ,  |= F  if there exists k >0 such that  k |= ,  |= G  if for every k >0,  k |= ,  |=  U  iff there exists k>0 such that  k |=  for all j < k  j |=  Following [Emerson 90] we identify a formula  to the set of states which satisfy it  ~ {s  S : s |=  }.

François Fages Rennes March 2005 Symbolic Model Checking Model Checking is an algorithm for computing, in a given finite Kripke structure the set of states satisfying a CTL formula: {s  S : s |=  }. Basic algorithm: represent K as a graph and iteratively label the nodes with the subformulas of  which are true in that node. Add  to the states satisfying  Add EF  (EX  ) to the (immediate) predecessors of states labeled by  Add E(  U  ) to the predecessor states of  while they satisfy  Add EG  to the states for which there exists a path leading to a non trivial strongly connected component of the subgraph of states satisfying  Symbolic model checking: use boolean constraints (BDDs) to represent sets of states and transitions (S is finite).

François Fages Rennes March 2005 Cell Cycle: G1  DNA Synthesis  G2  Mitosis G1: CdK4-CycD Cdk6-CycD Cdk2-CycE S: Cdk2-CycA G2 M: Cdk1-CycA Cdk1-CycB

François Fages Rennes March 2005 Mammalian Cell Cycle Control Map [Kohn 99]

François Fages Rennes March 2005 Kohn’s map detail for Cdk2 Complexation with CycA and CycE Phosphorylation sites PY15 and P Biocham Rules: cdk2~$P + cycA-$C => cdk2~$P-cycA-$C where $C in {_,cks1}. cdk2~$P + cycE~$Q-$C => cdk2~$P-cycE~$Q-$C where $C in {_,cks1}. p57 + cdk2~$P-cycA-$C => p57-cdk2~$P-cycA-$C where $C in {_, cks1}. cycE-$C =[cdk2~{p2}-cycE-$S]=> cycE~{T380}-$C where $S in {_, cks1} and $C in {_, cdk2~?, cdk2~?-cks1} rules, 165 proteins and genes, 500 variables, states.

François Fages Rennes March 2005 Mammalian Cell Cycle Control Benchmark rules, 165 proteins and genes, 500 variables, states. BIOCHAM NuSMV model-checker time in seconds: Initial state G2Query:Time: compiling29 Reachability G1EF CycE2 Reachability G1EF CycD1.9 Reachability G1EF PCNA-CycD1.7 Checkpoint for mitosis complex  EF (  Cdc25~{Nterm} U Cdk1~{Thr161}-CycB) 2.2 Cycle EG ( (CycA  EF  CycA)  (  CycA  EF CycA)) 31.8

François Fages Rennes March 2005 Plan 1.Introduction 2.BIOCHAM Language for Modeling Biochemical Systems 1.Syntax 2.Semantics at 3 abstraction levels (molecule populations, concentrations, Boolean) 3.BIOCHAM Language for Formalizing Biological Properties 1.Computation Tree Logic for Boolean semantics 2.Constraint Linear Time Logic for concentration semantics 4.Machine Learning from Temporal Properties 1.Learning reaction rules 2.Learning kinetic parameter values 5.Conclusion, collaborations and perspectives

François Fages Rennes March 2005 Learning by Theory Revision Theory T: BIOCHAM model molecule declarations interaction rules: complexation, phosphorylation, … Examples φ: CTL specification of biological properties Reachability Checkpoints Stable states Oscillations Bias R: Rule pattern Kind of reaction rules to learn Find R such that T,R |= φ

François Fages Rennes March 2005 Simple Ad-hoc Enumerative Algorithm For learning one reaction rule: 1.Compute the list of candidate rules All instances of the rule pattern (the bias) 2.Order the candidates by increasing complexity Sort the rules by size 3.For each candidate, add it to the model Check the CTL specification in the augmented model If the specification is satisfied, output the rule as an anwser

François Fages Rennes March 2005 Improved Theory Revision Algorithm General idea of constraint programming: replace a generate-and-test algorithm by a constrain-and-generate algorithm. Anticipate whether one has to add or remove a rule? Positive CTL formula: if false, remains false after removing a rule EF(φ) where φ is a boolean formula (pure state description) Negative CTL formula: if false, remains false after adding a rule AG(φ) where φ is a boolean formula Remove a rule on the path given by the model checker ( why command) Unclassified CTL formulae Checkpoint(a,b): ¬E(¬aUb) Yet if EF(b) is true, then checkpoint(a,b) is a negative formula Loop(a)= EG((a  EF  a)^(  a  EFa))

François Fages Rennes March 2005 Rule Inference in Cell Cycle Control [Tyson et al. 91] model over 6 variables, initial state present(cdc2). _ => cyclin. cdc2˜{p} + cyclin => cdc2˜{p}-cyclin˜{p}. cdc2˜{p}-cyclin˜{p} =>cdc2-cyclin˜{p}. ERASED cdc2-cyclin˜{p} => cdc2 + cyclin˜{p}. cyclin˜{p} => _. cdc2 cdc2˜{p}.

François Fages Rennes March 2005 Rule Inference in Cell Cycle Control (cont.) CTL specification of biological properties: Activation of the kinase-cyclin (MPF) complex reachable(cdc2-cyclin˜{p}). Oscillation of the cycle’s phase: loop(cyclin & cyclin˜{p} & !(cdc2-cyclin˜{p})).

François Fages Rennes March 2005 Rule Inference in Cell Cycle Control (cont.) ? learn([$Q=>$P where $P in complexes and $Q in complexes]). _=>cdc2-cyclin˜{p} cyclin=>cdc2-cyclin˜{p} cdc2˜{p}-cyclin˜{p}=>cdc2-cyclin˜{p} ? learn([$qp=>$q where $q in complexes and $qp modif $q]). cdc2˜{p}-cyclin˜{p}=>cdc2-cyclin˜{p} Adding temporal specification checkpoint(cdc2˜{p},cdc2-cyclin˜{p}). ? learn([$Q=>$P where $P in complexes and $Q in complexes]). cdc2˜{p}-cyclin˜{p}=>cdc2-cyclin˜{p}

François Fages Rennes March 2005 Process Inference in Cell Cycle Control [Tyson et al. 91] model over 6 variables, initial state present(cdc2). _ => cyclin. cdc2˜{p} + cyclin => cdc2˜{p}-cyclin˜{p}. cdc2˜{p}-cyclin˜{p} =>cdc2-cyclin˜{p}. ERASED cdc2-cyclin˜{p} => cdc2 + cyclin˜{p}. ERASED cyclin˜{p} => _. cdc2 cdc2˜{p}.

François Fages Rennes March 2005 Process Inference in Cell Cycle Control (cont.) ? learn([$qp =>$q where $q in complexes and $qp modif $q, $p+$q=>$p-$q where $q in complexes and $p in complexes]). No rule ? learn(_ => $q where $q in complexes). No rule ? learn([$R=> $P where $P in complexes and $R in complexes]). cdc2=>cdc2-cyclin˜{p} cyclin=>cdc2-cyclin˜{p} cdc2˜{p}=>cdc2-cyclin˜{p} ? learn([$R+ $Q=> $Rp- $Qp where $Q in complexes and $R in complexes and $Rp modif $R and $Qp modif $Q]). cdc2˜{p}+cyclin=>cdc2-cyclin˜{p}

François Fages Rennes March 2005 Cell Cycle Control [Qu et al. 03] _=[Cdk-CycB]=>APC. APC=>_. _=>Cdk. CycB=[APC]=>_. CycB-Cdk=[APC]=>_. CycB~{p1}-Cdk=[APC]=>_. Cdk+CycB => Cdk-CycB. Cdk-CycB~{p1}=[C25~{p1,p2}]=>Cdk-CycB. Cdk-CycB=[Wee1]=>Cdk-CycB~{p1}. C25=[Cdk-CycB]=>C25~{p1}. C25~{p1}=>C25. C25~{p1}=[Cdk-CycB]=>C25~{p1,p2}. C25~{p1,p2}=>C25~{p1}. Wee1=[Cdk-CycB]=>Wee1~{p1}. Wee1~{p1}=>Wee1. CKI=[APC]=>_. CKI+Cdk-CycB=>C. C=[Cdk-CycB]=>C~{p1}. C~{p1}=[APC]=>Cdk-CycB.

François Fages Rennes March 2005 Constraint-Based Linear Time Logic Constraints over concentrations and derivatives as FOL formulae over the reals: [M] > 0.2 [M]+[P] > [Q] d([M])/dt < 0 LTL operators for time X, F, G, U (no non-determinism). F([M]>0.2) FG([M]>0.2) F ([M]>2 & F (d([M])/dt 0 & F(d([M])/dt<0))))

François Fages Rennes March 2005 Traces from Numerical Simulation From a system of Ordinary Differential Equations dX/dt = f(X) Numerical integration produces a discretization of time (by Euler, Runge-Kutta, adaptive step size Runge-Kutta, Rosenbrock methods) The trace is a linear Kripke structure: (t 0,X 0 ), (t 1,X 1 ), …, (t n,X n ). the derivatives can be added to the trace (t 0,X 0,dX 0 /dt), (t 1,X 1,dX 1 /dt), …, (t n,X n,dX n /dt). Equality x=v true if x i ≤v & x i+1 ≥ v or if x i ≥ v & x i+1 ≤v (Rolle’s theorem!)

François Fages Rennes March 2005 Constraint-Based LTL (Forward) Model Checking Hypothesis 1: the initial state is completely known Hypothesis 2: the formula can be checked over a finite period of time [0,T] Simple algorithm based on the trace of the numerical simulation: 1.Run the numerical simulation from 0 to T producing values at a finite sequence of time points 2.Iteratively label the time points with the sub-formulae of  that are true: Add  to the time points where a FOL formula  is true, Add F  (X  ) to the (immediate) previous time points labeled by  Add  U  to the predecessor time points of  while they satisfy  (Add G  to the states satisfying  until T (optimistic abstraction…))

François Fages Rennes March 2005 Example of Parameter Estimation in the Brusselator present(x,1). present(y,1.5). parameter(a,1). parameter(b,1). %wrong parameter a for _=>x. [x]*[x]*[y] for 2*x+y=>3*x. b*[x] for x=>y. [x] for x=>_. ? trace_check(F(([y]>2) & F((d([y])/dt 0) & ([y]>2) & F(d([y])/dt<0))))). false ? trace_get(b,0,2,F(([y]>2) & F((d([y])/dt 0) & ([y]>2) & F(d([y])/dt<0)))),20). ? trace_get(b,0,2,F(([y]>2) & F(([y] [x]) & ([y]>2) & F([y]<[x])))),20),plot. No value found. ? trace_get(b,0,5,F(([y]>2) & F(([y] [x]) & ([y]>2) & F([y]<[x])))),20),plot. parameter(b,2.1) makes F(([y]>2)&F(([y] [x])&([y]>2)&F([y]<[x])))) true. ? trace_get(b,0,5,F(([y]>4) & F(([y] [x]) & ([y]>4) & F([y]<[x])))),20),plot. parameter(b,2.7) makes F(([y]>4)&F(([y] [x])&([y]>4)&F([y]<[x])))) true.

François Fages Rennes March 2005 Conclusion The biochemical abstract machine BIOCHAM offers: A simple rule-based language for modeling biochemical processes Molecule concentration semantics (ODE) Boolean semantics: presence/absence of molecules A powerful temporal logic language for formalizing biological properties CTL (implemented with NuSMV model checker) Constraint LTL (implemented in Prolog) An original machine learning system Rule discovery (from CTL specification) Parameter estimation (from constraint LTL specification) A repository of models: cell-cycle control, signaling pathways… (SBML)

François Fages Rennes March 2005 On-going Work and Perspectives Molecule population semantics: Stochastic simulation Probabilistic model checking (currently using PRISM) Space: representing compartments, transportation, and deformations Location algebra [Cardelli et al. 01, Plotkin 03] Partial differential equations Space deformation [Cardelli et al. 03, Danos et al. 03]

François Fages Rennes March 2005 Collaborations STREP APRIL 2: Applications of probabilistic inductive logic programming Luc de Raedt, Freiburg, Stephen Muggleton, Imperial College London,… Learning in a probabilistic logic setting NoE REWERSE: Reasoning on the web with rules and semantics François Bry, Münich, Rolf Backofen Jena, Mike Schroeder Dresden,… Interfacing Biocham to the Web, gene and protein ontologies INRIA Bang, Jean Clairambault, Benoît Perthame INSERM, Villejuif, Francis Lévi “Cancer chronotherapies” ULB, Albert Goldbeter, Bruxelles Coupled BIOCHAM models of cell cycle, circadian cycle, cytotoxic drugs.