Presentation is loading. Please wait.

Presentation is loading. Please wait.

Synthesis for Systems Biology Ras Bodík, Ali Sinan Köksal, Evan Pu, Saurabh Srivastava UC Berkeley Jasmin Fisher Microsoft Research Nir Piterman University.

Similar presentations


Presentation on theme: "Synthesis for Systems Biology Ras Bodík, Ali Sinan Köksal, Evan Pu, Saurabh Srivastava UC Berkeley Jasmin Fisher Microsoft Research Nir Piterman University."— Presentation transcript:

1 Synthesis for Systems Biology Ras Bodík, Ali Sinan Köksal, Evan Pu, Saurabh Srivastava UC Berkeley Jasmin Fisher Microsoft Research Nir Piterman University of Leicester

2 2

3 Synthesis of Biological Models from Mutation Experiments Abstract. Executable biology presents new challenges to formal systems. This papers addresses two problems that cell biologists face when developing formally analyzable models. First, we show how to automatically synthesize a model of a stem cell given in vitro experiments of how particular gene mutations influence the cell fate. The problem of synthesis from mutations is unique because mutation experiments have non-deterministic outcomes (presumably due to races in the cell) and the synthesized model must be able to replay all these outcomes or else the model does not faithfully describe cellular processes. (In contrast, a “regular” concurrent program synthesized under a permissive specification need not exhibit all allowed behaviors.) We developed algorithms for this synthesis problem and synthesized a c. elegans VPC cell model that we had failed to develop manually while others took months to develop. Second, we address the problem of underconstrained specifications, which arise due to missing mutation experiments and cause the biologist to doubt that a model is the sole explanation of the observed phenomena. This problem boils down to analyzing the space of plausible models, i.e., models that can be synthesized under given specifications. We develop algorithms for testing ambiguity of specifications, i.e., do there exist alternative models, which would produce different fate on some as yet unperformed experiment. Using the algorithm, we prove that our c. elegans cell model is the sole model; all alternative models will behave identically on future experiments. We also examine models equivalent observationally (same fates) but distinct internally (fate are determined due to different protein interactions). We identify an alternative model previously unknown to biologists. In addition to posing the synthesis and ambiguity testing problems and developing algorithms, we develop a modeling language and embed it into Scala. We describe how this language design and embedding allows us to build a lightweight synthesizer. 3

4 Executable biology pushes our boundaries 4

5 Other lessons and results Design your own tools To enable synthesis, design a domain language. Then build a lightweight synthesizer. Synthesized a C. elegans VPC model We failed to write this model manually; others took months. Beyond synthesis Showed that available experiments are non-ambiguous. Synthesized an new internally alternative model. 5

6 Systems biology 6

7 Understanding Diseases “Cancer is fundamentally a disease of failure of regulation of tissue growth. In order for a normal cell to transform into a cancer cell, the genes which regulate cell growth and differentiation must be altered.” – Wikipedia To understand cancer, investigate cell differentiation 7

8 How Are Cells Differentiated? Two ways of differentiation: –A single cell divides into cells of different type. –Multiple identical cells differentiate by communicating. To understand cell differentiation, investigate cell communication. 8

9 Studying Differentiation on Worms Cell differentiation in worms: similar to human but much simpler. 9 identical precursor cells differentiated vulval cells

10 The Research Goal What is the cell’s “algorithm” for robustly deciding cell fates through communication? 10

11 Mutation experiments are visually observable Biologists mutate cell genes and observe the outcome of differentiation. sqv mutants of Caenorhabditis elegans are defective in vulval epithelial invagination [Herman et al. 1999] 11

12 The results from wet-lab experiments 12

13 Mutation experiments give partial knowledge From gene mutation experiments, biologists infer a protein interaction. “In this assay, depletion of lst-2, lst-3, lst-4, or dpy-23, as well as ark-1, caused ectopic vulval induction, suggesting that they function as negative regulators of the EGFR- MAPK pathway.” [Yoo et al. 2004] 13

14 Making Sense of Experiments 14

15 Executable Systems biology 15

16 Executable Biology Computational models are needed to tackle the combinatorial complexity of cell communication. Verification of models can show their inconsistency with experimental data. New interactions can be discovered. [Fisher et al. 2007] 16

17 Semantics of models Time and protein concentrations are discrete: discrete is sufficient to show interesting behavior Cells are concurrent communicating automata bounded asynchrony (cells progress at ~same rate) Note: timing is modeled with state progression 17

18 Cells as a Reactive Modules (RM) program atom Vul controls Vul reads go, Vul, IS, Muv_state, v_Vul awaits go, v_Vul, lst_state init [] (true) & v_Vul'= ko -> Vul':= off0; [] (true) & v_Vul'~= ko -> Vul':= Evaluate0; update [] (~go & go') & Vul = Evaluate0 & Muv_state = ON & IS ~= high -> Vul' := off1; [] (~go & go') & Vul = Evaluate0 & IS = high -> Vul' := let23; [] (~go & go') & Vul = Evaluate0 & Muv_state = OFF & IS ~= high -> Vul' := Evaluate1; [] (~go & go') & Vul = off1 & IS = med -> Vul' := Before_Partial_On; [] (~go & go') & Vul = off1 & IS = high -> Vul' := let23; [] (~go & go') & Vul = off1 & IS ~= high & IS ~= med -> Vul' := off2; [] (~go & go') & Vul = Evaluate1 -> Vul' := let23; [] (~go & go') & Vul = Before_Partial_On -> Vul' := let23; [] (~go & go') & Vul = let23 & lst_state' = OFF -> Vul' := sem5; [] (~go & go') & Vul = sem5 & lst_state' = OFF -> Vul' := let60; [] (~go & go') & Vul = let60 & lst_state' = OFF -> Vul' := mpk1; [] (~go & go') & Vul = let23 & lst_state' = ON -> Vul' := Vul_counteracted; [] (~go & go') & Vul = sem5 & lst_state' = ON -> Vul' := Vul_counteracted; [] (~go & go') & Vul = let60 & lst_state' = ON -> Vul' := Vul_counteracted 18

19 RM models: laborious to develop and update Months of tweaking to get the timing right hard to understand hard to debug RM is too expressive (eg, has clairvoyance) it’s tempting to encode constructs that have no clear biological explanations (strange abstractions) Summary: modeling in executable biology is laborious if only we could automate model development 19

20 Synthesis and Analysis of Biology Models 20

21 Our contribution Automatically infer cell models (synthesis) –obtain executable models faster Enumerate alternative models (“distinct” synthesis) –find alternative explanations of observed phenomena Ask for more specifications (disambiguation) –suggest experiments to disambiguate between models 21

22 Lessons: Build your tools! Executable biology selects methods based on availability of tools, eg model checkers. We did the same for synthesis of models. It failed. We argue here to build our own lightweight tools, including the modeling language and its synthesizer. We show how to DIY. 22

23 The language 23

24 Motivation for a high-level language (HLL) 24

25 Four levels of the language schedule concentration update function

26 Top-level semantics 26

27 Correctness 27

28 Level 2: Program is composed from cells 28

29 Level 3: In cells are proteins Each cell is composed from proteins. –protein state: discretized protein concentration –proteins read states of other proteins (pot. in other cells) –they update their own concentration next step Synchronous execution: –when a cell is scheduled, all of its proteins take one step –ie, they update their concentration level [similar to Synchronous/Reactive (SR) model, Edwards and Lee, 2002] 29

30 Level 4: In proteins are update functions 30

31 The output fate 31

32 Example Assume a network of police cameras. When a gunshot happens, we want at least one nearby camera to take a picture. Synthesize a protocol for deciding which camera takes a picture. OK if multiple cameras do. Two types of communications: -sound from gunshot (“base station”) to cameras -radio transmission between camera nodes announcing “I took a picture, you don’t have to, save your battery” Nodes should decide who is closest on the basis of sound signal strength. No triangulation. 32

33 Example 33

34 Incomplete specification signal from BStake picture?signal from BStake picture?cameras managed to communicate? HYHNY NY YY HYLNY HYHYN 34

35 Synthesized update functions for base receiver, delay node 35

36 Synthesis 36

37 Synthesis 37

38 Sketch of the cell model Describes what biologists already know -proteins in the cell -how proteins interact (activation vs. inhibition) -timing (update functions) of some proteins 38

39 Enforcing Biological Invariants Synthesized models must satisfy biological invariants. Biologist’s invariants specify whether one protein activates or inhibits another. Asserted as monotonicity constraints on state transitions 39

40 The synthesizer 40

41 Architecture of synthesizer (3.5 KLOC) 41

42 Example of the embedded DSL class BaseReceiver extends Node("BaseReceiver") { val base = input(“off”, "low", "high") val lateralReceiver = input(“off”, "on") val out = output(“off”, "on") // update functions implemented as a (more general) FSM val stateful = logic(new StatefulLogic { val off = state("off") // two observable states val on = state("on") output(out) // link these states to output port init(off) // “off” is the start state nbStates(5) // this state machine will have five hidden states activating(base) // biological invariants on inputs inhibiting(lateralReceiver) }) register(stateful) // necessitated by the DSL } 42

43 How to deal with 3QBF synthesis problem 43

44 Formula translator 44

45 Algorithms 45

46 Synthesis Approach: CEGIS assume we care only about the classical demonic correctness 46 synthesize initial input set (schedule, experiment) candidate model SAT add counterexample (schedule, experiment) SAT UNSAT verify

47 Synthesis algorithm 47

48 Three communicating solvers 48 3QBF SAT 2QBF // blasts (m,f), turns to SAT SAT 2QBF 3QBF

49 Supporting tools 49

50 Supporting tools Work would not be productive without these tools –execution visualizer –causal tracer –automaton minimizer We still need ideas on how to construct those quickly 50

51 Visualizing the Synthesized Model 51 activated connections are colored step through execution

52 Results 52

53 Results (1): Automatic model inference Synthesized a model of VPC in C. elegans -the model expressed in our bio-inspired language -we believe it’s more readable than in RM Prior to synthesis –we failed to manually fix a bug in an equivalent model –collaborators took several months to make this model 53

54 Results (2): Are experiments complete? We concluded that the set of experiments is complete –this means there exists no alternative model that behaves differently on experiments not yet performed –this is under the assumption described in the sketch provided by biologists, which encodes their knowledge about C. elegans Working on identifying minimal set of experiments –if we want to validate these experiment, do we need to repeat all of them? 54

55 Results (3) No behaviorally distinct models. But we synthesized a model that differs internally. cell behavior due to a different protein interaction These models can’t be distinguished via mutation and fate observation (models have same fates, after all). Hence one must “instrument” the cell by tagging proteins with fluorescent genes. Here, our synthesis identifies which genes to instrument (the fewer the better). 55

56 Summary: Executable biology’s challenges Infer models that can replay all observed behavior … or else they don’t faithfully model cell phenomena. This semantics leads to a 3QBF synthesis problem. Analyze the space of plausible models Are specs ambiguous, minimal? Which experiments to perform to rule out a model? 56


Download ppt "Synthesis for Systems Biology Ras Bodík, Ali Sinan Köksal, Evan Pu, Saurabh Srivastava UC Berkeley Jasmin Fisher Microsoft Research Nir Piterman University."

Similar presentations


Ads by Google