1 Systems and Synthetic Biology: A programming languages point of view Saurabh Srivastava Assistant Research Engineer Computer Science + Bioengineering.

Slides:



Advertisements
Similar presentations
Cell Communication Cells need to communicate with one another, whether they are located close to each other or far apart. Extracellular signaling molecules.
Advertisements

Recombinant DNA Technology
Control of Expression In Bacteria –Part 1
Greta YorshEran YahavMartin Vechev IBM Research. { ……………… …… …………………. ……………………. ………………………… } T1() Challenge: Correct and Efficient Synchronization { ……………………………
François Fages MPRI Bio-info 2007 Formal Biology of the Cell Inferring Reaction Rules from Temporal Properties François Fages, Constraint Programming Group,
Timed Automata.
Microbial Therapy (Steph, Alex, Sammy) Pathway Engineering - make product body needs (possibly sense deficiency) - Synthetic Symbiosis (E. coli natural.
Production of the Antimalarial Drug Precursor Artemisinic Acid in Engineered Yeast February 12, 2007 Patrick Gildea By J.D. Keasling et all.
CELL CONNECTIONS & COMMUNICATION AP Biology Ch.6.7; Ch. 11.
…Is how cells coordinate their physiological behaviors …Greater than the sum of their subcellular organelles
Gene Expression Viruses Biotechnology
1 Genetics The Study of Biological Information. 2 Chapter Outline DNA molecules encode the biological information fundamental to all life forms DNA molecules.
Models and methods in systems biology Daniel Kluesing Algorithms in Biology Spring 2009.
Marc Riedel Synthesizing Stochasticity in Biochemical Systems Electrical & Computer Engineering Jehoshua (Shuki) Bruck Caltech joint work with Brian Fett.
Development.
Model Checking. Used in studying behaviors of reactive systems Typically involves three steps: Create a finite state model (FSM) of the system design.
Computational Molecular Biology (Spring’03) Chitta Baral Professor of Computer Science & Engg.
Bioinformatics: a Multidisciplinary Challenge Ron Y. Pinter Dept. of Computer Science Technion March 12, 2003.
Mathematical Modelling of Phage Dynamics: Applications in STEC studies Tom Evans.
Genetic Technologies By: Brenda, Dale, John, and Brady.
Synthesis for Systems Biology Ras Bodík, Ali Sinan Köksal, Evan Pu, Saurabh Srivastava UC Berkeley Jasmin Fisher Microsoft Research Nir Piterman University.
Cheng/Dillon-Software Engineering: Formal Methods Model Checking.
21.1 – 1 As you learned in chapter 12, mitosis gives rise to two daughter cells that are genetically identical to the parent cell. Yet you, the product.
Biotechnology Packet #26 Chapter #9. Introduction Since the 1970’s, humans have been attempted to manipulate and modify genes in a way that was somewhat.
Gene Expression and Cell Differentiation CSCOPE Unit: 08 Lesson: 01.
AP Biology: Chapter 14 DNA Technologies
AP Biology Control of Eukaryotic Genes.
Transformation AP Big Idea #3: Genes and Information Transfer connected with AP Big Idea #1 (Evolution) & #2 (Cellular Processes)
Bioinformatics and it’s methods Prepared by: Petro Rogutskyi
Automated Explanation of Gene-Gene Relationships Wacek Kuśnierczyk.
Synthetic biology: New engineering rules for emerging discipline Andrianantoandro E; Basu S; Karig D K; Weiss R. Molecular Systems Biology 2006.
AP Biology Development. AP Biology Big Questions: 1. How does a multicellular organism develop from a zygote? 2. How is the position of the parts of an.
CS 790 – Bioinformatics Introduction and overview.
1865- Gregor Mendel studied inheritance patterns using pea plants and observed traits were inherited as separate units. These traits are now known as.
Section 2 Genetics and Biotechnology DNA Technology
Microbial Models I: Genetics of Viruses and Bacteria 7 November, 2005 Text Chapter 18.
Clustering of protein networks: Graph theory and terminology Scale-free architecture Modularity Robustness Reading: Barabasi and Oltvai 2004, Milo et al.
Gene Regulation. Regulation in Prokaryotes Gene Expression = gene to protein processing that functions within cells. Regulation = We are talking about.
Gene regulation results in differential gene expression, leading to cell specialization.
LECTURE CONNECTIONS 1 | Introduction to Genetics © 2009 W. H. Freeman and Company.
Regulation of Gene Expression. You Must Know The functions of the three parts of an operon. The role of repressor genes in operons. The impact of DNA.
Synthetic Biology Risks and opportunities of an emerging field Constructing Life.
NY Times Molecular Sciences Institute Started in 1996 by Dr. Syndey Brenner (2002 Nobel Prize winner). Opened in Berkeley in Roger Brent,
Researchers use genetic engineering to manipulate DNA. Section 2: DNA Technology K What I Know W What I Want to Find Out L What I Learned.
Statistical Mechanics of Sloppy Models Bacterial Cell-Cell Communication and More Josh Waterfall, Jim Sethna, Steve Winans.
Metabolism Dr. Samah Kotb Lecturer of Biochemistry 2015 Cellular Biochemistry and Metabolism (CLS 331)
Cell metabolism. Metabolism encompasses the integrated and controlled pathways of enzyme catalysed reactions within a cell Metabolism The word “metabolism”
Biotech Vocabulary. AMINO ACID  The building blocks of a protein molecule.  A protein is composed of a chain of hundreds or thousands of amino acids.
GAMES camp Synthetic Biology and iGEM (International Genetically Engineered Machines)
Bioinformatics: Cool stuff you can do with Computers and Biology Oded Magger Tel Aviv University / Autodesk inc. GIP course 2010.
Cloning DNA, Plasmids and Transformations By: Stephen Sullivan and Julie Ethier.
Genetic Engineering Chapter 13 Test on Friday 03/13/09 Reviewing Content Due 03/12/ and #28.
Ch 12 DNA and RNA 12-1DNA 12-2 Chromosomes and DNA Replication 12-3 RNA and Protein Synthesis 12-4 Mutations 12-5 Gene Regulation 12-1DNA 12-2 Chromosomes.
From the double helix to the genome
GENETICS.
4/26/2010 BIOTECHNOLOGY.
Biotechnology Terms.
Journal club Jun , Zhen.
GENETIC TECHNOLOGY Genetically engineered bollworm.
Biomedical Technology I
Genes and Development CVHS Chapter 16.
Section 2 Genetics and Biotechnology DNA Technology
What makes a mutant?.
GENETICS.
4/26/2010 BIOTECHNOLOGY.
Genetic Engineering II
2. 2 Life as a worm-- the nematode C. elegans.
The Study of Biological Information
BIO307- Bioengineering principles SPRING 2019
Computational Biology
Presentation transcript:

1 Systems and Synthetic Biology: A programming languages point of view Saurabh Srivastava Assistant Research Engineer Computer Science + Bioengineering University of California, Berkeley

Systems biology – Modeling complete biological processes Genomics – Reading DNA sequences got incredibly fast, and cheap – Algorithms for sequence data analytics within organisms Synthetic biology – DNA synthesis got incredibly cheap – Functional characterization on the way – Algorithms to predict tweaks to organisms 2 One view: biological specifications One view: syntax of components One view: semantics of components In perspective

3 Different scales in biology Systems Biology operates here Synthetic Biology operates here Intracellular processes Intercellular processes Protein interactions Protein function Organisms

4 Systems Biology operates here Results and techniques used Synthetic Biology operates here Intracellular processes Intercellular processes Protein interactions Protein function

5 Systems Biology operates here Results and techniques used Synthetic Biology operates here Intracellular processes Intercellular processes Protein interactions Protein function Results: Constructed a bacteria that produces Paracetamol/Depon Model of cell communication allows understanding cancer Techniques: Automatically generating concurrent programs using input-output examples Big data analysis and abstraction

PART I Synthesizing models for Systems Biology

Synthesizing Systems Biology “Programs” Synthesizing concurrent programs from examples Programs ≡ biological models Examples ≡ biological experiments We assist natural sciences with formal methods Given experiments, are there other models? If so, compute a new, disambiguating experiment Part I: how stem cells coordinate their fates 7

Understanding Diseases “Cancer is fundamentally a disease of failure of regulation of tissue growth. In order for a normal cell to transform into a cancer cell, the genes which regulate cell growth and differentiation must be altered.” – from Wikipedia Research on cell differentiation helps understanding diseases such as cancer. 8

C. elegans: A Model Organism Earthworm used in developmental biology. 959 cells; its organs found in other animals. Differentiation studied on vulval development. 9

Differentiation and then development into organ parts 10 Initial division of embryo Identical precursor cells collaborate to decide their fate

Modeling Goal 11 What is the mechanism (program) within each cell for deciding fates through communication?

Building Blocks of these Programs Cells contain communicating proteins. Protein interaction: a protein senses the concentration of other proteins. Interaction is either activation or inhibition. 12 A B A B

How the Vulval Cells Differentiate 13 let-23 lin-12 sem-5 let-60 mpk-1 lst let-23 lin-12 sem-5 let-60 mpk-1 lst let-23 lin-12 sem-5 let-60 mpk-1 lst Anchor Cell Anchor Cell high med low 1º 2º 3º If cells sense the same signal strength, data races occur....

How Biologists Discover Interactions Measuring protein levels over time is infeasible If such “cell tracing” is infeasible, infer protein interaction from end-to-end experiments That is, mutate cells  observe resulting fates 14

A Mutation Experiment 15 let-23 lin-12 let-23 lin-12 let-23 lin-12 Anchor Cell Anchor Cell high med low 1º 2º 3º 1º...

Putting Experiments Together 16 ExperimentAClin-12lin-15let-23lstFate decisions 1ON {332123} 2ONOFFON {331113} 3ON OFFON {112121,122121, } No protein is mutated. lin-12 is turned off. Multiple outcomes observed Fate of six neighboring cells Experiments over 35 years by 11 groups

How to Build Accurate Models? 17 let-23 lin-12 sem-5 let-60 mpk-1 lst let-23 lin-12 sem-5 let-60 mpk-1 lst let-23 lin-12 sem-5 let-60 mpk-1 lst Anchor Cell Anchor Cell high med low 1º 2º 3º...

Semantics of the Modeling Language 18 Cell 1Cell 2 Program has cells Non-deterministic outcomes via schedule interleaving let-23 lin-12 sem-5 let-60 mpk-1 lst Cell has proteins All proteins advance synchronously Proteins have discrete state and update functions.

Synthesizing Cellular Programs 19

Synthesis of Programs 20 specification biological insight biological insight synthesizer completed program completed program ExperimentAClin-12lin-15let-23lstFate decisions 1ON {332123} 2ONOFFON {331113} 3ON OFFON {112121,122121, }... Given as a partial program

Partial Programs Partial programs express biological insight: Which proteins are in the cell Which proteins may interact Update functions can be unknown. 21 let-23 lin-12 sem-5 let-60 mpk-1 lst ?

Synthesis Algorithm 22

Classical CEGIS synthesizer initial input/output set candidate solution SAT add input-output counterexample SAT UNSAT 23 verifier

Correctness Condition Safety: all schedules must lead the program to produce experiment outcomes observed in the wet lab. ∀ mutation m. ∀ schedule s. P(m, s) ∈ E(m) Completeness: each observed experiment outcome must be reproducible by the program for some schedule. ∀ mutation m. ∀ fate f ∈ E(m). ∃ schedule s. P(m, s) = f 24 ExperimentAClin-12lin-15let-23lstFate decisions 1ON {332123} 2ONOFFON {331113} 3ON OFFON {112121, , }...

Counterexample-Guided Inductive Synthesis 25 inductive synthesizer inductive synthesizer safety verifier safety verifier counterexample: execution P(m, s) with bad outcome counterexample: observation (m, f) not reproducible completeness verifier completeness verifier candidate completion initial set of input/output examples no candidate completion ok

Verifying for Safety Safety: ∀ mutation m. ∀ schedule s. P(m, s) ∈ E(m) Attempt to disprove by searching for a demonic schedule: ∃ mutation m. ∃ schedule s. P(m, s) ∉ E(m) 26 Unroll over the set of performed experiments Search symbolically for a demonic schedule

Verifying for Completeness Completeness: ∀ mutation m. ∀ fate f ∈ E(m). ∃ schedule s. P(m, s) = f Attempt to disprove by showing lack of an angelic schedule for some outcome: ∃ mutation m. ∃ fate f ∈ E(m). ∀ schedule s. P(m, s) ≠ f 27 Unroll over pairs of mutation and fate Query symbolically for an angelic schedule

Synthesized Models We synthesized two models of VPCs. Input: Partial model that specifies known, simple protein behaviors. Output: Synthesized update functions for two key proteins. 28 model 1 model 2

Additional Algorithms for Going Beyond Synthesis to Assist Scientists 29 Does there exist alternative model that differs on new experiment? For two or more models within the space, do there exist disambiguating experiments? What are the minimal number of experiments that constrain the space to current model?

PART II Predicting DNA insertions for Synthetic Biology

Microbial chemical factories – Sustainable bacterial production of chemicals. E.g., drugs, polymers Bacteria as a sensor – Agricultural apps: e.g., sense nitrogen depletion in soil, change color Tumor killing bacteria – Sense multiple environmental factors (lower oxygen, high lactic acid) – Invade cancerous cell – Release drug inside cell 31 Synthetic Biology inserts DNA Applications enabled

Does it actually work? Jay D. Keasling – Artemisinin: from Artemisia annua to Yeast – Amyris company: 219M incidence, 300M cure target – 8 years of work; manual insights Computationally predicted – Our tylenol E. coli strain Sugar Tyenol

Incoming chemicals How do cells manipulate chemicals Some proteins are enzymes Output chemicals

Changing the cellular chemical machinery What happens when we add unnatural/external enzymes?

35 Sugar Acetaminophen Tylenol Bio repositories Data dedup/correlation + Search DNA synthesis + Plasmid Transformation Abstractions or rules from data Polymers Nylon etc. Biofuels Sugar Opportunity Current status + Rule application to predict

36 Tylenol4-aminophenol4-aminobenzoate Chorismate pathway 4ABH gene from Mushroom Prediction for Tylenol

Tylenol 4AP 4ABH gene inserted

38 Abstractions or rules from data Polymers Nylon etc. Biofuels Sugar Opportunity Rule application to predict

Biochemical rules/abstractions?

f Graph transformation function Operator taking (chemical) graph and transforming it f’ f’’ f class ** **

Open problem 1: Language for chemicals and transforms C1=CC=CC=C1 3D2D1D XYZ, CML, PDB SMILES, SYBYL Good for crystallographers Good for biochemists Good for computational storage, retrieval

Alternative new representation Transformation representation – enc molecule – Be able to efficiently compute f class (enc molecule ) Conflicting objectives: – Trace-based encoding Difficulty at cycle cuts – True graph encoding Subgraph isomorphism Need midway encoding ** **

Applying reaction operators We can now predict new edges: I.e., external enzymes or even new constructed enzymes

Open problem 2: Deriving biochemical programs f class (enc molecule ) is a single step Function composition to get “pathway programs” – f class0 (f class1 (f class2 (f class3 (enc molecule )))) Complications – Path can be “acyclic” not just straight-line – Mixed concrete and rule instantiation

Crux of the research problem Intelligent rule instantiation: edges Given target chemical Phrase it as a model checking reachability problem: initial experiments point to a need for more efficient representation of search space. Need better search methods

Arbekacin (semi-synthetic) MRSA Drug $20,900/gram Amikacin (natural) Nothing interesting $118/gram Eventual targets

Arbekacin (semi-synthetic) MRSA Drug $20,900/gram Modular decomposition through function summaries Amikacin (natural) Nothing interesting $118/gram

Acknowledgements 48

49 Sugar Acetaminophen Tylenol Bio repositories Data dedup/correlation + Search DNA synthesis + Plasmid Transformation Chemical transformation functions from data Language for biochemistry Synthesis of biochemical program (i.e., DNA modifications) Synthesis of internals of transformation function Polymers Nylon etc. Biofuels Sugar Opportunity Current status +

Backup slides

51 Nylon precursor (polymer)

52 Butanol (fuel)

Architecture 55,000 entries Pubchem: 53M entries Uniprot: 23M entries Organisms: Pubchem names/Bren da names: 55k