BioSigNet: Reasoning and Hypothesizing about Signaling Networks Nam Tran.

Slides:



Advertisements
Similar presentations
Active Reading: “Scientific Processes”
Advertisements

Intelligent Technologies Module: Ontologies and their use in Information Systems Revision lecture Alex Poulovassilis November/December 2009.
CSE 494/CSE 598/CBS 598 Application of AI to molecular Biology (4:40 – 5: 55 PM, BYAC 190) Instructor: Chitta Baral Office hours: TTh 3 to 4 PM.
The 20th International Conference on Software Engineering and Knowledge Engineering (SEKE2008) Department of Electrical and Computer Engineering
color code vocabulary words and definitions
Translation-Based Compositional Reasoning for Software Systems Fei Xie and James C. Browne Robert P. Kurshan Cadence Design Systems.
Answering complex questions and performing deep reasoning in advance question answering systems Chitta Baral 1, Michael Gelfond 2 and Richard Scherl 3.
Copyright © Allyn & Bacon (2007) Hypothesis Testing, Validity, and Threats to Validity Graziano and Raulin Research Methods: Chapter 8 This multimedia.
Prof. Carolina Ruiz Computer Science Department Bioinformatics and Computational Biology Program WPI WELCOME TO BCB4003/CS4803 BCB503/CS583 BIOLOGICAL.
Learning Objectives Explain similarities and differences among algorithms, programs, and heuristic solutions List the five essential properties of an algorithm.
Chapter 10 Algorithmic Thinking. Copyright © 2013 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Learning Objectives List the five essential.
DECO3008 Design Computing Preparatory Honours Research KCDCC Mike Rosenman Rm 279
A knowledge based approach for representing, reasoning and hypothesizing about biochemical networks Chitta Baral Arizona State University.
CSE 591 (99689) Application of AI to molecular Biology (5:15 – 6: 30 PM, PSA 309) Instructor: Chitta Baral Office hours: Tuesday 2 to 5 PM.
Relational Data Mining in Finance Haonan Zhang CFWin /04/2003.
Models -1 Scientists often describe what they do as constructing models. Understanding scientific reasoning requires understanding something about models.
Computational Molecular Biology (Spring’03) Chitta Baral Professor of Computer Science & Engg.
Bioinformatics: a Multidisciplinary Challenge Ron Y. Pinter Dept. of Computer Science Technion March 12, 2003.
Use of Ontologies in the Life Sciences: BioPax Graciela Gonzalez, PhD (some slides adapted from presentations available at
Introduction to CSE 591: Autonomous agents - theory and practice. Chitta Baral Professor Department of Computer Sc. & Engg. Arizona State University.
Modeling and Validation Victor R. Basili University of Maryland 27 September 1999.
Specifying a Purpose, Research Questions or Hypothesis
Computational Systems Biology Prepared by: Rhia Trogo Rafael Cabredo Levi Jones Monteverde.
Scientific Thinking - 1 A. It is not what the man of science believes that distinguishes him, but how and why he believes it. B. A hypothesis is scientific.
EXPERT SYSTEMS Part I.
1 CIS607, Fall 2005 Semantic Information Integration Presentation by Zebin Chen Week 7 (Nov. 9)
CBioC: Massive Collaborative Curation of Biomedical Literature Future Directions.
Developing Ideas for Research and Evaluating Theories of Behavior
Class Projects. Future Work and Possible Project Topic in Gene Regulatory network Learning from multiple data sources; Learning causality in Motifs; Learning.
Describing Syntax and Semantics
Amarnath Gupta Univ. of California San Diego. An Abstract Question There is no concrete answer …but …
Katanosh Morovat.   This concept is a formal approach for identifying the rules that encapsulate the structure, constraint, and control of the operation.
Knowledgebase Creation & Systems Biology: A new prospect in discovery informatics S.Shriram, Siri Technologies (Cytogenomics), Bangalore S.Shriram, Siri.
Inductive Logic Programming Includes slides by Luis Tari CS7741L16ILP.
Some Thoughts to Consider 13 What do we really mean by ‘learning’ in a software system? Can humans or systems learn anything that they don’t already know?
Automated Explanation of Gene-Gene Relationships Wacek Kuśnierczyk.
Ontologies Reasoning Components Agents Simulations Belief Update, Planning and the Fluent Calculus Jacques Robin.
Knowledge representation
1 Abduction and Induction in Scientific Knowledge Development Peter Flach, Antonis Kakas & Oliver Ray AIAI Workshop 2006 ECAI August, 2006.
What is Science?.  Science = Latin “to know” Inquiry is at the heart of science.  Inquiry: search for information and explanation Two main processes:
Big Idea 1: The Practice of Science Description A: Scientific inquiry is a multifaceted activity; the processes of science include the formulation of scientifically.
A New Oklahoma Bioinformatics Company. Microarray and Bioinformatics.
Computational biology of cancer cell pathways Modelling of cancer cell function and response to therapy.
Biology and You Chapter 1. Objectives Relate the seven properties of life to a living organism Relate the seven properties of life to a living organism.
Agent-based methods for translational cancer multilevel modelling Sylvia Nagl PhD Cancer Systems Science & Biomedical Informatics UCL Cancer Institute.
Formal Structuring of Genomic Knowledge Nigam Shah Postdoctoral Fellow, SMI
BAA - Big Mechanism using SIRA Technology Chuck Rehberg CTO at Trigent Software and Chief Scientist at Semantic Insights™
Biological Signal Detection for Protein Function Prediction Investigators: Yang Dai Prime Grant Support: NSF Problem Statement and Motivation Technical.
Computational Diagnostics A new research group at the Max Planck Institute for molecular Genetics, Berlin.
Computations using pathways and networks Nigam Shah
SCIENCE A system of knowledge about the natural world and the methods used to find that knowledge.
Statistical Testing with Genes Saurabh Sinha CS 466.
Theories and Hypotheses. Assumptions of science A true physical universe exists Order through cause and effect, the connections can be discovered Knowledge.
Mining the Biomedical Research Literature Ken Baclawski.
Bioinformatics and Computational Biology
Distributed Models for Decision Support Jose Cuena & Sascha Ossowski Pesented by: Gal Moshitch & Rica Gonen.
Florian A. Twaroch Institute for Geoinformation and Cartography, TU Vienna Naive Semantic Interoperability Florian A. Twaroch.
Modeling the cell cycle regulation by the RB/E2F pathway Laurence Calzone Service de Bioinformatique U900 Inserm / Ecoles de Mines / Institut Curie Collaborative.
Abstract Writing Workshop Grady Gauthier Jason I. Hong EECS Department University of California at Berkeley.
Evolution and the Foundations of Biology
Progress in Cancer Therapy Following Developments in Biopharma
Some Thoughts to Consider 5 Take a look at some of the sophisticated toys being offered in stores, in catalogs, or in Sunday newspaper ads. Which ones.
An argument-based framework to model an agent's beliefs in a dynamic environment Marcela Capobianco Carlos I. Chesñevar Guillermo R. Simari Dept. of Computer.
Artificial Intelligence Knowledge Representation.
Sub-fields of computer science. Sub-fields of computer science.
What contribution can automated reasoning make to e-Science?
Computational Diagnostics
George Baryannis and Dimitris Plexousakis
Causal Models Lecture 12.
Test-Driven Ontology Development in Protégé
Presentation transcript:

BioSigNet: Reasoning and Hypothesizing about Signaling Networks Nam Tran

Main points Biomedical databases: structured data and queries. Next step: knowledge bases and reasoning. Kinds of reasoning, incomplete knowledge How can existing knowledge be revised, expanded? Hypothesis formation Experimental verifications

Knowledge based reasoning Various kinds of reasoning Prediction – side effects Planning – designing therapies Explanation – reasoning about unobserved aspects Consistency checking – correctness of ontologies Additional facets/nuances Reasoning with incomplete knowledge. Reasoning with defaults. Ease of updating knowledge (elaboration tolerance)

Hypothesis formation If: our observations can not be explained by our existing knowledge? or the explanations given by our existing knowledge are invalidated by experiments? Then: Our knowledge needs to be augmented or revised? How? Can we use a reasoning system to predict some hypothesis that one can verify through experimentation?

Hypothesis space Knowledge base No cancer Cance r p53 UV leads_to cancer High UV (K,I) | = O

Motivation -- summary Goal: To emulate the abstract reasoning done by biologists, medical researchers, and pharmacology researchers. Types of reasoning: prediction, explanation and planning. Current system biology approaches: mostly prediction. Incomplete knowledge constantly needs to be updated -> Hypothesis formation

Overview of our approach Represent signal network as a knowledge base that describes actions/events (biological interactions, processes). effect of these actions/events. triggering conditions of the actions/events. To query using the knowledge base: Prediction; explanation; planning. Hypothesizing to discover new knowledge BioSigNet-RRH: Biological Signal Network – Representation, Reasoning and Hypothesizing

Foundation behind our approach Research on representing and reasoning about dynamic systems (space shuttles, mobile robots, software agents) causal relations between properties of the world effects of actions (when can they be executed) goal specification action-plans Research on knowledge representation, reasoning and declarative problem solving – the AnsProlog language.

Representing signal networks as a Knowledge Base Alphabet: Actions/Events: bind(ligand,receptor) Fluents: high(ligand), high(receptor) Statements: Effect axioms: bind(ligand,receptor) causes bound(ligand,receptor) if con. high(other_ligand) inhibits bind(lig,receptor) if cond. Trigger conditions: high(ligand), high(receptor) triggers bind(ligand,receptor)

Initial observations, Queries, Entailment Entailment: (K,I) |= Q Given K: the knowledge base of binding I: initially high(ligand), high(receptor) Conclude Q = eventually bound(ligand,receptor) Given K: the knowledge base of binding I: initially high(ligand), high(receptor), high(other_ligand) Conclude Q

Importance of a formal semantics Besides defining prediction, explanation and planning, it is also useful in identifying: Under what restrictions the answer given by a given algorithm will be correct. (soundness!) Under what restrictions a given algorithm will find a correct answer if one exists. (completeness!)

bind(TNF-,TNFR1) causes trimerized(TNFR1) trimerized(TNFR1) triggers bind(TNFR1,TRADD)

Prediction Given some initial conditions and observations, to predict how the world would evolve or predict the outcome of (hypothetical) interventions.

Initial Condition – bind(TNF-α,TNF-R1) occurs at 0 Query – predict eventually apoptosis Answer: Unknown! – Incomplete knowledge about the TRADDs bindings. – Depends on if bind(TRADD, RIP) happened or not!

Initial Condition – bind(TNF-α,TNF-R1) occurs at t0 Observation – TRADDs binding with TRAF2, FADD, RIP Query – predict eventually apoptosis Answer: Yes!

Explanation Given initial condition and observations, to explain why final outcome does not match expectation. Relation to diagnosis.

Initial condition: – bound(TNF-,TNFR1) at t0 Observation: – bound(TRADD, TRAF2) at t1 Query: Explain apoptosis One explanation: – Binding of TRADD with RIP – Binding of TRADD with FADD

Planning Given initial conditions, to plan interventions to achieve a goal. Application in drug and therapy design.

Planning requirements In addition to the knowledge about the pathway we need additional information about possible interventions such as: What proteins can be introduced What mutations can be forced.

Planning example Defining possible interventions: intervention intro(DN-TRAF2) intro(DN-TRAF2) causes present(DN-TRAF2) present(DN-TRAF2) inhibits bind(TRAF2,TRADD) present(DN-TRAF2) inhibits interact(TRAF2,NIK) Initial condition: bound(NFκB,IκB) at 0 bind(TNF-α,TNF-R1) at 0 Goal: to keep NFκB remain inactive. Query: plan always bound(NFκB,IκB) from 0

Future Works! Further development of the language To better approximate cellular systems Delay triggers Granularities of representation Continuous processes, hybrid systems Concurrency, durative actions Scaled-up implementation Kohns map Networks in Reactome and other repositories Ontologies Integration with BioPax

Hypothesis space Knowledge base No cancer Cance r p53 UV leads_to cancer High UV (K,I) | = O

Issues in this tiny example Hypothesis formation: Theory: UV leads to cancer. Observation: wild-type p53 resists the UV effect. Hypothesis: p53 is a tumor-suppressor. Elaboration tolerance: How do we update/revise UV leads to cancer? Defaults and non-monotonic reasoning: Normally UV leads to cancer. UV does not lead to cancer if p53 is present.

Construction of hypothesis space Present: manual construction, using research literature Future: integration of multiple data sources Protein interactions Pathway databases Biological ontologies …….. Provide cues, hunches such as A may interact with B: action interact(A,B) A-B interaction may have effect C: interact(A,B) causes C

Generation of hypotheses Enumeration of hypotheses Search: computing with Smodels (an implementation of AnsProlog) Heuristics A trigger statement is selected only if it is the only cause of some action occurrence that is needed to explain the novel observations. An inhibition statement is selected only if it is the only blocker of some triggered action at some time. Maximizing preferences of selected statements

Generation … (cont): heuristics Knowledge base K a causes g b causes g Initial condition I = { intially f } Observation O = { eventually g } (K,I) does not entail O Hypothesis space: to expand K with rules among f triggers a f triggers b Hypotheses: { f triggers a }, or { f triggers b }

Case study: p53 network

Tumor suppression by p53 p53 has 3 main functional domains N terminal transactivator domain Central DNA-binding domain C terminal domain that recognizes DNA damage Appropriate binding of N terminal activates pathways that lead to protection of cell from cancer. Inappropriate binding (say to Mdm2) inhibits p53 induced tumor suppression.

p53 knowledge base Stress high(UV ) triggers upregulate(mRNA(p53)) Upregulation of p53 upregulate(mRNA(p53)) causes high(mRNA(p53)) high(mRNA(p53)) triggers translate(p53) translate(p53) causes high(p53)

p53 knowledge base (cont.) Tumor suppression by p53 high(p53) inhibits growth(tumor)

p53 knowledge base (cont) Interaction between Mdm2 and p53 high(p53), high(mdm2) triggers bind(p53,mdm2) bind(p53,mdm2) causes bound(dom(p53,N)) bind(p53,mdm2) causes high([p53 : mdm2]), bind(p53,mdm2) causes ¬high(p53),¬high(mdm2)

Hypothesis formation Experimental observation: I = { initially high(UV), high(mdm2), high(ARF) } O = { eventually ~ tumorous } (K,I) does not entail O Need to hypothesize the role of ARF.

Constructing hypothesis space Levels of ARF and p53 correlate high(ARF) triggers upregulate(mRNA(p53)) high(p53) triggers upregulate(mRNA(ARF))

Interactions of ARF with the known proteins bind(p53,ARF) causes bound(dom(p53,N)) Constructing …(cont)

Influence of X (=ARF) on other interactions high(ARF) triggers upreg(mRNA(p53)) high(ARF) triggers translate(p53) high(ARF) triggers bind(p53,mdm2) Constructing …(cont)

Hypothesis high(UV) triggers upregulate(mRNA(ARF)) high(ARF), high(mdm2) triggers bind(ARF,mdm2)

Future Works Automatic construction of hypothesis space Extraction of facts like protein interactions … Integration of knowledge from different sources Consistency-based integration (HyBrow) Ontologies Heuristics for hypothesis search Ranking of hypotheses Make use of number data like microarray?