Functional Genomic Hypothesis Generation and Experimentation by a Robot Scientist King et al, Nature 2004 427:247-252 Presented by Monica C. Sleumer February.

Slides:



Advertisements
Similar presentations
Discovery Informatics Workshop February 2-3, 2012 NSF Workshop on Discovery Informatics Vasant Honavar Program Director Information & Intelligent Systems.
Advertisements

What is Science?.
Observationshypothesispredictionsexperimentsanalysis.
Detecting active subnetworks in molecular interaction networks with missing data Luke Hunter Texas A&M University SHURP 2007 Student.
FCAT Review The Nature of Science
Metabolic functions of duplicate genes in Saccharomyces cerevisiae Presented by Tony Kuepfer et al
AI and Bioinformatics From Database Mining to the Robot Scientist.
APRIL, Application of Probabilistic Inductive Logic Programming, IST Albert-Ludwigs-University, Freiburg, Germany & Imperial College of Science,
DECO3008 Design Computing Preparatory Honours Research KCDCC Mike Rosenman Rm 279
Integrated analysis of regulatory and metabolic networks reveals novel regulatory mechanisms in Saccharomyces cerevisiae Speaker: Zhu YANG 6 th step, 2006.
AI - Week 24 Uncertain Reasoning (quick mention) then REVISION Lee McCluskey, room 2/07
Pathway databases Goto S, Bono H, Ogata H, Fujibuchi W, Nishioka T, Sato K, Kanehisa M. (1997) Organizing and computing metabolic pathway data in terms.
Statement of the Problem Goal Establishes Setting of the Problem hypothesis Additional information to comprehend fully the meaning of the problem scopedefinitionsassumptions.
The bioinformatics of biological processes The challenge of temporal data Per J. Kraulis CMCM, Tartu University.
Introduction, Acquiring Knowledge, and the Scientific Method
Application and Efficacy of Random Forest Method for QSAR Analysis
Inferential Statistics
1 CS 178H Introduction to Computer Science Research What is CS Research?
How can you find a supported answer to an investigative question?
From Gene to Protein: Chpt. 17.
Tennessee Technological University1 The Scientific Importance of Big Data Xia Li Tennessee Technological University.
11 C H A P T E R Artificial Intelligence and Expert Systems.
Combining Inductive Logic Programming, Active Learning and Robotics to Discover the Function of Genes by C.H. Bryant, S.H. Muggleton, S.G. Oliver, D.B.
Chapter 1 “The Science of Biology” The goal of science is to investigate and understand, to explain events in nature, and to use those explanations to.
Reverse engineering gene regulatory networks Dirk Husmeier Adriano Werhli Marco Grzegorczyk.
Scientific Method for a controlled experiment. Observation Previous data Previous results Previous conclusions.
Research Concepts: Principles version 2.0
Biology and You Chapter 1. Objectives Relate the seven properties of life to a living organism Relate the seven properties of life to a living organism.
ICOM 6115: Computer Systems Performance Measurement and Evaluation August 11, 2006.
Chapter 1 The Science of Biology. Section 1 – What is Science? The goal of science is to investigate and understand nature, to explain events in nature,
CHAPTER 1 – THE SCIENCE OF BIOLOGY What Is Science? (A) Organized way of using evidence to learn about the natural world. (B) Collection of knowledge that.
Biological Science.
Learning Metabolic Network Inhibition using Abductive Stochastic Logic Programming Jianzhong Chen, Stephen Muggleton, José Santos Imperial College, London.
Estimating Component Availability by Dempster-Shafer Belief Networks Estimating Component Availability by Dempster-Shafer Belief Networks Lan Guo Lane.
Chapter 1: The Science of Biology
Reengineering the TCL Job Compiler SEM49060 Project Talk Ben Tagger 8 th May 2003.
What is Science? Science is  A way of learning about the natural world through observations and logical reasoning.  This information can grow and change.
Extracting binary signals from microarray time-course data Debashis Sahoo 1, David L. Dill 2, Rob Tibshirani 3 and Sylvia K. Plevritis 4 1 Department of.
Scientific Method Vocabulary
Science Terms TAKS Objective 1.
Introduction to biological molecular networks
1-1 What is Science? Objectives: State the goals of science Describe the steps of the scientific method.
Chapter 1 What is Biology? 1.1 Science and the Natural World.
Lesson Overview Lesson Overview What Is Science?.
Essential Questions What is biology? What are possible benefits of studying biology? What are the characteristics of living things? Introduction to Biology.
Scientific approach Two forms: Discovery science (descriptive) Hypothesis-driven science (specific)
Scientific Method Biology Image from:
Unit 1 The Science of Biology Part 1- What is Science?
CHAPTER 1 – THE SCIENCE OF BIOLOGY What Is Science? (A) Organized way of using evidence to learn about the natural world. (B) Collection of knowledge that.
Sub-fields of computer science. Sub-fields of computer science.
Intelligent software for laboratory automation
Scientific Reasoning Forensic Science.
Research Methods in Computer Science
The Scientific Method.
Chapter 1: The Science of Biology
Block 1 Do Now 1. What are the five major branches of earth science. 2
Week 3 Vocabulary Science Scientific Method Engineering Method
Science of Biology
The Scientific Method Unit 1.
Biology and You.
Scientific inquiry: a method
Summary of the Standards of Learning
TA : Mubarakah Otbi, Duaa al Ofi , Huda al Hakami
Scientific Method Integrated Sciences.
Introduction.
Are You Smarter Than a 5th Grader?
Investigating Scientifically
Introduction to Biology
The Nature of Science What is Science About?.
Generalized Diagnostics with the Non-Axiomatic Reasoning System (NARS)
Presentation transcript:

Functional Genomic Hypothesis Generation and Experimentation by a Robot Scientist King et al, Nature : Presented by Monica C. Sleumer February 5, 2004

Scientific Discovery “Branch of AI devoted to developing algorithms for acquiring scientific knowledge” Current applications: –Analysis of mass-spec data –Discovering structure-activity relationships for compounds –Making semantic connections in published literature –Predicting mechanisms for chemical reactions –Revising taxonomies to accommodate new data Connect to laboratory instrumentation

Accomplishment Automated entire scientific process Robotic system that uses AI to “carry out cycles of scientific experimentation”: –Originates hypotheses –Designs experiments –Performs the experiments –Interprets the results

Application: Functional genomics Function unknown for 30% of yeast genes Complete laboratory automation possible Goal: connect genes to their function Using: –Logical model of aromatic amino acid synthesis pathway –8 deletion mutants –9 metabolites –Auxotrophic growth experiments

Aromatic Amino Acid Pathway

Classical vs Robot Science Classical method: –Scientific expertise and imagination used to form hypotheses –Consequences of hypotheses tested by experiment Robot Scientist: –Hypotheses formed by abduction –Tested by deduction

Deduction and Abduction Deduction –Rule: P  Q, Fact: ~Q, Infer: ~P –E.g.If a cell grows on minimal medium, then it can synthesise tryptophan –Fact Cell cannot synthesise tryptophan – ∴ Cell cannot grow on minimal medium Abduction –Rule: P  Q, Fact: ~P, Hypothesize: ~Q –E.g.If a cell grows on minimal medium, then it can synthesise tryptophan –Fact Cell cannot grow on minimal medium – ∴ Cell cannot synthesise tryptophan

Implementation Software: –Background knowledge –Logical inference engine –Hypothesis generation code –Experiment selection code –LIMS code Hardware: –Liquid-handling robot –Plate reader –CPU to do the scientific reasoning No human intellectual input into: –Experimental design –Data interpretation

Robot Scientist

Logical Process Prolog used to model data Metabolic pathway represented as a directed graph Deduction: a knockout mutant will grow IFF a path can be found from the given metabolites to the 3 needed aa. Abduction: if a knockout mutant doesn’t grow using the given metabolites: hypothesize which enzyme is missing

Machine Learning Improves performance based on prior experience Each hypothesis has –Cost of testing –Probability of being correct Goals –Find out which gene goes with which enzyme –Use the fewest possible resources

Experiment Choosing 3 ways: –Intelligent: “ASE” –Cheapest Experiment: Naïve –Random Experiment Performance: –Accuracy: # of correct predictions made –Cost and number of experiments required Both real experiments and simulations Comparison to human

Accuracy of the Experiment Choosers ASE Naive Random ASE Naive Random

Results of Computer Simulations ASE Naive Random ASE Naive No noise Noise

Conclusions Scientific process can be automated Experiment selection strategies have significant impact on cost ASE outperforms –Naïve by 3 fold –Random by 100 fold in terms of cost Performance is competitive with human Cost-effectiveness of science can be improved

Future Work Extend system to uncover function of other metabolic genes Would need to: –Extend model to entire biochemical pathway in KEGG –Become more robust in terms of possible errors in KEGG –Include prediction of previously unknown enzymes

Criticisms De-emphasis on how little of the pathway was actually tested Not clear how deletion mutants were chosen No example of experiment cycle Too large of a jump from theory to results Results graphs too crowded

Discussion Questions Would computer-generated experiments and results be accepted? How much would we have to understand about a computer-generated discovery process? Compare this system to currently common method of: –Large-scale generation of data –Extraction of knowledge by data-mining systems What other aspects of genome analysis could scientific discovery be applied to?