Pat Langley School of Computing and Informatics Arizona State University Tempe, Arizona Computational Discovery.

Slides:



Advertisements
Similar presentations
Computational Revision of Ecological Process Models
Advertisements

Chapter 14: Usability testing and field studies
Performance Assessment
Pat Langley Dileep George Stephen Bay Computational Learning Laboratory Center for the Study of Language and Information Stanford University, Stanford,
Pat Langley Computational Learning Laboratory Center for the Study of Language and Information Stanford University, Stanford, California
Pat Langley Institute for the Study of Learning and Expertise Javier Sanchez CSLI / Stanford University Ljupco Todorovski Saso Dzeroski Jozef Stefan Institute.
Pat Langley Institute for the Study of Learning and Expertise Palo Alto, California and Center for the Study of Language and Information Stanford University,
Pat Langley Computational Learning Laboratory Center for the Study of Language and Information Stanford University, Stanford, California
Pat Langley School of Computing and Informatics Arizona State University Tempe, Arizona USA Mental Simulation and Learning in the I CARUS Architecture.
Pat Langley School of Computing and Informatics Arizona State University Tempe, Arizona USA Modeling Social Cognition in a Unified Cognitive Architecture.
Pat Langley Center for the Study of Language and Information Stanford University, Stanford, California
Pat Langley Computational Learning Laboratory Center for the Study of Language and Information Stanford University, Stanford, California USA
Filtering Information in Complex Temporal Domains
Pat Langley Computational Learning Laboratory Center for the Study of Language and Information Stanford University, Stanford, California USA
Pat Langley School of Computing and Informatics Arizona State University Tempe, Arizona A Cognitive Architecture for Integrated.
Pat Langley Institute for the Study of Learning and Expertise Palo Alto, California and Center for the Study of Language and Information Stanford University,
Pat Langley School of Computing and Informatics Arizona State University Tempe, Arizona USA A Unified Cognitive Architecture for Embodied Agents Thanks.
Pat Langley Center for the Study of Language and Information Stanford University, Stanford, California
Pat Langley Center for the Study of Language and Information Stanford University, Stanford, California
Pat Langley Institute for the Study of Learning and Expertise 2164 Staunton Court, Palo Alto, California and School of Computing and Informatics Arizona.
Pat Langley School of Computing and Informatics Arizona State University Tempe, Arizona Computational Discovery of Explanatory Process Models Thanks to.
Javier Sanchez Pat Langley Computational Learning Laboratory Center for the Study of Language and Information Stanford University, Stanford, California.
Pat Langley Institute for the Study of Learning and Expertise Palo Alto, California and Center for the Study of Language and Information Stanford University,
Pat Langley Institute for the Study of Learning and Expertise Palo Alto, California A Cognitive Architecture for Complex Learning.
Pat Langley Computer Science and Engineering / Psychology Arizona State University Tempe, Arizona Challenges and Opportunities in Informatics Research.
Requirements gathering
Bayesian network for gene regulatory network construction
1 Modeling and Simulation: Exploring Dynamic System Behaviour Chapter9 Optimization.
Agent-based Modeling: A Brief Introduction Louis J. Gross The Institute for Environmental Modeling Departments of Ecology and Evolutionary Biology and.
Present by Oz Shapira.  User modeling ”is a sub-area of human–computer interaction, in which the researcher / designer develops cognitive models of human.
INTRODUCTION TO MODELING
Data Mining Methodology 1. Why have a Methodology  Don’t want to learn things that aren’t true May not represent any underlying reality ○ Spurious correlation.
Learning Process-Based Models of Dynamic Systems Nikola Simidjievski Jozef Stefan Institute, Slovenia HIPEAC 2014 LJUBLJANA.
Display of Information for Time-Critical Decision Making Eric Horvitz Decision Theory Group Microsoft Research Redmond, Washington 98025
Pat Langley School of Computing and Informatics Arizona State University Tempe, Arizona Institute for the Study of Learning and Expertise Palo Alto, California.
Integrating Bayesian Networks and Simpson’s Paradox in Data Mining Alex Freitas University of Kent Ken McGarry University of Sunderland.
Pat Langley Computational Learning Laboratory Center for the Study of Language and Information Stanford University, Stanford, California
P ROCESSES AND C ONSTRAINTS IN S CIENTIFIC M ODEL C ONSTRUCTION Will Bridewell † and Pat Langley †‡ † Cognitive Systems Laboratory, CSLI, Stanford University.
Marakas: Decision Support Systems, 2nd Edition © 2003, Prentice-Hall Chapter Chapter 7: Expert Systems and Artificial Intelligence Decision Support.
Dynamic Models Lecture 13. Dynamic Models: Introduction Dynamic models can describe how variables change over time or explain variation by appealing to.
LÊ QU Ố C HUY ID: QLU OUTLINE  What is data mining ?  Major issues in data mining 2.
Thanks to K. Arrigo, G. Bradshaw, S. Borrett, W. Bridewell, S. Dzeroski, H. Simon, L. Todorovski, and J. Zytkow for their contributions to this research,
QUALITATIVE MODELING IN EDUCATION Bert Bredweg and Ken Forbus Yeşim İmamoğlu.
1.What is science? 2.Why should we study science? 3.What did we do before science? 4.What role does Math have in Science? What We Will Address.
Author: James Allen, Nathanael Chambers, etc. By: Rex, Linger, Xiaoyi Nov. 23, 2009.
BLAST: A Case Study Lecture 25. BLAST: Introduction The Basic Local Alignment Search Tool, BLAST, is a fast approach to finding similar strings of characters.
Discovering Dynamic Models Lecture 21. Dynamic Models: Introduction Dynamic models can describe how variables change over time or explain variation by.
Pat Langley Adam Arvay Department of Computer Science University of Auckland Auckland, NZ Heuristic Induction of Rate-Based Process Models Thanks to W.
A Design Science (Multi-Methodological) Approach to IS Research Presented by: Dr. Jay F. Nunamaker, Jr. 1.
The Field of Psychology Gaining Insight into Behavior Behavior results from physiological (physical) processes and cognitive (intellectual) processes.
Copyright © 2011 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin Developing and Evaluating Theories of Behavior.
Discovering Descriptive Knowledge Lecture 18. Descriptive Knowledge in Science In an earlier lecture, we introduced the representation and use of taxonomies.
Developing a Framework for Modeling and Simulating Aedes aegypti and Dengue Fever Dynamics Tiago Lima (UFOP), Tiago Carneiro (UFOP), Raquel Lana (Fiocruz),
Building Simulation Model In this lecture, we are interested in whether a simulation model is accurate representation of the real system. We are interested.
Dendral: A Case Study Lecture 25.
Theme 2: Data & Models One of the central processes of science is the interplay between models and data Data informs model generation and selection Models.
Advanced Computer Architecture & Processing Systems Research Lab Framework for Automatic Design Space Exploration.
Evolution Webquest Created by Trina Mitchell Summer 2010.
Artificial Intelligence: Research and Collaborative Possibilities a presentation by: Dr. Ernest L. McDuffie, Assistant Professor Department of Computer.
Pat Langley School of Computing and Informatics Arizona State University Tempe, Arizona Computational Assistance for Systems Biology of Aging Thanks to.
Borrett et al Computational Discovery of Process Models for Aquatic Ecosystems August 2006 Ecological Society of America, Memphis, TN Natasa Atanasova.
Model Discovery through Metalearning
Chapter 6 Calibration and Application Process
Models as Tools in Science
CSc4730/6730 Scientific Visualization
Causal Models Lecture 12.
Presented By: Darlene Banta
Scientific Workflows Lecture 15
Presentation transcript:

Pat Langley School of Computing and Informatics Arizona State University Tempe, Arizona Computational Discovery of Explanatory Process Models Thanks to N. Asgharbeygi, K. Arrigo, D. Billman, S. Borrett, W. Bridewell, S. Dzeroski, O. Shiran, and L. Todorovski for their contributions to this research, which is funded by a grant from the National Science Foundation.

The Challenge of Systems Science focus on synthesis rather than analysis in their operation; focus on synthesis rather than analysis in their operation; develop system-level models with many variables / relations; develop system-level models with many variables / relations; rely on computational methods to aid in their construction. rely on computational methods to aid in their construction. Disciplines like Earth science and computational biology differ from traditional fields in that they: However, the key challenge involves search through the model space, not running rapid simulations or handling large data sets.

Example: Explain Data from the Ross Sea

A Model of the Ross Sea Ecosystem d[phyto,t,1] = phyto zoo phyto d[zoo,t,1] = zoo zoo d[detritus,t,1] = phyto zoo zoo detritus d[nitro,t,1] = phyto detritus Differential equation models of this sort are regularly used to explain observations and predict future behavior.

The Task of Model Construction Environmental scientists are confronted with a challenging task: Given: A set of variables of interest to the scientist; Given: A set of variables of interest to the scientist; Given: Observations of how these variables change over time; Given: Observations of how these variables change over time; Find: A model that explains these variations in plausible terms and that generalizes well to future observations. Find: A model that explains these variations in plausible terms and that generalizes well to future observations. Automating such model construction is a natural task for artificial intelligence and machine learning. We can develop algorithms that search the space of differential equation models, but this space is huge, so we need constraints.

Another Account of the Ross Sea Ecosystem d[phyto,t,1] = phyto zoo phyto d[zoo,t,1] = zoo zoo d[detritus,t,1] = phyto zoo zoo detritus d[nitro,t,1] = phyto detritus As phytoplankton uptakes nitrogen, its concentration increases and nitrogen decreases. This continues until the nitrogen supply is exhausted, which leads to a phytoplankton die off. This produces detritus, which gradually remineralizes to replenish the nitrogen. Zooplankton grazes on phytoplankton, which slows the latters increase and also produces detritus.

Processes in the Ross Sea Ecosystem d[phyto,t,1] = phyto zoo phyto d[zoo,t,1] = zoo zoo d[detritus,t,1] = phyto zoo zoo detritus d[nitro,t,1] = phyto detritus Here we highlight the terms related to phytoplantkon loss, which decreases phyto concentration and increases detritus. Knowledge about candidate processes requires that some terms occur either together or not at all.

d[phyto,t,1] = phyto zoo phyto d[zoo,t,1] = zoo zoo d[detritus,t,1] = phyto zoo zoo detritus d[nitro,t,1] = phyto detritus Processes in the Ross Sea Ecosystem We can use knowledge about processes to reorganize models and constrain search through the model space. Here we highlight terms related to zooplankton grazing, which decreases phyto but increases zoo and detritus.

A Process Model for the Ross Sea model Ross_Sea_Ecosystem variables: phyto, zoo, nitro, detritus observables: phyto, nitro process phyto_loss equations:d[phyto,t,1] = phyto equations:d[phyto,t,1] = phyto d[detritus,t,1] = phyto process zoo_loss equations:d[zoo,t,1] = zoo equations:d[zoo,t,1] = zoo d[detritus,t,1] = zoo process zoo_phyto_grazing equations:d[zoo,t,1] = zoo equations:d[zoo,t,1] = zoo d[detritus,t,1] = zoo d[phyto,t,1] = zoo process nitro_uptake equations:d[phyto,t,1] = phyto equations:d[phyto,t,1] = phyto d[nitro,t,1] = phyto process nitro_remineralization; equations:d[nitro,t,1] = detritus equations:d[nitro,t,1] = detritus d[detritus,t,1 ] = detritus This model is equivalent to a standard differential equation model, but it makes explicit assumptions about which processes are involved. For completeness, we must also make assumptions about how to combine influences from multiple processes.

The Task of Inductive Process Modeling We can use these ideas to reformulate the modeling problem: Given: A set of variables of interest to the scientist; Given: A set of variables of interest to the scientist; Given: Observations of how these variables change over time; Given: Observations of how these variables change over time; Given: Background knowledge about plausible processes; Given: Background knowledge about plausible processes; Find: A process model that explains these variations and that generalizes well to future observations. Find: A process model that explains these variations and that generalizes well to future observations. We can use background knowledge about candidate processes to make search much more tractable. Moreover, the resulting model will be consistent with this domain knowledge, making it more comprehensible.

Generic Processes as Background Knowledge the variables involved in a process and their types; the variables involved in a process and their types; the parameters appearing in a process and their ranges; the parameters appearing in a process and their ranges; the forms of conditions on the process; and the forms of conditions on the process; and the forms of associated equations and their parameters. the forms of associated equations and their parameters. We cast background knowledge as generic processes that specify: Generic processes are building blocks from which one can compose a specific process model.

Generic Processes for Aquatic Ecosystems generic process exponential_lossgeneric process remineralization variables: S{species}, D{detritus} variables: N{nutrient}, D{detritus} variables: S{species}, D{detritus} variables: N{nutrient}, D{detritus} parameters: [0, 1] parameters: [0, 1] parameters: [0, 1] parameters: [0, 1] equations:d[S,t,1] = 1 S equations:d[N, t,1] = D equations:d[S,t,1] = 1 S equations:d[N, t,1] = D d[D,t,1] = Sd[D, t,1] = 1 D generic process grazinggeneric process constant_inflow variables: S1{species}, S2{species}, D{detritus} variables: N{nutrient} variables: S1{species}, S2{species}, D{detritus} variables: N{nutrient} parameters: [0, 1], [0, 1] parameters: [0, 1] parameters: [0, 1], [0, 1] parameters: [0, 1] equations:d[S1,t,1] = S1 equations:d[N,t,1] = equations:d[S1,t,1] = S1 equations:d[N,t,1] = d[D,t,1] = (1 ) S1 d[S2,t,1] = 1 S1 generic process nutrient_uptake variables: S{species}, N{nutrient} variables: S{species}, N{nutrient} parameters: [0, ], [0, 1], [0, 1] parameters: [0, ], [0, 1], [0, 1] conditions:N > conditions:N > equations:d[S,t,1] = S equations:d[S,t,1] = S d[N,t,1] = 1 S Our current library contains about 20 generic processes, including ones with alternative functional forms for loss and grazing processes.

process exponential_growth variables: P {population} variables: P {population} equations: d[P,t] = [0, 1, ] P equations: d[P,t] = [0, 1, ] P process logistic_growth variables: P {population} variables: P {population} equations: d[P,t] = [0, 1, ] P (1 P / [0, 1, ]) equations: d[P,t] = [0, 1, ] P (1 P / [0, 1, ]) process constant_inflow variables: I {inorganic_nutrient} variables: I {inorganic_nutrient} equations: d[I,t] = [0, 1, ] equations: d[I,t] = [0, 1, ] process consumption variables: P1 {population}, P2 {population}, nutrient_P2 variables: P1 {population}, P2 {population}, nutrient_P2 equations: d[P1,t] = [0, 1, ] P1 nutrient_P2, equations: d[P1,t] = [0, 1, ] P1 nutrient_P2, d[P2,t] = [0, 1, ] P1 nutrient_P2 d[P2,t] = [0, 1, ] P1 nutrient_P2 process no_saturation variables: P {number}, nutrient_P {number} variables: P {number}, nutrient_P {number} equations: nutrient_P = P equations: nutrient_P = P process saturation variables: P {number}, nutrient_P {number} variables: P {number}, nutrient_P {number} equations: nutrient_P = P / (P + [0, 1, ]) equations: nutrient_P = P / (P + [0, 1, ]) Constructing Process Models model AquaticEcosystem variables: nitro, phyto, zoo, nutrient_nitro, nutrient_phyto observables: nitro, phyto, zoo process phyto_exponential_growth equations: d[phyto,t] = 0.1 phyto equations: d[phyto,t] = 0.1 phyto process zoo_logistic_growth equations: d[zoo,t] = 0.1 zoo / (1 zoo / 1.5) equations: d[zoo,t] = 0.1 zoo / (1 zoo / 1.5) process phyto_nitro_consumption equations: d[nitro,t] = 1 phyto nutrient_nitro, equations: d[nitro,t] = 1 phyto nutrient_nitro, d[phyto,t] = 1 phyto nutrient_nitro d[phyto,t] = 1 phyto nutrient_nitro process phyto_nitro_no_saturation equations: nutrient_nitro = nitro equations: nutrient_nitro = nitro process zoo_phyto_consumption equations: d[phyto,t] = 1 zoo nutrient_phyto, equations: d[phyto,t] = 1 zoo nutrient_phyto, d[zoo,t] = 1 zoo nutrient_phyto d[zoo,t] = 1 zoo nutrient_phyto process zoo_phyto_saturation equations: nutrient_phyto = phyto / (phyto + 0.5) equations: nutrient_phyto = phyto / (phyto + 0.5) HeuristicSearch observations generic processes process model phyto, nitro, zoo, nutrient_nitro, nutrient_phyto variables

A Method for Process Model Construction 1. Find all ways to instantiate known generic processes with specific variables, subject to type constraints; 2. Combine instantiated processes into candidate generic models subject to additional constraints (e.g., number of processes); 3. For each generic model, carry out search through parameter space to find good coefficients; 4. Return the parameterized model with the best overall score. Our initial system, IPM, constructs process models from generic components in four stages: Our typical evaluation metric is squared error, but we have also explored other measures of explanatory adequacy.

Results on Observations from Ross Sea We provided IPM with 188 samples of phytoplnkton, nitrate, and ice measures taken from the Ross Sea. From 2035 distinct model structures, it found accurate models that limited phyto growth by the nitrate and the light available. Some high-ranking models incorporated zooplankton, whereas others did not.

Results with Inductive Process Modeling population dynamics battery behavior hydrology biochemical kinetics

Extensions to Inductive Process Modeling heuristic beam search through the space of process models; heuristic beam search through the space of process models; hierarchical generic processes that further constrain search; hierarchical generic processes that further constrain search; an ensemble-like method that mitigates overfitting effects; an ensemble-like method that mitigates overfitting effects; an EM-like method that deals with missing observations. an EM-like method that deals with missing observations. In recent work, we have extended our system to incorporate: This approach has great potential to speed the construction of scientifc models – provided that domain users adopt it.

specify a quantitative process model of the target system; specify a quantitative process model of the target system; display and edit the models structure and details graphically; display and edit the models structure and details graphically; simulate the models behavior over time and situations; simulate the models behavior over time and situations; compare the models predicted behavior to observations; compare the models predicted behavior to observations; invoke a revision module in response to detected anomalies. invoke a revision module in response to detected anomalies. Because few scientists want to be replaced, we are developing an interactive environment, P ROMETHEUS, that lets users: The environment offers computational assistance in forming and evaluating models but lets the user retain control. Interfacing with Scientists

Viewing a Process Model Graphically

Viewing a Process Model as Equations

Adding a Process Manually

Requesting Automatic Model Revision

Results of Automatic Model Revision

Directions for Future Research provide better ways to visualize models, data, and their relation provide better ways to visualize models, data, and their relation offer users more natural ways to define the space of models offer users more natural ways to define the space of models specifying constraints on relations among entities and processes specifying constraints on relations among entities and processes characterizing subsystems that decompose complex models characterizing subsystems that decompose complex models incorporate intuitive metrics like match to trajectory shape incorporate intuitive metrics like match to trajectory shape more generally improve the usability of P ROMETHEUS more generally improve the usability of P ROMETHEUS Despite our progress to date, we need further work in order to: Taken together, these will make inductive process modeling a more robust approach to scientific model construction.

computational scientific discovery (e.g., Langley et al., 1983); computational scientific discovery (e.g., Langley et al., 1983); theory revision in machine learning (e.g., Towell, 1991); theory revision in machine learning (e.g., Towell, 1991); qualitative physics and simulation (e.g., Forbus, 1984); qualitative physics and simulation (e.g., Forbus, 1984); languages for scientific simulation (e.g., STELLA, MATLAB ); languages for scientific simulation (e.g., STELLA, MATLAB ); interactive tools for data analysis (e.g., Schneiderman, 2001). interactive tools for data analysis (e.g., Schneiderman, 2001). Intellectual Influences Our approach to aiding scientific model construction incorporates ideas from many traditions: Our work combines, in novel ways, insights from machine learning, AI, programming languages, and human-computer interaction.

Contributions of the Research incorporates a formalism that is familiar to many scientists; incorporates a formalism that is familiar to many scientists; takes into account background knowledge about the domain; takes into account background knowledge about the domain; produces meaningful results from small amounts of data; produces meaningful results from small amounts of data; generates models that explain rather than describe observations; generates models that explain rather than describe observations; provides an interactive environment for model construction. provides an interactive environment for model construction. In summary, our work on computational model construction has produced an approach that: We need much more research in computational systems science that addresses these challenges.

End of Presentation