Presentation is loading. Please wait.

Presentation is loading. Please wait.

Pat Langley Institute for the Study of Learning and Expertise Javier Sanchez CSLI / Stanford University Ljupco Todorovski Saso Dzeroski Jozef Stefan Institute.

Similar presentations


Presentation on theme: "Pat Langley Institute for the Study of Learning and Expertise Javier Sanchez CSLI / Stanford University Ljupco Todorovski Saso Dzeroski Jozef Stefan Institute."— Presentation transcript:

1 Pat Langley Institute for the Study of Learning and Expertise Javier Sanchez CSLI / Stanford University Ljupco Todorovski Saso Dzeroski Jozef Stefan Institute Inducing Process Models from Continuous Data Supported by NTT Communication Science Laboratories, by Grant NCC 2-1220 from NASA Ames Research Center, and by EU Grant IST-2000-26469.

2 Exploratory Research in Machine Learning define a challenging new problem for machine learning; define a challenging new problem for machine learning; show that established methods cannot solve the problem; show that established methods cannot solve the problem; present an initial approach that addresses the new task; and present an initial approach that addresses the new task; and outline an agenda for future research efforts in the area. outline an agenda for future research efforts in the area. Dietterich (1990) claims an exploratory research report should: In this talk, we explore the problem of inducing process models from continuous data.

3 process exponential_growth variables: P {population} variables: P {population} equations: d[P,t] = [0, 1, ] P equations: d[P,t] = [0, 1, ] P process logistic_growth variables: P {population} variables: P {population} equations: d[P,t] = [0, 1, ] P (1 P / [0, 1, ]) equations: d[P,t] = [0, 1, ] P (1 P / [0, 1, ]) process constant_inflow variables: I {inorganic_nutrient} variables: I {inorganic_nutrient} equations: d[I,t] = [0, 1, ] equations: d[I,t] = [0, 1, ] process consumption variables: P1 {population}, P2 {population}, nutrient_P2 variables: P1 {population}, P2 {population}, nutrient_P2 equations: d[P1,t] = [0, 1, ] P1 nutrient_P2, equations: d[P1,t] = [0, 1, ] P1 nutrient_P2, d[P2,t] = [0, 1, ] P1 nutrient_P2 d[P2,t] = [0, 1, ] P1 nutrient_P2 process no_saturation variables: P {number}, nutrient_P {number} variables: P {number}, nutrient_P {number} equations: nutrient_P = P equations: nutrient_P = P process saturation variables: P {number}, nutrient_P {number} variables: P {number}, nutrient_P {number} equations: nutrient_P = P / (P + [0, 1, ]) equations: nutrient_P = P / (P + [0, 1, ]) Inductive Process Modeling model AquaticEcosystem variables: nitro, phyto, zoo, nutrient_nitro, nutrient_phyto observables: nitro, phyto, zoo process phyto_exponential_growth equations: d[phyto,t] = 0.1 phyto equations: d[phyto,t] = 0.1 phyto process zoo_logistic_growth equations: d[zoo,t] = 0.1 zoo / (1 zoo / 1.5) equations: d[zoo,t] = 0.1 zoo / (1 zoo / 1.5) process phyto_nitro_consumption equations: d[nitro,t] = 1 phyto nutrient_nitro, equations: d[nitro,t] = 1 phyto nutrient_nitro, d[phyto,t] = 1 phyto nutrient_nitro d[phyto,t] = 1 phyto nutrient_nitro process phyto_nitro_no_saturation equations: nutrient_nitro = nitro equations: nutrient_nitro = nitro process zoo_phyto_consumption equations: d[phyto,t] = 1 zoo nutrient_phyto, equations: d[phyto,t] = 1 zoo nutrient_phyto, d[zoo,t] = 1 zoo nutrient_phyto d[zoo,t] = 1 zoo nutrient_phyto process zoo_phyto_saturation equations: nutrient_phyto = phyto / (phyto + 0.5) equations: nutrient_phyto = phyto / (phyto + 0.5) Induction training data background knowledge learned knowledge

4 Observed values for a set of continuous variables as they vary over time or situations Generic processes that characterize causal relationships among variables in terms of conditional equations Inductive Process Modeling A specific process model that explains the observed values and predicts future data accurately Induction training data background knowledge learned model

5 A Process Model of an Ice-Water System model WaterPhaseChange variables: temp, heat, ice_mass, water_mass observables: temp, heat, ice_mass, water_mass process ice-warming conditions: ice_mass > 0, temp 0, temp < 0 equations: d[temp,t] = heat / (0.00206 ice_mass) equations: d[temp,t] = heat / (0.00206 ice_mass) process ice-melting conditions: ice_mass > 0, temp == 0 conditions: ice_mass > 0, temp == 0 equations: d[ice_mass,t] = (18 heat) / 6.02, equations: d[ice_mass,t] = (18 heat) / 6.02, d[water_mass,t] = (18 heat) / 6.02 d[water_mass,t] = (18 heat) / 6.02 process water-warming conditions: ice_mass == 0, water_mass > 0, conditions: ice_mass == 0, water_mass > 0, temp >= 0, temp = 0, temp < 100 equations: d[temp,t] = heat / (0.004184 water_mass) equations: d[temp,t] = heat / (0.004184 water_mass) 0 Time temp ice_mass water_mass

6 Why Are Process Models Interesting? they incorporate scientific formalisms rather than AI notations; they incorporate scientific formalisms rather than AI notations; that are easily communicable to scientists and engineers; that are easily communicable to scientists and engineers; they move beyond descriptive generalization to explanation; they move beyond descriptive generalization to explanation; while retaining the modularity needed to support induction. while retaining the modularity needed to support induction. Process models are a crucial target for machine learning because: These reasons point to process models as an ideal representation for scientific and engineering knowledge. Process models are an important alternative to formalisms used currently in machine learning.

7 Challenges of Inductive Process Modeling process models characterize behavior of dynamical systems; process models characterize behavior of dynamical systems; variables are mainly continuous and data are unsupervised; variables are mainly continuous and data are unsupervised; observations are not independently and identically distributed; observations are not independently and identically distributed; process models contain unobservable processes and variables; process models contain unobservable processes and variables; multiple processes can interact to produce complex behavior. multiple processes can interact to produce complex behavior. Process model induction differs from typical learning tasks in that: Compensating factors include a focus on deterministic systems and the availability of background knowledge.

8 Can Existing Methods Induce Process Models? d[ice_mass,t] = (18 heat) / 6.02 d[water_mass,t] = (18 heat) / 6.02 equation discovery B>6 C>0 C>4 14.318.711.516.9 regression trees x =12, x =1 x =12, x =1 y =18, x =2 y =18, x =2 x =12, x =1 x =12, x =1 y =10, x =2 y =10, x =2 x =16, x =2 x =16, x =2 y =13, x =1 y =13, x =1 x =19, x =1 x =19, x =1 y =11, x =2 y =11, x =2 0.3 0.7 1.0 1.0 hidden Markov models explanation-based learning gcd(X,X,X). gcd(X,Y,D) :- X<Y,Z is Y–X,gcd(X,Z,D). gcd(X,Y,D) :- Y<X,gcd(Y,X,D). inductive logic programming

9 Facets of Inductive Process Modeling characteristics of the data (observations to be explained); characteristics of the data (observations to be explained); a representation for background knowledge (generic processes); a representation for background knowledge (generic processes); a representation for learned knowledge (process models); a representation for learned knowledge (process models); a performance element that makes predictions (a simulator); a performance element that makes predictions (a simulator); a learning method that induces process models. a learning method that induces process models. To describe a system that learns process models, we must specify: We will use an example from population dynamics to illustrate an initial approach to inductive process modeling.

10 Data for an Aquatic Ecosystem

11 Generic Processes for Population Dynamics process exponential_growth process exponential_decay variables: P {population} variables: P {population} variables: P {population} variables: P {population} equations: d[P,t] = [0, 1, ] P equations: d[P,t] = [0, 1, ] P equations: d[P,t] = [0, 1, ] P equations: d[P,t] = [0, 1, ] P process logistic_growth variables: P {population} variables: P {population} equations: d[P,t] = [0, 1, ] P (1 P / [0, 1, ]) equations: d[P,t] = [0, 1, ] P (1 P / [0, 1, ]) process constant_inflow variables: I {inorganic_nutrient} variables: I {inorganic_nutrient} equations: d[I,t] = [0, 1, ] equations: d[I,t] = [0, 1, ] process consumption variables: P1 {population}, P2 {population}, nutrient_P2 {number} variables: P1 {population}, P2 {population}, nutrient_P2 {number} equations: d[P1,t] = [0, 1, ] P1 nutrient_P2, equations: d[P1,t] = [0, 1, ] P1 nutrient_P2, d[P2,t] = [0, 1, ] P1 nutrient_P2 d[P2,t] = [0, 1, ] P1 nutrient_P2 process no_saturation variables: P {number}, nutrient_P {number} variables: P {number}, nutrient_P {number} equations: nutrient_P = P equations: nutrient_P = P process saturation variables: P {number}, nutrient_P {number} variables: P {number}, nutrient_P {number} equations: nutrient_P = P / (P + [0, 1, ]) equations: nutrient_P = P / (P + [0, 1, ])

12 Process Model for an Aquatic Ecosystem model AquaticEcosystem variables: nitro, phyto, zoo, nutrient_nitro, nutrient_phyto observables: nitro, phyto, zoo process phyto_exponential_growth equations: d[phyto,t] = 0.1 phyto equations: d[phyto,t] = 0.1 phyto process zoo_logistic_growth equations: d[zoo,t] = 0.1 zoo / (1 zoo / 1.5) equations: d[zoo,t] = 0.1 zoo / (1 zoo / 1.5) process phyto_nitro_consumption equations: d[nitro,t] = 1 phyto nutrient_nitro, equations: d[nitro,t] = 1 phyto nutrient_nitro, d[phyto,t] = 1 phyto nutrient_nitro d[phyto,t] = 1 phyto nutrient_nitro process phyto_nitro_no_saturation equations: nutrient_nitro = nitro equations: nutrient_nitro = nitro process zoo_phyto_consumption equations: d[phyto,t] = 1 zoo nutrient_phyto, equations: d[phyto,t] = 1 zoo nutrient_phyto, d[zoo,t] = 1 zoo nutrient_phyto d[zoo,t] = 1 zoo nutrient_phyto process zoo_phyto_saturation equations: nutrient_phyto = phyto / (phyto + 0.5) equations: nutrient_phyto = phyto / (phyto + 0.5)

13 Making Predictions with Process Models Specify initial values for input variables and the size for time steps On each time step, check conditions to decide which processes are active Solve algebraic and differential equations with known values Propagate values and recurse to solve other equations Add the effects of different processes on each variable

14 The IPM Method for Process Model Induction Find all ways to instantiate known generic processes with specific variables Combine subsets of instantiated processes into generic models Remove candidates that are too complex or not connected graphs For each generic model, search for good parameter values Return parameterized model with the smallest error

15 Initial Evaluation of IPM Algorithm 1. We used the aquatic ecosystem model to generate data for 100 time steps, setting nitrogen = 1.0, phyto = 0.01, zoo = 0.01; 2. We replaced each true value x with x (1 + r 0.05), where r came from a Gaussian distribution ( = 0 and = 1); 3. We ran IPM on these noisy data, giving it type constraints and generic processes as background knowledge. To demonstrate IPM's functionality at inducing process models, we ran it on synthetic data for a known system. The IPM algorithm examined a space of 2196 generic models, each with an embedded parameter optimization.

16 Predictions from IPMs Induced Model

17 Process Model Generated by IPM model AquaticEcosystem variables: nitro, phyto, zoo, nutrient_nitro_1, nutrient_nitro_2, nutrient_phyto observables: nitro, phyto, zoo process phyto_exponential_growth equations: d[phyto,t] = 0.089 phyto equations: d[phyto,t] = 0.089 phyto process zoo_logistic_growth equations: d[zoo,t] = 0.013 zoo / (1 zoo / 0.469) equations: d[zoo,t] = 0.013 zoo / (1 zoo / 0.469) process phyto_nitro_consumption equations: d[nitro,t] = 1.174 phyto nutrient_nitro_1, equations: d[nitro,t] = 1.174 phyto nutrient_nitro_1, d[phyto,t] = 1.058 phyto nutrient_nitro_1 d[phyto,t] = 1.058 phyto nutrient_nitro_1 process phyto_nitro_no_saturation equations: nutrient_nitro_1 = nitro equations: nutrient_nitro_1 = nitro process zoo_phyto_consumption equations: d[phyto,t] = 0.986 zoo nutrient_phyto, equations: d[phyto,t] = 0.986 zoo nutrient_phyto, d[zoo,t] = 1.089 zoo nutrient_phyto d[zoo,t] = 1.089 zoo nutrient_phyto process zoo_phyto_saturation equations: nutrient_phyto = phyto / (phyto + 0.487) equations: nutrient_phyto = phyto / (phyto + 0.487)

18 Process Model Generated by IPM (continued) process nitro_constant_inflow equations: d[nitro,t] = 0.067 equations: d[nitro,t] = 0.067 process zoo_nitro_consumption equations: d[nitro,t] = 0.470 zoo nutrient_nitro_2, equations: d[nitro,t] = 0.470 zoo nutrient_nitro_2, d[zoo,t] = 1.089 zoo nutrient_nitro_2 d[zoo,t] = 1.089 zoo nutrient_nitro_2 process zoo_nitro_saturation equations: nutrient_nitro_2 = nitro / (nitro + 0.020) equations: nutrient_nitro_2 = nitro / (nitro + 0.020) These extra processes complicate the model but have little effect on its behavior or its predictive accuracy.

19 A Proposed Research Agenda reduce variance and overfitting (e.g., through pruning); reduce variance and overfitting (e.g., through pruning); determine the conditions on processes from training data; determine the conditions on processes from training data; associate variables with phyiscal entities to constrain search; associate variables with phyiscal entities to constrain search; use a taxonomy of process types to organize and limit search; use a taxonomy of process types to organize and limit search; use knowledge of dimensions and conservation to limit search; use knowledge of dimensions and conservation to limit search; support the induction of qualitative process models; and support the induction of qualitative process models; and revise existing process models rather than construct them. revise existing process models rather than construct them. Future research on process modeling should explore methods that: This work should draw on traditional induction methods, which have many relevant ideas.

20 Evaluation of Process Models make explicit claims about an induction method's abilities; make explicit claims about an induction method's abilities; support these claims with experimental or theoretical evidence; support these claims with experimental or theoretical evidence; study behavior on natural data sets to ensure relevance; study behavior on natural data sets to ensure relevance; utilize synthetic data sets to vary dimensions of interest; and utilize synthetic data sets to vary dimensions of interest; and incorporate ideas from other tasks and utilize existing methods whenever sensible. incorporate ideas from other tasks and utilize existing methods whenever sensible. Research on this new class of problems should follow the accepted standards; thus, papers should: In addition, the focus on communicability and use of background knowledge suggests collaborations with domain experts.

21 Concluding Remarks proposed a new problem that involves induction of process models from components to explain observations; proposed a new problem that involves induction of process models from components to explain observations; argued that this task does not lend itself to established methods; argued that this task does not lend itself to established methods; proposed a formalism for models and background knowledge; proposed a formalism for models and background knowledge; presented an initial system that induces such process models; presented an initial system that induces such process models; demonstrated its functionality in a population dynamics domain; demonstrated its functionality in a population dynamics domain; outlined an agenda for future research in this new area. outlined an agenda for future research in this new area. In this exploratory research contribution, we have: Process model induction has great potential to aid development of models in science and engineering.

22 In Memoriam Herbert A. Simon (1916 – 2001) Herbert A. Simon (1916 – 2001) Jan M. Zytkow (1945 – 2001) Jan M. Zytkow (1945 – 2001) Early last year, computational scientific discovery lost two of its founding fathers: Both contributed to the field in many ways: posing new problems, inventing methods, training students, and organizing meetings. Moreover, both were interdisciplinary researchers who contributed to computer science, psychology, philosophy, and statistics. Herb Simon and Jan Zytkow were excellent role models that we should all aim to emulate.

23

24 Our approach to inductive process modeling builds on LaGramge (Todorovski & Dzeroski, 1997), a discovery system that: The LaGramge Discovery System LaGramge has rediscovered an impressive class of differential and algebraic equations from noisy data. specifies a space of abstract numeric equations in terms of a context-free grammar; specifies a space of abstract numeric equations in terms of a context-free grammar; searches exhaustively through this space, to a given depth, to generate candidate abstract equations; searches exhaustively through this space, to a given depth, to generate candidate abstract equations; calls on established optimization techniques to determine the parameters for each equation; and calls on established optimization techniques to determine the parameters for each equation; and uses either squared error or minimum description length to select its final equations. uses either squared error or minimum description length to select its final equations.

25 Making Predictions with Process Models specify initial values for input variables and time step size; specify initial values for input variables and time step size; on each time step, determine which processes are active; on each time step, determine which processes are active; solve active algebraic/differential equations with known values; solve active algebraic/differential equations with known values; propagate values and recursively solve other active equations; propagate values and recursively solve other active equations; when multiple processes influence the same variable, assume their effects are additive. when multiple processes influence the same variable, assume their effects are additive. To simulate a given process models behavior over time, we can: This performance element makes specific predictions that we can compare to observations.

26 A Method for Process Model Induction 1. Find all ways to instantiate known generic processes with specific variables; 2. Combine subsets of instantiated processes into generic models, each specifying an explanatory structure; 2a. Ensure that each candidate consists of a connected graph; 2a. Ensure that each candidate consists of a connected graph; 2b. Limit the maximum number of processes that can connect 2b. Limit the maximum number of processes that can connect any two variables and the total number of processes; any two variables and the total number of processes; 3. Translate the candidate into a context-free grammar and invoke LaGramge to search for good parameter values; 4. Return the model with the least error produced by LaGramge. We have implemented IPM, an algorithm that constructs process models from generic components in four stages:


Download ppt "Pat Langley Institute for the Study of Learning and Expertise Javier Sanchez CSLI / Stanford University Ljupco Todorovski Saso Dzeroski Jozef Stefan Institute."

Similar presentations


Ads by Google