Presentation is loading. Please wait.

Presentation is loading. Please wait.

Javier Sanchez Pat Langley Computational Learning Laboratory Center for the Study of Language and Information Stanford University, Stanford, California.

Similar presentations


Presentation on theme: "Javier Sanchez Pat Langley Computational Learning Laboratory Center for the Study of Language and Information Stanford University, Stanford, California."— Presentation transcript:

1 Javier Sanchez Pat Langley Computational Learning Laboratory Center for the Study of Language and Information Stanford University, Stanford, California http://cll.stanford.edu/ An Interactive Environment for Scientific Model Construction Thanks to N. Asgharbeygi, K. Arrigo, S. Bay, J. Fitzgerald, D. George, S. Klooster, C. Potter, K. Saito, and T. Shinar.

2 Lessons about Scientific Knowledge Discovery Our research collaborations in Earth science and microbiology have revealed some important lessons: 1. Scientists are more comfortable with their own notations than ones from machine learning and data mining. 2. Scientific data are often rare and difficult, indicating a need for additional constraints. 3. Scientists often have initial models and knowledge that should influence the discovery process. 4. Scientists typically want computational assistance rather than automated discovery systems. These observations suggest a need for alternative computational approaches to scientific model construction.

3 ModelRevision Initial model Observations Revisedmodel Scientist Data, Knowledge, and the Scientist process exponential_growth variables: P {population} variables: P {population} equations: d[P,t] = [0, 1, ] P equations: d[P,t] = [0, 1, ] P process logistic_growth variables: P {population} variables: P {population} equations: d[P,t] = [0, 1, ] P (1 P / [0, 1, ]) equations: d[P,t] = [0, 1, ] P (1 P / [0, 1, ]) process constant_inflow variables: I {inorganic_nutrient} variables: I {inorganic_nutrient} equations: d[I,t] = [0, 1, ] equations: d[I,t] = [0, 1, ] process consumption variables: P1 {population}, P2 {population}, nutrient_P2 variables: P1 {population}, P2 {population}, nutrient_P2 equations: d[P1,t] = [0, 1, ] P1 nutrient_P2, equations: d[P1,t] = [0, 1, ] P1 nutrient_P2, d[P2,t] = [0, 1, ] P1 nutrient_P2 d[P2,t] = [0, 1, ] P1 nutrient_P2 process no_saturation variables: P {number}, nutrient_P {number} variables: P {number}, nutrient_P {number} equations: nutrient_P = P equations: nutrient_P = P process saturation variables: P {number}, nutrient_P {number} variables: P {number}, nutrient_P {number} equations: nutrient_P = P / (P + [0, 1, ]) equations: nutrient_P = P / (P + [0, 1, ]) model AquaticEcosystem variables: nitro, phyto, zoo, nutrient_nitro, nutrient_phyto observables: nitro, phyto, zoo process phyto_exponential_growth equations: d[phyto,t] = 0.1 phyto equations: d[phyto,t] = 0.1 phyto process zoo_logistic_growth equations: d[zoo,t] = 0.1 zoo / (1 zoo / 1.5) equations: d[zoo,t] = 0.1 zoo / (1 zoo / 1.5) process phyto_nitro_consumption equations: d[nitro,t] = 1 phyto nutrient_nitro, equations: d[nitro,t] = 1 phyto nutrient_nitro, d[phyto,t] = 1 phyto nutrient_nitro d[phyto,t] = 1 phyto nutrient_nitro process phyto_nitro_no_saturation equations: nutrient_nitro = nitro equations: nutrient_nitro = nitro process zoo_phyto_consumption equations: d[phyto,t] = 1 zoo nutrient_phyto, equations: d[phyto,t] = 1 zoo nutrient_phyto, d[zoo,t] = 1 zoo nutrient_phyto d[zoo,t] = 1 zoo nutrient_phyto process zoo_phyto_saturation equations: nutrient_phyto = phyto / (phyto + 0.5) equations: nutrient_phyto = phyto / (phyto + 0.5) model AquaticEcosystem variables: nitro, phyto, zoo, nutrient_nitro, nutrient_phyto observables: nitro, phyto, zoo process phyto_exponential_growth equations: d[phyto,t] = 0.1 phyto equations: d[phyto,t] = 0.1 phyto process zoo_logistic_growth equations: d[zoo,t] = 0.1 zoo / (1 zoo / 1.5) equations: d[zoo,t] = 0.1 zoo / (1 zoo / 1.5) process phyto_nitro_consumption equations: d[nitro,t] = 1 phyto nutrient_nitro, equations: d[nitro,t] = 1 phyto nutrient_nitro, d[phyto,t] = 1 phyto nutrient_nitro d[phyto,t] = 1 phyto nutrient_nitro process phyto_nitro_no_saturation equations: nutrient_nitro = nitro equations: nutrient_nitro = nitro process zoo_phyto_consumption equations: d[phyto,t] = 1 zoo nutrient_phyto, equations: d[phyto,t] = 1 zoo nutrient_phyto, d[zoo,t] = 1 zoo nutrient_phyto d[zoo,t] = 1 zoo nutrient_phyto process zoo_phyto_saturation equations: nutrient_phyto = phyto / (phyto + 0.5) equations: nutrient_phyto = phyto / (phyto + 0.5)

4 The P ROMETHEUS Modeling Environment specify process models of static and dynamic systems; specify process models of static and dynamic systems; display and edit a models structure and details graphically; display and edit a models structure and details graphically; utilize a model to simulate a systems behavior over time; utilize a model to simulate a systems behavior over time; incorporate background knowledge cast as generic processes; incorporate background knowledge cast as generic processes; indicate which processes to consider during model revision; indicate which processes to consider during model revision; invoke a revision module that improves a models fit to data. invoke a revision module that improves a models fit to data. P ROMETHEUS is an interactive environment that lets its users: Our initial results focused on static models, but in this talk we illustrate the systems use on dynamic models.

5 A Process Model for an Aquatic Ecosystem model AquaticEcosystem; variables phyto, zoo, nitro, residue; observables phyto, nitro; process zoo_exponential_decay; equationsd[zoo,t,1] = 0.251 zoo; equationsd[zoo,t,1] = 0.251 zoo; d[residue,t,1] = 0.251; process zoo_phyto_predation; equationsd[zoo,t,1] = 0.615 0.495 zoo; equationsd[zoo,t,1] = 0.615 0.495 zoo; d[residue,t,1] = 0.385 0.495 zoo; d[phyto,t,1] = 0.495 zoo; process nitro_uptake; conditionsnitro > 1.25; conditionsnitro > 1.25; equationsd[phyto,t,1] = 0.411 phyto; equationsd[phyto,t,1] = 0.411 phyto; d[nitro,t,1] = 0.098 0.411 phyto; process nitro_remineralization; equationsd[nitro,t,1] = 0.005 residue; equationsd[nitro,t,1] = 0.005 residue; d[residue,t,1 ] = 0.005 residue;

6 Advantages of Quantitative Process Models they embed quantitative relations within qualitative structure; they embed quantitative relations within qualitative structure; that refer to notations and mechanisms familiar to scientists; that refer to notations and mechanisms familiar to scientists; they support both algebraic and dynamical relationships; they support both algebraic and dynamical relationships; they offer causal and explanatory accounts of phenomena; they offer causal and explanatory accounts of phenomena; while retaining the modularity needed to support induction. while retaining the modularity needed to support induction. Process models are a good target for modeling systems because: Quantitative process models provide an important alternative to formalisms used currently in scientific modeling.

7 Viewing a Process Model Graphically

8 Simulation and Prediction in P ROMETHEUS To utilize a given process model, P ROMETHEUS simulates its behavior over time or samples by: accepting initial values for input variables and a time step size; accepting initial values for input variables and a time step size; on each time step, determining which processes are active; on each time step, determining which processes are active; solving active static/differential equations with known values; solving active static/differential equations with known values; propagating values and solving other active equations; propagating values and solving other active equations; when multiple processes influence the same variable, assuming their effects are additive. when multiple processes influence the same variable, assuming their effects are additive. This module makes specific predictions that the user can compare to observations.

9 Predictions from Aquatic Ecosystem Model

10 A User-Guided Method for Model Revision 1. Specify all ways to alter the initial model in terms of revising parameters in, removing, and adding processes; 2. Find all ways to instantiate candidate additions with specific variables, subject to type constraints; 3. Generate candidate model structures by removing and adding indicated processes, with limits on total number of processes. 4. For each model structure, search for parameter values that provide a good fit to the data; 5. Return a list of the best N parameterized models, ranked by their mean squared error. The P ROMETHEUS system revises a process model in five stages: The user can inspect these revisions and select the one he finds most plausible.

11 Marking Processes to Revise or Remove

12 Indicating Processes to Consider Adding

13 Generic Processes as Background Knowledge the variables involved in a process and their types; the variables involved in a process and their types; the parameters appearing in a process and their ranges; the parameters appearing in a process and their ranges; the forms of conditions on the process; and the forms of conditions on the process; and the forms of associated equations and their parameters. the forms of associated equations and their parameters. P ROMETHEUS casts background knowledge about a domain as generic processes that specify: Generic processes are the building blocks that P ROMETHEUS uses in its revision of specific process models.

14 Generic Processes for Aquatic Ecosystems generic process exponential_decay;generic process remineralization; variables: S{species}, D{detritus}; variables N{nutrient}, D{detritus}; variables: S{species}, D{detritus}; variables N{nutrient}, D{detritus}; parameters [0, 1]; parameters [0, 1]; parameters [0, 1]; parameters [0, 1]; equations d[S,t,1] = 1 S; equations d[N, t,1] = D; equations d[S,t,1] = 1 S; equations d[N, t,1] = D; d[D,t,1] = S;d[D, t,1] = 1 D; generic process predation;generic process constant_inflow; variables S1{species}, S2{species}, D{detritus}; variables N{nutrient}; variables S1{species}, S2{species}, D{detritus}; variables N{nutrient}; parameters [0, 1], [0, 1]; parameters [0, 1]; parameters [0, 1], [0, 1]; parameters [0, 1]; equations d[S1,t,1] = S1; equations d[N,t,1] = ; equations d[S1,t,1] = S1; equations d[N,t,1] = ; d[D,t,1] = (1 ) S1; d[S2,t,1] = 1 S1; generic process nutrient_uptake; variables S{species}, N{nutrient}; variables S{species}, N{nutrient}; parameters [0, ], [0, 1], [0, 1]; parameters [0, ], [0, 1], [0, 1]; conditions N > ; conditions N > ; equations d[S,t,1] = S ; equations d[S,t,1] = S ; d[N,t,1] = 1 S;

15 Specifying Data and Search Parameters

16 Inspecting Revised Process Models

17 Best Fit to Nitrate Data from Ross Sea

18 Best Fit to Phytoplankton Data from Ross Sea

19 computational scientific discovery (e.g., Langley et al., 1983); computational scientific discovery (e.g., Langley et al., 1983); theory revision in machine learning (e.g., Towell, 1991); theory revision in machine learning (e.g., Towell, 1991); qualitative physics and simulation (e.g., Forbus, 1984); qualitative physics and simulation (e.g., Forbus, 1984); languages for scientific simulation (e.g., STELLA, MATLAB ); languages for scientific simulation (e.g., STELLA, MATLAB ); interactive tools for data analysis (e.g., Schneiderman, 2001). interactive tools for data analysis (e.g., Schneiderman, 2001). Intellectual Influences Our approach to scientific model construction incorporates ideas from many traditions: The P ROMETHEUS environment combines insights from machine learning, AI, programming languages, and HCI in novel ways.

20 Valdes-Perez (1995) M ECHEM [chemistry] Valdes-Perez (1995) M ECHEM [chemistry] Rickel and Porters (1997) T RIPEL [biology] Rickel and Porters (1997) T RIPEL [biology] Sleeman et al.s (1997) D AVICCAND [metallurgy] Sleeman et al.s (1997) D AVICCAND [metallurgy] Mahidadia and Comptons (2001) J UST A ID [endocrinology] Mahidadia and Comptons (2001) J UST A ID [endocrinology] Specific Precursors A few earlier systems support the interactive discovery of scientific models, including: P ROMETHEUS adapts their ideas to scientific domains that involve quantitative explanatory models, such as Earth science.

21 Directions for Future Research produce additional results on other scientific data sets; produce additional results on other scientific data sets; develop more robust methods for fitting model parameters; develop more robust methods for fitting model parameters; implement interactive methods for searching the model space; implement interactive methods for searching the model space; introduce models with subsystems to handle complexity; and introduce models with subsystems to handle complexity; and carry out usability studies with the modeling environment. carry out usability studies with the modeling environment. Despite our progress to date, we need further work in order to: Interactive environments for model construction and revision have great potential to speed progress in science and engineering.

22 Contributions of the Research a new formalism for representing scientific process models; a new formalism for representing scientific process models; a graphical interface for displaying and editing these models; a graphical interface for displaying and editing these models; a computational method for simulating these models behavior; a computational method for simulating these models behavior; an encoding for background knowledge as generic processes; an encoding for background knowledge as generic processes; an interactive method for revising process models given data. an interactive method for revising process models given data. In summary, our work on the P ROMETHEUS system has led to: We have demonstrated this approach to model construction and revision on Earth science problems with encouraging results.

23 Hierarchical Model of a Power Grid

24 End of Presentation

25 Steps in Applying Computational Scientific Discovery problem formulation representation engineering data collection/ manipulation algorithm manipulation filtering and interpretation algorithm invocation

26 A Process Model for Carbon Production model npp; variables NPPc, E, IPAR, T1, T2, W, Topt, tempc, eet, PET, PETTWM, ahi, A, FPARFAS, monthlySolar, SolConver, MONFASNDVI, umd_veg; observable ahi,eet,tempc,Topt,MONFASNDVI,monthlySolar,PETTWM,umd_veg; process CarbonProd; equations NPPc = E * IPAR; process PhotoEfficiency; equations E = (0.389 * (T1 * (T2 * W))); process TempStress1; equations T1 = (0.8 + ((0.02 * Topt) - (0.0005 * (Topt ^ 2)))); process TempStress2; equations T2 = ((1.1814 / (1 + (2.718281828 ^ (0.2 * (Topt - 10 - tempc))))) / (1 + (2.718281828 ^ (0.3 * (tempc - 10 - Topt))))); process WaterStress; conditions PET!=0; equations W = (0.5 + (0.5 * (eet / PET))); process WSNoEvapoTrans; conditions PET==0; equations W = 0.5; process EvapoTrans; conditions tempc>0; equations PET = 1.6 * (10 * tempc / ahi) ^ A * PETTWM;

27 Viewing a Process Model Graphically

28 Results of Revising the NPP Model Initial model: E = 0.56 · T1 · T2 · W E = 0.56 · T1 · T2 · W T2 = 1.18 / [(1 + e 0.2 · (Topt – Tempc – 10) ) · (1 + e 0.3 · (Tempc – Topt – 10) )] T2 = 1.18 / [(1 + e 0.2 · (Topt – Tempc – 10) ) · (1 + e 0.3 · (Tempc – Topt – 10) )] PET = 1.6 · (10 · Tempc / AHI) A · PET-TW-M PET = 1.6 · (10 · Tempc / AHI) A · PET-TW-M SR {3.06, 4.35, 4.35, 4.05, 5.09, 3.06, 4.05, 4.05, 4.05, 5.09, 4.05} SR {3.06, 4.35, 4.35, 4.05, 5.09, 3.06, 4.05, 4.05, 4.05, 5.09, 4.05} RMSE on training data = 465.212 and r 2 = 0.799 Revised model: E = 0.353 · T1 0.00 · T2 0.08 · W 0.00 E = 0.353 · T1 0.00 · T2 0.08 · W 0.00 T2 = 0.83 / [(1 + e 1.0 · (Topt – Tempc – 6.34) ) · (1 + e 1.0 · (Tempc – Topt – 11.52) )] T2 = 0.83 / [(1 + e 1.0 · (Topt – Tempc – 6.34) ) · (1 + e 1.0 · (Tempc – Topt – 11.52) )] PET = 1.6 · (10 · Tempc / AHI) A · PET-TW-M PET = 1.6 · (10 · Tempc / AHI) A · PET-TW-M SR {0.61, 3.99, 2.44, 10.0, 2.21, 2.13, 2.04, 0.43, 1.35, 1.85, 1.61} SR {0.61, 3.99, 2.44, 10.0, 2.21, 2.13, 2.04, 0.43, 1.35, 1.85, 1.61} Cross-validated RMSE = 397.306 and r 2 = 0.853 [ 15 % reduction ]

29 The Challenge of Systems Science focus on synthesis rather than analysis in their operation; focus on synthesis rather than analysis in their operation; rely on computer modeling as one of their central methods; rely on computer modeling as one of their central methods; develop system-level models with many variables and relations; develop system-level models with many variables and relations; evaluate their models on observational, not experimental, data. evaluate their models on observational, not experimental, data. Disciplines like Earth science and computational biology differ from traditional fields in that they: Developing and testing such models are complex tasks that would benefit from computational aids. Our research goal is to design, construct, evaluate, and understand such computational tools for systems science.

30 Fields Contributing to the Proposed Research computational scientific discovery qualitative reasoning simulation languages, numerical analysis human-computer interaction biology, physiology, Earth science

31 identifying conditions on processes (parameter optimization) identifying conditions on processes (parameter optimization) inferring initial values of unobservables (parameter optimization) inferring initial values of unobservables (parameter optimization) keeping the search space tractable (typing on variables) keeping the search space tractable (typing on variables) reducing variance to mitigate overfitting (min. desc. length) reducing variance to mitigate overfitting (min. desc. length) Inductive process modeling raises a number of issues that have clear analogues in other paradigms: We have demonstrated promising responses to these four problems within the IPM framework. Issues in Process Model Induction

32 Best Model Fit to Data from Ross Sea

33 Best Model Fit to Data on Protozoan Predation

34 Collecting Data on Photosynthetic Processes External stimuli (e.g., light) Adaptation Period Sampling mRNA/cDNA Equlibrium Period MicroarrayTrace Continuous Culture (Chemostat) /wwwscience.murdoch.edu.au/teach www.affymetrix.com/ Health of Culture Time

35 Gene Expressions for Cyanobacteria

36 Generic Processes for Photosynthesis Regulation generic process translationgeneric process transcription variables: P{protein}, M{mRNA} variables: M{mRNA}, R{rate} variables: P{protein}, M{mRNA} variables: M{mRNA}, R{rate} parameters: [0, 1] parameters: parameters: [0, 1] parameters: equations:d[P,t,1] = M equations:d[M,t,1] = R equations:d[P,t,1] = M equations:d[M,t,1] = R generic process regulate_onegeneric process regulate_two variables: R{rate}, S{signal} variables: R{rate}, S{signal} variables: R{rate}, S{signal} variables: R{rate}, S{signal} parameters: [ 1, 1] parameters: [ 1, 1], [0, 1] parameters: [ 1, 1] parameters: [ 1, 1], [0, 1] equations:R = S equations:R = S equations:R = S equations:R = S d[S, t,1] = 1 S generic process automatic_degradationgeneric process controlled_degradation variables: C{concentration} variables: D{concentration}, E{concentration} variables: C{concentration} variables: D{concentration}, E{concentration} conditions:C > 0 conditions:D > 0, E > 0 conditions:C > 0 conditions:D > 0, E > 0 parameters: [0, 1] parameters: [0, 1] parameters: [0, 1] parameters: [0, 1] equations:d[C,t,1] = 1 C equations:d[D,t,1] = 1 E equations:d[C,t,1] = 1 C equations:d[D,t,1] = 1 E d[E,t,1] = 1 E generic process photosynthesis variables: L{light}, P{protein}, R{redox}, S{ROS} variables: L{light}, P{protein}, R{redox}, S{ROS} parameters: [0, 1], [0, 1] parameters: [0, 1], [0, 1] equations:d[R,t,1] = L P equations:d[R,t,1] = L P d[S,t,1] = L P

37 A Process Model for Photosynthetic Regulation model photo_regulation variables: light, mRNA_protein, ROS, redox, transcription_rate observables: light, mRNA process photosynthesis; equations:d[redox,t,1] = 0.0155 light protein equations:d[redox,t,1] = 0.0155 light protein d[ROS,t,1] = 0.019 light protein process protein_translationprocess mRNA_transcription equations:d[protein,t,1] = 7.54 mRNA equations:d[mRNA,t,1] = transcription_rate equations:d[protein,t,1] = 7.54 mRNA equations:d[mRNA,t,1] = transcription_rate process regulate_one_1process regulate_two_2 equations: transcription_rate = 0.99 light equations:transcription_rate = 1.203 redox equations: transcription_rate = 0.99 light equations:transcription_rate = 1.203 redox d[redox,t,1] = 0.0002 redox process automatic_degradation_1process controlled_degradation_1 conditions:protein > 0 conditions:redox > 0, ROS > 0 conditions:protein > 0 conditions:redox > 0, ROS > 0 equations:d[protein,t,1] = 1.91 protein equations:d[redox,t,1] = 0.0003 ROS equations:d[protein,t,1] = 1.91 protein equations:d[redox,t,1] = 0.0003 ROS d[ROS,t,1] = 0.0003 ROS

38 Predictions from Best Parameterized Model

39 Electric Power on the International Space Station

40 Telemetry Data from Space Station Batteries

41 Induced Process Model for Battery Behavior model Battery variables: Rs, Vcb, soc, Vt, i, temperature observable: soc, Vt, i, temperature process voltage_chargeprocess voltage_discharge conditions:i 0 conditions:i < 0 conditions:i 0 conditions:i < 0 equations:Vt = Vcb + 6.105 Rs i equations:Vt = Vcb 1.0 / (Rs + 1.0) equations:Vt = Vcb + 6.105 Rs i equations:Vt = Vcb 1.0 / (Rs + 1.0) process charge_transfer equations:d[soc,t,1] = i Vcb/179.38 equations:d[soc,t,1] = i Vcb/179.38 process quadratic_influence_Vcb_soc equations:Vcb = 41.32 soc soc equations:Vcb = 41.32 soc soc process linear_influence_Vcb_temp equations:Vcb = 0.2592 temperature equations:Vcb = 0.2592 temperature process linear_influence_Rs_soc equations:Rs = 0.03894 soc equations:Rs = 0.03894 soc

42 Results on Battery Test Data

43 Hierarchical Model of a Power Grid

44 specify a quantitative process model of the target system; specify a quantitative process model of the target system; display and edit the models structure and details graphically; display and edit the models structure and details graphically; simulate the models behavior over time and situations; simulate the models behavior over time and situations; compare the models predicted behavior to observations; compare the models predicted behavior to observations; invoke a revision module in response to detected anomalies. invoke a revision module in response to detected anomalies. Because scientists do not want to be replaced, we are developing an interactive environment that lets users: The environment offers computational assistance in forming and evaluating models but lets the user retain control. Challenge 5: Interfacing with Scientists

45 In Memoriam Herbert A. Simon (1916 – 2001) Herbert A. Simon (1916 – 2001) Jan M. Zytkow (1945 – 2001) Jan M. Zytkow (1945 – 2001) Two years ago, computational scientific discovery lost two of its founding fathers: Both contributed to the field in many ways: posing new problems, inventing methods, training students, and organizing meetings. Moreover, both were interdisciplinary researchers who contributed to computer science, psychology, philosophy, and statistics. Herb Simon and Jan Zytkow were excellent role models that we should all aim to emulate.

46 Data Mining vs. Scientific Discovery Data mining generates knowledge cast as decision trees, logical rules, or other notations invented by AI researchers; Data mining generates knowledge cast as decision trees, logical rules, or other notations invented by AI researchers; Computational scientific discovery instead uses equations, structural models, reaction pathways, or other formalisms invented by scientists and engineers. Computational scientific discovery instead uses equations, structural models, reaction pathways, or other formalisms invented by scientists and engineers. There exist two computational paradigms for discovering explicit knowledge from data: Both approaches draw on heuristic search to find regularities in data, but they differ considerably in their emphases.

47 Time Line for Research on Computational Scientific Discovery 1989199019791980198119821983198419851986198719881991199219931994199519961997199819992000 Bacon.1–Bacon.5 Abacus, Coper Fahrehneit, E*, Tetrad, IDS N Hume, ARC DST, GP N LaGrange SDS SSF, RF5, LaGramge Dalton, Stahl RL, Progol Gell-Mann BR-3, Mendel Pauli Stahlp, Revolver Dendral AM GlauberNGlauber IDS Q, Live IE Coast, Phineas, AbE, Kekada Mechem, CDP Astra, GP M HR BR-4 Numeric lawsQualitative lawsStructural modelsProcess models Legend

48 Why Are Process Models Interesting? they incorporate scientific formalisms rather than AI notations; they incorporate scientific formalisms rather than AI notations; that are easily communicable to scientists and engineers; that are easily communicable to scientists and engineers; they move beyond descriptive generalization to explanation; they move beyond descriptive generalization to explanation; while retaining the modularity needed to support induction. while retaining the modularity needed to support induction. Process models are a crucial target for machine learning because: These reasons point to process models as an ideal representation for scientific and engineering knowledge. Process models are an important alternative to formalisms used currently in machine learning.

49 Challenges of Inductive Process Modeling process models characterize behavior of dynamical systems; process models characterize behavior of dynamical systems; variables are mainly continuous and data are unsupervised; variables are mainly continuous and data are unsupervised; observations are not independently and identically distributed; observations are not independently and identically distributed; process models contain unobservable processes and variables; process models contain unobservable processes and variables; multiple processes can interact to produce complex behavior. multiple processes can interact to produce complex behavior. Process model induction differs from typical learning tasks in that: Compensating factors include a focus on deterministic systems and the availability of background knowledge.

50 Making Predictions with Process Models Specify initial values for input variables and the size for time steps On each time step, check conditions to decide which processes are active Solve algebraic and differential equations with known values Propagate values and recurse to solve other equations Add the effects of different processes on each variable

51 Predictions from IPMs Induced Model

52 Observed values for a set of continuous variables as they vary over time or situations Generic processes that characterize causal relationships among variables in terms of conditional equations Inductive Process Modeling A specific process model that explains the observed values and predicts future data accurately Induction training data background knowledge learned model

53 Inductive Process Modeling as Search an initial state from which to start search; an initial state from which to start search; some operators that generate new states; some operators that generate new states; an evaluation function that selects among states; an evaluation function that selects among states; an overall control regime for the search; and an overall control regime for the search; and a halting criterion for ending the search. a halting criterion for ending the search. To construct a quantitative process model, we need an algorithm to search the space of models that assumes: We have implemented a four-stage method that takes positions on these design decisions.

54 The IPM Method for Process Model Induction Find all ways to instantiate known generic processes with specific variables Combine subsets of instantiated processes into generic models Remove candidates that are too complex or not connected graphs For each generic model, search for good parameter values Return parameterized model with the smallest error

55 http://www.bio.ic.ac.uk/research/barber/photosystemII.html A Biologists Depiction of Photosynthesis

56 Predictions from Best Parameterized Model

57 The NPPc Portion of CASA NPPc = month max (E · IPAR, 0) E = 0.56 · T1 · T2 · W E = 0.56 · T1 · T2 · W T1 = 0.8 + 0.02 · Topt – 0.0005 · Topt 2 T1 = 0.8 + 0.02 · Topt – 0.0005 · Topt 2 T2 = 1.18 / [(1 + e 0.2 · (Topt – Tempc – 10) ) · (1 + e 0.3 · (Tempc – Topt – 10) )] T2 = 1.18 / [(1 + e 0.2 · (Topt – Tempc – 10) ) · (1 + e 0.3 · (Tempc – Topt – 10) )] W = 0.5 + 0.5 · EET / PET W = 0.5 + 0.5 · EET / PET PET = 1.6 · (10 · Tempc / AHI) A · PET-TW-M if Tempc > 0 PET = 1.6 · (10 · Tempc / AHI) A · PET-TW-M if Tempc > 0 PET = 0 if Tempc < 0 PET = 0 if Tempc < 0 A = 0.00000068 · AHI 3 – 0.000077 · AHI 2 + 0.018 · AHI + 0.49 A = 0.00000068 · AHI 3 – 0.000077 · AHI 2 + 0.018 · AHI + 0.49 IPAR = 0.5 · FPAR-FAS · Monthly-Solar · Sol-Conver IPAR = 0.5 · FPAR-FAS · Monthly-Solar · Sol-Conver FPAR-FAS = min [(SR-FAS – 1.08) / SR (UMD-VEG), 0.95] FPAR-FAS = min [(SR-FAS – 1.08) / SR (UMD-VEG), 0.95] SR-FAS = (Mon-FAS-NDVI + 1000) / (Mon-FAS-NDVI – 1000) SR-FAS = (Mon-FAS-NDVI + 1000) / (Mon-FAS-NDVI – 1000)

58 A Process Model for an Aquatic Ecosystem model AquaticEcosystem; variables phyto, zoo, nitro, residue; observables phyto, nitro; process phyto_exponential_decay; equationsd[phyto,t,1] = 0.307 phyto; equationsd[phyto,t,1] = 0.307 phyto; d[residue,t,1] = 0.307 phyto; process zoo_exponential_decay; equationsd[zoo,t,1] = 0.251 zoo; equationsd[zoo,t,1] = 0.251 zoo; d[residue,t,1] = 0.251; process zoo_phyto_predation; equationsd[zoo,t,1] = 0.615 0.495 zoo; equationsd[zoo,t,1] = 0.615 0.495 zoo; d[residue,t,1] = 0.385 0.495 zoo; d[phyto,t,1] = 0.495 zoo; process nitro_uptake; conditionsnitro > 1.25; conditionsnitro > 1.25; equationsd[phyto,t,1] = 0.411 phyto; equationsd[phyto,t,1] = 0.411 phyto; d[nitro,t,1] = 0.098 0.411 phyto; process nitro_remineralization; equationsd[nitro,t,1] = 0.005 residue; equationsd[nitro,t,1] = 0.005 residue; d[residue,t,1 ] = 0.005 residue;


Download ppt "Javier Sanchez Pat Langley Computational Learning Laboratory Center for the Study of Language and Information Stanford University, Stanford, California."

Similar presentations


Ads by Google