Presentation is loading. Please wait.

Presentation is loading. Please wait.

Pat Langley Computational Learning Laboratory Center for the Study of Language and Information Stanford University, Stanford, CA 94304

Similar presentations


Presentation on theme: "Pat Langley Computational Learning Laboratory Center for the Study of Language and Information Stanford University, Stanford, CA 94304"— Presentation transcript:

1 Pat Langley Computational Learning Laboratory Center for the Study of Language and Information Stanford University, Stanford, CA The Computational Discovery of Communicable Knowledge Also affiliated with the DaimlerChrysler Research & Technology Center and the Institute for the Study of Learning and Expertise.

2 The Problem and the Potential traces of traffic behavior from GPS and cell phones traces of traffic behavior from GPS and cell phones prices of stocks and currencies from exchanges prices of stocks and currencies from exchanges measurements of climate and ecosystem variables measurements of climate and ecosystem variables Our society is collecting increasing amounts of data in commercial and scientific domains. These include complex spatial/temporal data sets like: Computational techniques should let us find relations in these data that are useful for business and society.

3 Drawbacks of Current Approaches assume attribute-value representations that cannot handle time or space assume attribute-value representations that cannot handle time or space cannot tell interesting discoveries from mundane ones cannot tell interesting discoveries from mundane ones state the discovered knowledge in some opaque form state the discovered knowledge in some opaque form The fields of machine learning and data mining have developed methods to find regularities in data. Despite many successful applications, most techniques: This indicates the need for alternative methods that can address these issues.

4 Paradigms for Machine Learning decision-tree induction case-based learning induction of logical rules probabilistic induction neural networks

5 Paradigms for Scientific Discovery taxonomy formation equation discovery qualitative law discovery process model formation structural model construction

6 Discovering Numeric Laws Statement of the task: Given: Quantitative measurements about objects or events in the world.Given: Quantitative measurements about objects or events in the world. Find: Numeric relations that hold among variables that describe these items and that predict future behavior.Find: Numeric relations that hold among variables that describe these items and that predict future behavior. Historical examples: Keplers three laws of planetary motionKeplers three laws of planetary motion Archimedes principle of displacement in waterArchimedes principle of displacement in water Blacks law relating specific heat, mass, and temperatureBlacks law relating specific heat, mass, and temperature Prousts and Gay-Lussacs laws of definite proportionsProusts and Gay-Lussacs laws of definite proportions

7 BACON on Keplers Third Law D A B C d/pp d 2 /p d 3 /p moond BACON carries out heuristic search through a space of numeric terms, looking for constant values and linear relations. This example shows the systems progression from primitive variables (distance and period of Jupiters moons) to a complex term that has a nearly constant value.

8 Some Laws Discovered by BACON Basic numeric relations: Ideal gas lawPV = aNT + bNIdeal gas lawPV = aNT + bN Keplers third lawD 3 = [(A - k) / t] 2 = jKeplers third lawD 3 = [(A - k) / t] 2 = j Coulombs lawFD 2 / Q 1 Q 2 = cCoulombs lawFD 2 / Q 1 Q 2 = c Ohms lawTD 2 / (LI - rI) = rOhms lawTD 2 / (LI - rI) = r Relations with intrinsic properties: Snells law of refractionsin I / sin R = n 1 / n 2Snells law of refractionsin I / sin R = n 1 / n 2 Archimedes lawC = V + iArchimedes lawC = V + i Momentum conservationm 1 V 1 = m 2 V 2Momentum conservationm 1 V 1 = m 2 V 2 Blacks specific heat lawc 1 m 1 T 1 + c 2 m 2 T 2 = (c 1 m 1 + c 2 m 2 ) T fBlacks specific heat lawc 1 m 1 T 1 + c 2 m 2 T 2 = (c 1 m 1 + c 2 m 2 ) T f

9 Temporal Laws of Ecological Behavior (Todorovski & Dzeroski, 1997) Input: time phyt zoo phosp temp time 2 phyt 2 zoo 2 phosp 2 temp 2 time 1 phyt 1 zoo 1 phosp 1 temp 1 time m phyt m zoo m phosp m temp m..... phosp phosp c 2 + phosp Output: phyt = c 1 phyt – c 3 phyt Input: a context-free grammar of domain constraints

10 Formulating Structural Models Statement of the task: Given: Qualitative or numeric empirical laws that describe observed phenomena.Given: Qualitative or numeric empirical laws that describe observed phenomena. Find: Explanatory models of these phenomena in terms of component objects and their relations.Find: Explanatory models of these phenomena in terms of component objects and their relations. Historical examples: Daltons and Avogadros molecular models of chemicalsDaltons and Avogadros molecular models of chemicals Mendels genetic model of inherited traitsMendels genetic model of inherited traits Quark models of elementary particlesQuark models of elementary particles Structural models of planets, comets, and starsStructural models of planets, comets, and stars

11 Initial state: (reacts in {hydrogen oxygen} out {water}) (reacts in {hydrogen nitrogen} out {ammonia}) (reacts in {oxygen nitrogen} out {nitrous oxide})... Final state: 2 hydrogen + 1 oxygen 2 water 3 hydrogen + 1 nitrogen 2 ammonia 2 oxygen + 1 nitrogen 2 nitrous oxide hydrogen {h h} water {h h o} oxygen {h h} ammonia {h h h n} nitrogen {h h} nitrous oxide {n o o}... DALTON on Chemical Reactions DALTON finds these structural models through a depth-first search process constrained by conservation assumptions.

12 Constructing Process Models Statement of the task: Given: Qualitative or numeric empirical laws that describe temporal phenomena.Given: Qualitative or numeric empirical laws that describe temporal phenomena. Find: Explanatory models of these phenomena in terms of processes among component objects.Find: Explanatory models of these phenomena in terms of processes among component objects. Historical examples: Caloric and kinetic theories of heat phenomenaCaloric and kinetic theories of heat phenomena Reaction pathways in chemistry and nucleosynthesisReaction pathways in chemistry and nucleosynthesis Models of continental drift and plate tectonicsModels of continental drift and plate tectonics Process models of stellar evolution and destructionProcess models of stellar evolution and destruction

13 Inputs: - quantum properties for elements and isotopes - conservation relations among these properties - an element to be explained (e.g., O or C) - elements to be assumed (e.g., H or He) Outputs: - elementary reactions that obey conservation laws - reaction pathways that explain the elements evolution ASTRA on Nucleosynthesis ASTRA uses depth-first search to find reaction pathways for: - proton and neutron captures - neutron and deuteron production - generation of helium (He) from hydrogen (H) - generation of carbon (C) and oxygen (O)

14 Standard pathway: 4 He + 4 He 8 Be 4 He + 8 Be 12 C Three Pathways for Carbon Synthesis ASTRA generates many pathways novel to astrophysics, some of which have viable reaction rates. Alternative pathways: 4 He + D 6 Li 3 He + 6 Li 9 Be 4 He + 9 Be 12 C + n 4 He + D 6 Li 4 He + 6 Li 10 Be 4 He + 10 Be 12 C + D

15 Proposed Research are designed to process temporal and structured data are designed to process temporal and structured data use techniques from computational scientific discovery use techniques from computational scientific discovery describe new knowledge in a communicable form describe new knowledge in a communicable form We plan to develop and evaluate discovery methods that: We will apply our methods to domains that benefit from such communicable representations. Likely notations for the discovered knowledge include: structural models of relations among entities structural models of relations among entities process models of change over time process models of change over time sets of simultaneous differential equations sets of simultaneous differential equations

16 Benefits of the Approach support discoveries in domains that involve complex spatial, temporal, or relational data support discoveries in domains that involve complex spatial, temporal, or relational data use domain knowledge to filter only discoveries that are interesting and novel to the domain user use domain knowledge to filter only discoveries that are interesting and novel to the domain user present the new knowledge in some understandable notation that can be communicated among humans present the new knowledge in some understandable notation that can be communicated among humans Unlike most previous work on data mining and knowledge discovery, our methods will: Such techniques will improve the way we manipulate and understand complex data.


Download ppt "Pat Langley Computational Learning Laboratory Center for the Study of Language and Information Stanford University, Stanford, CA 94304"

Similar presentations


Ads by Google