Presentation is loading. Please wait.

Presentation is loading. Please wait.

Knowledge Modeling, use of information sources in the study of domains and inter-domain relationships - A Learning Paradigm by Sanjeev Thacker.

Similar presentations


Presentation on theme: "Knowledge Modeling, use of information sources in the study of domains and inter-domain relationships - A Learning Paradigm by Sanjeev Thacker."— Presentation transcript:

1 Knowledge Modeling, use of information sources in the study of domains and inter-domain relationships - A Learning Paradigm by Sanjeev Thacker

2 Introduction –Background –ADEPT –Problems –Contributions

3 Background Web is an ever-increasing source of information Information of interest to user is distributed across multiple heterogeneous sources Need for integration to provide a one point access for querying

4 Besides querying, use the data sources to extract useful knowledge Provide an environment for studying domains Provide means to study and explore complex inter-domain relationships Ability to pose complex information requests across multiple domains ADEPT

5 Problems Diverse and distributed sources Web sources unlike database –Unstructured or semi-structured –Inconsistencies and information overlapping Heterogeneities –Semantic –Structural –Syntactic

6 Problems Representation of complex relationships Use of Knowledge Model for complex information request capability with embedded semantic information

7 Contribution Knowledge Model Information Scape Model Learning Paradigm Visual Interfaces

8 Outline Knowledge Modeling Information Scapes Learning Paradigm Visual Interfaces Related Work Future Work Demo

9 Knowledge Modeling Approach to source modeling –Global model and source model –Source centric / query centric

10 Source Centric Advantages –Global model independent of source model –Modeling a source is independent of other sources –Dynamic addition, removal and modification of sources –Global view remains unaffected –No source mapping required during information integration –More suitable for sources other than database sources ( web sources)

11 Knowledge Base Comprises of –Ontologies (Domain model) –Resources –Relationships –Operations

12 Domain Hierarchy

13 Ontology Standardize meaning, description, representation of involved attributes Capture the semantics involved via domain characteristics Allow knowledge sharing and reuse Resolve resource model differences by mapping them to the global model of the ontology they represent Global interface

14 Ontology Description includes –Attributes –Domain Rules –Functional Dependencies

15 Resource Desirable characteristics: –Add, modify and delete resources for an ontology dynamically without affecting the systems knowledge –Specify the sources in a manner such that one can declaratively query them –Since the number of resources is large there is a need to identify the exact usefulness of resources from the query viewpoint and prune the others

16 Resource Description includes –Attributes –Binding Patterns –Data Characteristics –Local Completeness

17 Relationships Simple relationships: –equals, less-than, like, is-a, is-part-of Are hierarchical or similarity based Complex relationships –“Earthquakes cause Tsunami”, “Nuclear explosions cause earthquakes”, “Air- pollution affects vegetation”

18 Relationships Characteristics –Involves multiple ontologies –Requires understanding the semantics involved in their interaction –Cannot be expressed by simple relational and logical operators alone –Involves use of complex operations like functions and simulations

19 Relationship Example –“ Nuclear explosion causes Earthquakes ” NuclearTest Causes Earthquake: dateDifference(NuclearTest.eventDate, Earthquake.eventDate)<30 AND distance(NuclearTest.latitude, NuclearTest.longitude, Earthquake,latitude, Earthquake.longitude)<10000

20 Operations Functions, Simulations Functions –user defined –used to model the semantics involved in the relationships –used in post processing of result data –example distance, dateDifference Simulations –independent programs –used for post processing of result data –example clarke urban growth model

21 Information Scape (Iscape) Representation of an information request across multiple domains Can be deployed and executed Sources not explicitly specified like in a query System is aware of the sources and is able to identify the useful sources Semantic correlation across domains is embedded within the information request

22 Information Scape Definition –An IScape may be defined as information request over distributed heterogeneous sources of information involving multiple ontologies and the relationships between them that contains meta-information constructed to facilitate the bridging of semantic relationships between individual sources.

23 Information Scape Ontologies Relationships Constraint –Conjunctive boolean expression Runtime configurable constraint –Conceptually different Grouping and group constraint –Similar to having clause in SQL Projection list

24 Learning Paradigm Study of domain Use IScapes to study the domain interaction by using relationships Relationships could lead to transitive findings Explore the hypothetical relationships to validate and establish them or invalidate them

25 Learning Paradigm Data mining –Age and breast cancer Relationships –Nuclear Explosion causes Earthquakes Post processing –Functions –Simulations –Charting tool

26 Learning Paradigm Find the earliest recorded Nuclear test conducted Plot a graph of the average number of Earthquakes of magnitude greater than 5.8 per year starting from 1900 Find the average number of Earthquakes of magnitude greater than 5.8 between 1900-1949 and between 1950-present

27 Learning Paradigm Find the average number of Earthquakes of magnitude greater than 7 between 1900-1949 and between 1950-present Find pairs of Nuclear tests and Earthquakes that occurred with a certain radius and a certain time period of the explosion

28 Visual Interfaces Knowledge Builder IScape Builder Web Interface IScape Processing Monitor

29 Knowledge Builder GUI to build the knowledge base –fast and easy to use –Manually creating the knowledge could be arduous and error prone Knowledge is stored in the standard XML format Abstraction from the underlying format and other technical details

30 Knowledge Builder Assists in the creation, deletion and modification of the knowledge base Automatically creates a knowledge tree that assists in relating the knowledge in a better manner

31 Knowledge Builder

32 Knowledge Hierarchy

33 IScape Builder GUI to create, deploy and execute IScapes in a step by step manner IScape stored in XML format User abstraction to the underlying structure Validity checks implemented Integrated tools –the charting tool to plot charts with the result data

34 IScape Builder

35 Web Interface Web accessible –Knowledge Base –Existing Iscapes Set the runtime configurable constraint Execute existing IScapes View the tabulated results Cannot create new IScapes

36 Web Interface Result Screen

37 IScape Processing Monitor Color coded log entries describing the IScape processing are generated –Brief message along with agent name –Time stamp –detailed description and associated data, if any –IScape plan for the existing sources –Intermediate results High level debugging tool –Understand execution, locate failures Not available with the web interface

38 Monitor GUI

39 Related Work State of the art –SIMS, TSIMMIS, Information Manifold, Observer, Infosleuth Mainly focussed on one point access for querying of integrated data of a domain What makes ADEPT unique –Relationships, IScapes, learning paradigm distinguishes our system from any prior work

40 Future Work Support rules of type “if-then” and use of induction learning to speed up the processing Recursive query capability required IScape over Iscape support required Simulations currently supported as specialized function in our framework Statistical analysis tools like SAS for time series analysis, logistic regression


Download ppt "Knowledge Modeling, use of information sources in the study of domains and inter-domain relationships - A Learning Paradigm by Sanjeev Thacker."

Similar presentations


Ads by Google