Presentation is loading. Please wait.

Presentation is loading. Please wait.

2nd Semantic Web Mining Workshop at ECML/PKDD-2002, August 2002, Helsinki, Finland Data Fusion and Semantic Web: Meta-Models of Distributed Data and Decision.

Similar presentations


Presentation on theme: "2nd Semantic Web Mining Workshop at ECML/PKDD-2002, August 2002, Helsinki, Finland Data Fusion and Semantic Web: Meta-Models of Distributed Data and Decision."— Presentation transcript:

1 2nd Semantic Web Mining Workshop at ECML/PKDD-2002, August 2002, Helsinki, Finland Data Fusion and Semantic Web: Meta-Models of Distributed Data and Decision Fusion. Project Report Vladimir Gorodetski, Oleg Karsaev, Vladimir Samoilov Intelligent System Laboratory of the St. Petersburg Institute for Informatics and Automation E-mail: {gor, ok, samovl}@mail.iias.spb.su http://space.iias.spb.su/ai/english/gorodetski.htm

2 2nd Semantic Web Mining Workshop at ECML/PKDD-2002, August 2002, Helsinki, Finland Title of the Project “Autonomous Information Collection, Knowledge Discovery Techniques and Software Tool Prototype for Knowledge-Based Data Fusion” Project from European Office of Aerospace Research and Development (EOARD) –AFRL/IF (USA) (December 2000 - December 2003)

3 2nd Semantic Web Mining Workshop at ECML/PKDD-2002, August 2002, Helsinki, Finland Outline of the Project Presentation 1. Outline of the Data and Information Fusion problems 2. Project research objectives 3. Examples of case studies and applications used 4. Ontology-centered meta-model of data sources 5. Meta-model of decision fusion 6. Multi-agent architecture 7. Conclusion

4 2nd Semantic Web Mining Workshop at ECML/PKDD-2002, August 2002, Helsinki, Finland Tasks and Applications of Data and Information Fusion Application Fields Critical areas of human society security, life support, security of critical state infrastructures, large-scale logistics, natural and man-made disasters, etc. Examples of Applications  Assessment and prediction of situations,  Resource management and rescue operation planning in large scale natural and man-made disasters,  Decision making and planning of rescue operations in systems like US 911, Situational awareness and prediction for terrorist intents and anti-terrorist activity planning,  Military situation assessment,  Safeguard of critical plants like nuclear power stations, electrical power grids, etc.

5 2nd Semantic Web Mining Workshop at ECML/PKDD-2002, August 2002, Helsinki, Finland Information Fusion-Definition “…data fusion is a formal framework in which means and tools for the alliance of data originating from different sources are expressed. It aims at obtaining information of greater quality; the exact definition of “greater quality” will depend on the application” (JDL-Joint Directors of Laboratories model, USAF) Level 1-Object assessment Level 2- Situation assessment Level 3- Impact assessment Distributed data sources Level 4- Process refinement Data Base Management System Support DB Fusion DB Level 5-User refinement Distributed information sources Human- Computer interface Sensor management, resource management (Erik Blash, Fusion-2002, July, 2002, Annapolis, USA) Areas of the current and Future research projects are yellowed Sensor 1 Sensor 2 Sensor N … Level 0-Pre-processing of sensor data

6 2nd Semantic Web Mining Workshop at ECML/PKDD-2002, August 2002, Helsinki, Finland Project Research Objectives Development of DF software tool providing support for design (first of all, for learning!) and implementation of DF applications of broad spectrum, in particular, providing support for :  Development of ontology-based meta-models of data sources, meta-model of decision fusion and conceptual model of DF software tool,  Development of Multi-agent architecture and  Design and implementation of applications of broad spectrum.

7 2nd Semantic Web Mining Workshop at ECML/PKDD-2002, August 2002, Helsinki, Finland Examples of case studies and application used in Projects Case studies -KDD Cup99 dataset -- Preprocessed relational data specifying Intrusion Detection task http://kdd.ics.uci.edu/databases/kddcup99.html -Landsat Multi-Spectral Scanner image dataset http://www.dfc-grss.org/data/grss_dfc_0010.zip -STULONG dataset– Longitudinal Study of Atherosclerosis Risk Factors http://euromise.vse.cz/challenge/en/projekt/index.php Application Application to be used in debugging and validation of MAS DK-DF - Intrusion detection learning system (Project also funded by EOARD/AFRL)

8 2nd Semantic Web Mining Workshop at ECML/PKDD-2002, August 2002, Helsinki, Finland Subtasks of the Project matching Semantic Web Mining area 1. Design and implementation of meta-model of data sources caused by heterogeneity and distribution of data to be fused. 2. Design and implementation of meta-model of distributed learning.

9 2nd Semantic Web Mining Workshop at ECML/PKDD-2002, August 2002, Helsinki, Finland Multiplicity of Data Sources Presenting User’s Activity in Intrusion Detection system

10 2nd Semantic Web Mining Workshop at ECML/PKDD-2002, August 2002, Helsinki, Finland Interrelation of Semantic Web and Ontology-oriented Research within the Project Semantic Web considers development and standardization of the ontology specification languages (XML, RDF, DAML+OIL), ontology-based query languages, ontology editors, etc). Semantic Web Mining considers specific problems of ontology design technology for (Web-based) Data Mining systems. Any DF system technology supposes (Web-based) distributed Data Mining and KDD and that is why it is a sub- area of the Semantic Web Mining. Ontology-based Data and Information Fusion system design put a number of specific problems of technological sort. Among them, the most important one is a technology for distributed design of distributed ontology.

11 2nd Semantic Web Mining Workshop at ECML/PKDD-2002, August 2002, Helsinki, Finland What is distributed design of distributed ontology? Data Sources Meta-model Data Source Sensor Data Source Manager Data Source management agent Data SourceSensor Data Source management agent Data Source Sensor Data Source management agent Data SourceSensor Data Source Manager Meta-data manager “KDD Master” Agent Ontology-based meta-model of Data sources Data Source Manager Data Source management agent ……. Meta-model =Ontology + Data source models at meta-level supporting a unified view of data of particular sources

12 2nd Semantic Web Mining Workshop at ECML/PKDD-2002, August 2002, Helsinki, Finland DF system ontology DF Problem ontology … Private component of application ontology of data source 1 Private component of application ontology of data source k Private component of application ontology of data source 2 Tower of DF application ontology components Shared component of Application ontology

13 2nd Semantic Web Mining Workshop at ECML/PKDD-2002, August 2002, Helsinki, Finland Agent 1 Agent 2Agent 3 Agent k Distributed Ontology and Protocols for Distributed Ontology Design “KDD Master” Agent Problem and shared components of application ontology Data Source 2 Data Source 1 Shared component of application ontology Private component of application ontology-3 Shared component of application ontology Private component of application ontology-k DS- 1 management agent KDD agent of source 1 DS- 2 management agent KDD agent of source 2 Meta-level KDD Agent Data Source k Shared component of application ontology Private component of application ontology-3 Shared component of application ontology Private component of application ontology-k Data Source 3 DS- 3 management agent KDD agent of source 3 DS- k management agent KDD agent of source k Protocols, Functions …….

14 2nd Semantic Web Mining Workshop at ECML/PKDD-2002, August 2002, Helsinki, Finland Particular Tasks to Be Solved on the Basis of Meta- model of Data Sources Providing for monosemantic understanding of terminology used in data specification by distributed analysts; Solution of the entity identification problem; Providing consistency of data representation (in case if the same attributes are presented differently in different data sources); Providing a gateway between ontology and distributed databases accessibility making possible interaction between ontology and distributed databases, and several other tasks.

15 2nd Semantic Web Mining Workshop at ECML/PKDD-2002, August 2002, Helsinki, Finland Meta-model of Data Sources: => Monosemantic understanding of terminology Meta-model of Data Sources: Ontology + Protocols => Monosemantic understanding of terminology Monosemantic understanding of terminology among DF system components is provided by shared vocabulary used by DF system distributed entities for communication. This excludes different naming of the same entities and their properties in different sources, and equal naming of different entities within different data sources thus providing integrity and consistency of shared vocabulary. Protocols Supports distributed collaborative design of coherent ontology by distributed analysts.

16 2nd Semantic Web Mining Workshop at ECML/PKDD-2002, August 2002, Helsinki, Finland Example of Application Ontology: High-level Part of Intrusion Detection Domain Ontology Network attack A Reconnaissance R Implantation and threat realization I Collection of Information Identification of hosts Identification of services Identification of OS Resource Enumeration Users and Groups Enumeration IH IS IO CI RE UE Applications and Banners Enumeration ABE Escalating Privilege Threat Realization Covering Tracks Getting Access to Resources GAR ER GAD TR CT Creating Back Doors CBD Gaining Additional Data Network Ping Sweeps DC Port Scanning SPIH TCP connect scan ST TCP SYN scan SS Notions of micro-layer TCP FIN scan SF TCP Xmas Tree scan SX Proxy scanning Dumb host scan Scanning 'FTP Bounce' PS DHS SFB TCP Null scan SN Half scan HS UDP scan SU "Part of" relationship N o t i o n s o f l o w e r l e v e l s “ Subclass of" relationship CD ID DOS Confidentiality destruction Integrity destructio n Denial of Service

17 2nd Semantic Web Mining Workshop at ECML/PKDD-2002, August 2002, Helsinki, Finland The Simplest ("top-down") Meta-protocol for Collaborative Ontology Design Source 1. Local source expert Source 1: Data preparation agent Source N: Data preparation agent Meta-data description agent Application domain expert Source N: Local source expert Forming the basic variant of ontology Sending the basic variant Analysis of the suggested basic variant Modifying and expanding the ontology Synchronization of modifications by the basic protocol Modifying and expanding the ontology Synchronization of modifications by the basic protocol …

18 2nd Semantic Web Mining Workshop at ECML/PKDD-2002, August 2002, Helsinki, Finland Ontology Synchronization Protocol Represented in Terms of UML-sequence Diagram Legend: 1. Local source expert 2. Local source data managing agent 3. Local source ontology 4. Local source: buffer of temporary changes 5. KDD master (Meta- data description agent) 6. Shared ontology 7. Meta-level agent: buffer of temporary changes 8. Application expert (meta-level) 9. Local source determining the modified ontology part

19 2nd Semantic Web Mining Workshop at ECML/PKDD-2002, August 2002, Helsinki, Finland Meta-model of Data Sources: Entity Identification Problem # of case Attributes of Data Source 2 1 4 5 9 11 12 14 15 17 19 # of case Attributes of Data source 3 1 2 4 8 9 11 14 15 # of case Attributes of Data source 1 1 3 4 7 9 11 15 19 Data Source 1 Data Source 3 Data Source 2 Explanation of Entity Identification Problem

20 2nd Semantic Web Mining Workshop at ECML/PKDD-2002, August 2002, Helsinki, Finland Demonstration of Entity Identification Problem: Intrusion Detection Application

21 2nd Semantic Web Mining Workshop at ECML/PKDD-2002, August 2002, Helsinki, Finland A Technique for Entity Identification Problem In the DF problem ontology, for each instance of an object to be classified, the notion of entity identifier ("ID entity") is introduced. This entity identifier plays the role of the primary key of the instance (in analogy with the primary key of a table). For each such identifier, a rule as a component of the shared part of application ontology is defined, which can be used to calculate the value of the instance key. A rule is a function which arguments are chosen from the set of this entity attributes. A rule is defined for each local data source to uniquely connect the entity identifier and the local primary key in this source. This rule specifies: how to derive the local primary key of instance from the entity identifier value; how to derive the entity identifier value from the value of the local primary key of an instance of the source.

22 2nd Semantic Web Mining Workshop at ECML/PKDD-2002, August 2002, Helsinki, Finland Meta-model of Data Sources: Diversity of Measurement Scales of the Same Attributes in Different Data Sources Let X be an attribute in application ontology that is measured differently in different sources. 1.In the shared component of application ontology, the type and the measurement unit of the attribute X are determined. Selection of attribute X specification within shared part of application ontology is made by experts during negotiations according to a synchronization protocol. 2.In all the sources where X is present, expressions are determined for this attribute, through which it can further be converted into the same scale in all the sources. This allows using the values of attributes on the meta- level regardless of the data source from which they originated.

23 2nd Semantic Web Mining Workshop at ECML/PKDD-2002, August 2002, Helsinki, Finland Meta-model of Data Sources: Interaction of Ontology and Databases of Sources The task arises due to the fact that application ontology entities are specified in terms of ontology notions but their instances are represented in terms of database language. To provide interaction of ontology and databases of sources (accessibility of data requested in ontology terms), a special gateway is developed. Application Access via VIEW objects Database objects Local data source Client-gateway DF Application ontology Local source data properties DF problem ontology Three-level hierarchy of access to the database objects DF application ontology

24 2nd Semantic Web Mining Workshop at ECML/PKDD-2002, August 2002, Helsinki, Finland Meta-model of Distributed Learning Components of meta-model of distributed learning: Meta-model of decision making and combining decisions of multiple base-level classifiers; Model of distributed data management (allocation training and testing data sets for learning particular classifiers; management by computation of meta- data for upper level example-based learning, etc.); Approaches and formal techniques used for combining decisions.

25 2nd Semantic Web Mining Workshop at ECML/PKDD-2002, August 2002, Helsinki, Finland Meta-model of Data Fusion: Hierarchy of Classifiers and Combining Decisions Meta-level classifier of source Local database (database of source) Base classifier 1 Base classifier 2 Variant 2... Base classifier k To DF system meta-level classifier Local database (database of source) Base classifier 1 Base classifier 2... Base classifier k Variant 1 To DF system meta-level classifier Meta-level classifier of source Local database (database of source) Base classifier 1... Base classifier k Variant 3

26 2nd Semantic Web Mining Workshop at ECML/PKDD-2002, August 2002, Helsinki, Finland Meta-model of Data Fusion: Distributed data management Distributed data management that is allocation training and testing data sets for learning particular classifiers; management by computation of meta-data for upper level example-based learning, etc. These tasks are solved through using in DF system special agents operating on source- located components and meta-level component of DF system. These agents solve the task in question through special negotiation protocol under management of local source and meta-level analysts.

27 2nd Semantic Web Mining Workshop at ECML/PKDD-2002, August 2002, Helsinki, Finland Meta-models of Training and Testing Data: An Example Results of distributed data management for the case study KDDCup-99 for two data sources BC4 Attributes - 16 Examples - 215 BC5 Attributes– 7 Examples - 469 Testing data of the local source DS2 Examples - 1739 BC6 Attributes– 7 Examples - 469 BC6 Attributes– 7 Examples - 469 BC1 Attributes - 16 Examples - 146 BC2 Attributes - 17 Examples - 146 Testing data of the local source DS2 Examples - 178 BC3 Attributes - 15 Examples - 202 Data source DS1Data source DS2

28 2nd Semantic Web Mining Workshop at ECML/PKDD-2002, August 2002, Helsinki, Finland Meta-model of Data Fusion: Approaches for combining decisions-1 1. Meta-classification scheme for combining decisions (based on “stacked generalization”) Legend : Data KDD algorithmsResulting classifiers Meta-learning level Training and testing data set (source k) Training and testing data set (source 2) Training and testing data set (source 1) … Result: Meta-classifier Algorithm for learning meta- classifier Meta-classifier’s training and testing data ("meta-data") Algorithms for Base classifier learning Base Classifier 2 to be learned Base Classifier 1 to be learned Base Classifier k to be learned … Testing data

29 2nd Semantic Web Mining Workshop at ECML/PKDD-2002, August 2002, Helsinki, Finland Meta-model of Data Fusion: Approaches for Combining Decisions-2 Competence-based Approach Classifier 1 Classifier 2 Classifier K …….. Referee 1 Referee 2 Referee K …….. Training and testing data Decision of the most competent classifier Examples of class 1 Examples of class 2 Correctly classified examples Erroneously classified examples Partition of learning data for classifier training Partition of learning data for referee training Selection of the most competent classifier and its decision

30 2nd Semantic Web Mining Workshop at ECML/PKDD-2002, August 2002, Helsinki, Finland Architecture of DF Software Tool Architecture of the source-based component of DF software tool Local data source KDD agent Testing Training User interface Local classification agents of DF system Base classifier Meta-classifier Referee To the KDD Master To the Meta-classification agent Server (library) of learning methods Data source managing agent

31 2nd Semantic Web Mining Workshop at ECML/PKDD-2002, August 2002, Helsinki, Finland Architecture of DF system Architecture of meta-level component of DF software tool To the Data source managing agent KDD Master agent Agent-classifier of meta- level Inference engine User interface Meta-classifier Referee To the KDD agent Local classification agents Meta-level KDD agent Server of learning methods

32 2nd Semantic Web Mining Workshop at ECML/PKDD-2002, August 2002, Helsinki, Finland Conclusion: Future work. 1. Development of sophisticated ontology editor supporting distributed design of a distributed ontology. 2. Further design and Implementation of Data Fusion System software tool for development and implementation of particular distributed applications in Data Fusion area.

33 2nd Semantic Web Mining Workshop at ECML/PKDD-2002, August 2002, Helsinki, Finland For more information and related publications please contact E-mail: gor@mail.iias.spb.su http://space.iias.spb.su/ai/english/gorodetski.htm Acknowledgement This research is funded by AFRL/IF (EOARD), 1999-2003 Thank you!


Download ppt "2nd Semantic Web Mining Workshop at ECML/PKDD-2002, August 2002, Helsinki, Finland Data Fusion and Semantic Web: Meta-Models of Distributed Data and Decision."

Similar presentations


Ads by Google