A First Attempt towards a Logical Model for the PBMS PANDA Meeting, Milano, 18 April 2002 National Technical University of Athens Patterns for Next-Generation.

Slides:



Advertisements
Similar presentations
2 Introduction A central issue in supporting interoperability is achieving type compatibility. Type compatibility allows (a) entities developed by various.
Advertisements

XML: Extensible Markup Language
Knowledge Representation
Entity Relationship (ER) Modeling
Relational Data Mining in Finance Haonan Zhang CFWin /04/2003.
Chapter 6 Methodology Conceptual Databases Design Transparencies © Pearson Education Limited 1995, 2005.
1 Modeling and Language Support for the management of PBMS Manolis Terrovitis Panos Vassiliadis Spiros Skiadopoulos Elisa Bertino Barbara Catania Anna.
A Review of Ontology Mapping, Merging, and Integration Presenter: Yihong Ding.
Architecture for Pattern- Base Management Systems Manolis TerrovitisPanos Vassiliadis National Technical Univ. of Athens, Dept. of Electrical and Computer.
Marakas: Decision Support Systems, 2nd Edition © 2003, Prentice-Hall Chapter Chapter 1: Introduction to Decision Support Systems Decision Support.
Introduction to databases from a bioinformatics perspective Misha Taylor.
Sangam: A Transformation Modeling Framework Kajal T. Claypool (U Mass Lowell) and Elke A. Rundensteiner (WPI)
Data Mining Techniques
Chapter 10 Architectural Design
Katanosh Morovat.   This concept is a formal approach for identifying the rules that encapsulate the structure, constraint, and control of the operation.
Fundamentals of Information Systems, Fifth Edition
Methodology - Conceptual Database Design Transparencies
Software School of Hunan University Database Systems Design Part III Section 5 Design Methodology.
Methodology Conceptual Databases Design
9/14/2012ISC329 Isabelle Bichindaritz1 Database System Life Cycle.
Information Systems: Databases Define the role of general information systems Describe the elements of a database management system (DBMS) Describe the.
1 Chapter 15 Methodology Conceptual Databases Design Transparencies Last Updated: April 2011 By M. Arief
Alignment of ATL and QVT © 2006 ATLAS Nantes Alignment of ATL and QVT Ivan Kurtev ATLAS group, INRIA & University of Nantes, France
Module 3: The Relational Model.  Overview Terminology Relational Data Structure Mathematical Relations Database Relations Relational Keys Relational.
Entity Framework Overview. Entity Framework A set of technologies in ADO.NET that support the development of data-oriented software applications A component.
Data Mining Chapter 1 Introduction -- Basic Data Mining Tasks -- Related Concepts -- Data Mining Techniques.
Marc Conrad, University of Luton1 Abstract Classes – pure computer science meets pure mathematics. The Beauty of Implementing Abstract Structures.
Information System Development Courses Figure: ISD Course Structure.
RCDL Conference, Petrozavodsk, Russia Context-Based Retrieval in Digital Libraries: Approach and Technological Framework Kurt Sandkuhl, Alexander Smirnov,
Methodology - Conceptual Database Design. 2 Design Methodology u Structured approach that uses procedures, techniques, tools, and documentation aids to.
1/26/2004TCSS545A Isabelle Bichindaritz1 Database Management Systems Design Methodology.
Knowledge Modeling, use of information sources in the study of domains and inter-domain relationships - A Learning Paradigm by Sanjeev Thacker.
1 Relational Databases and SQL. Learning Objectives Understand techniques to model complex accounting phenomena in an E-R diagram Develop E-R diagrams.
Software Engineering, 8th edition Chapter 8 1 Courtesy: ©Ian Somerville 2006 April 06 th, 2009 Lecture # 13 System models.
Sommerville 2004,Mejia-Alvarez 2009Software Engineering, 7th edition. Chapter 8 Slide 1 System models.
SC32 FBM Study Group Report Korea SC32 Meetings, May 2013 Baba Piprani - Serge Valera 1 ISO/IEC JTC1/SC32/WG2 N1801.
Methodology - Conceptual Database Design
Databases Shortfalls of file management systems Structure of a database Database administration Database Management system Hierarchical Databases Network.
Multi-Relational Data Mining: An Introduction Joe Paulowskey.
1 A Historical Perspective on Conceptual Modelling (Based on an article and presentation by Janis Bubenko jr., Royal Institute of Technology, Sweden. June.
Part4 Methodology of Database Design Chapter 07- Overview of Conceptual Database Design Lu Wei College of Software and Microelectronics Northwestern Polytechnical.
Database Environment Chapter 2. Data Independence Sometimes the way data are physically organized depends on the requirements of the application. Result:
Database Environment Session 2 Course Name: Database System Year : 2013.
MODEL-BASED SOFTWARE ARCHITECTURES.  Models of software are used in an increasing number of projects to handle the complexity of application domains.
1 Resolving Schematic Discrepancy in the Integration of Entity-Relationship Schemas Qi He Tok Wang Ling Dept. of Computer Science School of Computing National.
Ch- 8. Class Diagrams Class diagrams are the most common diagram found in modeling object- oriented systems. Class diagrams are important not only for.
©2003 Prentice Hall Business Publishing, Accounting Information Systems, 9/e, Romney/Steinbart 4-1 Relational Databases.
PSYCHO: A Prototype System for Pattern Management Barbara Catania, Anna Maddalena, Maurizio Mazza DISI - University of Genoa, Italy VLDB ’05 – Trondheim.
Data Profiling 13 th Meeting Course Name: Business Intelligence Year: 2009.
Towards Unifying Vector and Raster Data Models for Hybrid Spatial Regions Philip Dougherty.
A Semantic Caching Method Based on Linear Constraints Yoshiharu Ishikawa and Hiroyuki Kitagawa University of Tsukuba
©2003 Prentice Hall Business Publishing, Accounting Information Systems, 9/e, Romney/Steinbart 4-1 Relational Databases.
Class Diagrams. Terms and Concepts A class diagram is a diagram that shows a set of classes, interfaces, and collaborations and their relationships.
1 Chapter 2 Database Environment Pearson Education © 2009.
Presented by Kyumars Sheykh Esmaili Description Logics for Data Bases (DLHB,Chapter 16) Semantic Web Seminar.
Some Thoughts to Consider 5 Take a look at some of the sophisticated toys being offered in stores, in catalogs, or in Sunday newspaper ads. Which ones.
1 Storing and Maintaining Semistructured Data Efficiently in an Object- Relational Database Mo Yuanying and Ling Tok Wang.
Department of Mathematics Computer and Information Science1 CS 351: Database Management Systems Christopher I. G. Lanclos Chapter 4.
Of 24 lecture 11: ontology – mediation, merging & aligning.
1 The Relational Data Model David J. Stucki. Relational Model Concepts 2 Fundamental concept: the relation  The Relational Model represents an entire.
Defects of UML Yang Yichuan. For the Presentation Something you know Instead of lots of new stuff. Cases Instead of Concepts. Methodology instead of the.
COP Introduction to Database Structures
Chapter 7: Entity-Relationship Model
Data Models.
Chapter 2 Database Environment Pearson Education © 2009.
Data Model.
Metadata Framework as the basis for Metadata-driven Architecture
Chapter 2 Database Environment Pearson Education © 2009.
Chapter 2 Database Environment Pearson Education © 2009.
Presentation transcript:

A First Attempt towards a Logical Model for the PBMS PANDA Meeting, Milano, 18 April 2002 National Technical University of Athens Patterns for Next-Generation Database Systems PANDA

P. Vassiliadis. PANDA Meeting, Milano, 18 April Overview General Understanding of the PBMS Mathematical Background MetaModel: Entities and Language The Software Engineering Perspective Conclusions

P. Vassiliadis. PANDA Meeting, Milano, 18 April Overview General Understanding of the PBMS Mathematical Background MetaModel: Entities and Language The Software Engineering Perspective Conclusions

P. Vassiliadis. PANDA Meeting, Milano, 18 April General Framework Meta-Pattern Type + Patter Types = PBMS Catalog Pattern Layer = PBMS Content Raw Data Cluster 3 Cluster 2 Cluster 1 Assoc. Rule n Assoc. Rule 2 Assoc. Rule 1 Decision Tree 1 Ass. Rule Algorithm Dec. Tree Algorithm DBSCAN Cluster Algorithm belong to belongs to belong to Association Rule Type DBSCAN Cluster Type Decision Tree Type belong to Meta_Pattern Type PBMS Pattern Type Layer Meta-Pattern Type Layer Language

P. Vassiliadis. PANDA Meeting, Milano, 18 April General Idea Meta-Pattern Type+ LanguageRelation + Language a Name a Condensed Expression an Extension and Language a Name a Schema an Extension and Relational Calculus Pattern TypeRelational Table AssociationRuleType head :- body ext(AssociationRuleType) Buys session_id,date,item, price ext(Buys) PatternTuple Buys(x,_,beer,_):- Buys(x,_,pampers,_) Buys(34,4/4/2002,beer,2)

P. Vassiliadis. PANDA Meeting, Milano, 18 April Overview General Understanding of the PBMS Mathematical Background MetaModel: Entities and Language The Software Engineering Perspective Conclusions

P. Vassiliadis. PANDA Meeting, Milano, 18 April Mathematical Background Assumptions from the definition: There exists a data space and a pattern space. There always exist M:N relationships among data and patterns. Data Space Pattern Space

P. Vassiliadis. PANDA Meeting, Milano, 18 April Characteristics of data and pattern space Each data item is characterized by a finite number of features N. dom(x) the domain of each feature. Data space D N  dom(A 1 )x…xdom(A N ) Proposal: all dom(x) are infinitely countable + consider cases for D N (whether it is finite or not). Each pattern is characterized by a finite number of features M. Pattern space D M  dom(A 1 )x…xdom(A M ) Proposal: all dom(x) are infinitely countable + D M is clearly finite.

P. Vassiliadis. PANDA Meeting, Milano, 18 April Statistical Measures The data-pattern relationship f DP has: participation measures for the relationship; importance measures for a data item; importance measures for a pattern. Data Space Pattern Space

P. Vassiliadis. PANDA Meeting, Milano, 18 April Statistical Measures Richness of representation = relationships captured by the condensed representation total number of relationships Compactness of the representation = size(D M )*M size(D N )*N

P. Vassiliadis. PANDA Meeting, Milano, 18 April Overview General Understanding of the PBMS Mathematical Background MetaModel: Entities and Language The Software Engineering Perspective Conclusions

P. Vassiliadis. PANDA Meeting, Milano, 18 April General Framework Meta-Pattern Type + Patter Types = PBMS Catalog Pattern Layer = PBMS Content Raw Data Cluster 3 Cluster 2 Cluster 1 Assoc. Rule n Assoc. Rule 2 Assoc. Rule 1 Decision Tree 1 Ass. Rule Algorithm Dec. Tree Algorithm DBSCAN Cluster Algorithm belong to belongs to belong to Association Rule Type DBSCAN Cluster Type Decision Tree Type belong to Meta_Pattern Type PBMS Pattern Type Layer Meta-Pattern Type Layer Language

P. Vassiliadis. PANDA Meeting, Milano, 18 April Pattern Types Intentional Description of a Pattern Type as follows: –PID –Explicit Relationship: f DPi :D N → D i M. –Relationship Expression –Statistical Measures. Extensional Description (or Pattern Extension) of a Pattern Type : a finite set of patterns Data extension of of a Pattern Type : a countable? set of data items

P. Vassiliadis. PANDA Meeting, Milano, 18 April Example Pattern Type Intentional Description [small part of] Pattern Type Extensional Description PID Explicit Relationship Relationship Expression Statistical Measures PID123 f DPi :D N →D i M ={(PID123,RID124),…} Buys(x,_,beer,_):- Buys(x,_,pampers,_) Coverage=80%, Confidence=90%

P. Vassiliadis. PANDA Meeting, Milano, 18 April General Framework Meta-Pattern Type + Patter Types = PBMS Catalog Pattern Layer = PBMS Content Raw Data Cluster 3 Cluster 2 Cluster 1 Assoc. Rule n Assoc. Rule 2 Assoc. Rule 1 Decision Tree 1 Ass. Rule Algorithm Dec. Tree Algorithm DBSCAN Cluster Algorithm belong to belongs to belong to Association Rule Type DBSCAN Cluster Type Decision Tree Type belong to Meta_Pattern Type PBMS Pattern Type Layer Meta-Pattern Type Layer Language

P. Vassiliadis. PANDA Meeting, Milano, 18 April Meta-Pattern Types Intentional Description of a Pattern Type as follows: –Name –Condensed Expression –[Meta]Statistical Measures. –?? Schema Attributes ?? Extensional Description of a Meta-Pattern Type : a finite set of pattern types

P. Vassiliadis. PANDA Meeting, Milano, 18 April Example Meta-Pattern Type Intentional Description [small part of] Meta-Pattern Type Extensional Description Name Condensed Expression [Meta]Statistical Measures Schema Attributes?? AssociationRuleType head :- body Coverage:Float[0..1], Confidence: Float[0..1] PID, Head, Body ?? Pattern Type Intentional Description [small part of] Pattern Type Extensional Description PID Explicit Relationship Relationship Expression Statistical Measures PID123 f DPi :D N →D i M ={(PID123,RID124),…} Buys(x,_,beer,_):- Buys(x,_,pampers,_) Coverage=80%, Confidence=90%

P. Vassiliadis. PANDA Meeting, Milano, 18 April Which language to choose? Relational Calculus, Datalog and Stratified Datalog ? –Powerful but not elegant for all the patterns that we might want to express… Constraint database approach ? –We cannot guarantee a finite representation of the result for non-linear constraints…

P. Vassiliadis. PANDA Meeting, Milano, 18 April Which language to choose?

P. Vassiliadis. PANDA Meeting, Milano, 18 April Which language to choose? Remove recursion ? –Cannot express interesting patterns like transitive closure… Only linear constraints ? –Cannot express interesting patterns like cyclic clusters… –Approximation of polynomials through sets of linear constraints ? Not elegant… Forget constraints and describe every pattern type as a simple predicate ? –Loss of all the declarative information on the nature of the pattern type … So, what to do? Possible dead-end due to the paradigm?

P. Vassiliadis. PANDA Meeting, Milano, 18 April Overview General Understanding of the PBMS Mathematical Background MetaModel: Entities and Language The Software Engineering Perspective Conclusions

P. Vassiliadis. PANDA Meeting, Milano, 18 April How to build it? Each of the pattern types implemented as a Class. The different pattern types defined as specializations of a Generic Pattern Class. Treat pattern types as predicates, with semantics computed by a computationally complete procedural language [e.g., PL/SQL, C++, …]? –Instead of fundamental research we turn to feasibility issues… What about behavior?

P. Vassiliadis. PANDA Meeting, Milano, 18 April General Framework Meta-Pattern Type + Patter Types = PBMS Catalog Pattern Layer = PBMS Content PBMS Cluster 3 Cluster 2 Cluster 1 Assoc. Rule n Assoc. Rule 2 Assoc. Rule 1 Decision Tree 1 IN Association Rule Class Cluster Class Decision Tree Class ISA Generic Class Set of DDL/DML Languages How to build it?

P. Vassiliadis. PANDA Meeting, Milano, 18 April Overview General Understanding of the PBMS Mathematical Background MetaModel: Entities and Language The Software Engineering Perspective Conclusions

P. Vassiliadis. PANDA Meeting, Milano, 18 April Conclusions Followed the Datalog paradigm (need for deductive capabilities) enhanced with constraints (need for elegance) Reduced the problem to the specification of a proper language for the description of pattern types Fundamental language limitations when considered constraints Dilemma: –Change paradigm? –Stick with this paradigm and focus on engineering issues? –…Any other suggestions ?…

P. Vassiliadis. PANDA Meeting, Milano, 18 April Thank you …

P. Vassiliadis. PANDA Meeting, Milano, 18 April Definitions from the minutes of Athens meeting Pattern is a compact and rich in semantics representation of raw data. A Pattern-Based Management System (PBMS) is a system for handling (storing / processing / retrieving) patterns extracted from raw data in order to efficiently support pattern matching and to exploit pattern- related operations generating intentional information.

P. Vassiliadis. PANDA Meeting, Milano, 18 April Issues around the pattern definition The mapping from original raw data space to less populated (  compact) pattern space is always possible preserving (or, documenting) as much knowledge as possible from raw data space (  rich in semantics). A M:N mapping between raw data space and pattern space is permitted Perhaps, several levels of representation / abstraction exist (different levels of granularity, multi- dimensionality, recursion, hierarchies, etc.)

P. Vassiliadis. PANDA Meeting, Milano, 18 April Issues around the PBMS definition A PBMS will cooperate with a DBMS storing raw data; A PBMS processes different kinds of queries (because of different user needs) on raw data and returns more intuitive results to users; A PBMS is useful in order to process those queries more efficiently than a normal DBMS would do; A PBMS will have its own mechanisms for representing and storing its entries (patterns), posing and processing queries, efficiently retrieving its entries.

P. Vassiliadis. PANDA Meeting, Milano, 18 April Query Language Issues Given a datum, which pattern does it refer to? Which are the data that correspond to this pattern? Zoom-in, zoom-out a pattern. Pattern union, difference. Composition of patterns (i.e., if A  B and B  C, then derive A  C). What are values of the statistical measures for this pattern? Which patterns fulfill a certain constraint on a statistical measure? Which are the patterns in the PBMS catalog? Which are the attributes or the statistical measures for this pattern type? Which pattern types relate to a certain statistical measure? Closed Form of the Language.