Presentation is loading. Please wait.

Presentation is loading. Please wait.

K.Subieta. SBA and SBQL, slide 1 Sept. 2006 SBA (Stack-Based Approach) and SBQL (Stack-Based Query Language) Presentation prepared for OMG Object Database.

Similar presentations


Presentation on theme: "K.Subieta. SBA and SBQL, slide 1 Sept. 2006 SBA (Stack-Based Approach) and SBQL (Stack-Based Query Language) Presentation prepared for OMG Object Database."— Presentation transcript:

1 K.Subieta. SBA and SBQL, slide 1 Sept. 2006 SBA (Stack-Based Approach) and SBQL (Stack-Based Query Language) Presentation prepared for OMG Object Database Technology Working Group OMG TECHNICAL MEETING, Anaheim, CA USA September 25-29, 2006 by Prof. Kazimierz Subieta Polish-Japanese Institute of Information Technology, Warsaw, Poland subieta@pjwstk.edu.pl http://www.ipipan.waw.pl/~subieta SBA/SBQL pages: http://www.sbql.plhttp://www.sbql.pl

2 K.Subieta. SBA and SBQL, slide 2 Sept. 2006 Agenda Motivation for SBA and SBQL, short history Major topics and architectural issues Syntax, semantics and pragmatics of QLs Foundation of abstract implementation of SBQL –Abstract syntax, state, environment stack, query result stack, semantic rules. M0 store model: objects and pointer links, SBQL for M0 M1 store model: classes and static inheritance, SBQL for M1 M2 store model: roles and dynamic inheritance, SBQL for M2 Imperative constructs, procedures, functions and methods SBQL virtual updatable object views SBQL strong typing system Query optimization for SBQL

3 K.Subieta. SBA and SBQL, slide 3 Sept. 2006 What is SBA and SBQL? SBA is a conceptual frame for developing O-O database query/programming languages SBQL is a model query language according to SBA. –It has the same role and meaning as object algebras, but it is formally sound and much more universal. SBA/SBQL deal with various data models and all imaginable and reasonable query constructs. Abstract implementation is the basic paradigm of formal specification of semantics.

4 K.Subieta. SBA and SBQL, slide 4 Sept. 2006 Why the Stack-Based Approach? Main motivations: –Inadequacy of the current database theories to practice; –Lack of clean and sound approach to QL semantics  too challenging query optimization, strong typing, database views, etc.; –Chaotic design of current QLs, non-orthogonality, sticking independent operators into big syntactic constructs; –Annoying false stereotypes (e.g. „contradiction” between encapsulation and query languages), wishful thinking on advantages or disadvantages of particular models; Main conclusion: query languages are programming languages. –Should be developed according to the same methods and principles. –Attempts to establish a clear border line between querying and programming failed. –We abandon database theories, such as the relational algebra, relational calculus, their object-oriented counterparts, F-logic, etc.

5 K.Subieta. SBA and SBQL, slide 5 Sept. 2006 Short history of SBA/SBQL 1989: first implementation (NETUL – an expert system shell) 1990: advanced prototype implementation (LOQIS) 2002: SBQL for XML DOM 2003: YAOD – a prototype OODBMS 2004: SBQL for the European Project ICONS (commercialized) 2004: SBQL prototype for Objectivity/DB 2004: Book on SBA/SBQL, 522 pages 2005: BPQL for OfficeObjects ® Workflows (commercial) 2006… (pending): SBQL for the European project eGov Bus 2006… (planned): SBQL for the European project VIDE 6 finished PhD-s, 7 pending PhD-s, many MSc, many papers,…

6 K.Subieta. SBA and SBQL, slide 6 Sept. 2006 Major topics that SBA deals with (1) General architecture of query processing Abstract models of object stores Syntax, semantics and pragmatics of query languages SBQL algebraic and non-algebraic operators – syntax and semantics Classes, methods and static inheritance in query languages Dynamic object roles and dynamic inheritance in query languages Processing of irregular data structures (semi-structured data) Transitive closures and fixed-point equations in SBQL Extension of SBQL with imperative (updating) constructs Procedures, functions and methods in SBQL

7 K.Subieta. SBA and SBQL, slide 7 Sept. 2006 Major topics that SBA deals with (2) Parameter passing for procedures / functions / methods Encapsulation in SBQL Virtual updatable views for SBQL Types, interfaces, schemas and metamodels Static (semi-) strong type checking of SBQL queries and programs Query optimization (rewriting, indices, caching, …) Query processing and optimization in distributed systems Data-intense grids and P2P networks: integration of distributed, heterogeneous, fragmented and redundant resources Aspect-oriented databases SBQL in OMG MDA and executable UML

8 K.Subieta. SBA and SBQL, slide 8 Sept. 2006 General architecture of query processing Actually, we do not fix the architecture –It can be similar to SQL (server-side processing of queries, the ODBC, ADO or JDBC style) –It can be similar to the ODMG architecture (queries as strings embedded in a popular programming language, e.g. Java or C#) –It can be similar to Oracle PL/SQL (programs integrated with queries, performed on the client-side) Our goal is to shift as much as possible query processing and optimization to the client side (in contrast to SQL) –Lower workload for the server  better overall performance –More flexible for query optimization (e.g. parallel execution on many servers, possibility to optimize queries referencing local objects)

9 K.Subieta. SBA and SBQL, slide 9 Sept. 2006 Internal architecture of the query processor Classical run-time architecture of popular programming languages, with necessary modifications and generalizations In contrast to PL-s, we separate ENVS from an object store Client object store Environment (call) stack ENVS Query result stack QRES Binding Query/program operators References (OIDs) and values of objects Volatile (non-shared) objects Persistent (shared) objects Query evaluation Server object store

10 K.Subieta. SBA and SBQL, slide 10 Sept. 2006 Detailed client-server architecture Parser of queries and programs Software development environment (editor, debugger, etc.) Client Syntactic tree of a query/program Volatile (non-shared) objects ENVS QRES Static ENVS Static QRES Persistent (shared) objects Object manager Processing persistent abstractions (views, stored procedures, triggers) Register of indices Server Local metabase Metabase of persistent objects Optimization by rewriting Optimization by indices Interpreter of queries & programs Strong type checker Network Register of views Administration Transactions

11 K.Subieta. SBA and SBQL, slide 11 Sept. 2006 Data model SBA and SBQL are neutral to data models. –This is firm and basic assumption. –Query languages address data structures rather than data models. All imaginable data models are to be acceptable, starting from the relational model, through XML/RDF models, up to very advanced object-oriented models, with classes, roles, static and dynamic inheritance, encapsulation, polymorphism, etc. –I am not aware of any data model that cannot be served by SBQL. –In any data model we are not interested in its ideological assumptions, advantages and disadvantages, but only in abstract formal properties of data structures that it implies. –All such structures are to be formally addressed by SBQL.

12 K.Subieta. SBA and SBQL, slide 12 Sept. 2006 Syntax and semantics of formal languages All we know what is syntax, including formal syntax. What is semantics, especially formal semantics? –For query languages the specification of formal semantics is the must. –It is a strong guideline for implementation and standardization. –Required by query optimization and strong typing. –There is a lot of approaches, especially mathematical ones, but no approach covering all the issues of QLs integrated with PLs. SBA is a universal formal framework to specify semantics of QLs/PLs. –„Formal” does not mean „mathematical”. –„Formal” means „expressed through precise abstract implementation”.

13 K.Subieta. SBA and SBQL, slide 13 Sept. 2006 Pragmatics of formal languages Pragmatics determines when, what for, and how to use the language, and consequences of the use. –Syntax and semantics are servants of pragmatics. –Pragmatics is the major subject of languages’ manuals. –Pragmatics is explained by examples of use, intuitive meaning of queries, description of query results, good patterns, bad patterns, etc. Popular manuals and standards explain semantics through syntax, intuitive description and pragmatics. –For instance, the ODMG standard. However, pragmatics is a bad way to specify semantics. –It is unable to explain and specify all recursive dependencies and relationships between various language constructs, operators, data structures and results of language’s statements.

14 K.Subieta. SBA and SBQL, slide 14 Sept. 2006 Object model and database schema … are inevitable parts of the pragmatics of a query language. –The application programmer must be aware what the database contains and how it is organized, frequently before the database is filled in. Usually, an object model and a database schema language are presented at the beginning of the given specification, c.f. ODMG The model involves such concepts as types, classes, interfaces, joined into a coherent whole as a schema language, c.f. ODL. –However, the concepts are difficult, especially types. –Introducing them at the beginning, without realizing what is the semantics of a query language, usually results in inconsistencies. Hence, we must first understand the semantics of a query language on the ground of an abstract object store model.

15 K.Subieta. SBA and SBQL, slide 15 Sept. 2006 Abstract syntax and syntax-driven semantics Concrete syntax usually involves a lot of syntactic sugar. Abstract syntax is free of syntactic sugar and ambiguities. –SQL, concrete syntax: –Abstract syntax: In computer languages semantics is syntax-driven. –First, the designers of a language define its syntactic rules. –Then, each syntactic rule is associated with a semantic rule. –To simplify the association, the syntax should be abstract. –Syntactic rules are recursive  semantic rules must be recursive too. select Name, DeptName from Employee, Dept where some_predicate ((Employee × Dept) σ some_predicate) π (Name ° DeptName)

16 K.Subieta. SBA and SBQL, slide 16 Sept. 2006 SBA semantics of QL-s – general point of view Let Query be a set of all syntactically correct queries. Let State be a set of all states (database states, but not only). Let Result be a set of all possible query results. Semantics of any query q belonging to Query is a function that maps State → Result. If q has side effects, then it maps State → Result and State → State. –In some theories, a state is a set of objects and a result is a set of objects too. This is known as the closure property. In SBA a state contains not only objects and a result never contains objects. The closure property is inconsistent, it is conceptual nonsense. –It leads to next nonsense such as subdividing queries into object preserving and object generating.

17 K.Subieta. SBA and SBQL, slide 17 Sept. 2006 Summing up: what we need to define semantics? 1.Abstract syntax of queries, domain Query. It is to be defined by a set of context-free rules. 2.Formal and abstract concept of all possible states, domain State. –Missed in the ODMG standard, thus the standard is not prepared to specify the formal semantics of OQL. 3.Formal and abstract concept of all possible query results, domain Result. 4.Formal (recursive) mapping of each context-free rule into a semantic rule, which maps each state into a result.

18 K.Subieta. SBA and SBQL, slide 18 Sept. 2006 Abstract implementation Actually, all formal approaches to query languages propose some method of specification of semantics. –e.g. the relational algebra, calculus, object algebras, etc. –However, they are very limited or inconsistent, as a rule. In programming languages the most known are denotational semantics and operational semantics. The best is a variant of operational semantics that is referred to as abstract implementation. –The method is simple (but not simpler than necessary) and universal. In the operational semantics we have to define a machine or a procedure that will execute all the semantic rules. –Abstract implementation can be easily mapped into a language’s interpreter in our favorite programming language.

19 K.Subieta. SBA and SBQL, slide 19 Sept. 2006 What is State? –Usually the concept is understood as object state or database state. We much extend this concept. A state includes all data or programming features that can influence the result of some (any) query, in particular: –Database state on the server side. –Local state (local objects used in queries) on the client side. –Global and local computer and software environment (e.g. date, time). –Available libraries, procedures, functions, classes, views, etc. A state also includes structures (invisible for the programmer ) that determine the run-time environment of computations. – In SBA there is one such structure: environment stack (ENVS). In SBA a state consists of two elements: state = object store + environment stack

20 K.Subieta. SBA and SBQL, slide 20 Sept. 2006 Is ENVS purely implementation notion? No. The environment stack is a conceptual notion. –Without ENVS formal specification of semantics of QLs and PLs will be impossible or will be limited. –Without ENVS it is impossible to explain formally and precisely the mechanisms of classes, roles, static and dynamic inheritance, etc. –ENVS makes it possible to explain precisely (recursive) procedures and methods, methods of parameter passing, database views, etc. In SBA we present ENVS on an abstract level. We are not interested in its physical implementation. –Implementation can be different, introducing many optimizations. –Usually ENVS is a client-side data structure stored in main memory. The main roles of ENVS: determining scopes for names and binding names occurring in queries.

21 K.Subieta. SBA and SBQL, slide 21 Sept. 2006 What is Result? Almost any value that can be stored in the object store or can be computed from other values can be the result of a query. –For instance, the query 2+2 returns 4. –Multimedia, having megabytes, cannot be returned as query results. –In such cases a query returns a reference to such a value (e.g. a file name) rather than a value. Reference is a fundamental concept of SBA. –Queries can return complex data structures which include values, references, names, structure constructors and collection constructors. In SBA queries never return objects, but references to objects (OIDs), perhaps within some complex structures. –Objects are stored within the object store only.

22 K.Subieta. SBA and SBQL, slide 22 Sept. 2006 Query result stack, QRES For nested queries temporary and final results are accumulated on the query result stack (abbreviated QRES). –This is quite easy notion, known from a lot of student manuals. –QRES is a client-side structure usually stored in a main memory. QRES must be prepared to store in a single section any query result (including nested collections, arrays, etc.). QRES is not a component of State –… because the result of a new query does not depend on the previous QRES state. –In the denotational semantics this notion is not necessary (it is hidden within recursive function calls). In abstract implementation precise specification of the QRES mechanism is fundamental. –Thus it is introduced in SBA explicitly, on the abstract level.

23 K.Subieta. SBA and SBQL, slide 23 Sept. 2006 Example of QRES state 15 i 17 struct{ x(i 61 ), y(i 93 ) } bag{ struct{ n( " Doe"), s(i 9 )}, struct{ n( " Poe"), s(i 14 )}, struct{ n( " Lee" ), s(i 18 )}} bottom the only visible stack section invisible stack sections top

24 K.Subieta. SBA and SBQL, slide 24 Sept. 2006 What is a semantic rule? In the operational semantics method we have to define a machine that associates each syntactic rule with a semantic rule, in the form of actions of the machine. We define this machine as a recursive procedure eval having a query q written in an abstract syntax as an argument. The procedure eval evaluates a query and returns its result. –Inside the procedure we have syntactic cases, and then, actions of the machine for particular cases. –A case presents just a semantic rule subordinated to the corresponding syntactic rule. Next is a skeleton of the procedure eval in the form of self- explained pseudo-code.

25 K.Subieta. SBA and SBQL, slide 25 Sept. 2006 Procedure eval – the idea of semantic rules procedure eval( q : query ) { parse ( q ); //the parser recognizes top-level subqueries in the query q case q is recognized as literal l : // a query consisting of a single literal, e.g. 2. push the value denoted by l on QRES; case q is recognized as name n : // a query being a single name n, e.g. Person. bind the name n on ENVS; push the result of the binding on QRES; case q is recognized as ∆ q 1 : // q consists of a unary operator ∆ applied to q 1 eval( q 1 ); apply the operator denoted by ∆ to the result returned by q1 on QRES; push the result of the application on QRES; case q is recognized as q 1 ∆ q 2 : // q involves a binary algebraic operator ∆ eval( q 1 ); eval( q 2 ); apply the operator denoted by ∆ to the results returned by q 1 and q 2 on QRES; push the result of the application on QRES; … } Rule 1  Rule 2  Rule 3  Rule 4 

26 K.Subieta. SBA and SBQL, slide 26 Sept. 2006 The compositionality principle … requires that the semantics of a compound statement is a function of semantics of its components. –For instance, if we have a compound query q = q 1 θ q 2, then the semantics of q, |q|, is the result of some function fun θ (| q 1 |, | q 2 | ). –Function fun θ depends on the operator θ. Compositionality allows for orthogonal combination of operators and unlimited recursive nesting of queries. –Semantics of a complex query is build recursively from semantics of its components. –Compositionality is better if syntactic rules are as short as possible. –Good compositionality  easier implementation, shorter manuals. –SQL, OQL – poor compositionality (big heavy syntactic monsters). In SBA compositionality concerns all constructs of queries, imperative statements, programming abstractions, etc.

27 K.Subieta. SBA and SBQL, slide 27 Sept. 2006 Total internal identification Each database or program entity, which could be separately retrieved, updated, inserted, deleted, authorized, indexed, protected, locked, should possess a unique internal identifier. –A unique internal identifier should be assigned not only to objects on the top hierarchy level, but to all sub-objects, including atomic ones. –We are not interested in the form, structure and meaning of internal identifiers. –The principle makes it possible to make references and pointers to all possible entities, thus to avoid conceptual problems with binding, scoping, updating, deleting, parameter passing, and other functionalities that require references as query primitives. ODMG does not follow the idea. –ODMG „literals” (components of objects) have no identifiers. –Thus e.g. it is impossible to extend OQL with updating constructs.

28 K.Subieta. SBA and SBQL, slide 28 Sept. 2006 Object relativism If some object O 1 can be defined, then object O 2 having O 1 as a component can also be defined. –No limitations concerning the number of hierarchy levels of objects. –Objects on any hierarchy level should be treated uniformly. –An atomic object (having no attributes) should be allowed as a regular data structure. Object relativism implies the relativism of corresponding query capabilities. –There is no need for attributes, sub-attributes, etc. - all are objects too. The idea radically reduces a database model, cuts the size of specification of query languages, the size of implementation, and the size of documentation. –It much supports query optimization and strong typing.

29 K.Subieta. SBA and SBQL, slide 29 Sept. 2006 Abstract Object Store Models A component of State is an object store. –To define the semantics of a query language we have to define an object store precisely, but on the abstract level. Because various object models introduce a lot of incompatible notions, SBA assumes some family of object store models which are enumerated M0, M1, M2 and M3. –M0 covers relational, nested-relational and XML-oriented databases. M0 assumes hierarchical objects and binary links between objects. –Advanced store models introduce classes and static inheritance (M1), object roles and dynamic inheritance (M2), and encapsulation (M3). –All the models are served by SBQL. These store models are pivots - they can be extended and modified, depending on features that one would like to cover.

30 K.Subieta. SBA and SBQL, slide 30 Sept. 2006 Notions common to store models 1.Internal object identifier (OID) –Uniquely identifies an object in the store. –Assigned automatically, no external meaning. –Used as a reference or a pointer to an object. 2.External object name –Usually bears some external semantics of an object, e.g. Person, Customer. –Explicitly assigned by a database designer, programmer, etc. –It is usually not unique, e.g. many objects named Person. 3.Atomic object value –Cannot be subdivided into smaller parts –E.g. 2, 3.14, “Doe”, “Hello, World!”. –The size is not constrained – from 1 bit to gigabytes. –So far we are not interested in types (I’ll return to this issue later).

31 K.Subieta. SBA and SBQL, slide 31 Sept. 2006 M0 : Complex Objects and Pointer Links No record, tuple, array, set, etc. constructors in the model: essentially all of them are collections of objects. External names are not unique: modeling collections (bags). Uniform treatment of relational, nested relational, etc. databases. I - a set of internal identifiers N - a set of external names V - a set of atomic values - atomic object - pointer object - complex object, T is a set of objects R  I – start identifiers object object ID object name object value

32 K.Subieta. SBA and SBQL, slide 32 Sept. 2006 M0 object store - example Objects, } >, } >, } >, } >, } >, } > Start identifiers i 1, i 5, i 9, i 17, i 22

33 K.Subieta. SBA and SBQL, slide 33 Sept. 2006 M0 object store – graphical view i 5 Emp i 6 name ”Poe” i 7 sal 2000 i 8 worksIn i 1 Emp i 2 name ”Doe” i 3 sal 2500 i 4 worksIn i 22 Dept i 23 dname ”Ads” i 24 loc ”Rome” i 25 employs i 26 employs i 17 Dept i 18 dname ”Trade” i 19 loc ”Paris” i 20 loc ”Rome” i 21 employs i 9 Emp i 10 name ”Lee” i 16 worksIn i 11 sal 900 i 12 address i 13 city ”Rome” i 14 street ”Boogie” i 15 house# 13

34 K.Subieta. SBA and SBQL, slide 34 Sept. 2006 A relational database in M0 A similar mapping can be applied to hierarchical DB, nested relational DB, XML, RDF, … Relational schema: Emp( name, sal, worksIn ) name Doe Poe Lee sal 2500 2000 worksIn Production Sales Relation: Emp Model M0: Objects:, } >,, } >,, } > Start identifiers: i 1, i 5, i 9

35 K.Subieta. SBA and SBQL, slide 35 Sept. 2006 Environment Stack, ENVS ENVS is also known as call stack. For query processing we modified and generalized it: –ENVS is used to binding objects that are stored at a server, hence ENVS contains references to objects rather than object values. –The same object can be referenced from different stack sections. –For collections the binding is macroscopic, for instance, if Emp is bound, the binding returns many references. In PLs the stack has usually two incarnations: static (compile time) and dynamic (run-time). Because database objects are always dynamically bound, some properties of a static stack must be shifted to a dynamic stack. –We return to the static stack when we will consider strong typing. Besides classical roles of the stack, in SBA it plays many new roles, in particular, processing non-algebraic operators.

36 K.Subieta. SBA and SBQL, slide 36 Sept. 2006 PLs: What ENVS is for? Abstraction and encapsulation: local properties of a procedure are hidden for programmers that use it. The procedure is seen only by its interface (signature). Isolation: programmers writing different procedures need not to know about each other. Semantic independency and reuse: a procedure can be invoked from many places of an applications. Unlimited invocations of procedures from other procedures, including recursive calls. Management of name spaces used in programs: no naming conflicts between local procedures’ environments. Implementation of parameter passing methods: call-by-value, call-by-reference, strict-call-by-value, etc.

37 K.Subieta. SBA and SBQL, slide 37 Sept. 2006 Naming, scoping, binding SBA is based on the naming, scoping and binding paradigm: Every name occurring in a query is bound to run time program or database entities, according to the actual scope for the name. Binding is substituting a name occurring in a query by a run- time program entity (or entities). This concerns all names, in particular: –Names of persistent or volatile objects, subobjects (attributes), sub- subobjects, pointers, etc. –Names of procedures, functions, methods, views, parameters. –Names of entities from the computer or software environment –Any auxiliary names that are defined and used in queries ENVS presents an universal scoping and binding mechanism. –No name occurring in a query can be bound otherwise.

38 K.Subieta. SBA and SBQL, slide 38 Sept. 2006 New, important concept: binder Binder is an internal structure to determine (dynamic) bindings. A binder consists of two parts: A binder is a pair (n, r), where n belongs to N and r belongs to the domain Result (e.g. a reference to an object). Such a pair is written n(r). Binders are the basis for binding names occurring in queries. –Roughly, if n(r) is present on ENVS, then the binding of n returns r. –If binder n(r) is not present on ENVS, then binding of n fails. –Binders play other important roles. Binders can be nested, for instance, Emp( name(i 2 ), sal(2500) ). External nameInternal run-time program entity

39 K.Subieta. SBA and SBQL, slide 39 Sept. 2006 ENVS in SBA It consists of sections. Each section is a set of binders. The stack is growing and shrinking according to nesting query operators. The most local data are at the top. The most global data are at the bottom....... Binders to local entities of the currently executed method..... name(i 2 ) sal(i 3 ) worksIn(i 4 ) Binders to global entities of the user session Emp(i 1 ) Emp(i 5 ) Emp(i 9 ) Dept(i 17 ) Dept(i 22 ) Binders to entities of the global environment The database section The section of the currently processed object

40 K.Subieta. SBA and SBQL, slide 40 Sept. 2006 Binding through ENVS – function bind Emp(i 1 ) G(“Mary”) X(i 221 ) …. name(i 2 ) sal(i 3 ) worksIn(i 4 ) ….. Emp(i 1 ) Emp(i 5 ) Emp(i 9 ) Dept(i 17 ) Dept(i 22 ) …. The order of visiting stack sections –Searching from the top section to the bottom section. –If proper binder is found, the searching is terminated. –All binders with the given name from the final section are taken. –Some sections are omitted due to static scoping (as usual in PLs). bind( G ) = “Mary” bind( X ) = i 221 bind( sal ) = i 3 bind( Emp ) = i 1 bind( Dept ) = {i 17, i 22 } Omitted section

41 K.Subieta. SBA and SBQL, slide 41 Sept. 2006 Opening a new section of ENVS (1) In PLs opening a new scope on ENVS is caused by entering a new procedure (function, method) or entering a new block. –Respectively, removing the scope is performed when the control leaves the body of the procedure/block. To these classical situations we add a new one. –It is the essence of SBA. The idea is that some query operators (called non-algebraic) behave on the stack similarly to program blocks. –For instance, in the SBQL query: Emp where ( name = “Poe” and sal > 1000 ) the part ( name = “Poe” and sal > 1000 ) behaves as a program block executed in an environment consisting of the interior of an Emp object. Binding concerns also names name and sal. –Hence, we push on ENVS a section with the interior of the currently processed Emp object (next slide).

42 K.Subieta. SBA and SBQL, slide 42 Sept. 2006 Emp where (name = ”Poe” and sal > 1000) Emp(i 1 ) Emp(i 5 ) Emp(i 9 ) Dept(i 17 ) Dept(i 22 ) name(i 10 ) sal(i 11 ) address(i 12 ) worksIn(i 16 ) Emp(i 1 ) Emp(i 5 ) Emp(i 9 ) Dept(i 17 ) Dept(i 22 ) Initial ENVS state. bind( Emp ) = {i 1, i 5, i 9 } ENVS during evaluation of the condition for the third object Emp. bind( name ) = i 10 ; bind( sal ) = i 11 binding Interior of the 3-rd object Emp condition Opening a new section of ENVS (2)

43 K.Subieta. SBA and SBQL, slide 43 Sept. 2006 Function nested – computing object’s interior Function nested acts on an object reference and returns its interior as a set of binders. For instance: The result of nested is then pushed at ENVS. i 9 Emp i 10 name ”Lee” i 16 worksIn i 11 sal 900 i 12 address i 13 city ”Rome” i 14 street ”Boogie” i 15 house# 13 nested( i 9 ) = { name( i 10 ), sal( i 11 ), address( i 12 ), worksIn( i 16 ) }

44 K.Subieta. SBA and SBQL, slide 44 Sept. 2006 Generalization of function nested In general, it can be applied to any element of Result. –For a complex object,,..., }> it holds: nested( i ) = { n 1 (i 1 ), n 2 (i 2 ),..., n k (i k ) } The case is illustrated on the previous slide. –If i is an identifier of a pointer object, and the object store contains the object, then nested( i ) = { n 1 (i 1 ) } This accomplishes navigation according to a pointer. –For a binder n(x) holds: nested( n(x) ) = { n(x) } As will be shown, this semantics is consistent with the typical understanding of auxiliary names introduced in queries. –For a structure nested returns the union of the results of the nested function applied for elements of the structure: nested( struct{ x 1, x 2,... } ) = nested(x 1 )  nested(x 2 ) ... For other arguments nested returns the empty set.

45 K.Subieta. SBA and SBQL, slide 45 Sept. 2006 Definition of Result for SBQL Any atomic value belongs to Result. Any reference (OID) belongs to Result. If x belongs to Result, then any binder n(x) belongs to Result. If x 1, x 2, x 3,... belong to Result, then struct{ x 1, x 2, x 3,... } belongs to Result. –The order of elements in a structure can be significant. –In contrast to typical structures, we do not assume that all elements of a structure must be named (elements need not be binders). –Implicitly, we assume that for a single element struct{ x 1 } = x 1. –Empty structures are not allowed. If x 1, x 2, x 3,... belong to Result, then bag{x 1, x 2, x 3,... } and sequence{x 1, x 2, x 3,... } belong to Result. –bag and sequence are collection constructors. Reminder: so far we are not dealing with types.

46 K.Subieta. SBA and SBQL, slide 46 Sept. 2006 Summing up: what we have defined so far? We know precisely what is an object store, atomic object, complex object, pointer object and collection. We know precisely what is the construction of an environment stack ENVS, what it is for, what is binding, and how a new section on the stack is constructed (binders, function nested). –Hence, we know precisely what is state and how it behaves We know precisely what is a query result and a result stack QRES. We understand the idea of abstract implementation in the form of the recursive procedure eval (evaluation of a query). Now we have all the semantic equipment to define SBQL and its abstract implementation for the M0 store model.

47 K.Subieta. SBA and SBQL, slide 47 Sept. 2006 SBQL atomic queries Syntax: Any literal l is an SBQL query. E.g. 2 3.14 “Doe” true A literal l is an external (source code) representation of a value v l. –Any name n is an SBQL query. E.g. Emp sal worksIn e d Semantics procedure eval( q : query ) { ….. case q is recognized as literal l : push(v l, QRES); case q is recognized as name n : push( bind(n), QRES);..… }

48 K.Subieta. SBA and SBQL, slide 48 Sept. 2006 SBQL algebraic operators Algebraic operators do not use ENVS. Syntax: If ∆ is a symbol denoting a unary algebraic operator and q 1 is a query, then ∆ q 1 is a query. –If ∆ is a symbol denoting a binary algebraic operator and q 1 and q 2 are queries, then q 1 ∆ q 2 is a query. Semantics: procedure eval( q : query ) { ….. case q is recognized as ∆q 1 : eval( q 1 ); apply the operator denoted by ∆ to the top of QRES; case q is recognized as q 1 ∆ q 2 : eval( q 1 ); eval( q 2 ); apply the operator denoted by ∆ to two top elements of QRES, pop QRES two times, then push the result of ∆ on QRES; …..

49 K.Subieta. SBA and SBQL, slide 49 Sept. 2006 Examples of algebraic operators A lot of them. We assume that SBQL accepts any operator if some designer wants to introduce it. –Unary algebraic operators: count, sum, avg, max, median, -, log, sqrt, not,... –Binary algebraic operators: Operators and comparisons for primitive types: +, -, *, /, =, >, <, and, or, concatenation of strings, …. Structure constructor Operators and comparisons on collections: sum of bags, equality of bags, intersect, contains, in, concatenation of sequences, … Coercions (changing types or representations) and dereferencing ….. There is a lot of discussions and semantic details concerning particular kinds of operators. In this presentation I skip them.

50 K.Subieta. SBA and SBQL, slide 50 Sept. 2006 Auxiliary naming operators Syntax: If q is a query, then q as n, q group as n are queries. Semantics: Both operators are considered unary algebraic operators parameterized by a name. –Operator as (changing bag/sequence elements into binders): Let q returns bag{a1, a2, a3,...}. Then q as n returns bag{n(a1), n(a2), n(a3),...}. Similarly for sequences and individual elements. –Operator group as (naming a query result): Let q returns some result r. Then, q group as n returns a single binder n(r). These simple operators cover all the naming contexts in QLs. –Iteration variables in SQL and OQL –A variable bound by a quantifier –Naming virtual attributes in views –…

51 K.Subieta. SBA and SBQL, slide 51 Sept. 2006 Examples of auxiliary naming in SBQL Navigational (dependent) join: ((Student as x join (x.takes.Lecture) as y join ( y.taught_by.Professor) as z) where z.rank = "full professor”). (x.name, z.name) Quantifier:  Emp as e (e.sal > 10000) Structure constructor: ( ”Lee” as name, 900 as sal, (“Rome” as city, “Boogie” as street, 13 as house) as address ) as Emp Iteration variable: for each Emp as e do e.sal := e.sal +100;

52 K.Subieta. SBA and SBQL, slide 52 Sept. 2006 SBQL non-algebraic operators Non-algebraic operators use ENVS. They cannot be reduced to any algebra. –SBQL is based on different foundations than the relational algebra. Non-algebraic operators introduced in SBQL: –where (selection), dot (projection, navigation, path expressions), join (dependent or navigational join), quantifiers (universal and existential), order by (sorting) and transitive closures. –All non-algebraic operators are binary. –All have a common semantic core based on the ENVS mechanism. Syntax: q 1 where q 2 q 1. q 2 q 1 close by q 2 q 1 order by q 2 q 1 join q 2  q 1 ( q 2 )  q 1 ( q 2 ) q 1 leaves by q 2

53 K.Subieta. SBA and SBQL, slide 53 Sept. 2006 SBQL non-algebraic operators - semantics Consider query q 1 θ q 2, where θ is a non-algebraic operator. 1.Evaluate query q 1 2.For each e  result(q 1 ) do the following steps: –Push nested(e) as the top section on ENVS –Evaluate query q 2 in this new environment –Calculate a partial query result through some function partialResultOf θ (e, result(q 2 ) ); the function depends on θ –Pop (remove) the top section from ENVS. 3.Merge all partial result into the final result. –It is done by some function mergePartialResults θ ( partialRes 1, partialRes 2,..., partialRes k ), which depends on θ.

54 K.Subieta. SBA and SBQL, slide 54 Sept. 2006 Evaluation of a non-algebraic operator result( q 1 ) = bag{ e 1, e 2, e 3 } Previous state of ENVS Previous state of ENVS nested(e 1 ) Previous state of ENVS Previous state of ENVS nested(e 2 ) Previous state of ENVS Previous state of ENVS nested(e 3 ) Previous state of ENVS time result(q 2 ) result(q 1 θ q 2 ) Evaluation of query q 1 θ q 2

55 K.Subieta. SBA and SBQL, slide 55 Sept. 2006 Formal semantics (pseudocode) procedure eval( q : query ) {....... case q is recognized as q 1 θ q 2 : { partialResults: bag of Result; partialResult, finalResult, e: Result; partialResults :=  ; eval( q 1 ); for each e in top( QRES ) do { push( nested(e), ENVS ); eval( q 2 ); partialResult := partialResultOf θ (e, top( QRES ) ); partialResults := partialResults  { partialResult }; pop( QRES ); pop( ENVS ); }; finalResult := mergePartialResults θ ( partialResults ); pop( QRES ); //removing the result(q 1 ) from QRES push( QRES, finalResult); }....... }

56 K.Subieta. SBA and SBQL, slide 56 Sept. 2006 SBQL: Selection q 1 where q 2 For each element e returned by q 1, query q 2 is evaluated with ENVS augmented by nested( e ). e belongs to the final result, iff q 2 returns true for it. Emp where ( sal > 1000 ) i1i5i9i1i5i9 i3i3 i7i7 i 11 i1i5i1i5 Result returned by query Emp Results returned by query sal Iteration over elements of the previous result Dereference forced by > 2500 2000 900 Results returned by query 1000 1000 true false Results returned by query sal>1000 Final result of the query name(i 2 ) sal(i 3 ) worksIn(i 4 ) Emp(i 1 ) Emp(i 5 ) Emp(i 9 ) Dept(i 17 ) Dept(i 22 ) name(i 6 ) sal(i 7 ) worksIn(i 4 ) Emp(i 1 ) Emp(i 5 ) Emp(i 9 ) Dept(i 17 ) Dept(i 22 ) name(i 10 ) sal(i 11 ) address(i 22 ) worksIn(i 16 ) Emp(i 1 ) Emp(i 5 ) Emp(i 9 ) Dept(i 17 ) Dept(i 22 ) Emp(i 1 ) Emp(i 5 ) Emp(i 9 ) Dept(i 17 ) Dept(i 22 ) ENVS before evaluation

57 K.Subieta. SBA and SBQL, slide 57 Sept. 2006 SBQL – other non-algebraic operators Projection, navigation q 1. q 2 –For each element e returned by q 1, query q 2 is evaluated with ENVS augmented by nested( e ). –The result is the sum of all partial results returned by q 2. –Path expressions are a side effect of the definitions. Dependent join q 1 join q 2 –For each element e returned by q 1, query q 2 is evaluated with ENVS augmented by nested( e ). –The result is the sum of struct{e, v}, where v is an element returned by q 2. Definition of quantifiers, order by, close by, leaves by, etc. exactly in the same style.

58 K.Subieta. SBA and SBQL, slide 58 Sept. 2006 Dept [0..*] d# dname loc[1..*] worksInemploys[1..*] manages[0..1]boss Object schema used in examples Emp [0..*] e# name job sal address [0..1] city street house#

59 K.Subieta. SBA and SBQL, slide 59 Sept. 2006 Examples of SBQL queries for M0 –Get references of departments for employee named Doe: (Emp where name = “Doe”).worksIn.Dept –Get names of departments together with their average salaries: (Dept join avg(employs.Emp.sal) as avgsal). (dname, avgsal) –Names and cities for employees working in the department managed by Kim: (Dept where (boss.Emp.name) = “Kim”).employs.Emp. (name, if exists(Address) then Address.city else “No address”) –Get departments employing a professional for any job in the company. Dept where  distinct(Emp.job) as j (  employs.Emp (j = job)) –Names and salaries of employees earning more than their bosses. (Emp where sal > (worksIn.Dept.boss.Emp.sal)).(name, sal)

60 K.Subieta. SBA and SBQL, slide 60 Sept. 2006 M1 : Classes and static inheritance Classes, methods and inheritance require extension of M0. Classes have two incarnations: as pieces of a source code and as run-time database entities. –Usually programming languages deal with classes as second-class citizens, i.e. in the source code only. –In our model we are (so far) not interested in this point of view. We deal with them when we consider static binding and strong typing. –In the M1 store model classes are first class entities storing invariant properties of their objects, i.e. methods (but not only). Hence in our model classes are objects too, connected with their member objects by a special relationship. Classes are also connected with classes by another relationship know as inheritance.

61 K.Subieta. SBA and SBQL, slide 61 Sept. 2006 Classes as objects in M1 i 1 Person i 2 name ”Doe”... i 9 Emp i 10 name ”Lee” i 16 worksIn i 11 sal 900... i 5 Emp i 6 name ”Poe” i 7 sal 2000 i 8 worksIn... i 40 PersonClass i 41 age (...code...)... i 51 changeSal (...code...)... i 50 EmpClass i 52 netSal (...code...) inherits from member of i 22 i 33 member of

62 K.Subieta. SBA and SBQL, slide 62 Sept. 2006 SBQL semantics for M1 Changes concern only ENVS and non-algebraic operators –When a non-algebraic operator processes an object, which is a member of a class, which inherits from a class, etc. then the ENVS is augmented (starting from the top) by nested(i), nested(i C1 ), nested(i C2 ), …up to the most general class. –When a non-algebraic operators finishes processing the object, all these sections are removed from ENVS. Previous ENVS state nested( i ) nested(i C1 ) nested (i C2 ) ….. Before processing the object After processing the object During processing the object

63 K.Subieta. SBA and SBQL, slide 63 Sept. 2006 Example: Processing an object in M1 name(i 6 ) sal(i 7 ) worksIn(i 8 ) … changeSal(i 51 ) netSal(i 52 )... age(i 41 )... … Person(i 1 )... Emp(i 5 ) Emp(i 9 )..... nested(i 5 ) - internals of the currently processed Poe’s object nested (i 50 ) – internals of EmpClass nested (i 40 ) – internals of PersonClass Binders to database objects Sections pushed by the dot (Emp where name = “Poe”). (name, netSal, age) ENVS during processing the subquery after the dot:

64 K.Subieta. SBA and SBQL, slide 64 Sept. 2006 Some peculiarities of M1 Binding and processing methods: –Invocation of a method means that a new section (activation record) is additionally pushed at top of ENVS. –The section contains parameters of the method (evaluated previously), its local environment and a return track. –Rather minor semantic peculiarities connected with encapsulation. A problem - multiple inheritance: –M1 allows for multiple inheritance, but in case of name conflict there is no solution. This is a general problem, not specific to M1. Next big problem - collections: –They violate object-oriented principles such as substitutability and open-close (reuse, conceptual continuation). –Possible solutions require specific extensions of M1.

65 K.Subieta. SBA and SBQL, slide 65 Sept. 2006 Examples of SBQL queries for M1 - schema Emp[0..*] e# job[1..*] sal[0..1] changeSal(newSal) netSal( ) Dept[0..*] d# dname loc[1..*] budget() employs[1..*]worksIn Person[0..*] name birthYear age() Address [0..1] city street house# manages[0..1] boss

66 K.Subieta. SBA and SBQL, slide 66 Sept. 2006 Examples of SBQL queries for M1 –Get names of departments and the average age of their employees (inheritance of the method age). Dept. (dname, avg(employs.Emp.age)) –Get employees that for sure live in the cities where their departments are located (inheritance of Address). Emp where  Address as a (  (worksIn.Dept.loc) as l (a.city = l)) –For each employee get name and the percent of the annual budget of his/her department that is consumed by his/her sal. Emp. (name, (((if exists(sal) then sal else 0) as s). ((s * 12 * 100)/(worksIn.Dept.budget))) –For each person having no salary give the minimal salary in his/her department. for each (Emp where not exists(sal)) as e do e.changeSal( min(e.works_in.Dept.employs.Emp.sal) );

67 K.Subieta. SBA and SBQL, slide 67 Sept. 2006 M2: Dynamic roles and dynamic inheritance The object model with dynamic object roles removes essential conceptual drawbacks of the classical static inheritance. –The idea is that an object during its life can acquire and lose its roles without changing its identity. –Object’s business semantics depends on a currently considered role. SBQL is the first (and only) QL dealing with dynamic roles. –Dynamic object roles and dynamic inheritance require extension of M1 and extension of the semantics of non-algebraic operators. Student Employee Club-member Student Tax-payer Dog-owner Person Patient

68 K.Subieta. SBA and SBQL, slide 68 Sept. 2006 Example of the M2 store model i 40 PersonClass i 41 age (...code...)............. i 1 Person i 2 name ”Doe” i 3 born 1948 i 60 StudentClass i 61 avgScore (...code...)............. i 50 EmpClass i 51 changeSal (...code...) i 52 netSal (...code...)............. is member of inherits from dynamically inherits from i 4 Person i 5 name ”Poe” i 6 born 1975 i 127 i 13 Emp i 14 sal 2500 i 15 worksIn i 7 Person i 8 name ”Lee” i 9 born 1951 i 128 i 16 Emp i 17 sal 1500 i 18 worksIn i 19 Student i 20 studentNo 223344 i 21 faculty ”Physics”

69 K.Subieta. SBA and SBQL, slide 69 Sept. 2006 SBQL semantics for M2 Changes concern only ENVS and non-algebraic operators –The order of sections of roles and classes on ENVS is determined by a simple rule (c.f. full description of SBA/SBQL). –Some new operators dealing with roles (dynamic cast, has role). (Emp where name = ”Lee”). (sal, born, age) Properties of the currently processed Emp role Properties of the EmpClass Properties of the Person super-role of the Emp role Properties of the PersonClass Database section sal(i 17 ) worksIn(i 18 ) changeSal(i 51 ) netSal(i 52 )... name(i 8 ) born(i 9 ) age(i 41 )............ Person(i 1 ) Person(i 4 ) Person(i 7 ) Emp(i 13 ) Emp(i 16 ) Student(i 19 )............ Sections pushed by the dot

70 K.Subieta. SBA and SBQL, slide 70 Sept. 2006 Examples of SBQL queries for M2 - schema

71 K.Subieta. SBA and SBQL, slide 71 Sept. 2006 Examples of SBQL queries for M2 –Get employees older than 60 who live in Warsaw (dynamic inheritance of the attribute Address and static inheritance of the method age ). Emp where age > 60 and  Address (city = “Warsaw”) –For each person get name and the sum of all the incomings (salary and scholarships). (Person as p). (p.name, sum(bag(0, ((Student)p).scholarship, ((Emp)p).sal))) –Get students who live in the same city as the city of their school. Student where  Address (city = (studiesAt.School.city)) –Get name, faculty and school name for each person studying at two or more faculties. (((Person as p) join ((((Student)p) group as s))) where count(s) ≥ 2). (p.name, s.(faculty, (studiesAt.School.name)))

72 K.Subieta. SBA and SBQL, slide 72 Sept. 2006 Some qualities of dynamic object roles Multiple inheritance. Because roles are encapsulated there is no name conflict even if the super classes would have different properties with the same name. Repeating inheritance. An object can have two or more roles with the same name. Multiple-aspect inheritance. A class can be specialized according to many aspects. UML covers this feature, but it is neglected in tools. Object migration: An object can change its classes without changing its identity. Temporal properties: Roles can represent any past facts concerning objects. Overlapping collections: an object is included into as many collections as the types of roles it contains. Aspect-Oriented Programming. Dynamic object roles can be considered as a technical facility supporting AOP.

73 K.Subieta. SBA and SBQL, slide 73 Sept. 2006 Imperative constructs of SBQL After implementing the stack-based machine of SBQL implementation of imperative constructs becomes quite easy extension. –For instance, create, update, insert, delete, for each and other control statements. We accept the tradition of classical imperative and object- oriented languages, but provide queries as basic constructs. –Obviously, there is a lot of choices and options. –Classical dilemma between built-in and added-on operators.

74 K.Subieta. SBA and SBQL, slide 74 Sept. 2006 Procedures, functions and methods A procedure call opens a new section on the environment stack. The section contains binders to local procedure objects (transient) and binders related to the actual parameters of the procedure. Local procedure objects are invisible from outside. Scoping rules assume skipping irrelevant stack sections. Queries are used as actual parameters of procedures. A query determines an output from a functional procedure. A call of a functional procedure is considered a query. Procedure p 1 calls p 2. Then, procedure p 2 calls p 3. When p 3 is executed, sections of p 1 and p 2 are irrelevant for binding.

75 K.Subieta. SBA and SBQL, slide 75 Sept. 2006 SBQL: Example of a procedure Procedure ChangeDept moves the specified employees to the specified department. procedure ChangeDept( E: EmpType[0..*]; D: DeptType ) { delete ( Dept. employs ) where Emp in E; for each E as e do { create  e as employs; insert employs into D; e. worksIn :=  D  e. D# := D. D#; }; ChangeDept( Emp where job = “designer” and (worksIn.Dept.boss.Emp.name) = “Lee”; Dept where (boss.Emp.name) = “Kim” ); Let Kim become the manager of all designers working so far for Lee:

76 K.Subieta. SBA and SBQL, slide 76 Sept. 2006 SBQL updatable object views 30 years of R&D on views have resulted in minor results. –Very restricted view updating in Oracle and DB2. –Proposals concerning object-oriented views - limited and immature. Essentially, all previous solutions were based on the assumption that a view definition is a function determined by a single query. –View updating through side effects of the definition. First fine solution for RDBMS – instead of trigger views: –Based on overloading of an updating operation on a virtual table by a trigger with the code accomplishing the intention of the updating. SBQL views are based on a similar idea, but it is incomparably more general and efficient. –The idea works for any data model, including XML and O-O ones. –It assumes that each operation on a virtual object is overloaded by a special procedure written by the view definer. –The procedure expresses definer’s intention of the operation.

77 K.Subieta. SBA and SBQL, slide 77 Sept. 2006 ….. User program A query invoking the view A consumer of the query result (e.g. the operator „update”) View definition The procedure in the view definition overloading the given consumer virtual identifiers Interpreter of queries and updating statements A piece of the interpreter code implementing the given consumer ….. General scenario of view processing

78 K.Subieta. SBA and SBQL, slide 78 Sept. 2006 Overloaded operations on virtual objects Dereference, i.e. taking a value of a virtual object (on_retrieve). –Unavailable in instead of trigger views. Assignment a new value to a virtual object (on_update). Deleting a virtual object (on_delete). Inserting a (material or virtual) object into a virtual object (on_insert). Creating a new virtual object (on_create). If some of the procedures on_retrieve, …, on_create is not defined by the view definer, the corresponding operation on virtual objects is not allowed.

79 K.Subieta. SBA and SBQL, slide 79 Sept. 2006 Example of a virtual updatable view create view bestSellingBookDef { virtual objects bestSellingBook { return (Book where sold > 1000) as b;} on_delete do { delete b; } create view vtitleDef { virtual objects vtitle { return (b.title) as t; } on_retrieve do { return deref( t ); }} create view vauthorDef { virtual objects vauthor { return (b.author) as a; } on_retrieve do { return deref( a ); }} create view vpriceDef { virtual objects vprice { return ( b.price ) as p; } on_retrieve do { return convertToEuro( b.currency, p ); } on_update (newPrice) do { p:= convertFromEuro(b.currency, newPrice);}}} Book title author price currency sold bestSellingBook vtitle vauthor vprice vprice always in euro, updating of vprice is converted to the proper currency of the book. for each (BestSellingBook where vtitle = ”MDA” ) do vprice := vprice - 10; Stored objects Virtual objects

80 K.Subieta. SBA and SBQL, slide 80 Sept. 2006 Strong static type checking in SBQL In our approach and implementation we have taken the following tenets: –Don’t trust intuitions easy to come to inconsistency (the ODMG case). –Don’t trust type theories too idealistic, addressing mathematical models very limited for practice. –Distinguish internal and external type systems. Internal type system reflects behavior of the type checking mechanism. External type system is seen and used by the programmer. Internal type system is much more sophisticated that the external one. Both must coincide, but the internal type system should dominate. ODMG defined an external type system only. This is like the definition of a building construction by determining its front elevation. –Trust only abstract implementation of the internal type system.

81 K.Subieta. SBA and SBQL, slide 81 Sept. 2006 Static type checking mechanism for QLs Any static strong type checking mechanism must simulate run- time computations during compile time… –…by reflecting run-time semantics with the precision that is available at the compile time. New semantic properties of query languages cause that known strong typing systems and their theories are totally useless. Current OO models and XML models introduce many peculiarities that make strong typing very challenging: –Ellipses, automatic coercions, automatic dereferences. –Mutability, collection cardinality constraints, collection types (set, bag, sequence, etc.), type names, multimedia types,.... –Irregularities in data structures (semi-structured data). We call the SBQL type system „semi-strong”, to underline liberal attitude to strong typing.

82 K.Subieta. SBA and SBQL, slide 82 Sept. 2006 Roles and functions of the SBQL typing system Compile-time type checking of query operators, imperative constructs, procedures, functions, methods, views and modules. User-friendly, context dependent reporting on type errors. Resolving ambiguities with automatic type coercions, ellipses, dereferences, literals and binding irregular data structures. Shifting type check to run-time, if it is impossible to do it during compile time. Restoring a type checking process after a type error. –To discover more than one type error in one run. Preparing information for query optimization by proper decorating a query syntax tree. –Decorations allow for automatic decisions concerning query rewriting, use indices, etc.

83 K.Subieta. SBA and SBQL, slide 83 Sept. 2006 Internal SBQL type system Three basic data structures are compile-time counterparts of run time structures: –Metabase – a counterpart of an object store. –Static environment stack S_ENVS – a counterpart of ENVS –Static result stack S_QRES – a counterpart of QRES Static stacks contain and process type signatures – internal typing counterparts of corresponding run time entities. –Signatures are atomic types, references to metabase nodes, static binders n(s), where s is a signature, struct and bag signatures, etc. –Signatures are additionally associated with attributes, such as mutability, cardinality, collection kind, type name, multimedia, etc. For each query/program operator a decision table is provided: –It determines allowed combinations of signatures and attributes, the resulting signature and its attributes, and additional actions. Then, the type checking engine simulates run-time semantics.

84 K.Subieta. SBA and SBQL, slide 84 Sept. 2006 Example decision table for dot Syntax: q L.q R E 1, E 2 are signatures of individual elements (not collections) T 1, T 2 are signatures of any types Other cases (not included in this table) are type errors. Signature of q L Signature of q R Signature of q L.q R Additional action E1E1 E2E2 E2E2 E1E1 bag{T 2 } E1E1 sequence{T 2 } sequence{T 1 }E2E2 sequence{E 2 } sequence{T 1 }sequence{T 2 } bag{T 1 }E2E2 bag{E 2 } bag{T 1 }bag{T 2 }

85 K.Subieta. SBA and SBQL, slide 85 Sept. 2006 Query optimization Query optimization is not directly addressed by a database standard, but it is closely related too. –Lack of query optimization undermines the goals of the standard. The optimization requires discipline in the QL’s development: –Occam’s razor: minimizing the data model, minimizing the number of features of the QL, avoiding irregular treatment and special cases. –Orthogonality of constructs, avoiding big syntactic monsters. –Precise formal semantics of all query operators, allowing one to reason on semantically equivalent queries and their performance. SBA, as a formal methodology of building OO query languages, is exceptionally well prepared for query optimization. –I believe that it is much better prepared than the relational model and SQL.

86 K.Subieta. SBA and SBQL, slide 86 Sept. 2006 Optimization methods for SBQL Methods based on rewriting: –Factoring out independent subqueries, rules based on the distributivity property, removing dead subqueries, query modification for processing stateless functions and views, query tail absorption, … Methods based on indices: –Involving dense and range indices, index management utilities. Methods based on query caching: –Storing results of queries in order to reuse them. Pipelining, parallel execution of queries: –Splitting a complex query processing into many parallel processes. Query optimization for distributed databases with horizontal and vertical fragmentations. Heuristics and cost models concerning a query execution plan. Query optimization is the major topic of the research on SBA.

87 K.Subieta. SBA and SBQL, slide 87 Sept. 2006 Conclusions To make a high quality standard for object-oriented databases, the specification of semantics is the must, … –…to avoid the fate of SQL-99 and ODMG standards, perceived as loose recommendations rather than technical specifications. SBA offers the unique method of query languages’ construction and semantic specification. –SBA is a holistic database theory, it doesn’t give up any, even the most advanced feature of current practical O-O database query and programming languages. –Even smallest semantic problem is considered very important. –Efficiency has been proven by several implementations. Alternatively, the new standard can rely on the current well- known theories concerning object-oriented databases. –In such a case many standard’s qualities will be among nice wishes.

88 K.Subieta. SBA and SBQL, slide 88 Sept. 2006 10 unique qualities of SBA/SBQL for a new O-O database standard 1.Orthogonal syntax, full compositionality of queries. 2.Universal formal semantics based on abstract implementation. 3.Computational universality, advanced data structures, integration with PL constructs. 4.Strong typing of advanced O-O queries and programs. 5.Several advanced implementations, next are pending. 6.Fully transparent O-O virtual updatable views. 7.Strong potential for query optimization. 8.All O-O notions treated formally and uniformly. 9.Sound and manageable metamodel. 10.The potential for distributed query processing.

89 K.Subieta. SBA and SBQL, slide 89 Sept. 2006 Appendix – current SBQL developments European Project eGov Bus –A dynamically adaptable information system supporting life events experienced by the citizen or business serviced by European government organizations. –Integration of distributed, heterogeneous, redundant and fragmented resources for eGov applications. –8 European partners, Jan 2006 - Dec 2007, budget 4 M Euro European Project VIDE –The UML-compliant action language VIDE to be researched, developed, evaluated and disseminated during the project will enable fully visual prototyping, programming, debugging and documenting of future applications. –Implementation of OMG MDA, visual programming. –10 European partners, July 2006 – Dec 2008, budget 4 M Euro

90 K.Subieta. SBA and SBQL, slide 90 Sept. 2006 SBQL in the eGov Bus project SBQL as embedded QL for application programming in Java. SBQL as a self-contained DBPL for application programming. SBQL updateable views will play the following roles: –As mediators that virtually convert a local data and service resource to the shape that is required by the canonical model. –As integrators that virtually fuse fragmented collections residing on different servers, resolve heterogeneities, remove redundancies, join fragmented remote services into the form of a procedure (c.f. life events) and (if necessary) equip some remote objects into new classes and methods. –As customizers that adapt the data that are seen through the canonical model to the need of particular end user applications and/or to particular users.

91 K.Subieta. SBA and SBQL, slide 91 Sept. 2006 SBQL in the VIDE project SBQL queries will occur explicitly as VIDE language constructs Moreover, SBQL queries will be depicted visually in the visual version of the language code. SBQL queries will address directly UML-like class diagrams. SBQL queries will be components of visual programming graphical metaphors. The VIDE language will have both visual and textual syntax with no losses in the automated conversion between the two. The programmer will have the freedom of choice to edit either the visual or the textual code.


Download ppt "K.Subieta. SBA and SBQL, slide 1 Sept. 2006 SBA (Stack-Based Approach) and SBQL (Stack-Based Query Language) Presentation prepared for OMG Object Database."

Similar presentations


Ads by Google