Presentation is loading. Please wait.

Presentation is loading. Please wait.

Dependency Problems for Future Database Systems: Ingres Project D Adrian Hudnott Warwick Postgraduate Colloquium in Computer Science 2008.

Similar presentations


Presentation on theme: "Dependency Problems for Future Database Systems: Ingres Project D Adrian Hudnott Warwick Postgraduate Colloquium in Computer Science 2008."— Presentation transcript:

1 Dependency Problems for Future Database Systems: Ingres Project D Adrian Hudnott Warwick Postgraduate Colloquium in Computer Science 2008

2 Research Problems The Third Manifesto is the subtitle of a book written by Chris Date and Hugh Darwen that defines what a relational database system should and should not do The Third Manifesto and related publications do not include any implementation techniques There is no scalable implementation of any part of The Manifesto currently in existence There are concerns that some of the demands of The Manifesto cannot be met using extensions of techniques in use by SQL DBMS When these were catalogued all of the difficult ones had a common theme: constraints All the problems involve the DBMS efficiently deducing a fact using known properties about the data source Little or no access to the actual data

3 Research Problems Constraint Enforcement Semantic Query Optimization Multiple Simultaneous Assignment Updates to Views Interface with SQL Databases Determination of Type (compile- and run- time)

4 Database Representation Use Datalog (like Prolog) E.g. supplier(s1, smith, 20, london). supplier(s2, jones, 10, paris). supplier(s3, blake, 30, paris). supplier(s4, clarke, 20, london). supplier(s5, adams, 30, athens).

5 Views Expressed using Datalog rules E.g. supplier_city(C) :- supplier(SN, N, ST, C).

6 Notation Extensional Database: EDB Intensional Database: IDB Statistics: Stat Constraints (invariant): Inv Assignments: R i := U i Queries: Q i (t) Boolean expressions: B i (t) Cost function: c(exp)

7 Constraint Enforcement Or, alternatively: (Prove Safety) (Prove Failure) N.B. Minimize access to the database (EDB)

8 Updates in the Presence of Views Example insertion: supplier(s6, carter, 30, oxford). R is “supplier” U is OR supplier_city(C) :- supplier(SN, N, ST, C); (SN=s6, N=carter, ST=30, C=oxford).

9 Materialized Views Materialized View: contents of a view cached to avoid re-calculation. Can be modelled using constraints CREATE TABLE supplier (City PRIMARY KEY REFERENCES supplier_city,... CREATE TABLE supplier_city (City PRIMARY KEY REFERENCES supplier); And vice versa: A constraint is a view that is always empty.

10 Multiple Assignment VAR Customers RELATION{Customer#,... VAR Accounts RELATION{Account#, Customer#,... CONSTRAINT Customers{Customer#} = Accounts{Customer#} INSERT Customers RELATION{TUPLE{Customer# 10,...}}, INSERT Accounts RELATION{TUPLE{Customer# 10,...}} ;

11 Multiple Assignment: Naive Implementation Example: X := Y, Y := X; Internal representation: atomic { Z := X; X := Y; Y := Z; }

12 Multiple Assignment IIF U’ is the first assignment:.... R’:= U’, R1 := U1, …, Rn := Un;

13 Multiple Assignment Idea 1)Semantically optimize RHS expressions 2)Place RHS expressions into canonical form and find subexpressions equal to other RHS expressions 3)Build a dependency graph 4)Depth first search to find cycles 5)Break cycles by creating simulated copies 6)Schedule and execute assignments

14 Transaction Repair Transaction repair is modifying an update request so that it conforms with the constraints Compensating actions –E.g. ON DELETE CASCADE Research on more advanced repairs –Generates more than one repair

15 View Update = Transaction Repair + Disambiguation DELETE FROM supplier_city WHERE City = ‘London’; Transaction repair needed: DELETE FROM supplier WHERE city=‘London’; For more complex cases, disambiguation is needed –Default values, rules, etc.

16 Semantic Optimization No access to the data and: (Equivalence) (Lower Cost)

17 NULLs are flawed for many reasons –Simple example: B OR NOT B is NULL if B is NULL Can consider missing data as free variables Problem is to efficiently determine if a formula is true for all possible values of the free variable Missing Information

18 Current Progress Catalogued all known issues requiring investigation Outlined solutions or literature for the lesser problems –E.g. comparison of large values Devised formal semantics for type checking Tutorial D programs Elaborated on the problems requiring original research in the context of dependency Resolved perceived ambiguities in The Manifesto Detailed ideas for future development into a solution to the multiple assignment problem Completed a literature review on constraint checking techniques

19 Project D Roadmap Further literature review & code familiarization Summer 2008 Address multiple assignment problemOct 2008 – Jan 2009 Development Phase 1Feb – Aug 2009 Development Phase 2Aug 2009 – Mar 2010 Thesis write-upApr – Sep 2010 First complete buildMid – Late 2011 Product releasedEarly 2012

20 Summing up Project D Essentially about relieving the industry of the burdens that have been imposed by SQL implementations in the past 30 years We think that the relational model can be properly implemented without SQL’s restrictions and ad hoc additions –… but requires innovative research Project D is still in very early development However, some small successes already


Download ppt "Dependency Problems for Future Database Systems: Ingres Project D Adrian Hudnott Warwick Postgraduate Colloquium in Computer Science 2008."

Similar presentations


Ads by Google