Presentation is loading. Please wait.

Presentation is loading. Please wait.

The Forest and the Trees Julia Stoyanovich Candidacy Exam in Database Systems Fall 2005.

Similar presentations


Presentation on theme: "The Forest and the Trees Julia Stoyanovich Candidacy Exam in Database Systems Fall 2005."— Presentation transcript:

1 The Forest and the Trees Julia Stoyanovich Candidacy Exam in Database Systems Fall 2005

2 October 31, 2005 Preparing to Study for the Exam reading list

3 October 31, 2005 Preparing to Study for the Exam short reading list length long > 50 pages

4 October 31, 2005 Preparing to Study for the Exam short reading list lengthtopic long > 50 pages Query Processing Views XML …

5 October 31, 2005 Preparing to Study for the Exam short reading list lengthtopic interesting long > 50 pages Query Processing Views XML yes so-so …

6 October 31, 2005 Preparing to Study for the Exam short reading list lengthtopic interestingage long > 50 pages Query Processing Views XML yes so-so same as I 1980-1990 after 1990 …

7 October 31, 2005 Preparing to Study for the Exam short reading list lengthtopic interestingageby IBM Research or mentions System R long > 50 pages Query Processing Views XML yes so-so same as I 1980-1990 after 1990 yesno …

8 October 31, 2005 The Enchanted Forest query processing transaction systems semi-structured, web integration,OLAP,views

9 October 31, 2005 Goals database system generalextensible elegantefficient expressive system design data model simple processing capabilities

10 October 31, 2005 Goals database system generalextensible elegantefficient expressive system design data model simple processing capabilities

11 October 31, 2005 Goals database system generalextensible elegantefficient expressive system design data model simple processing capabilities

12 October 31, 2005 Trade-offs system or component extensible simple efficient expressive general elegant

13 October 31, 2005 Trade-offs system or component simple efficientgeneral elegantextensible expressive design

14 October 31, 2005 Trade-offs system or component simple efficientgeneral elegantextensible expressive design careful designcompromise adaptability theoretical foundations

15 October 31, 2005 Roadmap Introduction Theoretical Foundations Careful Design Compromise Adaptability

16 October 31, 2005 Materialized Views query rewriting

17 October 31, 2005 Materialized Views query rewriting domain query optimization data integration

18 October 31, 2005 Materialized Views query rewriting domainassumptions closed-world open-world query optimization data integration

19 October 31, 2005 Materialized Views query rewriting propertiesdomainassumptions closed-world open-world contained equivalent maximally-contained query optimization data integration

20 October 31, 2005 Materialized Views query rewriting propertiesdomainassumptions closed-world open-world complexity algorithm output bucket inverse rules MiniCon query reformulation execution plan transformational System R style contained equivalent maximally-contained query optimization data integration

21 October 31, 2005 Roadmap Introduction Theoretical Foundations Careful Design Compromise Adaptability

22 October 31, 2005 System R system design

23 October 31, 2005 System R system design cost-based optimization access path selection plan enumeration cost metric CPU I/O table scan index scan interesting orders dynamic programming stats catalogue

24 October 31, 2005 System R system design cost-based optimization access path selection plan enumeration cost metric CPU I/O table scan index scan interesting orders dynamic programming code generation stats catalogue

25 October 31, 2005 System R system design cost-based optimization access path selection plan enumeration cost metric CPU I/O table scan index scan interesting orders dynamic programming code generation stats catalogue locking isolation levels hierarchy of locks

26 October 31, 2005 System R system design cost-based optimization access path selection plan enumeration cost metric CPU I/O table scan index scan interesting orders dynamic programming code generation stats catalogue locking logging & recovery in-place recovery logical logging isolation levels hierarchy of locks

27 October 31, 2005 Google system design

28 October 31, 2005 Google system design ranking metric IRPageRank term frequency inverse document frequency link structure

29 October 31, 2005 Google system design ranking metric distributed crawlers IRPageRank term frequency inverse document frequency link structure

30 October 31, 2005 Google system design indexing infrastructure ranking metric distributed crawlers IRPageRank term frequency inverse document frequency link structure compression inverted files custom file system

31 October 31, 2005 Google system design indexing infrastructure ranking metric distributed crawlers IRPageRank term frequency inverse document frequency link structure compression inverted files custom file system parallelism

32 October 31, 2005 Roadmap Introduction Theoretical Foundations Careful Design Compromise Adaptability

33 October 31, 2005 Common Trade-offs resource spacehardware diskmemory timeCPU

34 October 31, 2005 Common Trade-offs resource spacehardware indexes views diskmemory timeCPU

35 October 31, 2005 Common Trade-offs resource spacehardware indexes views inversion algorithms diskmemory time hashing sorting CPU

36 October 31, 2005 Common Trade-offs resource spacehardware indexes views inversion algorithms diskmemory time hashing sorting CPU compression

37 October 31, 2005 Common Trade-offs resource spacehardware indexes views inversion algorithms diskmemory time hashing sortingparallel algorithms CPU compression

38 October 31, 2005 XML Storage Strategies clustering storage strategy

39 October 31, 2005 XML Storage Strategies clustering document order by tag by real-world object storage text fileRDBMS object manager strategy

40 October 31, 2005 XML Storage Strategies clustering document order by tag by real-world object storage text fileRDBMS object manager file strategy

41 October 31, 2005 XML Storage Strategies clustering document order by tag by real-world object storage text fileRDBMS object manager DTDfileedgeattribute strategy

42 October 31, 2005 XML Storage Strategies clustering document order by tag by real-world object storage text fileRDBMS object manager DTDfileedgeattribute strategy

43 October 31, 2005 XML Storage Strategies clustering document order by tag by real-world object storage text fileRDBMS object manager DTDfileedgeattribute light-weight objects strategy

44 October 31, 2005 Roadmap Introduction Theoretical Foundations Careful Design Compromise Adaptability

45 October 31, 2005 Garlic system architecture

46 October 31, 2005 Garlic wrapper system architecture middleware

47 October 31, 2005 Garlic wrapper repository modeling method invocation query processing query planning plan translation query execution execution plan cost info system architecture middleware

48 October 31, 2005 Memory-Conscious Algorithms calibrator

49 October 31, 2005 Memory-Conscious Algorithms calibrator measure TLB cache sizecost of a miss

50 October 31, 2005 Memory-Conscious Algorithms calibrator measure TLB optimize localityCPU parallelism cache sizecost of a miss

51 October 31, 2005 Adaptive Query Optimization run-time activities statistics maintenance query re-optimization adjust selectivities discover correlations re-plan choose among concurrent plans

52 October 31, 2005 Conclusion database systems problems large datasets complex environments trade-offs

53 October 31, 2005 Conclusion database systems problemssolutions large datasets complex environments trade-offs theoretical foundations careful design compromise adaptability

54 October 31, 2005 Thank you!

55 October 31, 2005 Data Models data structured semi-structured unstructured homogeneous heterogeneous structure consistencysize central distributed location

56 October 31, 2005 Query Optimization access path selection table scan index lookup cost estimation CPU I/O communication plan enumeration operator selection sort- based hash- based tasks of a query optimizer physical properties order delegation of processing to all nodes to a subset of nodes


Download ppt "The Forest and the Trees Julia Stoyanovich Candidacy Exam in Database Systems Fall 2005."

Similar presentations


Ads by Google