DBMSs on a Modern Processor: Where Does Time Go? Anastassia Ailamaki Joint work with David DeWitt, Mark Hill, and David Wood at the University of Wisconsin-Madison.

DBMSs on a Modern Processor: Where Does Time Go? Anastassia Ailamaki Joint work with David DeWitt, Mark Hill, and David Wood at the University of Wisconsin-Madison

© 1999 Anastassia Ailamaki2 Higher DBMS Performance Sophisticated, powerful new processors + Compute, memory intensive DB apps = Suboptimal performance for DBMSs Where is query execution time spent? Look for performance bottlenecks in processor and memory components

© 1999 Anastassia Ailamaki5 The DBMS New Bottleneck Earlier bottleneck was I/O, now memory and compute intensive apps Modern platforms: 3sophisticated execution hardware 3fast, non-blocking caches and memory still... DBMSs hardware behavior is suboptimal, compared to scientific workloads.

© 1999 Anastassia Ailamaki6 An Execution Pipeline FETCH/ DECODE UNIT DISPATCH EXECUTE UNIT RETIRE UNIT INSTRUCTION POOL L1 I-CACHEL1 D-CACHE L2 CACHE : Branch prediction, non-blocking caches, out-of-order MAIN MEMORY

© 1999 Anastassia Ailamaki8 Setup and Methodology Four commercial DBMSs: A, B, C, D 6400 PII Xeon/MT running Windows NT 4 Used processor counters Range Selection (sequential, indexed) select avg (a3) from R where a2 > Lo and a2 < Hi Equijoin (sequential) select avg (a3) from R, S where R.a2 = S.a1 WHY ME?

© 1999 Anastassia Ailamaki9 Why Simple Queries? Easy to setup and run Fully controllable parameters Enable iterative hypotheses Allow to isolate behavior of basic loops (workload not good for comparing speed) Building blocks for complex workloads?

© 1999 Anastassia Ailamaki14 Memory Bottlenecks Memory is important -Increasing memory-processor performance gap -Deeper memory hierarchies expected Stalls due to L2 cache data misses -Compulsory or repeated -L2 grows (8MB), but will be slower Stalls due to L1 I-cache misses -Buffer pool code is expensive -L1 I-cache not likely to grow as much as L2

© 1999 Anastassia Ailamaki17 Resource-related Stalls High T DEP for all systems : Low ILP opportunity A’s sequential scan: Memory unit load buffers? Dependency-related stalls (T DEP )Functional Unit-related stalls (T FU )

© 1999 Anastassia Ailamaki19 Conclusions Execution time breakdown shows trends L1I and L2D are major memory bottlenecks We need to: 3 reduce page crossing costs 3 optimize instruction stream 3 optimize data placement in L2 cache 3 reduce stalls at all levels TPC may not be necessary to locate bottlenecks

DBMSs on a Modern Processor: Where Does Time Go? Anastassia Ailamaki Joint work with David DeWitt, Mark Hill, and David Wood at the University of Wisconsin-Madison.

Similar presentations

Presentation on theme: "DBMSs on a Modern Processor: Where Does Time Go? Anastassia Ailamaki Joint work with David DeWitt, Mark Hill, and David Wood at the University of Wisconsin-Madison."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

DBMSs on a Modern Processor: Where Does Time Go? Anastassia Ailamaki Joint work with David DeWitt, Mark Hill, and David Wood at the University of Wisconsin-Madison.

Similar presentations

Presentation on theme: "DBMSs on a Modern Processor: Where Does Time Go? Anastassia Ailamaki Joint work with David DeWitt, Mark Hill, and David Wood at the University of Wisconsin-Madison."— Presentation transcript:

Similar presentations

About project

Feedback