Presentation is loading. Please wait.

Presentation is loading. Please wait.

MEMORY SYSTEM CHARACTERIZATION OF COMMERCIAL WORKLOADS Authors: Luiz André Barroso (Google, DEC; worked on Piranha) Kourosh Gharachorloo (Compaq, DEC;

Similar presentations


Presentation on theme: "MEMORY SYSTEM CHARACTERIZATION OF COMMERCIAL WORKLOADS Authors: Luiz André Barroso (Google, DEC; worked on Piranha) Kourosh Gharachorloo (Compaq, DEC;"— Presentation transcript:

1 MEMORY SYSTEM CHARACTERIZATION OF COMMERCIAL WORKLOADS Authors: Luiz André Barroso (Google, DEC; worked on Piranha) Kourosh Gharachorloo (Compaq, DEC; worked on Dash and Flash) Edouard Bugnion (one of the original founders of VMware; also worked on SimOS) Presented by: David Eitel, March 31, 2010

2 Types of Commercial Applications  Online Transaction Processing (OLTP)  Decision Support Systems (DSS)  Web Index Search (WIS) Source: S. Brin and L. Page. “The Anatomy of a Large-Scale Hypertextual Web Search Engine.”

3 Benchmarks  Oracle Database Engine  TPC-B Banking Benchmark for OLTP  TPC-D Benchmark for DSS (read-only queries)  AltaVista Sources: http://georgiaconsortium.org/images/Banking-Coins.jpg, http://greencanada.files.wordpress.com/2009/04/databases.jpg, http://sixrevisions.com/web_design/popular-search-engines-in-the-90s-then-and-now/

4 Monitoring Results Source: Fig. 4  OLTP has more complex queries than DSS/AV  Important to have low-latency to non-primary caches because OLTP working set is very large.  Cache misses for DSS are low – misses on large database tables. Icache = instruction cache Dcache = data cache Scache = secondary cache Bcache = board-level cache Big CPI! Lots of Bcache misses Breakdown of the execution time misses Sum of single- and dual-issue cycles Pipeline and address translation related stalls >75% mem stalls Scache = secondary cache Bcache = board-level cache

5 Simulation Results for OLTP Source: Fig. 5 Associativity Cache Size Data capacity/ Conflict misses INST = instruction execution CACHE = stalls within cache hierarchy MEM = memory system stalls  Idle time increases with bigger caches.  The I/O latency cannot be hidden with faster processing rates.  Faster processing rates with a more efficient memory system = more commits ready for the log writer (I/O).  OLTP benefits from larger Bcaches.

6 More Simulation Results (OLTP and DSS)  DSS works well with current sized caches because the working sets are small (few misses in on-chip caches)  Replacement/instr miss rate are not affected by line size  good for larger cache sizes.  False sharing increases with cache line size.  What would be different if increased latency and bandwidth were accounted for when line size increases?  Are the results NOT valid because size(database) = size(main memory)? Sources: Fig. 7 and Fig. 8

7 Important Things to Remember  As # processors increases, communication stalls increase (see Fig. 6)  O/S activity & I/O latencies do not greatly affect the behavior of database engines.  OLTP has instruction & data locality  helped by off-chip caches  DSS and WIS have working sets that fit in memory  sensitive to on-chip caches Source: http://www.stress-treatment-21.com/wp-content/uploads/2009/05/thinking-monkey.bmp

8 Discussion Questions  What are some new commercial applications that have developed since this paper was written?  How much have the issues in this paper been addressed in recent architecture designs?  What should we focus on in the “parallel” future to increase performance for commercial applications?  Could we change commercial workloads to function more like scientific workloads to obtain performance gains? Source: http://www.vosibilities.com/wp-content/uploads/2009/05/bpm-questions-you-should-ask-your-bpms-vendor1.jpg


Download ppt "MEMORY SYSTEM CHARACTERIZATION OF COMMERCIAL WORKLOADS Authors: Luiz André Barroso (Google, DEC; worked on Piranha) Kourosh Gharachorloo (Compaq, DEC;"

Similar presentations


Ads by Google