Presentation is loading. Please wait.

Presentation is loading. Please wait.

Advancement of Buffer Management Research and Development in Computer and Data Systems Xiaodong Zhang The Ohio State University.

Similar presentations


Presentation on theme: "Advancement of Buffer Management Research and Development in Computer and Data Systems Xiaodong Zhang The Ohio State University."— Presentation transcript:

1 Advancement of Buffer Management Research and Development in Computer and Data Systems Xiaodong Zhang The Ohio State University

2 Numbers Everyone Should Know (Jeff Dean, Google) L1 cache reference: 0.5 ns Branch mis-predict: 5 ns L2 cache reference: 7 ns Mutex lock/unlock: 25 ns Main memory reference 100 ns Compress 1K Bytes with Zippy 3000 ns Send 2K Bytes over 1 GBPS network 20000 ns Read 1 MB sequentially from memory 250000 ns Round trip within data center 500000 ns Disk seek 1000000 ns Read 1MB sequentially from disk 2000000 ns Send one packet from CA to Europe 15000000 ns 2

3 Replacement Algorithms in Data Storage Management A replacement algorithm decides – Which data entry to be evicted when the data storage is full. – Objective: keep to-be-reused data, replace ones not to-be-reused – Making a critical decision: a miss means an increasingly long delay Widely used in all memory-capable digital systems – Small buffers: cell phone, Web browsers, e-mail boxes … – Large buffers: virtual memory, I/O buffer, databases … A simple concept, but hard to optimize – More than 40 years tireless algorithmic and system efforts – LRU-like algorithms/implementations have serious limitations. 3

4 Least Recent Used (LRU) Replacement LRU is most commonly used replacement for data management. Blocks are ordered by an LRU order (from bottom to top) Blocks enter from the top (MRU), and leave from bottom.(LRU) Recency – the distance from a block to the top of the LRU stack Upon a hit – move block to the top 3 2 5 LRU stack The stack is long, the bottom is the only exit. 9 Recency = 2 Recency of Block 2 is its distance to the top of stack Upon a Hit to block 2 1 Move block 2 to the top of stack 4

5 Least Recent Used (LRU) Replacement LRU is most commonly used replacement for data management. Blocks are ordered by an LRU order (from bottom to top) Blocks enter from the top, and leave from bottom. LRU stack The stack is long, the bottom is the only exit. 3 2 5 9 1 6 Disk 6 Load block 6 from disk Put block 6 on the stack top Recency – the distance from a block to the top of the LRU stack Upon a hit – move block to the top Upon a miss – evict block at the bottom Replacement – the block 1 at the stack bottom is evicted Upon a Miss to block 6 5

6 LRU is a Classical Problem in Theory and Systems First LRU paper – L. Belady, IBM System Journal, 1966 Analysis of LRU algorithms – Aho, Denning & Ulman, JACM, 1971 – Rivest, CACM, 1976 – Sleator & Tarjan, CACM, 1985 – Knuth, J. Algorithm, 1985 – Karp, et. al, J. Algorithms, 1991 Many papers in systems and databases – ASPLOS, ISCA, SIGMETRICS, SIGMOD, VLDB, USENIX… 6

7 The Problem of LRU: Inability to Deal with Certain Access Patterns File Scanning – One-time accessed data evict to-be-reused data (cache pollution) – A common data access pattern (50% data in NCAR accessed once) – LRU stack holds them until they reach to the bottom. Loop-like accesses – A loop size k+1 will miss k times for a LRU stack of k Access with different frequencies (mixed workloads) – Frequently accessed data can be replaced by infrequent ones 7

8 Why Flawed LRU is so Powerful in Practice What is the major flaw? – The assumption of recently used will be reused is not always right – This prediction is based on a simple metrics of recency – Some are cached too long, some are evicted too early. Why it is so widely used? – Works well for data accesses following LRU assumption – A simple data structure to implement 8

9 Challenges of Addressing the LRU Problem Two types of Efforts have been made – Detect specific access patterns: handle it case by case – Learn insights into accesses with complex algorithms – Most published papers could not be turned into reality Two Critical Goals – Fundamentally address the LRU problem – Retain LRU merits: low overhead and its assumption The goals are achieved by a set of three papers – The LIRS algorithm (SIGMETRICS02) – Clock-pro: a system implementation (USENIX05) – BP-Wrapper: lock-contention free assurance (ICDE09) 9

10 Outline The LIRS Algorithm – How the LRU problem is fundamentally addressed – How a data structure with low complexity is built Clock-pro – Turn LIRS algorithm into system reality BP-Wrapper – free lock contention so that LIRS and others can be implemented without approximation What would we do for multicore processors? Research impact in daily computing operations 10

11 Recency vs Reuse Distance 1 LRU stack 3 2 5 9 8 4 3... 45 4 3 Recency = 1 Recency = 2 Recency is the distance from a block to the top of the LRU stack Recency – the distance between last reference to the current time Reuse Distance (Inter reference recency) – the distance between two consecutive reference to the block (deeper and more useful information) 11

12 LRU stack 3 2 4 5 9 8 3... 5433 Recency = 2 IRR = 2 Inter-Reference Recency (IRR) The number of other unique blocks accessed between two consecutive references to the block. Recency = 0 IRR is the recency of the block being accessed last time – need an extra stack to help, increasing complexity. 5 3 LRU Stack for HIRs Recency – the distance between last reference to the current time Reuse Distance (Inter reference recency) – the distance between two consecutive reference to the block (deeper and more useful info) Recency vs Reuse Distance 12

13 Diverse Locality Patterns on an Access Map Virtual Time (Reference Stream) Logical Block Number strong locality loops one-time accesses 13

14 What Blocks does LRU Cache (measured by IRR)? Locality Strength Cache Size MULTI2 IRR (Re-use Distance in Blocks) Virtual Time (Reference Stream) LRU holds frequently accessed blocks with absolutely strong locality. holds one-time accessed blocks (0 locality) Likely to replace other relatively strong locality blocks 14

15 LIRS: Only Cache Blocks with Low Reuse Distances Locality Strength Cache Size MULTI2 IRR (Re-use Distance in Blocks) Virtual Time (Reference Stream) Holds strong locality blocks (ranked by reuse distance) 15

16 Basic Ideas of LIRS (SIGMETRICS02) LIRS: Low Inter-Reference recency Set – Low IRR blocks are kept in buffer cache – High IRR blocks are candidates for replacements Two stacks are maintained – A large LRU stack contains low IRR resident blocks – A small LRU stack contains high IRR blocks – The large stack also records resident/nonresident high IRR blocks IRRs are measured by the two stacks – After a hit to a resident high IRR block in small stack, the block becomes low IRR block and goes to the large stack if it can also be found in the large stack => low IRR, otherwise, top it locally – The low IRR block in the bottom stack will become a high IRR block and go to the small stack when the large stack is full. 16

17 Low Complexity of LIRS Both recencies and IRRs are recorded in each stack – The block in the bottom of LIRS has the maximum recency – A block is low IRR if it can be found in in both stacks – No explicit comparisons and measurements are needed Complexity of LIRS = LRU = O(1) although – Additional object movements between two stacks – Pruning operations in stacks 17

18 Data Structure: Keep LIR Blocks in Cache Low IRR (LIR) blocks and High IRR (HIR) blocks LIR block set (size is L lirs ) HIR block set Cache size L = L lirs + L hirs L hirs L lirs Physical Cache Block Sets 18

19 LIRS Operations resident in cache LIR block HIR block Cache size L = 5 L lir = 3 L hir =2 5 3 2 1 6 9 4 8 LIRS stack 5 3 LRU Stack for HIRs Initialization: All the referenced blocks are given an LIR status until LIR block set is full. We place resident HIR blocks in a small LRU Stack. Upon accessing an LIR block (a hit) Upon accessing a resident HIR block (a hit) Upon accessing a non-resident HIR block (a miss) 19

20 5 3 2 1 6 9 4 8 5 3 resident in cache LIR block HIR block Cache size L = 5 L lir = 3 L hir =2... 4835795 Access an LIR Block (a Hit) 20

21 5 3 2 1 6 9 4 8 5 3 resident in cache LIR block HIR block Cache size L = 5 L lir = 3 L hir =2... 835795 Access an LIR Block (a Hit) 21

22 Access an LIR block (a Hit) 6 9 5 3 2 1 4 8 5 3 resident in cache LIR block HIR block Cache size L = 5 L lir = 3 L hir =2... 357958 22

23 Access a Resident HIR Block (a Hit) 5 3 2 1 4 8 5 3 resident in cache LIR block HIR block Cache size L = 5 L lir = 3 L hir =2... 35795 3 23

24 1 52 5 4 8 3 resident in cache LIR block HIR block Cache size L = 5 L lir = 3 L hir =2... 35795 Access a Resident HIR Block (a Hit) 24

25 1 5 2 5 4 8 3 resident in cache LIR block HIR block Cache size L = 5 L lir = 3 L hir =2... 35795 1 Access a Resident HIR Block (a Hit) 25

26 5 4 83 resident in cache LIR block HIR block Cache size L = 5 L lir = 3 L hir =2... 5795 1 5 Access a Resident HIR Block (a Hit) 26

27 Access a Non-Resident HIR block (a Miss) 5 4 83 resident in cache LIR block HIR block Cache size L = 5 L lir = 3 L hir =2... 795 1 5 77 27

28 5 4 83 resident in cache LIR block HIR block Cache size L = 5 L lir = 3 L hir =2... 95 57 7 959 5 Access a Non-Resident HIR block (a Miss) 28

29 4 83 resident in cache LIR block HIR block Cache size L = 5 L lir = 3 L hir =2... 5 779597 5 4 7 Access a Non-Resident HIR block (a Miss) 29

30 83 resident in cache LIR block HIR block Cache size L = 5 L lir = 3 L hir =2... 9947 5 Access a Non-Resident HIR block (a Miss) 30

31 A Simplified Finite Automata for LRU 31 Miss (fetch data and place on the top) Hit (place it on the top) Operations on LRU stack Upon a block access block evicts (remove the block in the bottom)

32 A Simplified Finite Automata for LIRS 32 Hit Operations on LIR stack Upon a block access Pruning Operations on HIR stack Demotion to HIR stack Block evicts Miss (with a record) Hit/promotion to LIR stack Miss/promotion to LIR stack Miss Hit Block evicts Miss (no record) Add a record on resident HIR

33 Hit Operations on LIR stack Upon a block access Pruning Operations on HIR stack Demotion to HIR stack Miss (with a record) Hit/promotion to LIR stack Miss/promotion to LIR stack Miss Hit Block evicts Miss (no record) Add a record on resident HIR A Simplified Finite Automata for LIRS

34 How LIRS addresses the LRU problem File scanning: one-time access blocks will be replaced timely; (due to their high IRRs) Loop-like accesses: a section of loop data will be protected in low IRR stack; (misses only happen in the high IRR stack) Accesses with distinct frequencies: Frequently accessed blocks in short reuse distance will NOT be replaced. (dynamic status changes) 34

35 Performance Evaluation l Trace-driven simulation on different patterns shows – LIRS outperforms existing replacement algorithms in almost all the cases. – The performance of LIRS is not sensitive to its only parameter L hirs. – Performance is not affected even when LIRS stack size is bounded. – The time/space overhead is as low as LRU. – LRU is a special case of LIRS (without recording resident and non-resident HIR blocks, in the large stack). 35

36 Looping Pattern: postgres (Time-space map) 36

37 Looping Pattern: postgres (IRR Map) IRR (Re-use Distance in Blocks) Virtual Time (Reference Stream) LRU LIRS 37

38 Looping Pattern: postgres (Hit Rates) 38

39 Two Technical Issues to Turn it into Reality High overhead in implementations – For each data access, a set of operations defined in replacement algorithms (e.g. LRU or LIRS) are performed – This is not affordable to any systems, e.g. OS, buffer caches … – An approximation with reduced operations is required in practice High lock contention cost – For concurrent accesses, the stack(s) need to be locked for each operation – Lock contention limits the scalability of the system Clock-pro and BP-Wrapper addressed these two issues 39

40 Only Approximations can be Implemented in OS The dynamic changes in LRU and LIRS cause some computing overhead, thus OS kernels cannot directly adopt them. An approximation reduce overhead at the cost of lower accuracy. The clock algorithm for LRU approximation was first implemented in the Multics system in 1968 at MIT by Corbato (1990 Turing Award Laureate) Objective: LIRS approximation for OS kernels. 40

41 All the resident pages are placed around a circular list, like a clock; Each page is associated with a reference bit, indicating if the page has been accessed. Basic Operations of CLOCK Replacement 0 CLOCK hand 0 1 0 0 0 0 1 1 0 0 0 1 1 0 0 1 0 1 1 0 0 0 1 0 Upon a HIT: Set reference bit to 1 No algorithm operations Upon a HIT: Set reference bit to 1 No algorithm operations On a block HIT 41

42 Basic CLOCK Replacement 0 CLOCK hand 1 0 0 0 0 1 1 0 0 0 1 1 0 0 1 0 1 1 0 0 0 On a sequence of two MISSes Starts from the currently pointed page, and evicts the page if it is`0; Move the clock hand until reach a 0 page; Give 1 page a second chance, and reset its 1 to 0 1 00 0 Upon a MISS: Evict the block w/ reference bit 0 Upon a MISS: Evict the block w/ reference bit 0 Insert a new block here Reset reference bit to 0 Upon the second MISS: Evict the block w/ reference bit 0 Upon the second MISS: Evict the block w/ reference bit 0 0 42

43 Unbalanced R&D on LRU versus CLOCK FBR (1990, SIGMETRICS) LRU-2 (1993, SIGMOD) 2Q (1994, VLDB) SEQ (1997, SIGMETRICS) LRFU (1999, OSDI) EELRU (1999, SIGMETRICS) MQ (2001, USENIX) LIRS (2002, SIGMETRICS) ARC (2003, FAST, IBM patent) GCLOCK (1978, ACM TDBS) LRU related workCLOCK related work 1968, Corbato2003, Corbato CAR (2004, FAST, IBM patent) CLOCK-Pro (2005, USENIX) 43

44 It is an approximation of LIRS based on the CLOCK infrastructure. Pages categorized into two groups: cold pages and hot pages based on their reuse distances (or IRR). There are three hands: Hand-hot for hot pages, Hand-cold for cold pages, and Hand-test for running a reuse distance test for a block; The allocation of memory pages between hot pages (Mhot) and cold pages (Mcold ) are adaptively adjusted. (M = Mhot + Mcold) All hot pages are resident (=Lir blocks), some cold pages are also resident (= Hir Blocks); keep track of recently replaced pages (=non-resident Hir blocks) Basic Ideas of CLOCK-Pro 44

45 Two reasons for a resident cold page: (1)A fresh replacement: a first access. (2)It is demoted from a hot page. CLOCK-Pro (USENIX05) 0 0 1 0 0 1 0 0 0 0 0 0 1 0 0 hand-hot hand-test hand-cold 0 1 2 3 4 5 6 7 8 9 10 11 1213 14 15 16 17 18 19 20 21 22 23 24 0 1 1 Cold resident Hot Cold non-resident 0 All hands move in the clockwise direction. Hand-cold is used to find a page for replacement. Hand-test: (1) to determine if a cold page is promoted to be hot; (2) remove non-resident cold pages out of the clock. Hand-hot: find a hot page to be demoted into a cold page. 0 45

46 46

47 47

48 Hit ratio is largely determined by effectiveness of replacement algorithm –It determines which pages to be kept and which to be evicted –LRU-k, 2Q, LIRS, ARC, … –Lock (latch) is required to serialize the update after each page request Concurrency Management in Buffer Management 48 Lock (Latch) Replacement Management inside Lock Buffer Pool (in DRAM) Pages Hard Disk Concurrent accesses to buffer caches – A critical section is needed Buffer cache (pool) keeps hot pages – Maximizing hit ratio is the key Page Accesses Maximizing hit ratio 48

49 Accurate Algorithms and Their Approximations 49 … … LRU, LIRS, ARC, …. Approximations CLOCK (LRU), CLOCK-Pro (LIRS), CAR (ARC) 0 0 1 1 0 1 1 0 1 0 0 0 0 1 0 0 CLOCK hand 1 clock sets bit to 1 without lock for a page hit. Lock synchronization is only used only for misses. Clock approximation reduces lock contention at the price of reducing hit ratios. 49

50 1996-2000: LRU (suffer lock contention moderately due to low concurrency) 2000-2003: LRU-k (hit ratio outperforms LRU, but lock contention became more serious) 2004: ARC/CAR are implemented, but quickly removed due to an IBM patent protection. 2005: 2Q was implemented (hit ratios were further improved, but lock contention was high) 2006 to now: CLOCK (approximation of LRU, lock contention is reduced, but hit ratio is the lowest compared with all the previous ones) History of Buffer Pool's Caching Management in PostgreSQL 50

51 51 Trade-offs between Hit Ratios and Low Lock Contention LRU-k, 2Q, LIRS, ARC, SEQ, …. …… for high hit ratio Update page metadata Low Lock Synchronization CLOCK, CLOCK-Pro, and CAR for high scalability ? Clock-based approximations lower hit ratios (compared to original ones). The transformation can be difficult and demand great efforts; Some algorithms do not have clock-based approximations. Our Goal: to have both! Lock Synchronization modify data structures 51

52 Reducing Lock Contention by Batching Requests Replacement Algorithm (modify data structures, etc. ) Buffer Pool Replacement Algorithm (modify data structures, etc. ) Buffer Pool One batch queue per thread 52 Page hit Fetch the page directly. Fulfill page request Commit assess history for a set of replacement operations 52

53 Reducing Lock Holding Time by Prefetching 53 Time Thread 2 Thread 1 Data Cache Miss Stall Time Thread 2 Thread 1 Pre-read data that will be accessed in the critical section 53

54 Lock Contention Reduction by BP-Wrapper (ICDE09) 54 Lock contention: a lock cannot be obtained without blocking; Number of lock acquisitions (contention) per million page accesses. Reduced by over 7000 times! 54

55 Impact of LIRS in Academic Community l LIRS is a benchmark to compare replacement algorithms – Reuse distance is first used in replacement algorithm design – A paper in SIGMETRICS05 confirmed that LIRS outperforms all the other replacement. – LIRS has become a topic to teach in both graduate and undergraduate classes of OS, performance evaluation, and databases at many US universities. – The LIRS paper (SIGMETRICS02) is highly and continuously cited. l Linux Memory Management group has established an Internet Forum on Advanced Replacement for LIRS 55

56 LIRS has been adopted in MySQL l MySQL is the most widely used relational database – 11 million installations in the world – The busiest Internet services use MySQL to maintain their databases for high volume Web sites: google, YouTube, wikipedia, facebook, Taobao… – LIRS is managing the buffer pool of MySQL – The adoption is the most recent version (5.1), November 2008. 56

57 57

58 58

59 Infinispan (a Java-based Open Software) 59 The data grid forms a huge in-memory cache being managed using LIRS BP-Wrapper is used to ensure lock-free The data grid forms a huge in-memory cache being managed using LIRS BP-Wrapper is used to ensure lock-free

60 Concurrentlinkedhashmap as a Software Cache Linked list structure (a Java class) 60 http://code.google.com/p/concurrentlinkedhashmap/wiki/Design Elements are Linked and managed using LIRS replacement policy. BP-Wrapper ensures lock contention-free

61 LIRS in Management of Big Data LIRS has been adopted in GridGain Software A Java based open source middle ware for real time big data processing and analytics (www.gridgain.com)www.gridgain.com LIRS makes replacement decisions for the in-memory data grid Over 500 products and organizations using GridGain software daily: Sony, Cisco, Canon, JobsonJonson, Deutsche Bank, …. LIRS has been adopted in SYSTAPs storage management Big data scale-out storage systems (www.bigdata.com)www.bigdata.com 61

62 LIRS in Functional Programming Language: Clojure Clojure is a dynamic programming language that targets Java Virtual Machine (http://clojure.org)http://clojure.org A dialect of Lisp, functional programming, and designed for concurrency Have be used by many organizations LIRS is a member of the clojure library: LIRSCache 62

63 LIRS Principle in Hardware Caches A cache replacement hardware implementation based on Re-Reference Interval prediction (RRIP) Presented in ISCA10 by Intel Two bits are added to each cache line to measure reuse-distance in a static and dynamic way Performance gains are up to 4-10% Hardware cost may not be affordable in practice. 63

64 Impact of Clock-Pro in OS and Other Systems Clock-pro has been adopted in FreeBSD/NetBSD (open source Unix) Two patches in Linux kernel for users – Clock-pro patches in 2.6.12 by Rik van Riel – PeterZClockPro2 in 2.6.15-17 by Peter Zijlstra Clock-pro is patched in Aparche Derby (a relational DB) Clock-pro is patched in OpenLDAP (directory accesses) 64

65 Impact of Multicore Procesors in Computer Systems 65 256MB Memory Dell Precision GX620 Purchased in 2004 L1 2MB L2 Disk 8MB L3 Cache 8GB Memory L1 L2 L1 L2 L1 L2 L1 L2 Disk Dell Precision1500 in 2009 with similar price

66 8MB L3 Cache 8GB Memory L1 L2 L1 L2 L1 L2 L1 L2 Disk Dell Precision 1500 Purchased in 2009 with similar price Performance Issues w/ the Multicore Architecture 66 Slow data accesses to memory and disks continue to be major bottlenecks. Almost all the CPUs in Top-500 Supercomputers are multicores. Cache Contention and Pollution: Conflict cache misses among multi-threads can significantly degrade performance. Memory Bus Congestion: Bandwidth is limited to as the number of cores increases Disk Wall: Data-intensive applications also demand high throughput from disks.

67 67 Multi-core Cannot Deliver Expected Performance as It Scales Ideal Reality The Troubles with Multicores, David Patterson, IEEE Spectrum, July, 2010 Finding the Door in the Memory Wall, Erik Hagersten, HPCwire, Mar, 2009 Multicore Is Bad News For Supercomputers, Samuel K. Moore, IEEE Spectrum, Nov, 2008 Performance Throughput = Concurrency/Latency -Exploiting parallelism - Exploiting locality

68 Challenges of Managing LLC in Multi-cores Recent theoretical results about LLC in multicores – Single core: optimal offline LRU algorithm exists – Online LRU is k-competitive (k is the cache size) – Multicore: finding an offline optimal LRU is NP-complete – Cache partitioning for threads: an optimal solution in theory System Challenges in practice – LLC lacks necessary hardware mechanism to control inter-thread cache contention LLC share the same design with single-core caches – System software has limited information and methods to effectively control cache contention 68

69 OS Cache Partitioning in Multi-cores (HPCA08) virtual page number Virtual address page offset physical page number Physical address Page offset Address translation Cache tag Block offset Set index Cache address Physically indexed cache page color bits … OS control = Physically indexed caches are divided into multiple regions (colors). All cache lines in a physical page are cached in one of those regions (colors). OS can control the page color of a virtual page through address mapping (by selecting a physical page with a specific value in its page color bits). 69

70 Shared LLC can be partitioned into multiple regions …... …… … … Physically indexed cache … … …… … Physical pages are grouped to different bins based on their page colors 1 2 3 4 … i+2 i i+1 … Process 1 1 2 3 4 … i+2 i i+1 … Process 2 OS address mapping Shared cache is partitioned between two processes through OS address mapping. Main memory space needs to be partitioned too (co-partitioning). 70

71 Implementations in Linux and its Impact Static partitioning – Predetermines the amount of cache blocks allocated to each running process at the beginning of its execution Dynamic cache partitioning – Adjusts cache allocations among processes dynamically – Dynamically changes processes cache usage through OS page address re-mapping (page re-coloring) Current Status of the system facility – Open source in Linux kernels – adopted as a software solution in Intel SSG in May 2010 – used in applications of Intel platforms, e.g. automation 71

72 Final Remarks: Why LIRS-related Efforts Make the Difference? Caching the most deserved data blocks – Using reuse-distance as the ruler, approaching to the optimal – 2Q, LRU-k, ARC, and others can still cache non-deserved blocks LIRS with its two-stack yields constant operations: O(1) – Consistent to LRU, but recording much more useful information Clock-pro turns LIRS into reality in production systems – None of other algorithms except ARC have approximation versions BP-Wrapper ensures lock contention free in DBMS OS partitioning executes LIRS principle in LLC in multicores – Protect strong locality data, and control weak locality data 72

73 Acknowledgement to Co-authors and Sponsors Song Jiang – Ph.D.04 at William and Mary, faculty at Wayne State Feng Chen – Ph.D.10, Intel Labs (Oregon) Xiaoning Ding – Ph.D. 10, Intel Labs (Pittsburgh) Qingda Lu – Ph.D.09, Intel (Oregon) Jiang Lin – Ph.D08 at Iowa State, AMD Zhao Zhang – Ph.D.02 at William and Mary, faculty at Iowa State P. Sadayappan, Ohio State Continuous support from the National Science Foundation 73

74 CSE 788: Winter Quarter 2011 Principle of Locality in Design and Implementation of Computer and Distributed Systems – Exploiting locality at different levels of computer systems – Challenges of algorithms design and implantations – Readings of both classical and new papers – A proposals- and projects-based class – Many high quality research started from this class, and published in FAST, HPCA, Micro, PODC, PACT, SIGMETRICS, USENIX, and VLDB You are welcome to take the class next quarter 74

75 75 Xiaodong Zhang : zhang@cse.ohio-state.eduzhang@cse.ohio-state.edu http://www.cse.ohio-state.edu/~zhang


Download ppt "Advancement of Buffer Management Research and Development in Computer and Data Systems Xiaodong Zhang The Ohio State University."

Similar presentations


Ads by Google