Performance Evaluation of Cache Replacement Policies for the SPEC CPU2000 Benchmark Suite Hussein Al-Zoubi.

Slides:



Advertisements
Similar presentations
9.4 Page Replacement What if there is no free frame?
Advertisements

Online Algorithm Huaping Wang Apr.21
361 Computer Architecture Lecture 15: Cache Memory
Chapter 11 – Virtual Memory Management
Project : Phase 1 Grading Default Statistics (40 points) Values and Charts (30 points) Analyses (10 points) Branch Predictor Statistics (30 points) Values.
Chapter 11 – Virtual Memory Management
Page Replacement Algorithms
A Preliminary Attempt ECEn 670 Semester Project Wei Dang Jacob Frogget Poisson Processes and Maximum Likelihood Estimator for Cache Replacement.
§12.4 Static Paging Algorithms
Link-Time Path-Sensitive Memory Redundancy Elimination Manel Fernández and Roger Espasa Computer Architecture Department Universitat.
Dynamic Thread Assignment on Heterogeneous Multiprocessor Architectures Pree Thiengburanathum Advanced computer architecture Oct 24,
1 Lecture 13: Cache and Virtual Memroy Review Cache optimization approaches, cache miss classification, Adapted from UCB CS252 S01.
1 Cache and Caching David Sands CS 147 Spring 08 Dr. Sin-Min Lee.
1 Improving Direct-Mapped Cache Performance by the Addition of a Small Fully-Associative Cache and Prefetch Buffers By Sreemukha Kandlakunta Phani Shashank.
August 8 th, 2011 Kevan Thompson Creating a Scalable Coherent L2 Cache.
4/17/20151 Improving Memory Bank-Level Parallelism in the Presence of Prefetching Chang Joo Lee Veynu Narasiman Onur Mutlu* Yale N. Patt Electrical and.
Cache Replacement Policy Using Map-based Adaptive Insertion Yasuo Ishii 1,2, Mary Inaba 1, and Kei Hiraki 1 1 The University of Tokyo 2 NEC Corporation.
1 Virtual Private Caches ISCA’07 Kyle J. Nesbit, James Laudon, James E. Smith Presenter: Yan Li.
WCED: June 7, 2003 Matt Ramsay, Chris Feucht, & Mikko Lipasti University of Wisconsin-MadisonSlide 1 of 26 Exploring Efficient SMT Branch Predictor Design.
CS 342 – Operating Systems Spring 2003 © Ibrahim Korpeoglu Bilkent University1 Memory Management – 4 Page Replacement Algorithms CS 342 – Operating Systems.
Chapter 11 – Virtual Memory Management Outline 11.1 Introduction 11.2Locality 11.3Demand Paging 11.4Anticipatory Paging 11.5Page Replacement 11.6Page Replacement.
1 Balanced Cache:Reducing Conflict Misses of Direct-Mapped Caches through Programmable Decoders ISCA 2006,IEEE. By Chuanjun Zhang Speaker: WeiZeng.
Adaptive Cache Compression for High-Performance Processors Alaa R. Alameldeen and David A.Wood Computer Sciences Department, University of Wisconsin- Madison.
Dyer Rolan, Basilio B. Fraguela, and Ramon Doallo Proceedings of the International Symposium on Microarchitecture (MICRO’09) Dec /7/14.
ECE 510 Brendan Crowley Paper Review October 31, 2006.
Basics of Operating Systems March 4, 2001 Adapted from Operating Systems Lecture Notes, Copyright 1997 Martin C. Rinard.
Simulation of Memory Management Using Paging Mechanism in Operating Systems Tarek M. Sobh and Yanchun Liu Presented by: Bei Wang University of Bridgeport.
A Novel Cache Architecture with Enhanced Performance and Security Zhenghong Wang and Ruby B. Lee.
Chapter 1 Computer System Overview Dave Bremer Otago Polytechnic, N.Z. ©2008, Prentice Hall Operating Systems: Internals and Design Principles, 6/E William.
Korea Univ B-Fetch: Branch Prediction Directed Prefetching for In-Order Processors 컴퓨터 · 전파통신공학과 최병준 1 Computer Engineering and Systems Group.
Achieving Non-Inclusive Cache Performance with Inclusive Caches Temporal Locality Aware (TLA) Cache Management Policies Aamer Jaleel,
A Data Cache with Dynamic Mapping P. D'Alberto, A. Nicolau and A. Veidenbaum ICS-UCI Speaker Paolo D’Alberto.
1 of 20 Phase-based Cache Reconfiguration for a Highly-Configurable Two-Level Cache Hierarchy This work was supported by the U.S. National Science Foundation.
Chapter Twelve Memory Organization
Garo Bournoutian and Alex Orailoglu Proceedings of the 45th ACM/IEEE Design Automation Conference (DAC’08) June /10/28.
Computer Architecture Memory organization. Types of Memory Cache Memory Serves as a buffer for frequently accessed data Small  High Cost RAM (Main Memory)
L/O/G/O Cache Memory Chapter 3 (b) CS.216 Computer Architecture and Organization.
Computer Architecture Lecture 26 Fasih ur Rehman.
Abdullah Aldahami ( ) March 23, Introduction 2. Background 3. Simulation Techniques a.Experimental Settings b.Model Description c.Methodology.
Caches Where is a block placed in a cache? –Three possible answers  three different types AnywhereFully associativeOnly into one block Direct mappedInto.
Improving Disk Throughput in Data-Intensive Servers Enrique V. Carrera and Ricardo Bianchini Department of Computer Science Rutgers University.
Time Parallel Simulations I Problem-Specific Approach to Create Massively Parallel Simulations.
Using Cache Models and Empirical Search in Automatic Tuning of Applications Apan Qasem Ken Kennedy John Mellor-Crummey Rice University Houston, TX Apan.
Introduction: Memory Management 2 Ideally programmers want memory that is large fast non volatile Memory hierarchy small amount of fast, expensive memory.
Lecture 14: Caching, cont. EEN 312: Processors: Hardware, Software, and Interfacing Department of Electrical and Computer Engineering Spring 2014, Dr.
11 Intro to cache memory Kosarev Nikolay MIPT Nov, 2009.
Exploiting Instruction Streams To Prevent Intrusion Milena Milenkovic.
Lecture 20 Last lecture: Today’s lecture: Types of memory
Cache Miss-Aware Dynamic Stack Allocation Authors: S. Jang. et al. Conference: International Symposium on Circuits and Systems (ISCAS), 2007 Presenter:
1 Appendix C. Review of Memory Hierarchy Introduction Cache ABCs Cache Performance Write policy Virtual Memory and TLB.
On the Importance of Optimizing the Configuration of Stream Prefetches Ilya Ganusov Martin Burtscher Computer Systems Laboratory Cornell University.
ECE462/562 Class Project Intelligent Cache Replacement Policy Team member : Chen, Kemeng Gregory A Reida.
PAGE REPLACEMNT ALGORITHMS FUNDAMENTAL OF ALGORITHMS.
Computer Orgnization Rabie A. Ramadan Lecture 9. Cache Mapping Schemes.
Page Replacement FIFO, LIFO, LRU, NUR, Second chance
Virtual memory.
COSC3330 Computer Architecture
The Memory System (Chapter 5)
Replacement Policy Replacement policy:
Multilevel Memories (Improving performance using alittle “cash”)
Microbenchmarks for Memory Hierarchy
Bank-aware Dynamic Cache Partitioning for Multicore Architectures
Adaptive Cache Replacement Policy
Another Performance Evaluation of Memory Hierarchy in Embedded Systems
What Happens if There is no Free Frame?
Virtual Memory فصل هشتم.
Cache Replacement in Modern Processors
Page Replacement FIFO, LIFO, LRU, NUR, Second chance
Cache - Optimization.
What Are Performance Counters?
Presentation transcript:

Performance Evaluation of Cache Replacement Policies for the SPEC CPU2000 Benchmark Suite Hussein Al-Zoubi

Overview Introduction Common cache replacement policies Experimental methodology Evaluating cache replacement policies: questions and answers Conclusion

Introduction Increasing speed gap between processor and memory Modern processors include multiple levels of caches, cache associativity increases Replacement policy: Which block to discard when the cache is full

Introduction...cont. Optimal Replacement (OPT) algorithm: replace cache memory block whose next reference farthest away in the future, infeasible State-of-the-art processors employ various policies

Introduction...cont. Random LRU (Least Recently Used) Round-robin (FIFO – First-In-First-Out) PLRU (Pseudo Least Recently Used) : reduce the hardware cost by approximating the LRU mechanism

Introduction...cont. Our goal: explore and evaluate common cache replacement policies how existing policies relate to OPT effect on instruction and data caches how good are pseudo techniques in approximating true LRU

Common cache replacement policies: LRU

Common cache replacement policies … cont. Random policy: simpler, but at the expense performance. Linear Feedback Shift Register (LFSR) Round Robin (or FIFO) replacement: replacing oldest block in cache memory. Circular counter

Common cache replacement policies … cont. PLRUt

Common cache replacement policies … cont. PLRUm

Experimental methodology sim-cache and sim-cheetah simulators Alpha version of the SimpleScalar original simulators modified to support additional pseudo-LRU replacement policies sim-cache simulator modified to print interval statistics per specified number of instructions

Evaluating cache replacement policies: questions and answers Q: How much associativity is enough for state-of-the-art benchmarks? A: For data cache, performance gain for transition from a direct mapped to a two- way set associative cache For instruction cache, OPT replacement policy benefits from increased associativity. realistic policies don t exploit more than 8 ways, or in some cases even more than 2 ways

Evaluating cache replacement policies: questions and answers … cont. Q: How much space is there for improvement for each specific benchmark and cache configuration?

Evaluating cache replacement policies: questions and answers … cont. Q: Do replacement policies behave differently for different types of memory references, such as instruction and data? A: In general, LRU policy has better performance than FIFO and Random with some exceptions

Evaluating cache replacement policies: questions and answers … cont. Q: Can dynamic change of replacement policy reduce the total number of cache misses? A: If one policy better than the other, it stays consistently better

Evaluating cache replacement policies: questions and answers … cont. Can we use most recently used information for cache way prediction?

Evaluating cache replacement policies: questions and answers … cont. Q: How good are pseudo LRU techniques at approximating true LRU? A: PLRUm and PLRUt very efficient in approximating LRU policy and close to LRU during whole program execution

Conclusion Eliminating cache misses extremely important for improving overall processor performance Cache replacement policies gain more significance in set associative caches Gap between LRU and OPT replacement policies, up to 50%, new research to close the gap is necessary