Compressed Memory Hierarchy Dongrui SHE Jianhua HUI.

Slides:



Advertisements
Similar presentations
Gennady Pekhimenko Advisers: Todd C. Mowry & Onur Mutlu
Advertisements

Fabián E. Bustamante, Spring 2007
Managing Wire Delay in Large CMP Caches Bradford M. Beckmann David A. Wood Multifacet Project University of Wisconsin-Madison MICRO /8/04.
Lecture 8: Memory Hierarchy Cache Performance Kai Bu
High Performing Cache Hierarchies for Server Workloads
Practical Caches COMP25212 cache 3. Learning Objectives To understand: –Additional Control Bits in Cache Lines –Cache Line Size Tradeoffs –Separate I&D.
Linearly Compressed Pages: A Main Memory Compression Framework with Low Complexity and Low Latency Gennady Pekhimenko, Vivek Seshadri , Yoongu Kim,
CSC 4250 Computer Architectures December 8, 2006 Chapter 5. Memory Hierarchy.
1 Copyright © 2012, Elsevier Inc. All rights reserved. Chapter 2 (and Appendix B) Memory Hierarchy Design Computer Architecture A Quantitative Approach,
Spring 2003CSE P5481 Introduction Why memory subsystem design is important CPU speeds increase 55% per year DRAM speeds increase 3% per year rate of increase.
Overview of Cache and Virtual MemorySlide 1 The Need for a Cache (edited from notes with Behrooz Parhami’s Computer Architecture textbook) Cache memories.
CSCE 212 Chapter 7 Memory Hierarchy Instructor: Jason D. Bakos.
S.1 Review: The Memory Hierarchy Increasing distance from the processor in access time L1$ L2$ Main Memory Secondary Memory Processor (Relative) size of.
Recap. The Memory Hierarchy Increasing distance from the processor in access time L1$ L2$ Main Memory Secondary Memory Processor (Relative) size of the.
CS 333 Introduction to Operating Systems Class 11 – Virtual Memory (1)
Last Time –Main memory indexing (T trees) and a real system. –Optimize for CPU, space, and logging. But things have changed drastically! Hardware trend:
Reducing Cache Misses (Sec. 5.3) Three categories of cache misses: 1.Compulsory –The very first access to a block cannot be in the cache 2.Capacity –Due.
Memory Organization.
1 COMP 206: Computer Architecture and Implementation Montek Singh Wed, Nov 9, 2005 Topic: Caches (contd.)
Memory: Virtual MemoryCSCE430/830 Memory Hierarchy: Virtual Memory CSCE430/830 Computer Architecture Lecturer: Prof. Hong Jiang Courtesy of Yifeng Zhu.
Skewed Compressed Cache
An Intelligent Cache System with Hardware Prefetching for High Performance Jung-Hoon Lee; Seh-woong Jeong; Shin-Dug Kim; Weems, C.C. IEEE Transactions.
A Novel Cache Architecture with Enhanced Performance and Security Zhenghong Wang and Ruby B. Lee.
CMP 301A Computer Architecture 1 Lecture 3. Outline zQuick summary zMultilevel cache zVirtual memory y Motivation and Terminology y Page Table y Translation.
1 Reducing DRAM Latencies with an Integrated Memory Hierarchy Design Authors Wei-fen Lin and Steven K. Reinhardt, University of Michigan Doug Burger, University.
Ioana Burcea * Stephen Somogyi §, Andreas Moshovos*, Babak Falsafi § # Predictor Virtualization *University of Toronto Canada § Carnegie Mellon University.
Embedded System Lab. 김해천 Linearly Compressed Pages: A Low- Complexity, Low-Latency Main Memory Compression Framework Gennady Pekhimenko†
The Memory Hierarchy 21/05/2009Lecture 32_CA&O_Engr Umbreen Sabir.
A Decompression Architecture for Low Power Embedded Systems Lekatsas, H.; Henkel, J.; Wolf, W.; Computer Design, Proceedings International.
Abdullah Aldahami ( ) March 23, Introduction 2. Background 3. Simulation Techniques a.Experimental Settings b.Model Description c.Methodology.
The Three C’s of Misses 7.5 Compulsory Misses The first time a memory location is accessed, it is always a miss Also known as cold-start misses Only way.
Lecture 08: Memory Hierarchy Cache Performance Kai Bu
Operating Systems ECE344 Ashvin Goel ECE University of Toronto Virtual Memory Hardware.
Introduction to Virtual Memory and Memory Management
M E M O R Y. Computer Performance It depends in large measure on the interface between processor and memory. CPI (or IPC) is affected CPI = Cycles per.
Multilevel Caches Microprocessors are getting faster and including a small high speed cache on the same chip.
CS.305 Computer Architecture Memory: Virtual Adapted from Computer Organization and Design, Patterson & Hennessy, © 2005, and from slides kindly made available.
1 CMP-MSI.07 CARES/SNU A Reusability-Aware Cache Memory Sharing Technique for High Performance CMPs with Private Caches Sungjune Youn, Hyunhee Kim and.
1 CMPE 421 Parallel Computer Architecture PART3 Accessing a Cache.
Memory Design Principles Principle of locality dominates design Smaller = faster Hierarchy goal: total memory system almost as cheap as the cheapest component,
Presented by Rania Kilany.  Energy consumption  Energy consumption is a major concern in many embedded computing systems.  Cache Memories 50%  Cache.
Cache Data Compaction: Milestone 2 Edward Ma, Siva Penke, Abhijeeth Nuthan.
CMSC 611: Advanced Computer Architecture Memory & Virtual Memory Some material adapted from Mohamed Younis, UMBC CMSC 611 Spr 2003 course slides Some material.
CS 704 Advanced Computer Architecture
Improving Memory Access The Cache and Virtual Memory
COSC3330 Computer Architecture
Memory COMPUTER ARCHITECTURE
Section 9: Virtual Memory (VM)
Improving Memory Access 1/3 The Cache and Virtual Memory
Virtual Memory Use main memory as a “cache” for secondary (disk) storage Managed jointly by CPU hardware and the operating system (OS) Programs share main.
Cache Memory Presentation I
Morgan Kaufmann Publishers Memory & Cache
ECE 445 – Computer Organization
CSCI206 - Computer Organization & Programming
Amoeba-Cache: Adaptive Blocks for Eliminating Waste in the Memory Hierarchy Snehasish Kumar, Hongzhou Zhao†, Arrvindh Shriraman Eric Matthews∗, Sandhya.
Lecture 14 Virtual Memory and the Alpha Memory Hierarchy
Part V Memory System Design
CMSC 611: Advanced Computer Architecture
UNIVERSITY of WISCONSIN-MADISON Computer Sciences Department
Chapter 5 Memory CSE 820.
Lecture 17: Case Studies Topics: case studies for virtual memory and cache hierarchies (Sections )
Lecture 08: Memory Hierarchy Cache Performance
ECE 445 – Computer Organization
Lecture: Cache Innovations, Virtual Memory
Module IV Memory Organization.
Virtual Memory Hardware
If a DRAM has 512 rows and its refresh time is 9ms, what should be the frequency of row refresh operation on the average?
CSC3050 – Computer Architecture
Overview Problem Solution CPU vs Memory performance imbalance
Restrictive Compression Techniques to Increase Level 1 Cache Capacity
Presentation transcript:

Compressed Memory Hierarchy Dongrui SHE Jianhua HUI

The research paper:  A compressed memory hierarchy using an indirect index cache. By Erik G. Hallnor and Steven K. Reinhardt Advanced Computer Architecture Laboratory EECS Department University of Michigan

Outline  Introduction  Memory eXpansion Technology  Cache-compression  IIC & IIC-C  Evaluation  Summary

Introduction Memory capacity and Memory bandwidth  The amount of cache cannot be increased without bound;  Scarce resource: memory bandwidth;

Application of data compression  First, adding a compressed main memory system (Memory Expansion Technology, MXT)  Second, Storing compressed data in the cache, then data be transmitted in compressed form between main memory and cache

A key challenge  Management of variable-sized data blocks: 128-byte Block After compression, 58 bytes unused

Outline  Introduction  Memory eXpansion Technology(MXT)  Cache-compression  IIC & IIC-C  Evaluation  Summary

Memory eXpansion Technology  A server class system with hardware compressed main memory.  Using LZSS compression algorithm. For most applications, two to one compression (2:l).  Hardware compression of memory has a negligible performance penalty.

Hardware organization  Sector translation table Each entry has 4 physical addr that each points to a 256B sector.

Outline  Introduction  Memory eXpansion Technology(MXT)  Cache-compression  IIC & IIC-C  Evaluation  Summary

Cache compression  Most designs for power savings, using more conventional cache structures: unused storage benefits only by not consuming power.  To use the space freed by compression, new cache structure is needed.

Outline  Introduction  Memory eXpansion Technology(MXT)  Cache-compression  IIC & IIC-C  Evaluation  Summary

Conventional Cache Structure  Tag associated statically with a block  When data is compressed

Solution: Indirect Index Cache  A tag entry not associated with a particular data block  A tag entry contains a pointer to data block

IIC structure  The cache can be fully associative

Extend IIC to compressed data  Tag contains multiple pointers to smaller data blocks

 Software-managed  Blocks grouped into prioritized pools based on frequency  Victim is chosen from lowest-priority non- empty pool Generational Replacement

Additional Cost  Compression/decompression engine  More space for the tag entries  Extra resource for replacement algorithm  Area is roughly 13% larger

Outline  Introduction  Memory eXpansion Technology(MXT)  Cache-compression  IIC & IIC-C  Evaluation  Summary

Evaluation Method: SPEC CPU2000 Benchmarks:  Main memory: 150 cycle latency, bus width 32, with MXT  L1: 1 cycle latency, split 16KB, 4-way, 64B block size  L2:12 cycle latency, unified 256KB, 8-way,128B block size  L3:26 cycle latency, unified 1MB,8- way,128B block size, with IIC-C

Evaluation  lsd Over 50% gain with only 10% area overhead

Evaluation

Summary Advantages:  Increase Effective Capacity & Bandwidth;  Power Saving From Less Memory Access Drawbacks:  Increase Hardware Complexity  Power Consumption of Additional Hardware

Future work  Overall power consumption study  Use it in embedded system

END Thank you ! Question time.