1  2004 Morgan Kaufmann Publishers Multilevel cache Used to reduce miss penalty to main memory First level designed –to reduce hit time –to be of small.

Slides:



Advertisements
Similar presentations
1 Lecture 13: Cache and Virtual Memroy Review Cache optimization approaches, cache miss classification, Adapted from UCB CS252 S01.
Advertisements

Computer Organization CS224 Fall 2012 Lesson 44. Virtual Memory  Use main memory as a “cache” for secondary (disk) storage l Managed jointly by CPU hardware.
Lecture 34: Chapter 5 Today’s topic –Virtual Memories 1.
CSIE30300 Computer Architecture Unit 10: Virtual Memory Hsin-Chou Chi [Adapted from material by and
Caching IV Andreas Klappenecker CPSC321 Computer Architecture.
The Memory Hierarchy (Lectures #24) ECE 445 – Computer Organization The slides included herein were taken from the materials accompanying Computer Organization.
CSCE 212 Chapter 7 Memory Hierarchy Instructor: Jason D. Bakos.
1 Lecture 20 – Caching and Virtual Memory  2004 Morgan Kaufmann Publishers Lecture 20 Caches and Virtual Memory.
S.1 Review: The Memory Hierarchy Increasing distance from the processor in access time L1$ L2$ Main Memory Secondary Memory Processor (Relative) size of.
Computer ArchitectureFall 2008 © CS : Computer Architecture Lecture 22 Virtual Memory (1) November 6, 2008 Nael Abu-Ghazaleh.
1  1998 Morgan Kaufmann Publishers Chapter Seven Large and Fast: Exploiting Memory Hierarchy.
Review CPSC 321 Andreas Klappenecker Announcements Tuesday, November 30, midterm exam.
Recap. The Memory Hierarchy Increasing distance from the processor in access time L1$ L2$ Main Memory Secondary Memory Processor (Relative) size of the.
1 Chapter Seven Large and Fast: Exploiting Memory Hierarchy.
The Memory Hierarchy II CPSC 321 Andreas Klappenecker.
Technical University of Lodz Department of Microelectronics and Computer Science Elements of high performance microprocessor architecture Virtual memory.
1  1998 Morgan Kaufmann Publishers Chapter Seven Large and Fast: Exploiting Memory Hierarchy.
1  1998 Morgan Kaufmann Publishers Chapter Seven Large and Fast: Exploiting Memory Hierarchy.
Virtual Memory and Paging J. Nelson Amaral. Large Data Sets Size of address space: – 32-bit machines: 2 32 = 4 GB – 64-bit machines: 2 64 = a huge number.
©UCB CS 162 Ch 7: Virtual Memory LECTURE 13 Instructor: L.N. Bhuyan
1  2004 Morgan Kaufmann Publishers Chapter Seven.
1 SRAM: –value is stored on a pair of inverting gates –very fast but takes up more space than DRAM (4 to 6 transistors) DRAM: –value is stored as a charge.
1  1998 Morgan Kaufmann Publishers Chapter Seven Large and Fast: Exploiting Memory Hierarchy (Part II)
11/10/2005Comp 120 Fall November 10 8 classes to go! questions to me –Topics you would like covered –Things you don’t understand –Suggestions.
1 CSE SUNY New Paltz Chapter Seven Exploiting Memory Hierarchy.
Computer ArchitectureFall 2007 © November 12th, 2007 Majd F. Sakr CS-447– Computer Architecture.
Computing Systems Memory Hierarchy.
Lecture 19: Virtual Memory
Lecture 15: Virtual Memory EEN 312: Processors: Hardware, Software, and Interfacing Department of Electrical and Computer Engineering Spring 2014, Dr.
CPE432 Chapter 5A.1Dr. W. Abu-Sufah, UJ Chapter 5B:Virtual Memory Adapted from Slides by Prof. Mary Jane Irwin, Penn State University Read Section 5.4,
July 30, 2001Systems Architecture II1 Systems Architecture II (CS ) Lecture 8: Exploiting Memory Hierarchy: Virtual Memory * Jeremy R. Johnson Monday.
IT253: Computer Organization
Lecture 9: Memory Hierarchy Virtual Memory Kai Bu
1 Virtual Memory Main memory can act as a cache for the secondary storage (disk) Advantages: –illusion of having more physical memory –program relocation.
1  1998 Morgan Kaufmann Publishers Recap: Memory Hierarchy of a Modern Computer System By taking advantage of the principle of locality: –Present the.
Introduction: Memory Management 2 Ideally programmers want memory that is large fast non volatile Memory hierarchy small amount of fast, expensive memory.
Multilevel Caches Microprocessors are getting faster and including a small high speed cache on the same chip.
1 Chapter Seven CACHE MEMORY AND VIRTUAL MEMORY. 2 SRAM: –value is stored on a pair of inverting gates –very fast but takes up more space than DRAM (4.
1  2004 Morgan Kaufmann Publishers Chapter Seven Memory Hierarchy-3 by Patterson.
CS2100 Computer Organisation Virtual Memory – Own reading only (AY2015/6) Semester 1.
Virtual Memory Ch. 8 & 9 Silberschatz Operating Systems Book.
1 Chapter Seven. 2 Users want large and fast memories! SRAM access times are ns at cost of $100 to $250 per Mbyte. DRAM access times are ns.
1  1998 Morgan Kaufmann Publishers Chapter Seven.
1  2004 Morgan Kaufmann Publishers Locality A principle that makes having a memory hierarchy a good idea If an item is referenced, temporal locality:
LECTURE 12 Virtual Memory. VIRTUAL MEMORY Just as a cache can provide fast, easy access to recently-used code and data, main memory acts as a “cache”
Summary of caches: The Principle of Locality: –Program likely to access a relatively small portion of the address space at any instant of time. Temporal.
1 Chapter Seven. 2 SRAM: –value is stored on a pair of inverting gates –very fast but takes up more space than DRAM (4 to 6 transistors) DRAM: –value.
3/1/2002CSE Virtual Memory Virtual Memory CPU On-chip cache Off-chip cache DRAM memory Disk memory Note: Some of the material in this lecture are.
1  2004 Morgan Kaufmann Publishers Page Tables. 2  2004 Morgan Kaufmann Publishers Page Tables.
CS203 – Advanced Computer Architecture Virtual Memory.
CDA 5155 Virtual Memory Lecture 27. Memory Hierarchy Cache (SRAM) Main Memory (DRAM) Disk Storage (Magnetic media) CostLatencyAccess.
CMSC 611: Advanced Computer Architecture Memory & Virtual Memory Some material adapted from Mohamed Younis, UMBC CMSC 611 Spr 2003 course slides Some material.
Virtual Memory Chapter 8.
CS161 – Design and Architecture of Computer
Virtual Memory Chapter 7.4.
ECE232: Hardware Organization and Design
Memory COMPUTER ARCHITECTURE
CS161 – Design and Architecture of Computer
CS352H: Computer Systems Architecture
Lecture 12 Virtual Memory.
Virtual Memory Use main memory as a “cache” for secondary (disk) storage Managed jointly by CPU hardware and the operating system (OS) Programs share main.
Morgan Kaufmann Publishers
Virtual Memory Main memory can act as a cache for the secondary storage (disk) Advantages: illusion of having more physical memory program relocation protection.
Lecture 14 Virtual Memory and the Alpha Memory Hierarchy
Virtual Memory 4 classes to go! Today: Virtual Memory.
ECE 445 – Computer Organization
Morgan Kaufmann Publishers Memory Hierarchy: Virtual Memory
Virtual Memory Overcoming main memory size limitation
Virtual Memory Lecture notes from MKP and S. Yalamanchili.
Virtual Memory Use main memory as a “cache” for secondary (disk) storage Managed jointly by CPU hardware and the operating system (OS) Programs share main.
Presentation transcript:

1  2004 Morgan Kaufmann Publishers Multilevel cache Used to reduce miss penalty to main memory First level designed –to reduce hit time –to be of small size and allow for higher miss rate –Usually implemented on the same die as the processor Second level designed –to reduce miss rate (miss penalty) –to be larger in size –Can be on or off-chip (built from SRAMs)

2  2004 Morgan Kaufmann Publishers Multilevel cache: example (page 505) Processor with base CPI = 1.0 (assuming all references hit in primary cache) Clock rate 5 GHz Main memory access time of 100 ns (including miss handling) Miss rate per instruction at primary cache is 2% How much faster will be the processor if we add a secondary level cache with access time 5 ns and large enough to reduce miss rate to main memory to 0.5%?

3  2004 Morgan Kaufmann Publishers Solution Using total execution time

4  2004 Morgan Kaufmann Publishers Cache Complexities Not always easy to understand implications of caches: Theoretical behavior of Radix sort vs. Quicksort Observed behavior of Radix sort vs. Quicksort

5  2004 Morgan Kaufmann Publishers Cache Complexities Here is why: Memory system performance is often critical factor –multilevel caches, pipelined processors, make it harder to predict outcomes –Compiler optimizations to increase locality sometimes hurt ILP Think of putting instructions that access same data near each other in code leading to data hazards Difficult to predict best algorithm: need experimental data

6  2004 Morgan Kaufmann Publishers Virtual memory

Memory Hierarchy Cache (SRAM) Main Memory (DRAM) Disk Storage (Magnetic media)

Issues DRAM is too expensive to buy gigabytes –Yet we want our programs to work even if they require more DRAM than we bought. –We also don’t want a program that works on a machine with 128MB of DRAM to stop working if we try to run it on a machine with only 64MB of main memory. We run more than one program on the machine. –The sum of needed memory for all of them usually exceeds the amount of available memory –We need to protect the programs from each other

Virtual Memory Virtual memory technique: Main memory can act as a cache for the secondary storage (disk) Virtual memory is responsible for the mapping of blocks of memory (called pages) from one set of addresses (called virtual addresses) to another set (called physical addresses)

Virtual memory advantages Illusion of having more physical memory –Keep only active portions of a program in RAM Program relocation –Maps virtual address of a program to a physical address in memory –Put program anywhere in memory no need to have a single contiguous block of main memory, program relocated as a set of fixed-size pages Protection (code and data between simultaneously running programs) –Each program has its own address space –Virtual memory implements translation of program address space into physical address while enforcing protection

Virtual memory terminology Blocks are called Pages –A virtual address consists of A virtual page number A page offset field (low order bits of the address) Misses are call Page faults –and they are generally handled as an exception Virtual page numberPage offset 01131

Page faults Page faults: the data is not in memory, retrieve it from disk –huge miss penalty [main memory is about 100,000 times faster than disk], thus pages should be fairly large (4KB to 16 KB) –reducing page faults is important (LRU is worth the price) –can handle the faults in software instead of hardware [overhead will be small compared to disk access time] –using write-through is too expensive so we use write-back The structure that holds the information related to the pages (i.e. page is in memory or disk) is called Page Table –Page table is stored in memory

Here the page size is 2 12 = 4 KB (determined by number of bits in page offset) Number of allowed physical pages = 2 18 Thus, main memory is at most 1GB (=2 30 ) while virtual address space is 4GB (=2 32 ) CPU ( an address in main memory)

Placing a page and finding it again High penalty of a page fault necessitates optimizing page placement –Fully associative is attractive as it allows OS to replace any page by sophisticated LRU algorithms Full search of main memory is impractical –Use a page table that indexes the memory and resides in memory –Page table indexed by page number from the virtual address (no need of tags) –Page table for each program –Page table register to indicate page table location in memory –Page table may contain entries not in main memory, rather on disk

Pointer to the starting address of the page table in memory Here the page size is 2 12 = 4 KB (determined by number of bits in page offset) Number of allowed physical pages = 2 18 Thus, main memory is at most 1GB (=2 30 ) while virtual address space is 4GB (=2 32 ) Number of entries in page table is 2 20 (very large)

The OS Role OS indexes the page table OS moves the pages in/out of the memory When a process is created the OS tries to reserve in the disk enough space for all the pages. This space is called swap area. Page table maps each page in virtual memory to either a page in main memory or on disk

Problem Consider a virtual memory system with the following properties: 40-bit virtual address 36-bit physical address 16KB page size What is the total size of the page table for each process on this processor, assuming that the valid, protection, dirty, and use bits take a total of 4 bits, and that all the virtual pages are in use?

Solution The total size is equal to the number of entries times the size of each entry. Each page is 16 KB, and thus, 14 bits of the virtual and physical address will be used as a page offset. The remaining 40 – 14 = 26 bits of the virtual address constitute the virtual page number There are thus 2 26 entries in the page table, one for each virtual page number. Each entry requires 36 – 14 = 22 bits to store the physical page number and an additional 4 bits for the valid, protection, dirty, and use bits. We round the 26 bits up to a full word per entry, so this gives us a total size of 2 26 x 32 bits or 256 MB.

Performance of virtual memory We must access physical memory to access the page table to make the translation from a virtual address to a physical one Then we access physical memory again to get (or store) the data A load instruction performs at least 2 memory reads A store instruction performs at least 1 read and then a write.

Translation lookaside buffer We fix this performance problem by avoiding main memory in the translation from virtual to physical pages. We buffer the common translations in a Translation lookaside buffer (TLB), a fast cache memory dedicated to storing a small subset of valid Virtual-to-Physical translations. It is usually put before the cache

21  2004 Morgan Kaufmann Publishers Main design questions for memory hierarchy Where can a block be placed? How is a block found? Which block should be replaced on a miss? What happens on a write?

22  2004 Morgan Kaufmann Publishers HW HW: 7.10, 7.14, 7.20, 7.32 –Due Dec 23 Section problems: 7.9, 7.12, 7.29, 7.33

Chapters 8

Interfacing Processors and Peripherals ( محيطى / خارجى ) I/O Design affected by many factors (expandability, resilience, dependability) Performance: – Measured by access latency throughput – Depends on connection between devices and the system the memory hierarchy the operating system A variety of different users (e.g., banks, supercomputers, engineers)

Which performance measure is important? Depends on the application – Multimedia applications, most I/O requests are long streams, BW is important – File tax processing, lots of small I/O requests, need to handle large number of small I/O requests simultaneously – ATM transactions, both high throughput and short response time

I/O Devices Very diverse devices — behavior (i.e., input vs. output) — partner (who is at the other end? Human/machine) — data rate

I/O Example: Disk Drives To access data: — seek: position head over the proper track (3 to 14 ms. avg.) — rotational latency: wait for desired sector (.5 / RPM) — transfer: grab the data (one or more sectors) 30 to 80 MB/sec

Example page 570

solution

Dependability, reliability and availability Dependability: fuzzy concept – Needs reference specification System alternating between two states: 1.Service accomplishment: service delivered as specified 2.Service interruption: delivered service is different from specified Transition from 1 to 2  failure Transition from 2 to 1  restoration