Presentation is loading. Please wait.

Presentation is loading. Please wait.

EEE-445 Review: Major Components of a Computer Processor Control Datapath Memory Devices Input Output Cache Main Memory Secondary Memory (Disk)

Similar presentations


Presentation on theme: "EEE-445 Review: Major Components of a Computer Processor Control Datapath Memory Devices Input Output Cache Main Memory Secondary Memory (Disk)"— Presentation transcript:

1 EEE-445 Review: Major Components of a Computer Processor Control Datapath Memory Devices Input Output Cache Main Memory Secondary Memory (Disk)

2 EEE-445 Second Level Cache (SRAM) A Typical Memory Hierarchy Control Datapath Secondary Memory (Disk) On-Chip Components RegFile Main Memory (DRAM) Data Cache Instr Cache ITLB DTLB Speed (ns):.1’s 1’s 10’s 100’s 1,000’s Size (bytes): 100’s K’s 10K’s M’s T’s Cost: highest lowest  By taking advantage of the principle of locality: Present the user with as much memory as is available in the cheapest technology. Provide access at the speed offered by the fastest technology.

3 EEE-445 Memory Hierarchy Technologies  Random Access Memories (RAMs) “Random” is good: access time is the same for all locations DRAM: Dynamic Random Access Memory -High density (1 transistor cells), low power, cheap, slow -Dynamic: need to be “refreshed” regularly (~ every 4 ms) SRAM: Static Random Access Memory -Low density (6 transistor cells), high power, expensive, fast -Static: content will last “forever” (until power turned off) Size: DRAM/SRAM ­ ratio of 4 to 8 Cost/Cycle time: SRAM/DRAM ­ ratio of 8 to 16  “Non-so-random” Access Technology Access time varies from location to location and from time to time (e.g., disk, CDROM)

4 EEE-445 RAM Memory Uses and Performance Metrics  Caches use SRAM for speed  Main Memory is DRAM for density Addresses divided into 2 halves (row and column) -RAS or Row Access Strobe triggering row decoder -CAS or Column Access Strobe triggering column selector  Memory performance metrics Latency: Time to access one word -Access Time: time between request and when word is read or written (read access and write access times can be different) -Cycle Time: time between successive (read or write) requests -Usually cycle time > access time Bandwidth: How much data can be supplied per unit time -width of the data channel * the rate at which it can be used

5 EEE-445 Classical SRAM Organization (~Square) rowdecoderrowdecoder row address 4-bit data word RAM Cell Array word (row) select line bit (data) lines Each intersection represents a 6-T SRAM cell Column Selector & I/O Circuits column address  One memory row holds a block of data, so the column address selects the requested word from that block

6 EEE-445 data bit Classical DRAM Organization (~Square Planes) rowdecoderrowdecoder row address Column Selector & I/O Circuits column address data bit bit (data) lines Each intersection represents a 1-T DRAM cell  The column address selects the requested bit from the row in each plane data word... RAM Cell Array word (row) select line n planes (where n is the # of bits in the word)

7 EEE-445 Classical DRAM Operation  DRAM Organization: N rows x N column x M-bit (planes) Read or Write M-bit at a time Each M-bit access requires a RAS / CAS cycle Row Address CAS RAS Col AddressRow AddressCol Address Access Time N rows N cols DRAM M bits Row Address Column Address M-bit Output Cycle Time

8 EEE-445 Synchronous RAMs  Synchronous RAMs have the ability to transfer a burst of data from a series of sequential addresses that are contained in the same row  Have speed advantages since don’t have to provide the complete (row and column) addresses for words in the same burst Page mode DRAMs – Specify the starting (row) address and then the successive column addresses SDRAMs – Specify starting (row+column) address and burst length (number of words in the row). Data in the burst is transferred under control of a clock signal.  DDR SDRAMs (Double Data Rate SDRAMs) Transfer burst data on both the rising and falling edge of the clock (twice as much bandwidth)

9 EEE-445 N rows N cols DRAM Column Address M-bit Output M bits N x M SRAM Row Address (Fast) Page Mode DRAM Operation Row Address CAS RAS Col Address 1st M-bit Access Col Address 2nd M-bit3rd M-bit4th M-bit  A row is read into an N x M SRAM buffer Only CAS is needed to access other M-bit blocks on that row (burst size) RAS remains asserted while CAS is changed Cycle Time

10 EEE-445 N rows N cols DRAM Column Address M-bit Output M bit planes N x M SRAM Row Address Synchronous DRAM (SDRAM) Operation  After a row is read into the SRAM register Input CAS as the starting “burst” address along with a burst length Transfers a burst of data from a series of sequential addresses within that row - A clock controls transfer of successive words in the burst +1 Row Address CAS RAS Col Address 1 st M-bit Access2 nd M-bit3 rd M-bit4 th M-bit Cycle Time Row Add

11 EEE-445 DRAM Memory Latency & Bandwidth Milestones  In the time that the memory to processor bandwidth has doubled the memory latency has improved by a factor of only 1.2 to 1.4  To deliver such high bandwidth, the internal DRAM has to be organized as interleaved memory banks 526275125170225Latency (nsec) 16006402671604013BWidth (MB/s) 6654201816 Pins/chip 204170130704535Die size (mm 2 ) 256641610.250.06Mb/chip 200019971993198619831980Year 64b 32b16b Module Width DDR SDRAM SDRAMPage DRAM DRAM Patterson, CACM Vol 47, #10, 2004

12 EEE-445 The Memory Hierarchy L1$ L2$ Main Memory Secondary Memory Increasing distance from the processor in access time Processor (Relative) size of the memory at each level Inclusive– what is in L1$ is a subset of what is in L2$ is a subset of what is in MM that is a subset of is in SM 4-8 bytes (word) 1 to 4 blocks 1,024+ bytes (disk sector = page) 8-32 bytes (block)  Take advantage of the principle of locality to present the user with as much memory as is available in the cheapest technology at the speed offered by the fastest technology

13 EEE-445 The Memory Hierarchy: Why Does it Work?  Temporal Locality (Locality in Time):  Keep most recently accessed instructions/datum closer to the processor  Spatial Locality (Locality in Space):  Move blocks consisting of contiguous words to the upper levels Lower Level Memory Upper Level Memory To Processor From Processor Blk X Blk Y

14 EEE-445 The Memory Hierarchy: Terminology  Hit: data is in some block in the upper level (Blk X) Hit Rate: the fraction of memory accesses found in the upper level Hit Time: Time to access the upper level which consists of -RAM access time + Time to determine hit/miss  Miss: data is not in the upper level so needs to be retrieve from a block in the lower level (Blk Y) Miss Rate = 1 - (Hit Rate) Miss Penalty: Time to replace a block in the upper level + Time to deliver the block the processor Hit Time << Miss Penalty Lower Level Memory Upper Level Memory To Processor From Processor Blk X Blk Y

15 EEE-445 How is the Hierarchy Managed?  registers  memory by compiler (programmer?)  cache  main memory by the cache controller hardware  main memory  disks by the operating system (virtual memory) virtual to physical address mapping assisted by the hardware (TLB) by the programmer (files)

16 EEE-445 Why? The “Memory Wall” The Processor vs DRAM speed “gap” continues to grow Clocks per instruction Clocks per DRAM access Good cache design is increasingly important to overall performance

17 EEE-445  Two questions to answer (in hardware): Q1: How do we know if a data item is in the cache? Q2: If it is, how do we find it?  Direct mapped For each item of data at the lower level, there is exactly one location in the cache where it might be - so lots of items at the lower level must share locations in the upper level Address mapping: (block address) modulo (# of blocks in the cache) First consider block sizes of one word The Cache

18 EEE-445 Caching: A Simple First Example 00 01 10 11 Cache 0000xx 0001xx 0010xx 0011xx 0100xx 0101xx 0110xx 0111xx 1000xx 1001xx 1010xx 1011xx 1100xx 1101xx 1110xx 1111xx Main Memory TagData Q1: Is it there? Valid Two low order bits define the byte in the word (32-b words) Q2: How do we find it? (block address) modulo (# of blocks in the cache) Index

19 EEE-445 Caching: A Simple First Example 00 01 10 11 Cache Main Memory Q2: How do we find it? Use next 2 low order memory address bits – the index – to determine which cache block (i.e., modulo the number of blocks in the cache) TagData Q1: Is it there? Compare the cache tag to the high order 2 memory address bits to tell if the memory block is in the cache Valid 0000xx 0001xx 0010xx 0011xx 0100xx 0101xx 0110xx 0111xx 1000xx 1001xx 1010xx 1011xx 1100xx 1101xx 1110xx 1111xx Two low order bits define the byte in the word (32b words) (block address) modulo (# of blocks in the cache) Index

20 EEE-445 Direct Mapped Cache 0123 4 3415  Consider the main memory word reference string 0 1 2 3 4 3 4 15 Start with an empty cache - all blocks initially marked as not valid

21 EEE-445 Direct Mapped Cache 0123 4 3415  Consider the main memory word reference string 0 1 2 3 4 3 4 15 00 Mem(0) 00 Mem(1) 00 Mem(0) 00 Mem(1) 00 Mem(2) miss hit 00 Mem(0) 00 Mem(1) 00 Mem(2) 00 Mem(3) 01 Mem(4) 00 Mem(1) 00 Mem(2) 00 Mem(3) 01 Mem(4) 00 Mem(1) 00 Mem(2) 00 Mem(3) 01 Mem(4) 00 Mem(1) 00 Mem(2) 00 Mem(3) 014 11 15 00 Mem(1) 00 Mem(2) 00 Mem(3) Start with an empty cache - all blocks initially marked as not valid 8 requests, 6 misses

22 EEE-445  One word/block, cache size = 1K words MIPS Direct Mapped Cache Example 20 Tag 10 Index Data IndexTagValid 0 1 2. 1021 1022 1023 31 30... 13 12 11... 2 1 0 Byte offset What kind of locality are we taking advantage of? 20 Data 32 Hit


Download ppt "EEE-445 Review: Major Components of a Computer Processor Control Datapath Memory Devices Input Output Cache Main Memory Secondary Memory (Disk)"

Similar presentations


Ads by Google