+ Principle of Memory Hierarchy To optimize memory performance for a given cost: A set of technologies are arranged in a hierarchy that contains a relatively small amount of fast memory and larger amounts of less expensive, but slower memory.
+ Memory Hierarchy Importance 1980: no cache on CPU 1995: 2-level cache on CPU 1989: first Intel CPU with cache on chip
+ Memory Storage Characteristics Location Capacity Unit of transfer Access method Performance Physical type Physical characteristics Organization
+ Memory Storage - Location CPU Registers L1, L2, L3, L4 Cache Internal Main Memory (System RAM) BIOS (EEPROM) External Magnetic Disk (HDD) Non Volatile Solid State (SSD) Optical Magnetic Tape
+ Memory Storage - Capacity Word size The natural unit of organization Expected size of most data and instructions Typically 32 bits or 64 bits Past: 16 bits Typical Storage L1 Cache: 32 – 64 KB per core L2 Cache: 128 – 512 KB per core L3 Cache: 2 – 8 MB (shared) L4 Cache: 0 – 128 MB (video memory) Main Memory (RAM): 4 – 32 GB (Typical Desktop) HDD Cache: 16 – 64 MB SDD: 64 – 512 GB HDD: 200 – 2000 GB (Inexpensive, but extremely slow) Optical: DVD: 4.7 – 17.08 GB Blu-ray: 25 – 100 GB Magnetic Tape: 10 – 35 TB per cartridge (uncompressed)
+ Memory Storage – Unit of Transfer Internal Usually governed by data bus width External Usually a block which is much larger than a word Addressable unit Smallest unit which can be uniquely addressed Byte internally (typically)
+ Memory Storage – Access Methods Sequential (tape): Shared read/write mechanism Start at the beginning and read through in order Access time depends on location of data and previous location of magnetic tape Direct (disk) Shared read/write mechanism Individual blocks have unique address Access data by jumping to vicinity plus sequential search Access time depends on location of data and previous location of read/write mechanism
+ Memory Storage – Access Methods Random (RAM): Individual addresses identify locations exactly Access time is independent of location or previous access Associative (Cache): Data is located by a comparison with contents of a portion of the store Access time is independent of location or previous access
+ Memory Storage – Performance Latency/Access time Time between presenting the address and getting the valid data Memory Cycle time Time may be required for the memory to “recover” before next access Cycle time is latency + recovery Transfer Rate Rate at which data can be moved # of bits * (1/cycle time)
+ Performance – Transfer Rate Example Problem Assume we have a 32-Mbit SDRAM memory with 8 bits simultaneously read and a cycle time of 250 ns. How fast can data be moved out of memory? 8b * (1/250ns) = 8b * (4x10 6 /s) = 32 Mbps = 4 MBps
+ Memory Storage – Physical Types Semiconductor Cache Main Memory (RAM) SSD Magnetic HDD Tape Optical CD DVD Blu-Ray Others Bubble Hologram
+ Memory Storage – Organization Physical arrangement of bits into words Not always obvious Striped across multiple disks
+ Memory Storage – RAID RAID: Redundant Array of Inexpensive Disks Combines multiple disks into a logical unit for the purposes of Data redundancy Performance Improvement Or both Can be implemented by software or hardware Software: OS controlled Hardware: Physical RAID controller
+ Memory Storage – RAID Levels RAID 0 Striped data without parity or mirroring. Performance increase. RAID 1 Mirrored data without parity or striping. Fault tolerance. RAID 5 (most common) Striped data with distributed parity. Requires at least three disks. Performance increase with Striped data and Fault tolerance. Can still operate with one failed drive.
+ Memory Storage – Bottom Line How much? Capacity How fast? Performance How expensive?
+ Memory Storage – Hierarchy List Registers L1 Cache L2 Cache L3 Cache Main Memory Disk Cache SSD HDD Optical Tape
+ Main Memory Basics Memory: Where computer stores programs and data Bit (binary digit): basic unit (8 bits = 1 byte) Each memory cell (location) has an address numbered 0,…,n-1 (for n memory cells) Possible address range limited by address size m bits in address means 2 m addresses Memory cell size (typically 1 byte) grouped together into words (typically 32 or 64 bits) 32 bit computer will typically have 32 bit registers and instructions for manipulating 32 bit words 64 bit computer will be similar
+ Semiconductor Memory Random Access Memory (RAM): All semiconductor memory is random access Directly accessed by address logic Read/Write Volatile Requires constant power supply Temporary storage Static Holds data Dynamic Periodically refreshes charge
+ Static RAM Bits stored as on/off switches (transistors) No charges to leak Does not need refresh circuits No refreshing needed when powered Larger per bit More expensive Faster Example: Cache Memory:
+ SRAM Illustration When write enable is high, output is the same as input. Otherwise, output holds previous input value
+ Dynamic RAM Bits stored as charge in capacitors (also uses transistors) Charges leak from capacitors Needs refreshing, even when powered Needs refresh circuits Smaller per bit Less expensive Slower Asynchronous and Synchronous DRAMs Example: Main memory
+ DRAM Illustration When write enable is high, output is the same as input. Otherwise, output holds previous input value Includes capacitor refresh circuits
+ Read Only Memory (ROM) Permanent storage Microprogramming Library subroutines Systems programs Function tables
+ Measures of Memory Technology Density Latency and cycle time
+ Memory Density Refers to memory cells per square area of silicon Usually states as number of bits on standard chip size Examples: 1 mb chip 4 mb chip Memory cells typically structured in arrays 1Mb x 1 chip 256 Kb x 4 chips Note: higher density chip generates more heat
+ Memory Packaging: Chips 16 Mbit chip (4M x 4) WE = Write Enable OE = Output Enable RAS = Row Address Select CAS = Column Adress Select A0 – A10 = 11 address bits D1 – D4 = Data to be read/written NC = No connect, for even # of pins Vcc = Power Supply Vss = Ground Pin
+ Read-Write Performance In many memory technologies, the time required to fetch information from memory differs from the time required to store information in memory. This difference can be dramatic. Performance is then determined by Read and Write operations.