Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 COMP541 Memories - I Montek Singh Mar 21, 2016.

Similar presentations


Presentation on theme: "1 COMP541 Memories - I Montek Singh Mar 21, 2016."— Presentation transcript:

1 1 COMP541 Memories - I Montek Singh Mar 21, 2016

2 Topics  Overview of Memory Types Read-Only Memory (ROM): PROMs, FLASH, etc. Read-Only Memory (ROM): PROMs, FLASH, etc. Random-Access Memory (RAM) Random-Access Memory (RAM)  Static today  Dynamic next  Verilog descriptions of memories 2

3 Types of Memory  Many dimensions Read Only vs. Read/Write (or write seldom) Read Only vs. Read/Write (or write seldom) Volatile vs. Non-Volatile Volatile vs. Non-Volatile Requires refresh or not Requires refresh or not  Look at ROM first to examine interface 3

4 Non-Volatile Memory Technologies  Mask (old)  ROM read-only memory read-only memory  Fuses (old)  PROM programmable read-only memory programmable read-only memory  Erasable  EPROM erasable programmable read-only memory erasable programmable read-only memory  Electrically erasable  EEPROM electrically-erasable programmable read-only memory electrically-erasable programmable read-only memory  today called FLASH!  used everywhere! 4

5 Details of ROM  Memory that is permanent k address lines k address lines 2 k items 2 k items n bits n bits 5

6 Notional View of Internals  Main components: decoder for address decoding  select one row decoder for address decoding  select one row “wired-OR” per bit  OR’s together minterms “wired-OR” per bit  OR’s together minterms  ORing done by connecting outputs of effectively tristate buffers 6

7 Programmed Truth Table 7

8 ROM after programming  Remember: OR is a “wired OR” OR is a “wired OR”  output is 1 if any of the rows with an intact fuse is 1  0 otherwise 8

9 Mask ROMs  Oldest technology  Originally “mask” used as last step in manufacturing Specify metal layer (connections) Specify metal layer (connections) Used for volume applications Used for volume applications Long turnaround Long turnaround Used for applications such as embedded systems and, in the old days, boot ROM Used for applications such as embedded systems and, in the old days, boot ROM but cheap to mass produce! but cheap to mass produce! 9

10 Programmable ROM (PROM)  Early ones had fusible links High voltage would blow out links High voltage would blow out links Fast to program Fast to program Single use Single use 10

11 UV EPROM  Erasable PROM Common technologies used UV light to erase complete device Common technologies used UV light to erase complete device Took about 10 minutes Took about 10 minutes Holds state as charge in very well insulated areas of the chip Holds state as charge in very well insulated areas of the chip Nonvolatile for several (10?) years Nonvolatile for several (10?) years 11

12 EEPROM  Electrically Erasable PROM Similar technology to UV EPROM Similar technology to UV EPROM Erased in blocks by higher voltage Erased in blocks by higher voltage Programming is slower than reading Programming is slower than reading  Today’s flavor is called “flash memory” Digital cameras, MP3 players, BIOS Digital cameras, MP3 players, BIOS Limited life Limited life Some support individual word write, some block Some support individual word write, some block  Our boards have it: A flash memory chip on our Nexys boards A flash memory chip on our Nexys boards Has a “boot block” that is carefully protected Has a “boot block” that is carefully protected We will learn to use it in upcoming labs We will learn to use it in upcoming labs 12

13 How Flash Works  Special transistor with floating gate  This is part of device surrounded by insulation So charge placed there can stay for years So charge placed there can stay for years Aside: some newer devices store multiple bits of info in a cell Aside: some newer devices store multiple bits of info in a cell  Interested in this? Let’s cover briefly Let’s cover briefly 13

14 Flash  Add an extra gate to an nMOS transistor a “float gate” below the actual control gate a “float gate” below the actual control gate  float gate is isolated from everything else  can hold electrons for a while charge on float gate determines bit value stored charge on float gate determines bit value stored  electrons deposited  negative charge does not allow transistor to turn on  if no electrons on float gate  transistor can be turned on by the control gate 14 https://en.wikipedia.org/wiki/Flash_memory

15 Flash  Add an extra gate to an nMOS transistor charge on float gate determines bit value stored charge on float gate determines bit value stored float gate can be cleared using high voltage float gate can be cleared using high voltage  erased  ‘1’ value  cannot erase individual bits: must clear an entire “block” or “page”  can write individual bits for fast write speeds: for fast write speeds:  must have empty blocks available  speeds slows down as memory fills  thus, garbage collection is important  overprovisioning used in SSDs 15 https://en.wikipedia.org/wiki/Flash_memory

16 Read/Write Memories  Flash is obviously writeable But not meant to be written rapidly (say at CPU rates) But not meant to be written rapidly (say at CPU rates) And often writing needs erasure of entire blocks And often writing needs erasure of entire blocks  For frequent writing, use RAM 16

17 Random Access Memories  So called because it takes same amount of time to address any particular location Not entirely true for modern DRAMs, but somewhat true… Not entirely true for modern DRAMs, but somewhat true…  First look at asynchronous static RAM reading and writing typically controlled by “handshakes” reading and writing typically controlled by “handshakes”  clock may still be present, but actions controlled by handshake signals 17

18 Simple View of RAM  Typical parameters: some word size n some word size n some capacity 2 k some capacity 2 k k bits of address line k bits of address line  Need a line to specify reading or writing typically only one wire needed typically only one wire needed  sometimes two separate ones 18

19 Example: 1K x 16 memory  RAM comes in variety of sizes from 1-bit wide from 1-bit wide main issue is no. of pins available on chip main issue is no. of pins available on chip  Memory size often specified in bytes This would be 2KB memory This would be 2KB memory 10 address lines (=1K locations) 10 address lines (=1K locations) 16 data lines (=2 bytes/location) 16 data lines (=2 bytes/location) 19

20 Writing  Sequence of steps Set up address lines Set up address lines Set up data lines Set up data lines Activate write line (e.g., maybe a positive edge) Activate write line (e.g., maybe a positive edge) 20

21 Reading  Steps Setup address lines Setup address lines Activate read line Activate read line Data available soon Data available soon  for asynchronous memory: after simply a specified amount of time  for synchronous memory: after a clock edge 21

22 Chip Select  Enable: Usually a line to enable the chip Usually a line to enable the chip Why? Why? 22

23 Timing: Writing 23

24 Timing: Reading 24

25 Static vs. Dynamic RAM  Different internal implementations: SRAM vs. DRAM DRAM: DRAM:  DRAM stores charge in capacitor  Disappears after short period of time  Must be refreshed  Small size  Higher storage density  larger capacities SRAM: SRAM:  SRAM easier to use  Uses transistors (think of it as latch)  Faster  More expensive per bit  Smaller sizes 25

26 Structure of SRAM  Internally, each bit stored in a “latch” One memory cell per bit One memory cell per bit  Cell consists of a few transistors  Not really a latch made of NANDs/NORs, but logically equivalent  Behaves like an SR latch Control logic Control logic  also need extra logic around the latch to make it work like a memory cell 26

27 Structure of SRAM  Several optimized circuits often used replace a full-fledged SR latch with something simpler, smaller, faster… replace a full-fledged SR latch with something simpler, smaller, faster…  Not really a latch made of NANDs/NORs, but logically equivalent  Behaves like an SR latch e.g., a simpler 6-transistor memory cell e.g., a simpler 6-transistor memory cell  wordline  Select  (bitline, bitline’)  (B, B’) as well as (C, C’) 27

28 Example: A Simple Organization  Note: In reality, more complex In reality, more complex Only one word-line is “on” at a time Only one word-line is “on” at a time 28

29 Zoom in: A single bit slice  Operation: Cells connected to form 1 bit position (column) Cells connected to form 1 bit position (column) Word Select enables one latch from address lines Word Select enables one latch from address lines  only this cell is writable  only this cell is read B (and B’) set by: B (and B’) set by:  Read/Write’  Data In  Bit Select 29

30 Let’s look at a single bit cell Example: 0 1 Z Z 30

31 31 Bit Slices and Modules  Entire column of cells called a bit slice called a bit slice  basically a 1-bit wide memory!  Module module refers to a single chip of memory module refers to a single chip of memory 1-bit wide memory chips are quite common! 1-bit wide memory chips are quite common!

32 Inside an SRAM Bit Cell  Actual implementation does not use a real SR latch! a tinier approximation is used a tinier approximation is used logically behaves very much like an SR latch logically behaves very much like an SR latch but much smaller and faster! but much smaller and faster! 32

33 33 16 X 1 RAM “Chip”  Now shows address decoder selects appropriate location selects appropriate location

34 Row/Column Layout  For larger RAMs: decoder becomes pretty big decoder becomes pretty big also run into chip layout issues also run into chip layout issues  Typically: larger memories use “2D” matrix layout larger memories use “2D” matrix layout see next slide see next slide 34

35 35 16 X 1 RAM as 4 X 4 Array  Two decoders Row Row Column Column  Address just broken up  Not visible from outside on SRAMs

36 36 Not the same as 8 X 2 RAM!  Minor change in logic and pins Spot the difference! Spot the difference!

37 Spot the difference! 37

38 Realistic Sizes  Example: 256Kb memory organized 32K X 8 Single-column layout would need 15-bit decoder with 32K outputs! Single-column layout would need 15-bit decoder with 32K outputs!  Better organization: A 2D (i.e., square) layout with: A 2D (i.e., square) layout with:  9-bit row and 6-bit column decoders 38

39 SRAM Performance  Latency and Throughput important Current ones have cycle times in low nanoseconds Current ones have cycle times in low nanoseconds  say 1-2ns (top-end ones even lower) Used as cache (typically on-chip or off-chip secondary cache) Used as cache (typically on-chip or off-chip secondary cache)  Sizes up to 8Mbit or so for fast chips Expensive ones can go a bit bigger Expensive ones can go a bit bigger  Energy/power SRAMs also better for low power vs. DRAMs SRAMs also better for low power vs. DRAMs 39

40 Wider Memory  What if you don’t have enough bit width? use multiple chips and side-by-side use multiple chips and side-by-side 40

41 Larger/Wider Memories  Made up from sets of chips  Consider a 64K by 8 RAM our building block our building block 41

42 Larger  Let’s build a larger memory 256K X 8 256K X 8 Decoder for high-order 2 bits Decoder for high-order 2 bits  Selects chip  Look at selection logic  Address ranges Tri-state outputs Tri-state outputs 42

43 SystemVerilog Behavioral descriptions of: ROM, single-ported RAM, dual-ported RAM, etc. 43

44 SystemVerilog: 1-port RAM  RAM example single-ported  one address (for reading and writing) single-ported  one address (for reading and writing) whether read or written is determined by “write enable” whether read or written is determined by “write enable” clock clock  all writes take place on clock tick  reads are asynchronous –i.e., output after a propagation delay without waiting for a clock tick 44 clock addr din dout wr 1-port RAM

45 SystemVerilog: 1-port RAM logic [Dbits-1:0] mem [Nloc-1:0]; always_ff @(posedge clock) if(wr) mem[addr] <= din; assign dout = mem[addr]; 45 clock addr din dout wr 1-port RAM The actual storage where data resides Write operation on clock tick if write enabled Reading is asynchronous, no clock involved

46 SystemVerilog: 2-port RAM  RAM example 2 ports 2 ports  one read-write port (using addr1 )  one read-only port (using addr2 ) 2 outputs: dout1 and dout2 2 outputs: dout1 and dout2 only one data input: din only one data input: din 46 clock read-write: addr1 din dout1 wr 2-port RAM read-only: addr2 dout2

47 SystemVerilog: 2-port RAM logic [Dbits-1:0] mem [Nloc-1:0]; always_ff @(posedge clock) if(wr) mem[addr1] <= din; assign dout1 = mem[addr1]; assign dout2 = mem[addr2]; 47 clock read-write: addr1 din dout1 wr 2-port RAM read-only: addr2 dout2 The actual storage where data resides Write operation on clock tick if write enabled Reading is asynchronous, no clock involved

48 SystemVerilog: register file  Register file 3 ports 3 ports  two read-only ports (using ReadAddr1 and ReadAddr2 )  one write-only port (using WriteAddr ) 2 outputs: ReadData1 and ReadData2 2 outputs: ReadData1 and ReadData2 one data input: WriteData one data input: WriteData 48 clock ReadAddr1 WriteData ReadData1 wr 3-port register file ReadAddr2 ReadData2 WriteAddr

49 SystemVerilog: register file logic [Dbits-1:0] rf [Nloc-1:0]; always_ff @(posedge clock) if(wr) rf[…] <= …; assign ReadData1 = rf[…]; 49 clock ReadAddr1 WriteData ReadData1 wr 3-port register file ReadAddr2 ReadData2 WriteAddr The actual storage where data resides Write operation on clock tick if write enabled Reading is asynchronous, no clock involved Skeleton only. You fill in the details (Lab 9).

50 SystemVerilog: memory initialization  Specify a file that contains initial values one value per line: one value per line:  hex or binary  use $readmemh for hex  use $readmemb for binary logic [Dbits-1:0] mem[Nloc-1:0]; initial $readmemh(“mem_data.txt”, mem, 0, Nloc-1); always_ff @(posedge clock) … assign … 50 Specifies the file that contains initial values

51 SystemVerilog: ROM example  ROM example single-ported single-ported read-only, no writing read-only, no writing no clock needed no clock needed  reads are asynchronous  i.e., output appears after a propagation delay without waiting for a clock tick logic [Dbits-1:0] mem [Nloc-1:0]; initial $readmemh(“mem_data.txt”, mem, 0, Nloc-1); assign dout = mem[addr]; 51 Read operation only, no writes

52 Summary  Today we looked at: Quick look at non-volatile memory Quick look at non-volatile memory Static RAM Static RAM SystemVerilog templates for memories SystemVerilog templates for memories  Next topic: Dynamic RAM Dynamic RAM  Complex, largest, cheap  Much more design effort to use 52


Download ppt "1 COMP541 Memories - I Montek Singh Mar 21, 2016."

Similar presentations


Ads by Google