Flash Memory Jian-Jia Chen (Slides are based on Yuan-Hao Chang)

Slides:



Advertisements
Similar presentations
Fakultät für informatik informatik 12 technische universität dortmund Optimizations - Compilation for Embedded Processors - Peter Marwedel TU Dortmund.
Advertisements

Chapter 6 I/O Systems.
Chapter 13: I/O Systems I/O Hardware Application I/O Interface
Chapter 8: Main Memory.
1 Vorlesung Informatik 2 Algorithmen und Datenstrukturen (Parallel Algorithms) Robin Pomplun.
Copyright © 2003 Pearson Education, Inc. Slide 1 Computer Systems Organization & Architecture Chapters 8-12 John D. Carpinelli.
1 Copyright © 2013 Elsevier Inc. All rights reserved. Chapter 4 Computing Platforms.
1 Copyright © 2013 Elsevier Inc. All rights reserved. Chapter 3 CPUs.
UNITED NATIONS Shipment Details Report – January 2006.
Chapter 5 Input/Output 5.1 Principles of I/O hardware
Chapter 6 File Systems 6.1 Files 6.2 Directories
Chapter 4 Memory Management 4.1 Basic memory management 4.2 Swapping
1 Chapter 11 I/O Management and Disk Scheduling Patricia Roy Manatee Community College, Venice, FL ©2008, Prentice Hall Operating Systems: Internals and.
1 Chapter 12 File Management Patricia Roy Manatee Community College, Venice, FL ©2008, Prentice Hall Operating Systems: Internals and Design Principles,
SE-292 High Performance Computing
Figure 12–1 Basic computer block diagram.
Chapter 5 : Memory Management
Storing Data: Disk Organization and I/O
Hard Disks Low-level format- organizes both sides of each platter into tracks and sectors to define where items will be stored on the disk. Partitioning:
Copyright © 2006 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill Technology Education Copyright © 2006 by The McGraw-Hill Companies,
File and Disk Maintenance
4.1 © 2004 Pearson Education, Inc. Exam Managing and Maintaining a Microsoft® Windows® Server 2003 Environment Lesson 4: Organizing a Disk for Data.
File Management.
Databasteknik Databaser och bioinformatik Data structures and Indexing (II) Fang Wei-Kleiner.
Mehdi Naghavi Spring 1386 Operating Systems Mehdi Naghavi Spring 1386.
Storing Data Chapter 4.
HyLog: A High Performance Approach to Managing Disk Layout Wenguang Wang Yanping Zhao Rick Bunt Department of Computer Science University of Saskatchewan.
13 Copyright © 2005, Oracle. All rights reserved. Monitoring and Improving Performance.
Advance Nano Device Lab. Fundamentals of Modern VLSI Devices 2 nd Edition Yuan Taur and Tak H.Ning 0 Ch9. Memory Devices.
Chapter 4 Memory Management Basic memory management Swapping
Project 5: Virtual Memory
Memory.
Practical Session 9, Memory
1 Overview Assignment 4: hints Memory management Assignment 3: solution.
Cache and Virtual Memory Replacement Algorithms
Chapter 3.3 : OS Policies for Virtual Memory
Module 10: Virtual Memory
Chapter 10: Virtual Memory
Virtual Memory II Chapter 8.
Memory Management.
Virtual Memory 1 Computer Organization II © McQuain Virtual Memory Use main memory as a cache for secondary (disk) storage – Managed jointly.
1 CMSC421: Principles of Operating Systems Nilanjan Banerjee Principles of Operating Systems Acknowledgments: Some of the slides are adapted from Prof.
File Systems: Fundamentals.
Chapter 6 File Systems 6.1 Files 6.2 Directories
1..
Analyzing Genes and Genomes
SE-292 High Performance Computing
© 2007 Cisco Systems, Inc. All rights reserved.Cisco Public 1 Addressing the Network – IPv4 Network Fundamentals – Chapter 6.
Essential Cell Biology
SE-292 High Performance Computing Memory Hierarchy R. Govindarajan
Intracellular Compartments and Transport
PSSA Preparation.
Essential Cell Biology
Flash storage memory and Design Trade offs for SSD performance
Chapter 11: File System Implementation
Solid State Drive Feb 15. NAND Flash Memory Main storage component of Solid State Drive (SSD) USB Drive, cell phone, touch pad…
File System. NET+OS 6 File System Architecture Design Goals File System Layer Design Storage Services Layer Design RAM Services Layer Design Flash Services.
Speaker: 吳晋賢 (Chin-Hsien Wu) Embedded Computing and Applications Lab Department of Electronic Engineering National Taiwan University of Science and Technology,
2010 IEEE ICECS - Athens, Greece, December1 Using Flash memories as SIMO channels for extending the lifetime of Solid-State Drives Maria Varsamou.
Embedded System Lab. Jung Young Jin The Design and Implementation of a Log-Structured File System D. Ma, J. Feng, and G. Li. LazyFTL:
Lecture 22 SSD. LFS review Good for …? Bad for …? How to write in LFS? How to read in LFS?
XIP – eXecute In Place Jiyong Park. 2 Contents Flash Memory How to Use Flash Memory Flash Translation Layers (Traditional) JFFS JFFS2 eXecute.
Chin-Hsien Wu & Tei-Wei Kuo
Fakultät für informatik informatik 12 technische universität dortmund Rechnerarchitektur (RA) Sommersemester 2016 Flash Memory Jian-Jia Chen (Slides are.
COS 518: Advanced Computer Systems Lecture 8 Michael Freedman
Understanding Modern Flash Memory Systems
What you should know about Flash Storage
An Adaptive Data Separation Aware FTL for Improving the Garbage Collection Efficiency of Solid State Drives Wei Xie and Yong Chen Texas Tech University.
COS 518: Advanced Computer Systems Lecture 8 Michael Freedman
COS 518: Advanced Computer Systems Lecture 9 Michael Freedman
Presentation transcript:

Flash Memory Jian-Jia Chen (Slides are based on Yuan-Hao Chang) TU Dortmund Informatik 12 Germany © Springer, 2010 2015 年 01 月 27日 These slides use Microsoft clip arts. Microsoft copyright restrictions apply.

Diversified Application Domains Why Flash Memory Diversified Application Domains Portable Storage Devices Consumer Electronics Industrial Applications Critical System Components

Layout of Flash Memory … … Block: basic erase-operation unit. (2KB + 64 Byte) Page: basic write-operation unit. 63 1023 Block: basic erase-operation unit. 128MB Flash Memory

Characteristics of Flash Memory Write-Once No writing on the same page unless its residing block is erased! Pages are classified into valid, invalid, and free pages. Bulk-Erasing Pages are erased in a block unit to recycle used but invalid pages. Wear-Leveling Each block has a limited lifetime in erasing cycles. E.g., 10,000 ~ 100,000 erase cycles for each block

Terminology Valid data: the latest version of data stored in flash Invalid data: not the latest version of data stored in flash Live page: a page that stores valid data Dead page: a pages that stores invalid data Free page: a page that is erased and is ready to store data Free block: a block that is erased and is not allocated to store any data Hot data: frequently updated data Valid hot data might become invalid in the near future. Cold data: non-frequently updated data Valid cold data might stay in the same place for a long time.

Management Issues – Flash-Memory Characteristics Example 1: Out-place Update A B C D A B Dead pages Because flash memory is write-once, we do not overwrite data on update. Instead, the new version of data A and data B are written to free pages, and the old versions of data are considered as dead.

Management Issues – Flash-Memory Characteristics Example 2: Garbage Collection L D D L D D L D This block is to be recycled. (3 live pages and 5 dead pages) L L D L L L F D L F L L L L D F After a certain usage, the number of free space would be low. Garbage collection intends to reclaim free pages by erasing blocks. In this example, we have 4 blocks, and each block has 8 pages. These are live pages, these are dead pages, and these are free pages. Suppose that we want to recycle the dead pages on the first block. A live page F L L F L L F D A dead page A free page

Management Issues – Flash-Memory Characteristics Example 2: Garbage Collection D D D D D D D D Live data are copied to somewhere else. L L D L L L L D L F L L L L D L Because there are live pages on the block, the useful / live data are copied to somewhere else first. The copy is called live page copying. A live page L L L F L L F D A dead page A free page

Management Issues – Flash-Memory Characteristics Example 2: Garbage Collection F F F F F F F F The block is then erased. Overheads: live data copying block erasing. L L D L L L L D L F L L L L D L Finally, we perform a block erase to erase all pages on the block. There are 5 free pages reclaimed because we consume 3 free pages to perform the live page copyings before the block erase. As we can see, there are overheads to do garbage collection. The overheads are the live data copying s and the block erasing. A live page L L L F L L F D A dead page

Management Issues – Flash-Memory Characteristics Example 3: Wear-Leveling Wear-leveling might interfere with the decisions of the block-recycling policy. 100 L D D L D D L D A 10 L L D L L L F D B L F L L L L D F 20 C A live page Under current technology, a flash memory block has a limitation on the number of erase cycle count. A block could be erased for 1 million times. After that, the write to the block would suffer from frequent write errors. Wear-leveling intends to erased all block evenly. In this example, let the numbers denote the erase cycle count of each blocks. Obviously it is not a good idea to recycle block A since it had been erased for 100 times. Alternatively, we could erased block B,C,or D. However, it is not efficient to recycle block B,C,D because there are not many dead pages. We can see that Wear-leveling might interfere with the decisions of the block-recycling policy. A dead page 15 F L L F L L F D D A free page Erase cycle counts

Single-Level Cell (SLC) vs. Multiple-Level Cell (MLC) A limited bound on erase cycles SLC : 100,000 MLCx2: 10,000 Bit error probability SLC: 10-9 MLC: 10-6 As the larger capacity of flash memory adapts MLC flash memory, where MLC is a multi-level cell flash memory and each cell can store more than one bit information because of its low cost. It can endure less erase cycles. Comparatively, SLC is a cell that can store only one bit information and has higher erase cycle but more expensive. 11

Electronic Engineering Times, July 2005 MLC vs. SLC (Cont.) Electronic Engineering Times, July 2005

Price and Read/Write Performance NOR NAND SLC NAND MLCx2 Price 34.55 $/GB 6.79 $/GB 2.48 $/GB Read 23.84 MB/sec 15.33 MB/sec 13.5 MB/sec Write 0.07 MB/sec 4.57 MB/sec 2.34 MB/sec Erase 0.22 MB/sec 85.33 MB/sec 170.66 MB/sec *NOR: Silicon Storage Technology (SST). NAND SLC: Samsung Electronics. K9F1G08Q0M. NAND MLCx2: ST STMicroelectronics[1,2] 1. Jian-Hong Lin, Yuan-Hao Chang, Jen-Wei Hsieh, Tei-Wei Kuo, and Cheng-Chih Yang, "A NOR Emulation Strategy over NAND Flash Memory," the 13th IEEE International Conference on Embedded and Real-Time Computing Systems and Applications (RTCSA), Daegu, Korea , August 21-24, 2007. 2. Yuan-Hao Chang and Tei-Wei Kuo, “A Log-based Management Scheme for Low-cost Flash-memory Storage Systems of Embedded Systems”

Flash Memory Management Management issues Write constraints imposed by flash memory Scalability issue Garbage collection Performance considerations Reliability issues Cell error rate problem imposed by MLC flash memory Error correction coding vs. wear leveling Read/write disturbance problem Data retention problem

Typical System Architecture

Flash Translation Layer (FTL) FTL adopts a page-level address translation mechanism. The main problem of FTL is on large memory space requirements for storing the address translation information.

Address Translation Table (in main-memory) NAND Flash Translation Layer (NFTL) A logical address under NFTL is divided into a virtual block address and a block offset. e.g., LBA=1019 => virtual block address (VBA) = 1019 / 8 = 127 and block offset = 1019 % 8 = 3 NFTL Address Translation Table (in main-memory) A Primary Block Address = 9 A Replacement Block Address = 23 Write data to LBA=1019 Free Used Free Used . Free Used Used Free (9,23) Block Offset=3 Free Free VBA=127 If the page has been used . Write to the first free page Free Free Free Free Free Free

FTL or NFTL FTL NFTL Memory Space Requirements Large Small Address Translation Time Short Long Garbage Collection Overhead Less More Space Utilization High Low The memory-space requirements for one 1GB NAND (2KB/Page, 4B/Table Entry, 128 Pages/Block) FTL: 2MB (= 4*(1024*1024*1024)/2K) NFTL: 32KB (= 2*4*(1024*1024*1024)/(2K * 128))

Size of Translation Tables 1GB 32GB 1TB 32TB FTL 2MB 64MB 2GB 64GB NFTL 32KB 1MB 32MB No matter which kind of granularity of address translation is adopted, the fast growing flash memory capacity would eventually make the translation table too large to be fitted in RAM.

Adaptive Two-Level Mapping Mechanism The coarse-grained hash table maintains the primary block and its replacement block This will be mainly used to identify the primary and replacement blocks The fine-grained hash table has limited entries to store the excessive live pages in the replacement block When the fine-grained table is full, the replacement policy has to be considered to move some pages to the coarse-grained table Improve the space utilization. The delayed recycling of any primary block lets free pages of a primary block be likely used in the future. Chin-Hsien Wu and Tei-Wei Kuo, 2006, “An Adaptive Two-Level Management for the Flash Translation Layer in Embedded Systems,” IEEE/ACM 2006 International Conference on Computer-Aided Design (ICCAD), November 5-9, 2006.

New Write Constraints of MLC Flash Pages can only be written sequentially in a block. Partial page write/programming is prohibited. Impact on NFTL Data can’t be written to any free page of primary blocks. The space utilization in primary blocks is even lower. Most writes are forced to be placed in the replacement block. Pages of invalid data can’t be marked as dead. Each read operation should scan pages of the replacement block.

Intuitive and Practical Solution Level-paging translation tables Pages of translation tables are stored in flash, and are cached in RAM. Problems: Hit ratio of cached pages Extra page reads and writes for translation information Crash recovery for translation table and lost data

Performance Considerations MLC flash has growing market share: Reason: low cost and high density Drawbacks: low speed, low endurance, and low reliability Solutions: Hardware multi-channel programming Software multi-bank or multi-channel programming

Wear Leveling Intuitive Static Wear Leveling Perfect Static Wear Leveling No Wear Leveling Dynamic Wear Leveling 1000 2000 3000 4000 5000 20 40 60 80 100 Erase Cycles 20 40 60 80 100 20 40 60 80 100 Physical Block Addresses (PBA)

Static Wear Leveling Random policy It randomly select a block to reclaim after a fixed number of block erases or write requests. It doesn’t track the locality of data acceses, such that it might move hot data that might turn dead in the near future.

Static Wear Leveling (Cont.) Random policy with block-erasing table Each-bit flag of the table is to indicate whether the corresponding blocks have been erased. Whenever the block erases are not even enough, select blocks whose corresponding bit flag are not set. Pros and Cons: Pros: it can identify the locality of data accesses. Cons: the block-erasing table needs extra RAM space even it is comparatively small. Yuan-Hao Chang, Jen-Wei Hsieh, Tei-Wei Kuo: Endurance Enhancement of Flash-Memory Storage, Systems: An Efficient Static Wear Leveling Design. DAC 2007: 212-217

The Block Erasing Table (BET) A bit-array: Each bit is for 2k consecutive blocks. Small k – in favor of hot-cold data separation Large k – in favor of small RAM space ecnt=1 fcnt =1 ecnt=3 fcnt =2 ecnt=2 fcnt =2 ecnt=0 fcnt =0 ecnt=0 fcnt =0 ecnt=1 fcnt =1 ecnt=2 fcnt =2 ecnt=3 fcnt =2 ecnt=4 fcnt =2 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 Flash 1 1 1 1 BET k=0 k=2 : a block that has been erased in the current resetting interval : an index to a block that the Cleaner wants to erase fcnt: the number of 1’s in the BET ecnt: the total number of block erases done since the BET is reset

A SimpleStatic-Wear Leveler An unevenness level (ecnt / fcnt) >= T  Triggering of the SW Leveler Resetting of BET when all flags are set. ecnt=2004 fcnt =3 ecnt=3004 fcnt =4 ecnt=3999 fcnt =4 ecnt=2000 fcnt =2 ecnt=4000 fcnt =4 ecnt=1999 fcnt =2 ecnt=3000 fcnt =3 ecnt=0 fcnt =0 ecnt=1998 fcnt =2 ecnt=2998 fcnt =3 ecnt=2999 fcnt =3 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 T: A threshold, T=1000 in this example ecnt: the total number of block erases since the BET is erased fcnt: The number of 1’s in the BET 1 1 1 1 : An index that SW Leveler triggers the Cleaner to do garbage collection Note that the sequential scanning of blocks in the selection of block sets for static wear leveling is very efficitive in the implementation. We surmise that the design is close to that in a random selection policy in reality because cold data could virtually exist in any block in the physical address space of the flash memory. k=2 : An index in the selection of a block set : An index to a block that the Cleaner wants to erase After a period of time, the total erase count reaches 2998. 4000 / 4 = 1000>=1000 (ecnt/ fcnt>=1000) , but all flags in BET are 1 reset BET After a period of time, the total erase count reaches 3999. The Cleaner is triggered to 1. Copy valid data of selected block set to free area, 2. Erase block in the selected block set, and 3. Inform the Allocator to update the address mapping between LBA and PBA Reset to a randomly selected block set (flag) 2000 / 2 = 1000 >= 1000 (Ecnt / fcnt >= T) 3000 / 3 = 1000 >= 1000 (Ecnt / fcnt >= T) : A block that has been erased in the current resetting interval

Main-Memory Requirements 512MB 1GB 2GB 4GB 8GB k=0 256B 512B 1024B 2048B 4096B k=1 128B k=2 64B k=3 32B MLCx2 (1 page = 2 KB, 1 block=128 pages)

File-System Considerations - Observations Archive: Linux-2.6.17 source Number of files 19,535 Number of directories 1,200 Average file size 11 KB Archive size 215,666 KB FAT is the default file system for removable storage devices Write the files to a removable storage device File system NTFS FAT Number of write requests 24,513 179,670 Time taken (min:sec) 4:33 54:21 FAT file systems introduce excessive write requests! why ? Read the files from a removable storage device File system NTFS FAT Number of read requests 14,568 23,528 Time taken (min:sec) 2:53 3:19

File-System Considerations - Observations (Cont.) Layout of FAT filesystem LBA 0 File #1 FAT #1 FAT #2 Directory Entry Content of File Directory Entry File #2 FAT #1 FAT #2 Directory Entry Content of File Directory Entry https://www.pjrc.com/tech/8051/ide/fat32.html

File-System Considerations Host Card Readers Applications Operating System API File System Drivers Disk Drivers Device Drivers Filter Drivers USB Flash Drives Bus Drivers Host Flash Memory Cards Device Device Controller Device Controller Device Storage Media Storage Media (a) (b)

Overview Cohesive-caching algorithm USB Mass Storage Device Driver Cache Transport Unit Debug Unit Dispatch Unit Viewer USB Mass Storage Device Driver USB Bus Driver Filesystem Identifier I/O Request Partition Table & Boot Sectors File System Layout Trace Read and Write IoPackets Response Other IoPackets Filter Driver FIFO Queue Notifications When ? Cohesive-caching algorithm 此一 filter driver 介於 USB mass storage device driver 與 USB bus driver 之間,並包含了五個 component,其中cohesive-caching policy 是實作在 cache 這個 component 之中。當裝置一插上時,Transport Unit 就先access devices 的 boot sector,並把資訊傳給 Filesystem identifier,用來偵測device目前的partition 與 file system。然後 filesystem identifier 把偵測到的訊息傳給 Cache。當上層有read/write request來時,就會先傳到 Cache component 裏,但是如果是其他 control commands,就bypass 到 transport unit。而 Dispatch unit 同時會傳送訊息通知 Cache 什麼時候要把cache 的資料flush 到device,以維持資料的完整性。另外 Debug unit 是用來把 debug 的訊息送到console 給 debug viewer。問題是Dispatch unit 什麼時候該通知 cache unit 呢?

The “Allow/Prevent Medium Removal” Operation Code in the USB Bulk-Only Transport <CBW> Allow Medium Removal <CBW> Prevent Medium Removal <CSW> Success USB Mass Storage Class, Bulk-Only Transport Protocol CBW: Command Block Wrapper CSW: Command Status Wrapper FAT File System Disk Driver 1st FAT 2nd FAT File Content Directory Entry W 477 1 W 1466 1 W 210576 4 W 210664 9 … Disk USB Mass Storage Device Driver USB Bus <CBW> Write10 LBA=1446 LEN=1 <CSW> Success <DATA> … An IoPacket … hrtimer.c intermodule.c itimer.c kallsyms.c Send a notification 我們可以從下面的例子來說明: 當file system 要求寫入一個檔案時,它會對 disk driver 發出一連串的write request,但是每一個 write command 透過 USB mass storage device driver 時,就會需要發出三個 USB command,Command block wrapper, command status wrapper。因此 write amplification ratio 非常高。 然而,USB mass storage command 會在發出一連串的command之前先發出一個 Prevent medium removal,且在一連串的command一後馬上發出 Allow medium removal來通知可能把隨身碟移除。這表示從收到 prevent medium removal command時就可以開始cache 資料,並且在收到 allow medium removal 時可以把cache的資料 flush 掉,這只需要幾個 millisecond就能完成。由於一般的使用者不會在copy 動作還在進行時拔除隨身碟,因此資料流失的機率就會非常低。如果沒有這個特性的話就不知道什麼時候該 cache 資料,什麼時候該 flush資料,而這樣的特性是一般在 file system 裏的 caching system 做不到的

Worsening Reliability - Narrow Threshold Voltage Window MLC/TLC/QLC technology must squeeze the available window of threshold voltage for each logical state Higher Bit Error Rate Lower Endurance SLC MLC http://liobaashlyritchie.blogspot.tw/2013/05/benefits-of-robust-error-correction-for.html Source: http://liobaashlyritchie.blogspot.tw/

Reliability Issues The low-cost MLC flash Has lower erase cycles Has higher and higher bit error rate. Ways to improve reliability Error correction coding (ECC)  passively Non-erasure code such as BCH and RS Wear leveling  positively Distribute block erases as even as possible Non-erasure code: Errors in the received packet can be detected/corrected Erasure code: Lost packets can be recovered from other received packets. Note: Erasure code is used in communication Non-erasure code is used in storage systems

Error Correction Coding Error correction code of a page is stored in its spare area. The time on error correction might be affected by the location of error bits or the number of error bits. The space of the ECC hardware is increased as the number of supported error bits increases. Trend of MLC flash The page size is getting larger The bit error rate is getting higher Fast erasing bits The fast worn-out flash cells Vt

Data Retention Problem The guaranteed data retention: 10 years As the cell size is getting smaller, the number of electrons in the floating gate of a flash cell is getting smaller. For example: A programmed cell can store 10,000 electrons. A lost of only 10% in this number can lead to a wrong read.  A loss of less than 2 electrons per week can be tolerated.