Presentation is loading. Please wait.

Presentation is loading. Please wait.

Myoungsoo Jung (UT Dallas) Mahmut Kandemir (PSU)

Similar presentations


Presentation on theme: "Myoungsoo Jung (UT Dallas) Mahmut Kandemir (PSU)"— Presentation transcript:

1 Revisiting Widely Held SSD Expectations and Rethinking System-Level Implication
Myoungsoo Jung (UT Dallas) Mahmut Kandemir (PSU) The University of Texas at Dallas Computer Architecture and Memory Systems Lab.

2 Testing Expectations on
Motivation Evaluation Setup Testing Expectations on Reads Writes Advanced Schemes

3 Testing Expectations on
Motivation Evaluation Setup Testing Expectations on Reads Writes Advanced Schemes

4 We know SSDs! Reads Writes 10x~100x better than writes
Reliable (no erase) Fast random accesses Less overheads Writes GC Impacts DRAM Buffer Faster than HDD

5 We are carefully using them!!
Read Cache Reads Memory Extension Read-Only Storage Virtual Memory Writes Burst Buffer Checkpointing Swap/Hibernation Management

6 Then, why do we need to rethink?
NAND Core Architecture Have shrank 5x to 2x nm Less reliable / extra operations Longer latency Multiple channels and pipelining Queue/IO rescheduling methods Internal DRAM buffer cache Firmware/OS Packaging Multiple dies and planes Package-level queue ECC engines Fast data movement interfaces Advanced mapping algorithms TRIM Background task managements

7 Write Performance/Background Tasks
OS Admin HPC App Reliability on Reads Write Performance/Background Tasks OS Supports Read Performance Firmware/OS Architecture Packaging NAND Core

8 Testing Expectations on
Motivation Evaluation Setup Testing Expectations on Reads Writes Advanced Schemes

9 SSD Test-Beds Multi-core SSD DRAM-less SSD High-reliable SSD
Over-provisioned SSD

10 Tools Intel Iometer LeCroy Sierra M6-1 SATA protocol analyzer
In-house AHCI minport driver

11 Firmware/OS Architecture Packaging NAND Core Write Performance/
Reliability on Reads Write Performance/ Background Tasks OS Supports Read Performance Firmware/OS Architecture Packaging NAND Core

12 Are SSDs good for applications that exhibit mostly random reads?
Observation Performance values with random read accesses are worse than other types of access patterns and operations

13 Are SSDs good for applications that exhibit mostly random reads?
39% 59% 23% [SSD-C] [SSD-L] [SSD-Z]

14 Are SSDs good for applications that exhibit mostly random reads?
23% ~ 59% of latency values with random reads [SSD-C] [SSD-L] [SSD-Z]

15 Are SSDs good for applications that exhibit mostly random reads?
No! Why? HOST Address translation for reads/ address remapping for writes

16 Are SSDs good for applications that exhibit mostly random reads?
No! Why? Rand. writes  Seq. writes (by remapping addresses on writes) Lack of internal parallelism on random reads Resource conflicts on random accesses

17 Can we achieve sustained read performance with seq. accesses?
Observation Sequential read performance characteristics get worse with aging and as I/O requests are being processed

18 Can we achieve sustained read performance with seq. accesses?
Most I/O requests are served in 200 us [SSD-C] [SSD-L] [SSD-Z]

19 Can we achieve sustained read performance with seq. accesses?
2x ~ 3x worse than pristine state SSDs [SSD-C] [SSD-L] [SSD-Z]

20 Can we achieve sustained read performance with seq. accesses?
No! Why? We believe that this performance degradation on reads is mainly caused by Fragmented physical data layout

21 Firmware/OS Architecture Packaging NAND Core Write Performance/
Reliability on Reads Write Performance/ Background Tasks OS Supports Read Performance Firmware/OS Architecture Packaging NAND Core

22 Do program/erase (PE) cycles of SSDs increase during read only operations?
Observation Read requests can shorten the SSDs lifespan PE cycles on reads are not well managed by underlying firmware

23 Do program/erase (PE) cycles of SSDs increase during read only operations?
Reach 1% ~ 50% of PE cycles on writes 247x 12x [PE cycles on seq. access pattern] [PE cycles on rand. access pattern] 1 hour I/O services per evaluation round

24 Do program/erase (PE) cycles of SSDs increase during read only operations?
Unfortunately, Yes. Why? Vpass Vpass 0V Vpass Vpass Can gain charge (need to perform an erase)

25 Firmware/OS Architecture Packaging NAND Core Write Performance/
Reliability on Reads Write Performance/ Background Tasks OS Supports Read Performance Firmware/OS Architecture Packaging NAND Core

26 TRIM FILE A FILE B TRIM Can be wiped out VALID INVALID INVALID VALID

27 Can TRIM command reduce GC overheads?
Observation SSDs do not trim all the data SSD performance with TRIM command is strongly related to the TRIM command submission patterns (SEQ-TRIM vs. RND-TRIM)

28 Can TRIM command reduce GC overheads?
SEQ-TRIM = Pristine Pristine (Trimmed SSD = pristine state SSD??) RND-TRIM = NON-TRIM [SSD-C] [SSD-Z]

29 Can TRIM command reduce GC overheads?
[SSD-C] [SSD-Z]

30 Please take a look!! There exist 25 questions we tested
In the paper, we have 59 different types of empirical evaluation including : Overheads on runtime bad block managements, ECC overheads Physical data layout performance impact DRAM caching impact Background tasks and etc.

31 Thank you!

32 Backup

33 Firmware/OS Architecture Packaging NAND Core Reliability on Reads
Write Performance OS Supports Read Performance Firmware/OS Architecture Packaging NAND Core

34 How much impact does the worst-case latency have on modern SSDs?
The worst-case latencies on fully-utilized SSDs are much worse than that of HDDs

35 How much impact does the worst-case latency have on modern SSDs?
2x ~ 173x better than Enterprise-scale HDD 12x ~ 17x worse than 10K HDD [Average latency -- SSDs vs. enterprise HDD] [Worst-case latency -- SSDs vs. enterprise HDD]

36 What is the correlation between the worst-case latency and throughput?
Observation SSD latency and bandwidth become 11x and 3x respectively worse than normal writes Performance degradation on the writes is not recovered even after many GCs are executed

37 What is the correlation between the worst-case latency and throughput?
Recovered immediately [SSD-C] [SSD-L]

38 What is the correlation between the worst-case latency and throughput?
Write-cliff Performance is not recovered [SSD-C] [SSD-L]

39 What is the correlation between the worst-case latency and throughput?
Write Cliff. Why? INVALID VALID The range of random access addresses is not covered by the reclaimed block INVALID VALID Update Block VALID INVALID New Block VALID INVALID Data Block Free Block Pool

40 Could DRAM buffer help the firmware to reduce GC overheads?
Observation DRAM buffers (before write cliff) offer 4x shorter latency (after write cliff kicks in) introduce 2x ~ 16x worse latency

41 Could DRAM buffer help the firmware to reduce GC overheads?
16x worse 4x better [SSD-C] Write-Cliff [SSD-L]

42 Could DRAM buffer help the firmware to reduce GC overheads?
No! Why? Flushing of buffered data introduces large number of random accesses, which can in turn accelerate GC invocation on write-cliffs

43 Can background tasks of current SSDs guarantee sustainable performance?
0.1% (30 idle secs) 7% (1 idle hour) [SSD-C BFLUSH] [SSD-L BGC]

44 Why can’t BGC help with the foreground tasks?
Endurance characteristics Block erasure acceleration Power consumption problem on idle

45 Does TRIM command incur any overheads?
Moderns SSDs require much longer latencies to trim data than normal I/O operation would take I-TRIM: data invalidation based on address an d prompt response E-TRIM: block erasure in real-time

46 Does TRIM command incur any overheads?
[SSD-Z] [SSD-L]

47 Does TRIM command incur any overheads?
[SSD-C] [SSD-A]

48 What types of background tasks exist in modern SSDs?
BFLUSH: flushing in-memory data into flush medium, creating extra room, which can be used to buffer the new incoming I/O req. BGC: performed GCs in the background

49 What types of background tasks exist in modern SSDs?
[Cache-on SSD] [Cache-off SSD]

50 What types of background tasks exist in modern SSDs?
[Cache-on SSD] [Cache-off SSD]

51 What types of background tasks exist in modern SSDs?
Excluding BFLUSH, there is only one SSD (SSD-L) perform BGC There exist several benchmark results published assuming BGC, and even SSD maker indicated that they alleviate GC overheads by utilizing idle times

52 Read Overheads ECC Recovery Runtime Bad Block Management


Download ppt "Myoungsoo Jung (UT Dallas) Mahmut Kandemir (PSU)"

Similar presentations


Ads by Google