Better I/O Through Byte-Addressable, Persistent Memory

Slides:



Advertisements
Similar presentations
55:035 Computer Architecture and Organization Lecture 7 155:035 Computer Architecture and Organization.
Advertisements

Better I/O Through Byte-Addressable, Persistent Memory
1 Lecture 6: Chipkill, PCM Topics: error correction, PCM basics, PCM writes and errors.
File Systems.
Steven Pelley, Peter M. Chen, Thomas F. Wenisch University of Michigan
Chapter 11: File System Implementation
Phase Change Memory What to wear out today? Chris Craik, Aapo Kyrola, Yoshihisa Abe.
Lecture 18 ffs and fsck. File-System Case Studies Local FFS: Fast File System LFS: Log-Structured File System Network NFS: Network File System AFS: Andrew.
G Robert Grimm New York University SGI’s XFS or Cool Pet Tricks with B+ Trees.
1 Lecture 14: DRAM, PCM Today: DRAM scheduling, reliability, PCM Class projects.
NVM Programming Model. 2 Emerging Persistent Memory Technologies Phase change memory Heat changes memory cells between crystalline and amorphous states.
1 File Systems Chapter Files 6.2 Directories 6.3 File system implementation 6.4 Example file systems.
Problems discussed in the review session for the final COSC 4330/6310 Summer 2012.
Journaling vs Soft Updates: Asynchronous Metadata Protection in File Systems Margo I. Seltzer, Harvard, Gregory R. Ganger CMU, M. Kirk McKusick, Keith.
NOTE: To change the image on this slide, select the picture and delete it. Then click the Pictures icon in the placeholder to insert your own image. NON.
File System Implementation Chapter 12. File system Organization Application programs Application programs Logical file system Logical file system manages.
1 File Systems: Consistency Issues. 2 File Systems: Consistency Issues File systems maintains many data structures  Free list/bit vector  Directories.
Free Space Management.
Silberschatz, Galvin and Gagne  Operating System Concepts Chapter 12: File System Implementation File System Structure File System Implementation.
Advanced file systems: LFS and Soft Updates Ken Birman (based on slides by Ben Atkin)
Disk & File System Management Disk Allocation Free Space Management Directory Structure Naming Disk Scheduling Protection CSE 331 Operating Systems Design.
“NVM Duet: Unified Working Memory and Persistent Store Architecture”
CS333 Intro to Operating Systems Jonathan Walpole.
Outline for Today Journaling vs. Soft Updates Administrative.
11.1 Silberschatz, Galvin and Gagne ©2005 Operating System Principles 11.5 Free-Space Management Bit vector (n blocks) … 012n-1 bit[i] =  1  block[i]
ThyNVM Enabling Software-Transparent Crash Consistency In Persistent Memory Systems Jinglei Ren, Jishen Zhao, Samira Khan, Jongmoo Choi, Yongwei Wu, and.
Lecture 20 FSCK & Journaling. FFS Review A few contributions: hybrid block size groups smart allocation.
File System Performance CSE451 Andrew Whitaker. Ways to Improve Performance Access the disk less  Caching! Be smarter about accessing the disk  Turn.
Transactional Flash V. Prabhakaran, T. L. Rodeheffer, L. Zhou (MSR, Silicon Valley), OSDI 2008 Shimin Chen Big Data Reading Group.
NVWAL: Exploiting NVRAM in Write-Ahead Logging
CPSC 426: Building Decentralized Systems Persistence
Free Transactions with Rio Vista Landon Cox April 15, 2016.
© 2013 Gribble, Lazowska, Levy, Zahorjan
Persistent Memory (PM)
Database Recovery Techniques
Failure-Atomic Slotted Paging for Persistent Memory
Free Transactions with Rio Vista
Jonathan Walpole Computer Science Portland State University
Chapter 11: File System Implementation
J. Xu et al, FAST ‘16 Presenter: Rohit Sarathy April 18, 2017
Enforcing the Atomic and Durable Properties
FileSystems.
Using non-volatile memory (NVDIMM-N) as block storage in Windows Server 2016 Tobias Klima Program Manager.
Journaling File Systems
Lesson Objectives Aims You should be able to:
NOVA: A High-Performance, Fault-Tolerant File System for Non-Volatile Main Memories Andiry Xu, Lu Zhang, Amirsaman Memaripour, Akshatha Gangadharaiah,
EECE.4810/EECE.5730 Operating Systems
Filesystems 2 Adapted from slides of Hank Levy
Overview Continuation from Monday (File system implementation)
Lecture 6: Reliability, PCM
Introduction to Operating Systems
Free Transactions with Rio Vista
CSE 451: Operating Systems Autumn Module 16 Journaling File Systems
CSE 451: Operating Systems Spring 2011 Journaling File Systems
Printed on Monday, December 31, 2018 at 2:03 PM.
CSE 451: Operating Systems Winter Module 16 Journaling File Systems
CSE 451: Operating Systems Spring Module 17 Journaling File Systems
Overview: File system implementation (cont)
CSE 451: Operating Systems Spring Module 16 Journaling File Systems
Horizontally Partitioned Hybrid Main Memory with PCM
A Highly Non-Volatile Memory Scalable and Efficient File System
File-System Structure
CSE 451: Operating Systems Spring 2008 Module 14
File System Implementation
CS703 - Advanced Operating Systems
File System Performance
Page Cache and Page Writeback
CSE 451: Operating Systems Spring 2010 Module 14
The Design and Implementation of a Log-Structured File System
Presentation transcript:

Better I/O Through Byte-Addressable, Persistent Memory Jeremy Condit, Ed Nightingale, Chris Frost, Engin Ipek, Ben Lee, Doug Burger, Derrick Coetzee

A New World of Storage DRAM + Fast + Byte-addressable - Volatile Disk / Flash + Non-volatile - Slow - Block-addressable

A New World of Storage BPRAM + Fast + Byte-addressable + Non-volatile Byte-addressable, Persistent RAM BPRAM + Fast + Byte-addressable + Non-volatile How do we build fast, reliable systems with BPRAM?

Phase Change Memory Most promising form of BPRAM “Melting memory chips in mass production” – Nature, 9/25/09

Phase Change Memory phase change material (chalcogenide) Properties slow cooling -> crystalline state (1) phase change material (chalcogenide) fast cooling -> amorphous state (0) Properties Reads: 2-4x DRAM Writes: 5-10x DRAM Endurance: 108+ electrode

A New World of Storage BPRAM + Fast + Byte-addressable + Non-volatile Byte-addressable, Persistent RAM BPRAM + Fast + Byte-addressable + Non-volatile How do we build fast, reliable systems with BPRAM? Disk / Flash + Non-volatile - Slow - Block-addressable This talk: BPFS, a file system for BPRAM Result: Improved performance and reliability

Goal New guarantees for applications File system operations will commit atomically and in program order Your data is durable as soon as the cache is flushed New mechanism: short-circuit shadow paging

Design Principles 1. Eliminate the DRAM buffer cache; use the L1/L2 cache instead  2. Put BPRAM on the memory bus 3. Provide atomicity and ordering in hardware Write A Write B

Outline Intro File System Hardware Support Evaluation Conclusion

BPRAM in the PC BPRAM and DRAM are addressable by the CPU Physical address space is partitioned BPRAM data may be cached in L1/L2 L1 L2 Memory bus DRAM BPRAM HD / Flash PCI/IDE bus

BPFS: A BPRAM File System Guarantees that all file operations execute atomically and in program order Despite guarantees, significant performance improvements over NTFS on the same media Short-circuit shadow paging often allows atomic, in-place updates

BPFS: A BPRAM File System root pointer indirect blocks inode file inodes file directory file

Enforcing FS Consistency Guarantees What happens if we crash during an update? Disk: Use journaling or shadow paging BPRAM: Use short-circuit shadow paging

Review 1: Journaling Write to journal, then write to file system B B’ journal A’ B’ Reliable, but all data is written twice

Review 2: Shadow Paging Use copy-on-write up to root of file system file’s root pointer A B A’ B’ Any change requires bubbling to the FS root Small writes require large copying overhead

Short-Circuit Shadow Paging Inspired by shadow paging Optimization: In-place update when possible

Short-Circuit Shadow Paging Inspired by shadow paging Optimization: In-place update when possible file’s root pointer A B A’ B’ Uses byte-addressability and atomic 64b writes

Opt. 1: In-Place Writes Aligned 64-bit writes are performed in place Data and metadata file’s root pointer in-place write

Opt. 2: Exploit Data-Metadata Invariants Appends committed by updating file size file’s root pointer + size file size update in-place append

BPFS Example Cross-directory rename bubbles to common ancestor inode root pointer add entry remove entry indirect blocks inode file inodes directory directory file Cross-directory rename bubbles to common ancestor

Outline Intro File System Hardware Support Evaluation Conclusion

Problem 1: Ordering L1 / L2 ... CoW Commit BPRAM

Problem 2: Atomicity L1 / L2 ... CoW Commit BPRAM

Enforcing Ordering and Atomicity Solution: Epoch barriers to declare constraints Faster than write-through Important hardware primitive (cf. SCSI TCQ) Atomicity Solution: Capacitor on DIMM Simple and cheap!

Ordering and Atomicity Ineligible for eviction! L1 / L2 2 ... CoW Barrier Commit 1 BPRAM MP works too (see paper)

Outline Intro File System Hardware Support Evaluation Conclusion

Methodology Built and evaluated BPFS in Windows Three parts: Experimental: BPFS vs. NTFS on DRAM Simulation: Epoch barrier evaluation Analytical: BPFS on PCM

Microbenchmarks NTFS - Disk NTFS - RAM BPFS - RAM NOT DURABLE!

BPFS Throughput On PCM BPFS - PCM Projected Throughput NTFS Disk NTFS RAM BPFS RAM BPFS PCM (Proj) Sustained Throughput of PCM (MB/s)

Conclusions BPRAM changes the trade-offs for storage Use consistency technique designed for medium Short-circuit shadow paging: improves performance improves reliability Bonus: PCM chips on display at poster session!