Presentation is loading. Please wait.

Presentation is loading. Please wait.

Accurate and Efficient Replaying of File System Traces Nikolai Joukov, TimothyWong, and Erez Zadok Stony Brook University (FAST 2005) USENIX Conference.

Similar presentations


Presentation on theme: "Accurate and Efficient Replaying of File System Traces Nikolai Joukov, TimothyWong, and Erez Zadok Stony Brook University (FAST 2005) USENIX Conference."— Presentation transcript:

1 Accurate and Efficient Replaying of File System Traces Nikolai Joukov, TimothyWong, and Erez Zadok Stony Brook University (FAST 2005) USENIX Conference on File and Storage Technologies Presented by Hsu Hao Chen

2 Outline Introduction Design Architecture Reproduce original timing problem Replayfs trace Threads and their scheduling Zero copying of data File system caches Implementation Evaluation Conclusions

3 Introduction Trace replaying is useful for file system benchmarking, stress-testing, debugging, and forensics. File system traces can be captured and replayed at different logical levels: System calls Virtual File system (VFS) Network level for network file systems Device driver

4 Design Architecture(1/2) Tracefs: replays traces captured using stackable file system

5 Design Architecture(2/2) Replayfs: VFS-level replayer

6 Design Reproduce original timing problem If the t replayer > t user then timeing and I/O rate could not be reproduced correctly

7 Design System-call replayers problem: User mode Redundant data copying between user and kernel buffers Page eviction is not completely controlled from the user level Replaying processes can be preempted by other tasks Some kernel are not preemptive and have long execution path

8 Design Replayfs trace(1/4)

9 Design Replayfs trace(2/4) Tracefs A trace captured by a tracer is often portable, descriptive, and verbose to offer as much information Trace compiler User mode program for conversion and optimization of the Traces raw traces Splits the raw Tracefs trace into three components: Command Resource Allocation Table (RAT) Buffer

10 Design Replayfs trace(3/4)

11 Design Replayfs trace(4/4) memory buffers are accessed for reading only because the information read from the disk is discarded.

12 Design Threads and their scheduling Replayfs issues requests to the lower file system on behalf of different threads Resource contention (disk head repositioning, locks, etc) Replayfs reuses threads if possible pre-spin Increase event precision (standard event timers 1ms) Clock thread CPU cycle counters

13 Design Zero copying of data there is no easy way a user-mode program can read data but avoid copying it to user space. Use kernel-mode benefit a data page that belongs to the trace file can be simply moved to the target le by just changing several pointers

14 Design File system caches Replaysfs supports three replaying modes for dealing with read operations Current cache state Replayfs calls all the captured buffer read operations Original cache state reads are invoked on the page level only for the pages that were not found in the cache during tracing. Reads are not replayed at all

15 Implementation Linux kernel and now both Tracefs and Replayfs can be used on either 2.4 or 2.6 Linux kernels. Kernel module Application program Kernel module

16 Evaluation Test environment 1.7GHz Pentium 4 machine with 1GB of RAM system disk was a 30GB 7200 RPM IDE formatted with Ext3 the machine had two Maxtor Atlas 15,000 RPM 18.4GB Ultra320 SCSI disks formatted with Ext2 storing the traces and the Replayfs traces

17 Evaluation Evaluation Tools and Workloads Am-utils build Building Am-utils is a CPU-intensive benchmark Postmark simulates the operation of electronic mail servers Pread evaluate Replayfs's CPU time consumption. It spawns two threads that concurrently read 1KB buffers of cached data using the pread system call. Pread performed 100 million read operations.

18 Evaluation Memory Overheads 56%70% 45%

19 Evaluation Timing Precision of Replaying(1/2) Time (seconds) Number of operations

20 Evaluation Timing Precision of Replaying(2/2) Time (seconds) Number of operations

21 Evaluation CPU Time Consumption 32% 61% User-level replayers cannot replay traces like Pread at the same rate as the original

22 Conclusions Trace replaying offers a number of advantages for file system benchmarking, debugging, and forensics Replaying has three distinct benefits: Capture and replay all file system operations Include important memory-mapping Kernel module Avoid unnecessary data copying reduce the number of context switches Optimize trace data prefetch Precise control over thread scheduling Pre-spin


Download ppt "Accurate and Efficient Replaying of File System Traces Nikolai Joukov, TimothyWong, and Erez Zadok Stony Brook University (FAST 2005) USENIX Conference."

Similar presentations


Ads by Google