A File is Not a File: Understanding the I/O Behavior of Apple Desktop Applications Tyler Harter, Chris Dragga, Michael Vaughn, Andrea C. Arpaci-Dusseau,

Slides:



Advertisements
Similar presentations
Chapter 11: Multimedia Tools Section V: Using Multimedia Tools to Enhance Learning.
Advertisements

1 Storage-Aware Caching: Revisiting Caching for Heterogeneous Systems Brian Forney Andrea Arpaci-Dusseau Remzi Arpaci-Dusseau Wisconsin Network Disks University.
Computing Fundamentals
Tutorial 12: Enhancing Excel with Visual Basic for Applications
Microsoft Office Illustrated Fundamentals Unit N: Polishing and Running a Presentation.
File Management Systems
File System Implementation CSCI 444/544 Operating Systems Fall 2008.
Reducing the Energy Usage of Office Applications Jason Flinn M. Satyanarayanan Carnegie Mellon University Eyal de Lara Dan S. Wallach Willy Zwaenepoel.
Basic Application Software
Software. Application Software performs useful work on general-purpose tasks such as word processing and data analysis. The user interacts with the application.
Computer Forensics Principles and Practices by Volonino, Anzaldua, and Godwin Chapter 6: Operating Systems and Data Transmission Basics for Digital Investigations.
Copyright 2003 The McGraw-Hill Companies, Inc CHAPTER Application Software computing ESSENTIALS    
Computer Skills Preparatory Year Presented by: L.Obead Alhadreti.
Word Processing, Web Browsing, File Access, etc. Windows Operating System (Kernel) Window (GUI) Platform Dependent Code Virtual Memory “Swap” Block Data.
Basic Application Software © 2013 The McGraw-Hill Companies, Inc. All rights reserved.Computing Essentials 2013.
SOFTWARE.
Microsoft Visual Basic 2012 CHAPTER ONE Introduction to Visual Basic 2012 Programming.
MS Access Advanced Instructor: Vicki Weidler Assistant:
Section 6.1 Explain the development of operating systems Differentiate between operating systems Section 6.2 Demonstrate knowledge of basic GUI components.
1 JCM 106 Computer Application for Journalism Lecture 1 – Introduction to Computing.
1 Programming Concepts Module Code : CMV6107 Class Contact Hours: 45 hours (Lecture 15 hours) (Laboratory/Tutorial 30 hours) Module Value: 1 Textbook:
Lesson 4 MICROSOFT EXCEL PART 1 by Nguyễn Thanh Tùng Web:
Objectives Learn what a file system does
Computer Concepts – Illustrated 8 th edition Unit C: Computer Software.
Software All parts of the computer people can NOT touch, such as programs, files, documents and any other data.
What is a computer? What is its definition? A computer is a programmable machine designed to sequentially and automatically carry out a sequence of arithmetic.
Computer Software CSCI N207 Data Analysis Using Spreadsheet Department of Computer and Information Science, IUPUI.
TERMS TO KNOW. Desktop This does not mean a computer desktop vs. a laptop. You probably keep a number of commonly used items on your desk at home such.
1 Chapter 2 & Chapter 4 §Browsers. 2 Terms §Software §Program §Application.
1 INTRODUCTION TO DATABASE MANAGEMENT SYSTEM L E C T U R E
Agenda Last class: Computer Hardware Today: –Typing for 10 minutes –Computer Software –Journal 2 homework.
CHAPTER FOUR COMPUTER SOFTWARE.
Software Essentials ICT 1 & 2. What is software?  software is the set of instructions stored inside a computer  These instructions tell the computer.
SOFTWARE: the power behind the machine By Mrs. Gonzales.
Presented by the Virginia 4-H Science and Technology Committee PowerPoint 101.
FILES & FOLDERS Organization Computer hard drives hold an enormous amount of data or information. Knowing how a computer's organization system works.
 Saundra Speed  Mariela Esparza  Kevin Escalante.
Component 4: Introduction to Information and Computer Science Unit 4: Application and System Software Lecture 3 This material was developed by Oregon Health.
Data Structure & File Systems Hun Myoung Park, Ph.D., Public Management and Policy Analysis Program Graduate School of International Relations International.
The Worlds of Database Systems From: Ch. 1 of A First Course in Database Systems, by J. D. Pullman and H. Widom.
Visualizing Technology© 2012 Pearson Education, Inc. Publishing as Prentice Hall1 PowerPoint Presentation To Accompany Chapter 6 System Software.
File Systems in Real-Time Embedded Applications March 5th Eric Julien Understanding How the File Allocation Table (FAT) Operates 1.
* Property of STI Page 1 of 18 Software: Systems and Applications Basic Computer Concepts Software  Software: can be divided into:  systems software.
Mac OS X December 5, 2005 Fall 2005 Term Project CS450 Operating Systems (Section 2) Darrell Hall, Ryan Lanman, Chris Sanford, John Suarez {halldl, lanmanrm,
Computer Software Types Three layers of software Operation.
An operating system is the software that makes everything in the computer work together smoothly and efficiently. What is an Operating System?
Application Software System Software.
Web Design and Development. World Wide Web  World Wide Web (WWW or W3), collection of globally distributed text and multimedia documents and files 
Introduction to Microsoft Excel Macros COE 201- Computer Proficiency.
Software Essentials ICT 1 & 2. What is software?  software is the set of instructions stored inside a computer  These instructions tell the computer.
Chapter 9: Networking with Unix and Linux. Objectives: Describe the origins and history of the UNIX operating system Identify similarities and differences.
CONTENT  Introduction Introduction  Operating System (OS) Operating System (OS) Operating System (OS)  Summary Summary  Application Software Application.
Storage Systems CSE 598d, Spring 2007 OS Support for DB Management DB File System April 3, 2007 Mark Johnson.
A computer contains two major sets of tools, software and hardware. Software is generally divided into Systems software and Applications software. Systems.
File Systems - Part I CS Introduction to Operating Systems.
Gorman, Stubbs, & CEP Inc. 1 Introduction to Operating Systems Lesson 8 Linux.
Microsoft Visual Basic 2015 CHAPTER ONE Introduction to Visual Basic 2015 Programming.
OFFICE SUITES. Office Suite Sometimes called an office software suite or a productivity suite Intended for use by a typical clerical worker and knowledge.
A Comparison of File System Workloads D. Roselli J. Lorch T. Anderson University of California, Berkeley.
Nature & Types of Software
Free Transactions with Rio Vista
Introduction to Visual Basic 2008 Programming
Journaling File Systems
Computer Technology Notes #3
UNIVERSITY of WISCONSIN-MADISON Computer Sciences Department
Free Transactions with Rio Vista
Modern PC operating systems
Virtual Memory: Working Sets
Chapter 11: Multimedia Tools
Chapter 2 Applications Software and Operating Systems
Presentation transcript:

A File is Not a File: Understanding the I/O Behavior of Apple Desktop Applications Tyler Harter, Chris Dragga, Michael Vaughn, Andrea C. Arpaci-Dusseau, Remzi H. Arpaci-Dusseau Department of Computer Sciences University of Wisconsin-Madison

Why study desktop applications? Measurement drives file-system design File systems must decide how to optimize Great history - many past I/O studies SOSP ’81: M. Satyanarayanan. A Study of File Sizes and Functional Lifetimes. SOSP ’85:, Ousterhout et al. A Trace-Driven Analysis of Name and Attribute Caching in a Distributed System. SOSP ’91: M. Baker et al. Measurements of a Distributed System. SOSP ’99: W. Vogels. File system usage in Windows NT 4.0. There is still uncharted territory Little focus on home users Little focus on individual applications More study can inform the design of the next generation of file systems

Outline Why study desktop applications? Case study: saving a document The big picture The DOC file General findings Conclusion

A case study: saving a document Application: Pages From Apple’s iWork suite Document processor (like MS Word) One simple task (from user’s perspective): 1. Create a new document 2. Insert 15 JPEG images (each ~2.5MB) 3. Save to the Microsoft DOC format

Files small I/O big I/O

Files small I/O big I/O

Files small I/O big I/O

Case study observations Auxiliary files dominate Task’s purpose: create 1 file; observed I/O: 385 files are touched 218 KV store files + 2 SQLite files: Personalized behavior (recently used lists, settings, etc) 118 multimedia files: Rich graphical experience 25 Strings files: Language localization 17 Other files: Auto-save file and others

Files small I/O big I/O

Threads Files small I/O big I/O

Case study observations Auxiliary files dominate Multiple threads perform I/O Interactive programs must avoid blocking

small I/O big I/O Files Threads

fsync Files Threads small I/O big I/O

Case study observations Auxiliary files dominate Multiple threads perform I/O Writes are often forced KV-store + SQLite durability Auto-save file

Files Threads fsync small I/O big I/O

rename Files Threads fsync small I/O big I/O

Case study observations Auxiliary files dominate Multiple threads perform I/O Writes are often forced Renaming is popular Often used for key-value store Makes updates atomic

Files Threads rename fsync small I/O big I/O

read write Writing the DOC file

read write Writing the DOC file

Case study observations Auxiliary files dominate Multiple threads perform I/O Writes are often forced Renaming is popular A file is not a file DOC format is modeled after a FAT file system Multiple “sub-files” Application manages space allocation

read write Writing the DOC file

Case study observations Auxiliary files dominate Multiple threads perform I/O Writes are often forced Renaming is popular A file is not a file Sequential access is not sequential Multiple sequential runs in a complex file => random accesses

read write Writing the DOC file

Case study observations Auxiliary files dominate Multiple threads perform I/O Writes are often forced Renaming is popular A file is not a file Sequential access is not sequential Frameworks influence I/O Example: update value in page function Cocoa, Carbon are a substantial part of application

Outline Why study desktop applications? Case study: saving a document General analysis Introducing iBench Files Accesses Transactional demands Threads Conclusion

iBench applications Choose popular home-user applications iLife suite (multimedia) iPhoto iTunes iMovie iWork (like MS Office) Pages (Word) Numbers (Excel) Keynote (PowerPoint)

iBench Tasks Automate 34 typical tasks (iBench task suite) Importing photos, playing songs, editing movies Typing documents, making charts, displaying a slideshow Collect I/O traces Use DTrace to instrument kernel System-call level traces reveal application behavior Record I/O events: open, close, read, write, fsync, etc. The iBench traces Available online:

iBench questions What different types of files are accessed? Which types dominate? What I/O patterns are used to access the files? Is I/O sequential or random? What are the transactional properties? Are writes flushed with fsync or performed atomically? How are threads used? How is I/O distributed across different threads?

iBench questions What different types of files are accessed? Which types dominate? What I/O patterns are used to access the files? Is I/O sequential or random? What are the transactional properties? Are writes flushed with fsync or performed atomically? How are threads used? How is I/O distributed across different threads?

File type (weighted by accesses) Files

General observations Auxiliary files dominate Lots of helper files With hundreds of helper files, how can we minimize disk seeks?

File type (weighted by I/O bytes) Files, (weighted by I/O)

Mostly Complex Files Files, (weighted by I/O)

General observations Auxiliary files dominate A file is not a file Complex files have a significant presence How can we allocate space for sub files in complex files?

iBench questions What different types of files are accessed? Which types dominate? What I/O patterns are used to access the files? Is I/O sequential or random? What are the transactional properties? Are writes flushed with fsync or performed atomically? How are threads used? How is I/O distributed across different threads?

Read sequentiality Read I/O bytes

Prefetching Implications Read I/O bytes

General observations Auxiliary files dominate A file is not a file Sequential access is not sequential How can we prefetch intelligently based on patterns?

iBench questions What different types of files are accessed? Which types dominate? What I/O patterns are used to access the files? Is I/O sequential or random? What are the transactional properties? Are writes flushed with fsync or performed atomically? How are threads used? How is I/O distributed across different threads?

Fsync (durability) Write I/O bytes

General observations Auxiliary files dominate A file is not a file Sequential access is not sequential Writes are often forced Renders write buffering ineffective Can hardware help? What do applications need? Durability? Ordering?

Fsync causes Write I/O bytes

Explicit Case Write I/O bytes

General observations Auxiliary files dominate A file is not a file Sequential access is not sequential Writes are often forced Frameworks influence I/O Should there be greater integration between FS and frameworks?

Rename and similar calls Write I/O bytes

Locality Implications Write I/O bytes

General observations Auxiliary files dominate A file is not a file Sequential access is not sequential Writes are often forced Frameworks influence I/O Renaming is popular How should directory-locality heuristics adapt? Do we need atomicity APIs? Is copy-on-write always best?

iBench questions What different types of files are accessed? Which types dominate? What I/O patterns are used to access the files? Is I/O sequential or random? What are the transactional properties? Are writes flushed with fsync or performed atomically? How are threads used? How is I/O distributed across different threads?

Thread I/O distribution I/O bytes

General observations Auxiliary files dominate A file is not a file Sequential access is not sequential Writes are often forced Frameworks influence I/O Renaming is popular Multiple threads perform I/O Should file systems do thread-based locality (like ext file systems)? Should GUI threads receive special treatment?

Summary The general findings agree with the case study findings: 1. Auxiliary files dominate 2. A file is not a file 3. Sequential access is not sequential 4. Writes are often forced 5. Renaming is popular 6. Multiple threads perform I/O 7. Frameworks influence I/O

Conclusion: how has the world changed?

In 1974: “No large ‘access method’ routines are required to insulate the programmer from the system calls; in fact, all user programs either call the system directly or use a small library program, only tens of instructions long…” ~ Ritchie and Thompson. The UNIX Time-Sharing System.

In the past, applications: Used the file-system API directly Performed simple tasks well Chained together for more complex actions File System Application Conclusion: how has the world changed?

In the past, applications: Used the file-system API directly Performed simple tasks well Chained together for more complex actions Today, we see: Applications are graphically rich, multifunctional monoliths “#include reads 112,047 lines from 689 files” ~ Rob Pike ‘10 They rely heavily on I/O libraries Cocoa, Carbon, and other frameworks File System Developer’s Code Conclusion: how has the world changed? File System Application

Resources The iBench suite and the paper are available online: Traces: Paper: