Lazy Asynchronous I/O For Event-Driven Servers Khaled Elmeleegy, Anupam Chanda and Alan L. Cox Department of Computer Science Rice University, Houston,

Slides:



Advertisements
Similar presentations
Numbers Treasure Hunt Following each question, click on the answer. If correct, the next page will load with a graphic first – these can be used to check.
Advertisements

1
1 Vorlesung Informatik 2 Algorithmen und Datenstrukturen (Parallel Algorithms) Robin Pomplun.
Chapter 7 Constructors and Other Tools. Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 7-2 Learning Objectives Constructors Definitions.
Copyright © 2003 Pearson Education, Inc. Slide 1 Computer Systems Organization & Architecture Chapters 8-12 John D. Carpinelli.
Chapter 1 The Study of Body Function Image PowerPoint
Processes and Operating Systems
Copyright © 2011, Elsevier Inc. All rights reserved. Chapter 6 Author: Julia Richards and R. Scott Hawley.
Author: Julia Richards and R. Scott Hawley
1 Copyright © 2013 Elsevier Inc. All rights reserved. Chapter 3 CPUs.
Properties Use, share, or modify this drill on mathematic properties. There is too much material for a single class, so you’ll have to select for your.
UNITED NATIONS Shipment Details Report – January 2006.
RXQ Customer Enrollment Using a Registration Agent (RA) Process Flow Diagram (Move-In) Customer Supplier Customer authorizes Enrollment ( )
Business Transaction Management Software for Application Coordination 1 Business Processes and Coordination. Introduction to the Business.
© 2010 Pearson Addison-Wesley. All rights reserved. Addison Wesley is an imprint of Chapter 11: Structure and Union Types Problem Solving & Program Design.
1 RA I Sub-Regional Training Seminar on CLIMAT&CLIMAT TEMP Reporting Casablanca, Morocco, 20 – 22 December 2005 Status of observing programmes in RA I.
Properties of Real Numbers CommutativeAssociativeDistributive Identity + × Inverse + ×
Create an Application Title 1A - Adult Chapter 3.
Custom Statutory Programs Chapter 3. Customary Statutory Programs and Titles 3-2 Objectives Add Local Statutory Programs Create Customer Application For.
FACTORING ax2 + bx + c Think “unfoil” Work down, Show all steps.
1 Processes and Threads Creation and Termination States Usage Implementations.
Chapter 5 Input/Output 5.1 Principles of I/O hardware
1 Click here to End Presentation Software: Installation and Updates Internet Download CD release NACIS Updates.
1. 2 Objectives Become familiar with the purpose and features of Epsilen Learn to navigate the Epsilen environment Develop a professional ePortfolio on.
REVIEW: Arthropod ID. 1. Name the subphylum. 2. Name the subphylum. 3. Name the order.
Break Time Remaining 10:00.
Turing Machines.
PP Test Review Sections 6-1 to 6-6
Bright Futures Guidelines Priorities and Screening Tables
EIS Bridge Tool and Staging Tables September 1, 2009 Instructor: Way Poteat Slide: 1.
Bellwork Do the following problem on a ½ sheet of paper and turn in.
CS 6143 COMPUTER ARCHITECTURE II SPRING 2014 ACM Principles and Practice of Parallel Programming, PPoPP, 2006 Panel Presentations Parallel Processing is.
Operating Systems Operating Systems - Winter 2011 Dr. Melanie Rieback Design and Implementation.
Operating Systems Operating Systems - Winter 2012 Dr. Melanie Rieback Design and Implementation.
Exarte Bezoek aan de Mediacampus Bachelor in de grafische en digitale media April 2014.
1 public class Newton { public static double sqrt(double c) { double epsilon = 1E-15; if (c < 0) return Double.NaN; double t = c; while (Math.abs(t - c/t)
Copyright © 2012, Elsevier Inc. All rights Reserved. 1 Chapter 7 Modeling Structure with Blocks.
1 RA III - Regional Training Seminar on CLIMAT&CLIMAT TEMP Reporting Buenos Aires, Argentina, 25 – 27 October 2006 Status of observing programmes in RA.
Factor P 16 8(8-5ab) 4(d² + 4) 3rs(2r – s) 15cd(1 + 2cd) 8(4a² + 3b²)
Basel-ICU-Journal Challenge18/20/ Basel-ICU-Journal Challenge8/20/2014.
1..
CONTROL VISION Set-up. Step 1 Step 2 Step 3 Step 5 Step 4.
© 2012 National Heart Foundation of Australia. Slide 2.
Adding Up In Chunks.
Copyright © 2013 by John Wiley & Sons. All rights reserved. HOW TO CREATE LINKED LISTS FROM SCRATCH CHAPTER Slides by Rick Giles 16 Only Linked List Part.
1 Processes and Threads Chapter Processes 2.2 Threads 2.3 Interprocess communication 2.4 Classical IPC problems 2.5 Scheduling.
While Loop Lesson CS1313 Spring while Loop Outline 1.while Loop Outline 2.while Loop Example #1 3.while Loop Example #2 4.while Loop Example #3.
1 10 pt 15 pt 20 pt 25 pt 5 pt 10 pt 15 pt 20 pt 25 pt 5 pt 10 pt 15 pt 20 pt 25 pt 5 pt 10 pt 15 pt 20 pt 25 pt 5 pt 10 pt 15 pt 20 pt 25 pt 5 pt Synthetic.
KAIST Computer Architecture Lab. The Effect of Multi-core on HPC Applications in Virtualized Systems Jaeung Han¹, Jeongseob Ahn¹, Changdae Kim¹, Youngjin.
Model and Relationships 6 M 1 M M M M M M M M M M M M M M M M
1 hi at no doifpi me be go we of at be do go hi if me no of pi we Inorder Traversal Inorder traversal. n Visit the left subtree. n Visit the node. n Visit.
Analyzing Genes and Genomes
Speak Up for Safety Dr. Susan Strauss Harassment & Bullying Consultant November 9, 2012.
©Brooks/Cole, 2001 Chapter 12 Derived Types-- Enumerated, Structure and Union.
Essential Cell Biology
Intracellular Compartments and Transport
PSSA Preparation.
Essential Cell Biology
Immunobiology: The Immune System in Health & Disease Sixth Edition
Physics for Scientists & Engineers, 3rd Edition
Energy Generation in Mitochondria and Chlorplasts
Murach’s OS/390 and z/OS JCLChapter 16, Slide 1 © 2002, Mike Murach & Associates, Inc.
Profile. 1.Open an Internet web browser and type into the web browser address bar. 2.You will see a web page similar to the one on.
User Defined Functions Lesson 1 CS1313 Fall User Defined Functions 1 Outline 1.User Defined Functions 1 Outline 2.Standard Library Not Enough #1.
1 Decidability continued…. 2 Theorem: For a recursively enumerable language it is undecidable to determine whether is finite Proof: We will reduce the.
Rapid Development of High Performance Servers Khaled ElMeleegy Alan Cox Willy Zwaenepoel.
LAIO: Lazy Asynchronous I/O For Event Driven Servers Khaled Elmeleegy Alan L. Cox.
for Event Driven Servers
Presentation transcript:

Lazy Asynchronous I/O For Event-Driven Servers Khaled Elmeleegy, Anupam Chanda and Alan L. Cox Department of Computer Science Rice University, Houston, Texas. Willy Zwaenepoel School of Computer and Communication Sciences EPFL, Lausanne, Switzerland.

2 Event-Driven Architecture Event-driven architecture is widely used for servers Event-driven architecture is widely used for servers Performance Performance Scalability Scalability

3 Problem Developing event-driven servers is hard for many reasons Developing event-driven servers is hard for many reasons We focus on problems with non-blocking I/O: We focus on problems with non-blocking I/O: Incomplete coverage Incomplete coverage Burden of state maintenance Burden of state maintenance

4 Lazy Asynchronous I/O (LAIO) Addresses problems with non-blocking I/O Addresses problems with non-blocking I/O Universality Universality Covers all I/O operations Covers all I/O operations Simplicity Simplicity Requires less code Requires less code High performance for event-driven servers High performance for event-driven servers Meets or exceeds alternatives Meets or exceeds alternatives

5 Outline Background Background Lazy Asynchronous I/O (LAIO) Lazy Asynchronous I/O (LAIO) Evaluation Evaluation Conclusions Conclusions

6 Outline Background Background Event-driven servers Event-driven servers Lazy Asynchronous I/O (LAIO) Lazy Asynchronous I/O (LAIO) Evaluation Evaluation Conclusions Conclusions

7 Event-Driven Servers Event loop processes incoming events Event loop processes incoming events For each incoming event, it dispatches its handler For each incoming event, it dispatches its handler Single thread of execution Single thread of execution Event Loop Handler #1 Handler #2 Handler #k

8 Event Handler I/O operation (Network/Disk) Complete handling event To event loop

9 Event Handler I/O operation (Network/Disk) Complete handling event Block If the I/O operation blocks If the I/O operation blocks The server stalls The server stalls To event loop

10 Outline Background Background Lazy Asynchronous I/O (LAIO) Lazy Asynchronous I/O (LAIO) API API Example of usage Example of usage Implementation Implementation Evaluation Evaluation Conclusions Conclusions

11 LAIO API Return Type Function Name Parameters intlaio_syscall int number,… void*laio_gethandlevoid intlaio_poll laio_completion[] completions, int ncompletions, timespec* ts

12 laio_syscall() Lazily converts any system call into an asynchronous call Lazily converts any system call into an asynchronous call If the system call doesnt block If the system call doesnt block laio_syscall() returns immediately laio_syscall() returns immediately With return value of system call With return value of system call Else if it blocks: Else if it blocks: laio_syscall() returns immediately laio_syscall() returns immediately With return value -1 With return value -1 errno set to EINPROGRESS errno set to EINPROGRESS Background LAIO operation Background LAIO operation

13 laio_gethandle() Returns a handle representing the last issued LAIO operation Returns a handle representing the last issued LAIO operation If operation didnt block, NULL is returned If operation didnt block, NULL is returned

14 laio_poll() Returns a count of completed background LAIO operations Returns a count of completed background LAIO operations Fills an array with completion entries Fills an array with completion entries One for each operation One for each operation Each completion entry has Each completion entry has Handle Handle Return value Return value Error value Error value

15 Event Handler With LAIO laio_syscall() Operation completed Complete handling event Yes No To event loop If operation completes If operation completes Handler continues execution Handler continues execution

16 Event Handler With LAIO laio_syscall() Operation completed Complete handling event Yes No To event loop If operation doesnt complete If operation doesnt complete laio_syscall() returns immediately laio_syscall() returns immediately Handler records LAIO handle Handler records LAIO handle Returns to event loop Returns to event loop Completion notification arrives later Completion notification arrives later Handle = laio_gethandle()

17 LAIO Implementation LAIO uses scheduler activations LAIO uses scheduler activations Scheduler activations Scheduler activations The kernel delivers an upcall when an operation The kernel delivers an upcall when an operation Blocks - laio_syscall() Blocks - laio_syscall() Unblocks - laio_poll() Unblocks - laio_poll()

18 laio_syscall() – Non-blocking case Issue operation Save context Enable upcalls System call blocks? Disable upcalls Return retval No laio_syscall() Application Library

19 laio_syscall() – Blocking case Issue operation Save context Enable upcalls System call blocks? laio_syscall() Application Library Yes

20 laio_syscall() – Blocking case Issue operation Save context Enable upcalls System call blocks? laio_syscall() Application Library Upcall on a new thread Kernel Yes Background laio_syscall

21 laio_syscall() – Blocking case Issue operation Save context Enable upcalls System call blocks? laio_syscall() Application Library Upcall on a new thread Kernel Yes upcall handler Steals old stack using stored context Library Background laio operation

22 laio_syscall() – Blocking case Issue operation Save context Enable upcalls System call blocks? laio_syscall() Application Library Upcall on a new thread Kernel Yes upcall handler Steals old stack using stored context Library Disable upcalls errno = EINPROGRESS Return -1 Background laio operation

23 Unblocking case List of completions is retrieved by the application using laio_poll() List of completions is retrieved by the application using laio_poll() Background laio operation completes, thread dies Upcall on the current thread Kernel upcall handler() Construct completion structure: laio operation handle. System call return value. Error code. Add completion to list of completions. Library

24 Outline Background Background Lazy Asynchronous I/O (LAIO) Lazy Asynchronous I/O (LAIO) Evaluation Evaluation Methodology Methodology Experiments Experiments LAIO vs. conventional non-blocking I/O LAIO vs. conventional non-blocking I/O LAIO vs. AMPED LAIO vs. AMPED Programming complexity Programming complexity Conclusions Conclusions

25 Methodology Flash web server Flash web server Intel Xeon 2.4 GHz with 2 GB memory Intel Xeon 2.4 GHz with 2 GB memory Gigabit Ethernet between machines Gigabit Ethernet between machines FreeBSD 5.2-CURRENT FreeBSD 5.2-CURRENT Two web workloads Two web workloads Rice 1.1 GB footprint Rice 1.1 GB footprint Berkeley 6.4 GB footprint Berkeley 6.4 GB footprint

26 Experiments: LAIO vs. conventional non-blocking I/O Flash-NB-BSingle Disk I/O Flash-NB-AIOSingle Disk I/O other than read and write Flash-LAIO-LAIOSingleNone Server-Network-DiskThreaded Blocking operations Compare performance

27 Performance: Large Workload

28 Performance: Small Workload

29 Why Be Lazy? Most potentially blocking operations dont actually block Most potentially blocking operations dont actually block Experiments: 73% - 86% of such operations dont block Experiments: 73% - 86% of such operations dont block Lower overhead Lower overhead Micro-benchmark: AIO is 3.2 times slower than LAIO for non-blocking operations Micro-benchmark: AIO is 3.2 times slower than LAIO for non-blocking operations

30 Experiments: LAIO vs. AMPED Flash-LAIO-LAIOSingleNone Flash-NB-AMPED Process-based helpers None Server-Network-DiskThreaded Blocking operations Compare performance

31 Asymmetric Multi-process Event- Driven (AMPED) Architecture Event-driven core Event-driven core Potentially blocking I/O handed off to a helper Potentially blocking I/O handed off to a helper Helper can block as it runs on a separate thread of execution Helper can block as it runs on a separate thread of execution Helper #1 Helper #2 Helper #n Event Loop Handler #1 Handler #2 Handler #k

32 Flash-NB-AMPED Stock version of flash Stock version of flash Two relevant helpers Two relevant helpers File read: reads a file from disk File read: reads a file from disk Name conversion: checks for a files existence and permissions Name conversion: checks for a files existence and permissions

33 Performance of LAIO vs. AMPED

34 Performance of LAIO vs. AMPED

35 Programming Complexity Where performance is comparable: Flash- LAIO-LAIO vs. Flash-NB-AMPED Where performance is comparable: Flash- LAIO-LAIO vs. Flash-NB-AMPED Flash-LAIO-LAIO is simpler Flash-LAIO-LAIO is simpler No helpers No helpers No state maintenance No state maintenance

36 State Maintenance With Non- blocking I/O I/O operation Operation completed Complete handling event Yes No To event loop If operation does not complete If operation does not complete Handler returns to event loop Handler returns to event loop handler

37 State Maintenance With Non- blocking I/O I/O operation Operation completed Complete handling event Yes No To event loop If operation does not complete If operation does not complete Handler returns to event loop Handler returns to event loop Operation is continued later Operation is continued later handler

38 State Maintenance With Non- blocking I/O I/O operation Operation completed Complete handling event Yes NoTo event loop Store state If operation does not complete: If operation does not complete: Handler returns to event loop Handler returns to event loop Operation is continued later Operation is continued later This may require storing state This may require storing state handler

39 Lines of Code Comparison Total code size File read Name Conversion Partial-write state maintenance 700 ComponentFlash-NB-AMPED lines of code Flash-LAIO-LAIO 9.5% reduction in lines of code

40 Conclusions LAIO is universal LAIO is universal Supports all system calls Supports all system calls LAIO is simpler LAIO is simpler Used uniformly Used uniformly No state maintenance No state maintenance No helpers No helpers Less lines of code Less lines of code LAIO meets or exceeds the performance of other alternatives LAIO meets or exceeds the performance of other alternatives

41 LAIO source can be found at: