Why Events Are A Bad Idea (for high-concurrency servers) Rob von Behren, Jeremy Condit and Eric Brewer University of California at Berkeley

Slides:



Advertisements
Similar presentations
Capriccio: Scalable Threads for Internet Services Rob von Behren, Jeremy Condit, Feng Zhou, Geroge Necula and Eric Brewer University of California at Berkeley.
Advertisements

Chess Review May 8, 2003 Berkeley, CA Compiler Support for Multithreaded Software Jeremy ConditRob von Behren Feng ZhouEric Brewer George Necula.
CSCI-1680 Network Programming II Rodrigo Fonseca.
1 SEDA: An Architecture for Well- Conditioned, Scalable Internet Services Matt Welsh, David Culler, and Eric Brewer Computer Science Division University.
CS533 Concepts of Operating Systems Jonathan Walpole.
1 Capriccio: Scalable Threads for Internet Services Matthew Phillips.
SEDA: An Architecture for Well- Conditioned, Scalable Internet Services Matt Welsh, David Culler, and Eric Brewer Computer Science Division University.
CS533 Concepts of Operating Systems Class 3 Monitors.
Why Events Are A Bad Idea (for high-concurrency servers) By Rob von Behren, Jeremy Condit and Eric Brewer.
SEDA: An Architecture for Well-Conditioned, Scalable Internet Services Matt Welsh, David Culler, and Eric Brewer Computer Science Division University of.
Capriccio: Scalable Threads for Internet Services ( by Behren, Condit, Zhou, Necula, Brewer ) Presented by Alex Sherman and Sarita Bafna.
490dp Synchronous vs. Asynchronous Invocation Robert Grimm.
Precept 3 COS 461. Concurrency is Useful Multi Processor/Core Multiple Inputs Don’t wait on slow devices.
Capriccio: Scalable Threads for Internet Services Rob von Behren, Jeremy Condit, Feng Zhou, George Necula and Eric Brewer University of California at Berkeley.
CS533 Concepts of Operating Systems Class 6 The Duality of Threads and Events.
Capriccio: Scalable Threads for Internet Services Rob von Behren, Jeremy Condit, Feng Zhou, Geroge Necula and Eric Brewer University of California at Berkeley.
Threads 1 CS502 Spring 2006 Threads CS-502 Spring 2006.
Why Events Are a Bad Idea (for high-concurrency servers) Author: Rob von Behren, Jeremy Condit and Eric Brewer Presenter: Wenhao Xu Other References: [1]
“Why Events are a Bad Idea (For high-concurrency servers)” Paper by Rob von Behren, Jeremy Condit and Eric Brewer, May 2003 Presentation by Loren Davis,
Operational Analysis L. Grewe. Operational Analysis Relationships that do not require any assumptions about the distribution of service times or inter-arrival.
CS533 Concepts of Operating Systems Class 2 The Duality of Threads and Events.
CS533 - Concepts of Operating Systems 1 Class Discussion.
1 School of Computing Science Simon Fraser University CMPT 300: Operating Systems I Ch 4: Threads Dr. Mohamed Hefeeda.
CS533 Concepts of Operating Systems Class 3 Integrated Task and Stack Management.
Chapter 4: Threads. 4.2 Silberschatz, Galvin and Gagne ©2005 Operating System Concepts Objectives Thread definitions and relationship to process Multithreading.
Why Events Are A Bad Idea (for high-concurrency servers) Rob von Behren, Jeremy Condit and Eric Brewer Computer Science Division, University of California.
SEDA: An Architecture for Well-Conditioned, Scalable Internet Services
CSCI-1680 Network Programming II Rodrigo Fonseca.
Why Events Are A Bad Idea (for high-concurrency servers) Rob von Behren, Jeremy Condit and Eric Brewer Presented by: Hisham Benotman CS533 - Concepts of.
Network Applications: Async Servers and Operational Analysis Y. Richard Yang 10/03/2013.
SEDA: An Architecture for Well-Conditioned, Scalable Internet Services
Servers: Concurrency and Performance Jeff Chase Duke University.
SEDA: An Architecture for Well-Conditioned, Scalable Internet Services Presented by Changdae Kim and Jaeung Han OFFENCE.
Scheduler Activations: Effective Kernel Support for the User- Level Management of Parallelism. Thomas E. Anderson, Brian N. Bershad, Edward D. Lazowska,
Concurrent Programming. Concurrency  Concurrency means for a program to have multiple paths of execution running at (almost) the same time. Examples:
Scalable Internet Services Cluster Lessons and Architecture Design for Scalable Services BR01, TACC, Porcupine, SEDA and Capriccio.
Capriccio: Scalable Threads for Internet Services Rob von Behren, Jeremy Condit, Feng Zhou, Geroge Necula and Eric Brewer University of California at Berkeley.
5204 – Operating Systems Threads vs. Events. 2 CS 5204 – Operating Systems Forms of task management serial preemptivecooperative (yield) (interrupt)
Operating Systems CSE 411 CPU Management Sept Lecture 10 Instructor: Bhuvan Urgaonkar.
Presenting: Why Events Are A “Bad” Idea ( for high-concurrency servers) Paper by: Behren, Condit and Brewer, 2003 Presented by: Rania Elnaggar 1/30/2008.
CSE 60641: Operating Systems Next topic: CPU (Process/threads/scheduling, synchronization and deadlocks) –Why threads are a bad idea (for most purposes).
SEDA: An Architecture for Well-Conditioned, Scalable Internet Services Matt Welsh, David Culler, and Eric Brewer Computer Science Division University of.
By: Rob von Behren, Jeremy Condit and Eric Brewer 2003 Presenter: Farnoosh MoshirFatemi Jan
SEDA An architecture for Well-Conditioned, scalable Internet Services Matt Welsh, David Culler, and Eric Brewer University of California, Berkeley Symposium.
Holistic Systems Programming Qualifying Exam Presentation UC Berkeley, Computer Science Division Rob von Behren June 21, 2004.
Threads versus Events CSE451 Andrew Whitaker. This Class Threads vs. events is an ongoing debate  So, neat-and-tidy answers aren’t necessarily available.
1 Why Events Are A Bad Idea (for high-concurrency servers) By Rob von Behren, Jeremy Condit and Eric Brewer (May 2003) CS533 – Spring 2006 – DONG, QIN.
An Efficient Threading Model to Boost Server Performance Anupam Chanda.
CS533 Concepts of Operating Systems Jonathan Walpole.
1 Why Threads are a Bad Idea (for most purposes) based on a presentation by John Ousterhout Sun Microsystems Laboratories Threads!
Operating Systems Unit 2: – Process Context switch Interrupt Interprocess communication – Thread Thread models Operating Systems.
Paper Review of Why Events Are A Bad Idea (for high-concurrency servers) Rob von Behren, Jeremy Condit and Eric Brewer By Anandhi Sundaram.
Network Applications: High-performance Server Design: Async Servers/Operational Analysis Y. Richard Yang 2/24/2016.
Capriccio:Scalable Threads for Internet Services
SEDA: An Architecture for Scalable, Well-Conditioned Internet Services
Threads vs. Events SEDA – An Event Model 5204 – Operating Systems.
Why Events Are A Bad Idea (for high-concurrency servers)
Processes and Threads Processes and their scheduling
Presenter: Godmar Back
Network Applications: High-performance Server Design & Operational Analysis Y. Richard Yang 10/5/2017.
Capriccio – A Thread Model
On the Duality of Operating System Structures
Background and Motivation
Capriccio: Scalable Threads for Internet Services
Why Threads Are A Bad Idea (for most purposes)
Why Events Are a Bad Idea (for high concurrency servers)
Y. Richard Yang 10/9/2018 Network Applications: High-Performance Server Design (Async Select NonBlocking Servers)
Why Threads Are A Bad Idea (for most purposes)
Why Threads Are A Bad Idea (for most purposes)
CS 5204 Operating Systems Lecture 5
Presentation transcript:

Why Events Are A Bad Idea (for high-concurrency servers) Rob von Behren, Jeremy Condit and Eric Brewer University of California at Berkeley

The Stage Highly concurrent applications Internet servers (Flash, Ninja, SEDA) Transaction processing databases Workload Operate “near the knee” Avoid thrashing! What makes concurrency hard? Race conditions Scalability (no O(n) operations) Scheduling & resource sensitivity Inevitable overload Code complexity Ideal Peak: some resource at max Overload: some resource thrashing Load (concurrent tasks) Performance

The Debate Performance vs. Programmability Current threads pick one Events somewhat better Questions Threads vs. Events? How do we performance and programmability? Performance Ease of Programming Current Threads Current Events Ideal

Our Position Thread-event duality still holds But threads are better anyway More natural to program Better fit with tools and hardware Compiler-runtime integration is key

The Duality Argument General assumption: follow “good practices” Observations Major concepts are analogous Program structure is similar Performance should be similar Given good implementations! ThreadsEvents Monitors Exported functions Call/return and fork/join Wait on condition variable Event handler & queue Events accepted Send message / await reply Wait for new messages Accept Conn. Write Response Read File Read Request Pin Cache Web Server Exit

The Duality Argument General assumption: follow “good practices” Observations Major concepts are analogous Program structure is similar Performance should be similar Given good implementations! ThreadsEvents Monitors Exported functions Call/return and fork/join Wait on condition variable Event handler & queue Events accepted Send message / await reply Wait for new messages Accept Conn. Write Response Read File Read Request Pin Cache Web Server Exit

The Duality Argument General assumption: follow “good practices” Observations Major concepts are analogous Program structure is similar Performance should be similar Given good implementations! ThreadsEvents Monitors Exported functions Call/return and fork/join Wait on condition variable Event handler & queue Events accepted Send message / await reply Wait for new messages Accept Conn. Write Response Read File Read Request Pin Cache Web Server Exit

“But Events Are Better!” Recent arguments for events Lower runtime overhead Better live state management Inexpensive synchronization More flexible control flow Better scheduling and locality All true but… No inherent problem with threads! Thread implementations can be improved

Runtime Overhead Criticism: Threads don’t perform well for high concurrency Response Avoid O(n) operations Minimize context switch overhead Simple scalability test Slightly modified GNU Pth Thread-per-task vs. single thread Same performance!

Live State Management Criticism: Stacks are bad for live state Response Fix with compiler help Stack overflow vs. wasted space Dynamically link stack frames Retain dead state Static lifetime analysis Plan arrangement of stack Put some data on heap Pop stack before tail calls Encourage inefficiency Warn about inefficiency Live Dead Unused Thread State (stack) Event State (heap)

Synchronization Criticism: Thread synchronization is heavyweight Response Cooperative multitasking works for threads, too! Also presents same problems Starvation & fairness Multiprocessors Unexpected blocking (page faults, etc.) Compiler support helps

Control Flow Criticism: Threads have restricted control flow Response Programmers use simple patterns Call / return Parallel calls Pipelines Complicated patterns are unnatural Hard to understand Likely to cause bugs

Scheduling Task Program Location Criticism: Thread schedulers are too generic Can’t use application-specific information Response 2D scheduling: task & program location Threads schedule based on task only Events schedule by location (e.g. SEDA) Allows batching Allows prediction for SRCT Threads can use 2D, too! Runtime system tracks current location Call graph allows prediction

Scheduling Task Program Location Threads Criticism: Thread schedulers are too generic Can’t use application-specific information Response 2D scheduling: task & program location Threads schedule based on task only Events schedule by location (e.g. SEDA) Allows batching Allows prediction for SRCT Threads can use 2D, too! Runtime system tracks current location Call graph allows prediction

Scheduling Criticism: Thread schedulers are too generic Can’t use application-specific information Response 2D scheduling: task & program location Threads schedule based on task only Events schedule by location (e.g. SEDA) Allows batching Allows prediction for SRCT Threads can use 2D, too! Runtime system tracks current location Call graph allows prediction Task Program Location Threads Events

The Proof’s in the Pudding User-level threads package Subset of pthreads Intercept blocking system calls No O(n) operations Support > 100K threads 5000 lines of C code Simple web server: Knot 700 lines of C code Similar performance Linear increase, then steady Drop-off due to poll() overhead KnotC (Favor Connections) KnotA (Favor Accept) Haboob Concurrent Clients Mbits / second

Our Big But… More natural programming model Control flow is more apparent Exception handling is easier State management is automatic Better fit with current tools & hardware Better existing infrastructure Allows better performance?

Control Flow Events obscure control flow For programmers and tools ThreadsEvents thread_main(int sock) { struct session s; accept_conn(sock, &s); read_request(&s); pin_cache(&s); write_response(&s); unpin(&s); } pin_cache(struct session *s) { pin(&s); if( !in_cache(&s) ) read_file(&s); } AcceptHandler(event e) { struct session *s = new_session(e); RequestHandler.enqueue(s); } RequestHandler(struct session *s) { …; CacheHandler.enqueue(s); } CacheHandler(struct session *s) { pin(s); if( !in_cache(s) ) ReadFileHandler.enqueue(s); else ResponseHandler.enqueue(s); }... ExitHandlerr(struct session *s) { …; unpin(&s); free_session(s); } Accept Conn. Write Response Read File Read Request Pin Cache Web Server Exit

Control Flow Accept Conn. Write Response Read File Read Request Pin Cache Web Server Exit ThreadsEvents thread_main(int sock) { struct session s; accept_conn(sock, &s); read_request(&s); pin_cache(&s); write_response(&s); unpin(&s); } pin_cache(struct session *s) { pin(&s); if( !in_cache(&s) ) read_file(&s); } CacheHandler(struct session *s) { pin(s); if( !in_cache(s) ) ReadFileHandler.enqueue(s); else ResponseHandler.enqueue(s); } RequestHandler(struct session *s) { …; CacheHandler.enqueue(s); }... ExitHandlerr(struct session *s) { …; unpin(&s); free_session(s); } AcceptHandler(event e) { struct session *s = new_session(e); RequestHandler.enqueue(s); } Events obscure control flow For programmers and tools

Exceptions Exceptions complicate control flow Harder to understand program flow Cause bugs in cleanup code Accept Conn. Write Response Read File Read Request Pin Cache Web Server Exit ThreadsEvents thread_main(int sock) { struct session s; accept_conn(sock, &s); if( !read_request(&s) ) return; pin_cache(&s); write_response(&s); unpin(&s); } pin_cache(struct session *s) { pin(&s); if( !in_cache(&s) ) read_file(&s); } CacheHandler(struct session *s) { pin(s); if( !in_cache(s) ) ReadFileHandler.enqueue(s); else ResponseHandler.enqueue(s); } RequestHandler(struct session *s) { …; if( error ) return; CacheHandler.enqueue(s); }... ExitHandlerr(struct session *s) { …; unpin(&s); free_session(s); } AcceptHandler(event e) { struct session *s = new_session(e); RequestHandler.enqueue(s); }

State Management ThreadsEvents thread_main(int sock) { struct session s; accept_conn(sock, &s); if( !read_request(&s) ) return; pin_cache(&s); write_response(&s); unpin(&s); } pin_cache(struct session *s) { pin(&s); if( !in_cache(&s) ) read_file(&s); } CacheHandler(struct session *s) { pin(s); if( !in_cache(s) ) ReadFileHandler.enqueue(s); else ResponseHandler.enqueue(s); } RequestHandler(struct session *s) { …; if( error ) return; CacheHandler.enqueue(s); }... ExitHandlerr(struct session *s) { …; unpin(&s); free_session(s); } AcceptHandler(event e) { struct session *s = new_session(e); RequestHandler.enqueue(s); } Accept Conn. Write Response Read File Read Request Pin Cache Web Server Exit Events require manual state management Hard to know when to free Use GC or risk bugs

Existing Infrastructure Lots of infrastructure for threads Debuggers Languages & compilers Consequences More amenable to analysis Less effort to get working systems

Better Performance? Function pointers & dynamic dispatch Limit compiler optimizations Hurt branch prediction & I-cache locality More context switches with events? Example: Haboob does 6x more than Knot Natural result of queues More investigation needed!

The Future: Compiler-Runtime Integration Insight Automate things event programmers do by hand Additional analysis for other things Specific targets Dynamic stack growth* Live state management Synchronization Scheduling* Improve performance and decrease complexity * Working prototype in threads package

Conclusion Threads  Events Performance Expressiveness Threads > Events Complexity / Manageability Performance and Ease of use? Compiler-runtime integration is key Performance Ease of Programming Current Threads Current Events New Threads?