The Future of Distributed Computing Renaissance or Reformation? Maurice Herlihy Brown University.

Slides:



Advertisements
Similar presentations
Introduction Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit TexPoint fonts used in EMF. Read the TexPoint manual.
Advertisements

The Art of Multiprocessor Programming Nir Shavit, Ori Shalev CS Spring 2007 (Based on the book by Herlihy and Shavit)
Concurrency Issues Motivation, Problems, Directions Dennis Kafura - CS Operating Systems1.
Multiprocessors— Large vs. Small Scale Multiprocessors— Large vs. Small Scale.
Multi-Level Caches Vittorio Zaccaria. Preview What you have seen: Data organization, Associativity, Cache size Policies -- how to manage the data once.
A KTEC Center of Excellence 1 Cooperative Caching for Chip Multiprocessors Jichuan Chang and Gurindar S. Sohi University of Wisconsin-Madison.
Introduction Companion slides for
Prof. Srinidhi Varadarajan Director Center for High-End Computing Systems.
Transactional Memory (TM) Evan Jolley EE 6633 December 7, 2012.
The Future of Concurrency Theory Renaissance or Reformation? (Met dank aan Maurice Herlihy) Frits Vaandrager.
March 18, 2008SSE Meeting 1 Mary Hall Dept. of Computer Science and Information Sciences Institute Multicore Chips and Parallel Programming.
Progress Guarantee for Parallel Programs via Bounded Lock-Freedom Erez Petrank – Technion Madanlal Musuvathi- Microsoft Bjarne Steensgaard - Microsoft.
SYNAR Systems Networking and Architecture Group CMPT 886: Special Topics in Operating Systems and Computer Architecture Dr. Alexandra Fedorova School of.
It Ain’t the Meat, it’s the Notion Why Theory is Essential to Teaching Concurrent Programming Maurice Herlihy Brown University.
1 Lecture 21: Transactional Memory Topics: consistency model recap, introduction to transactional memory.
1 Johannes Schneider Transactional Memory: How to Perform Load Adaption in a Simple And Distributed Manner Johannes Schneider David Hasenfratz Roger Wattenhofer.
Lock vs. Lock-Free memory Fahad Alduraibi, Aws Ahmad, and Eman Elrifaei.
CS510 Concurrent Systems Class 2 A Lock-Free Multiprocessor OS Kernel.
CS510 Concurrent Systems Class 13 Software Transactional Memory Should Not be Obstruction-Free.
1 New Architectures Need New Languages A triumph of optimism over experience! Ian Watson 3 rd July 2009.
Introduction Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit Modified by Rajeev Alur for CIS 640 at Penn, Spring.
1 Instant replay  The semester was split into roughly four parts. —The 1st quarter covered instruction set architectures—the connection between software.
1 © R. Guerraoui Seth Gilbert Professor: Rachid Guerraoui Assistants: M. Kapalka and A. Dragojevic Distributed Programming Laboratory.
An Introduction to Software Transactional Memory
Computer System Architectures Computer System Software
Programming Paradigms for Concurrency Part 2: Transactional Memories Vasu Singh
Art of Multiprocessor Programming 1 Transactional Memory Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit.
Transactional Memory Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit TexPoint fonts used in EMF. Read the TexPoint.
Parallel and Distributed Systems Instructor: Xin Yuan Department of Computer Science Florida State University.
Multi-Core Architectures
Multi-core Programming Introduction Topics. Topics General Ideas Moore’s Law Amdahl's Law Processes and Threads Concurrency vs. Parallelism.
Transactional Memory The Art of Multiprocessor Programming Herlihy and Shavit.
Håkan Sundell, Chalmers University of Technology 1 NOBLE: A Non-Blocking Inter-Process Communication Library Håkan Sundell Philippas.
A Consistency Framework for Iteration Operations in Concurrent Data Structures Yiannis Nikolakopoulos A. Gidenstam M. Papatriantafilou P. Tsigas Distributed.
Parallel Processing Sharing the load. Inside a Processor Chip in Package Circuits Primarily Crystalline Silicon 1 mm – 25 mm on a side 100 million to.
WG5: Applications & Performance Evaluation Pascal Felber
Chapter 1 Performance & Technology Trends Read Sections 1.5, 1.6, and 1.8.
Dr. Alexandra Fedorova School of Computing Science SFU
Kernel Locking Techniques by Robert Love presented by Scott Price.
On the Performance of Window-Based Contention Managers for Transactional Memory Gokarna Sharma and Costas Busch Louisiana State University.
Transactional Memory R. Guerraoui, EPFL. Locking is ’’history’’ Lock-freedom is difficult.
A Methodology for Creating Fast Wait-Free Data Structures Alex Koganand Erez Petrank Computer Science Technion, Israel.
1 Lecture #21 Shared Objects and Concurrent Programming This material is not available in the textbook. The online powerpoint presentations contain the.
Software Transactional Memory Should Not Be Obstruction-Free Robert Ennals Presented by Abdulai Sei.
CS510 Concurrent Systems Jonathan Walpole. RCU Usage in Linux.
Multiprocessor Architecture Basics Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit.
Lecture 27 Multiprocessor Scheduling. Last lecture: VMM Two old problems: CPU virtualization and memory virtualization I/O virtualization Today Issues.
MULTIVIE W Slide 1 (of 21) Software Transactional Memory Should Not Be Obstruction Free Paper: Robert Ennals Presenter: Emerson Murphy-Hill.
Parallel Data Structures. Story so far Wirth’s motto –Algorithm + Data structure = Program So far, we have studied –parallelism in regular and irregular.
Queue Locks and Local Spinning Some Slides based on: The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit.
The Relative Power of Synchronization Operations Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit.
Hardware Trends CSE451 Andrew Whitaker. Motivation Hardware moves quickly OS code tends to stick around for a while “System building” extends way beyond.
Hardware Trends CSE451 Andrew Whitaker. Motivation Hardware moves quickly OS code tends to stick around for a while “System building” extends way beyond.
Commutativity and Coarse-Grained Transactions Maurice Herlihy Brown University Joint work with Eric Koskinen and Matthew Parkinson (POPL 10)
Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit Concurrent Skip Lists.
Novel Paradigms of Parallel Programming Prof. Smruti R. Sarangi IIT Delhi.
Transactional Memory Companion slides for
Conclusions on CS3014 David Gregg Department of Computer Science
Multiprocessor Programming
Transactional Memory Companion slides for
Transactional Memory TexPoint fonts used in EMF.
Concurrent Objects Companion slides for
Challenges in Concurrent Computing
Transactional Memory Companion slides for
Erlang Multicore support
Does Hardware Transactional Memory Change Everything?
Software Transactional Memory Should Not be Obstruction-Free
Programming with Shared Memory Specifying parallelism
I, for one, Welcome our new Multicore Overlords …
Parallel Data Structures
Presentation transcript:

The Future of Distributed Computing Renaissance or Reformation? Maurice Herlihy Brown University

PODC Le Quatorze Juillet SAN FRANCISCO, May Intel said on Friday that it was scrapping its development of two microprocessors, a move that is a shift in the company's business strategy…. New York Times

PODC Moore’s Law (hat tip: Simon Peyton-Jones) Clock speed flattening sharply Transistor count still rising

PODC Art of Multiprocessor Programming4 Still on some of your desktops: The Uniprocesor memory cpu

PODC Art of Multiprocessor Programming5 In the Enterprise: The Shared Memory Multiprocessor (SMP) cache Bus shared memory cache

PODC Art of Multiprocessor Programming6 Your New Desktop: The Multicore Processor (CMP) cache Bus shared memory cache All on the same chip Sun T2000 Niagara

PODC Multicores are Here “Learn how the multi-core processor architecture plays a central role in Intel's platform approach. ….” “AMD is leading the industry to multi- core technology for the x86 based computing market …” “Sun's multicore strategy centers around multi-threaded software.... “

PODC Why should we care? First time ever, –PODC research relevant to Real World™ First time ever, –Real World™ relevant to PODC Plato vs Aristotle

PODC Renaissance? World (re)discovers PODC community achievements This has already happened (sort-of) World learns of PODC results

PODC Reformation? Can we respond to the Real World’s challenges? Are we working on problems that matter? Can we recognize what’s going to be important? Bonfire of the Vanities

PODC In Classic Antiquity Time cured software bloat Double your path length? –Wait 6 months, until –Processor speed catches up

PODC Multiprocessor companies failed in 80s Outstripped by sequential processors Field respected, but not taken seriously Parallelism Didn’t Matter

PODC The Old Order Lies in Ruins Six months means more cores, same clock speed Must exploit more paralellism No one really knows how to do this

PODC What Keeps Microsoft and Intel awake at Night? If more cores does not deliver more value … Then why upgrade? ?

PODC Washing Machine Science? Computers could become like washing machines You don’t trade it in every 2 years for a cooler model You keep it until it breaks.

PODC No Cores Please, we’re Theorists! Computer Science is driven by Moore’s law Each year we can do things we couldn’t do last year Means funding, students, excitement !

PODC With Sudden Relevance Comes Great Responsibility Many challenges involve –concurrent algorithms –Data structures –formal models – complexity & lower bounds, –…–… Stuff we’re good at.

PODC Disclaimer What follows are my Opinions (mine, mine, mine!) –And prejudices Targeted to people –New in the field No offence intended –In most cases.

PODC Concurrent Programming Today

PODC Coarse-Grained Locking Easily made correct … But not scalable.

PODC Fine-Grained Locking Here comes trouble …

PODC Locks are not Robust If a thread holding a lock is delayed … No one else can make progress

PODC Locking Relies on Conventions Relation between –Lock bit and object bits –Exists only in programmer’s mind /* * When a locked buffer is visible to the I/O layer * BH_Launder is set. This means before unlocking * we must clear BH_Launder,mb() on alpha and then * clear BH_Lock, so no reader can see BH_Launder set * on an unlocked buffer and then risk to deadlock. */ Actual comment from Linux Kernel (hat tip: Bradley Kuszmaul)

PODC Sadistic Homework enq(x) deq(y) FIFO queue No interference if ends “far enough” apart

PODC Sadistic Homework enq(x) deq(y) FIFO queue Interference OK if ends “close enough” together

PODC You Try It … One lock? –Too Conservative Locks at each end? –Deadlock, too complicated, etc Publishable result? –Once, maybe still?

PODC Locks do not compose add(T 1, item) delete(T 1, item) add(T 2, item) item Move from T 1 to T 2 Must lock T 2 before deleting from T 1 lock T2 lock T1 item Exposing lock internals breaks abstraction Hash Table Must lock T 1 before adding item

PODC The Transactional Manifesto What we do now is inadequate to meet the multicore challenge Research Agenda –Replace locking with a transactional API –Design languages to support this model –Implement the run-time to be fast enough

PODC © 2006 Herlihy & Shavit29 Public void enq(item x) { Qnode q = new Qnode(x); q.next = this.tail; this.tail.next = q; } Sadistic Homework Revisited (1) Write sequential Code

PODC © 2006 Herlihy & Shavit30 Public void LeftEnq(item x) { atomic { Qnode q = new Qnode(x); q.next = this.tail; this.tail.next = q; } Sadistic Homework Revisited (1)

PODC © 2006 Herlihy & Shavit31 Public void LeftEnq(item x) { atomic { Qnode q = new Qnode(x); q.next = this.tail; this.tail.next = q; } Sadistic Homework Revisited (1) Enclose in atomic block

PODC © 2006 Herlihy & Shavit32 Warning Not always this simple –Conditional waits –Enhanced concurrency –Complex patterns But often it is –Works for sadistic homework

PODC © 2006 Herlihy & Shavit33 Public void Transfer(Queue q1, q2) { atomic { T x = q1.deq(); q2.enq(x); } Composition (1) Trivial or what?

PODC Not All Skittles and Beer Algorithmic choices –Lower bounds –Better algorithms Language design Semantic issues –Like memory models –Atomicity checking

PODC Contention Management & Scheduling How to resolve conflicts? Who moves forward and who rolls back? Lots of empirical work but formal work in infancy Judgment of Solomon

PODC I/O & System Calls? Some I/O revocable –Provide transaction- safe libraries –Undoable file system/DB calls Some not –Opening cash drawer –Firing missile

PODC Privatization Transaction makes object inaccessible Works on it without synchronization Works with locks … But not necessarily with transactions … Need algorithms and models!

PODC Strong vs Weak Isolation How do transactional & non-transactional threads synchronize? Similar to memory- model theory? Efficient algorithms?

PODC Single Global Lock Semantics? Transactions act as if it acquires SGL Good: –Intuitively appealing Bad: –What about aborted transactions? –Expensive? Need better models

PODC Progress, Performance Metrics and Lower Bounds Wait-free –Everyone makes progress Lock-free –Someone makes progress Obstruction-free –Solo threads make progress

PODC Obstruction-Free? Experience suggests simpler, more efficient and easier to reason about But no real formal justification Progress conditions imperfectly understood

PODC Formal Models of Performance Asynchrony

PODC Formal Models of Performance Asynchrony Multi-level Memory

PODC Formal Models of Performance Asynchrony Multi-level Memory Contention

PODC Formal Models of Performance Asynchrony Multi-level Memory Contention Memory Models

PODC Formal Models of Performance Asynchrony Multi-level Memory Contention Memory Models Reads, writes, CAS, TM and other stuff we may devise …

PODC Formal Verification Concurrent algorithms are hard Need routine verification of real algorithms Model checking? Theorem proving? Probably both

PODC PODC Victories Byzantine agreement

PODC PODC Victories Byzantine agreement Paxos, group communication

PODC PODC Victories Byzantine agreement Paxos, group communication Replication algorithms Photoshop™ replication algorithm

PODC PODC Victories Byzantine agreement Paxos, group communication Replication Lock-free & wait- free algorithms

PODC PODC Victories Byzantine agreement Paxos, group communication Replication Lock-free & wait-free algorithms Formalizing what needs to to be formalized!

PODC An Insurmountable Opportunity! (hat tip: Walt Kelley) Multicore forces us to rethink almost everything

PODC An Insurmountable Opportunity! (hat tip: Walt Kelley) Multicore forces us to rethink almost everything The fate of CS as a vibrant field depends on our success

PODC An Insurmountable Opportunity! (hat tip: Walt Kelley) Multicore forces us to rethink almost everything The fate of CS as a vibrant field depends on our success PODC community has unique insights & advantages

PODC An Insurmountable Opportunity! (hat tip: Walt Kelley) Multicore forces us to rethink almost everything The fate of CS as a vibrant field depends on our success PODC community has unique insights & advantages Are we equal to the task?

PODC This work is licensed under a Creative Commons Attribution- ShareAlike 2.5 License.Creative Commons Attribution- ShareAlike 2.5 License