Alternative system models

Slides:

Advertisements

Similar presentations

Multiple Processor Systems

Advertisements

There is more Consensus in Egalitarian Parliaments Presented by Shayan Saeed Used content from the author's presentation at SOSP '13

NETWORK ALGORITHMS Presenter- Kurchi Subhra Hazra.

Multiprocessors— Large vs. Small Scale Multiprocessors— Large vs. Small Scale.

Multiple Processor Systems

Cache Coherent Distributed Shared Memory. Motivations Small processor count –SMP machines –Single shared memory with multiple processors interconnected.

The Multikernel: A new OS architecture for scalable multicore systems Andrew Baumann et al CS530 Graduate Operating System Presented by.

1. Overview  Introduction  Motivations  Multikernel Model  Implementation – The Barrelfish  Performance Testing  Conclusion 2.

490dp Synchronous vs. Asynchronous Invocation Robert Grimm.

I/O Hardware n Incredible variety of I/O devices n Common concepts: – Port – connection point to the computer – Bus (daisy chain or shared direct access)

Distributed Systems Fall 2009 Replication Fall 20095DV0203 Outline Group communication Fault-tolerant services –Passive and active replication Highly.

Chapter 13: I/O Systems I/O Hardware Application I/O Interface

Computer System Overview Chapter 1. Basic computer structure CPU Memory memory bus I/O bus diskNet interface.

Paxos Made Simple Jinghe Zhang. Introduction Lock is the easiest way to manage concurrency Mutex and semaphore. Read and write locks. In distributed system:

Multiple Processor Systems. Multiprocessor Systems Continuous need for faster and powerful computers –shared memory model ( access nsec) –message passing.

Presented by Dr. Greg Speegle April 12,  Two-phase commit slow relative to local transaction processing  CAP Theorem  Option 1: Reduce availability.

Bringing Paxos Consensus in Multi-agent Systems Andrei Mocanu Costin Bădică University of Craiova.

Megastore: Providing Scalable, Highly Available Storage for Interactive Services J. Baker, C. Bond, J.C. Corbett, JJ Furman, A. Khorlin, J. Larson, J-M.

Architectures of distributed systems Fundamental Models

Review COMPSCI 210 Recitation 7th Dec 2012 Vamsi Thummala.

Advanced Computer Networks Topic 2: Characterization of Distributed Systems.

Multiple Processor Systems. Multiprocessor Systems Continuous need for faster computers –shared memory model ( access nsec) –message passing multiprocessor.

Distributed Shared Memory Based on Reference paper: Distributed Shared Memory, Concepts and Systems.

Disco : Running commodity operating system on scalable multiprocessor Edouard et al. Presented by Vidhya Sivasankaran.

S-Paxos: Eliminating the Leader Bottleneck

Latency Reduction Techniques for Remote Memory Access in ANEMONE Mark Lewandowski Department of Computer Science Florida State University.

Chapter 13 – I/O Systems (Pgs ). Devices  Two conflicting properties A. Growing uniformity in interfaces (both h/w and s/w): e.g., USB, TWAIN.

The Mach System Silberschatz et al Presented By Anjana Venkat.

6.894: Distributed Operating System Engineering Lecturers: Frans Kaashoek Robert Morris

T HE M ULTIKERNEL : A NEW OS ARCHITECTURE FOR SCALABLE MULTICORE SYSTEMS Presented by Mohammed Mustafa Distributed Systems CIS*6000 School of Computer.

Chapter 13: I/O Systems.

Last Class: Introduction

Module 12: I/O Systems I/O hardware Application I/O Interface

CS5102 High Performance Computer Systems Thread-Level Parallelism

Distributed Systems – Paxos

The Multikernel: A New OS Architecture for Scalable Multicore Systems

CSCI5570 Large Scale Data Processing Systems

Consistency in Distributed Systems

CMSC 611: Advanced Computer Architecture

The Multikernel A new OS architecture for scalable multicore systems

EECS 498 Introduction to Distributed Systems Fall 2017

Implementing Consistency -- Paxos

Outline Announcements Fault Tolerance.

IS 651: Distributed Systems Midterm

Operating System Concepts

13: I/O Systems I/O hardwared Application I/O Interface

CS703 - Advanced Operating Systems

Multiple Processor Systems

Building your own modern distributed system

Architectures of distributed systems Fundamental Models

Yiannis Nikolakopoulos

EEC 688/788 Secure and Dependable Computing

Architectures of distributed systems Fundamental Models

From Viewstamped Replication to BFT

Chapter 2: Operating-System Structures

EEC 688/788 Secure and Dependable Computing

EECS 498 Introduction to Distributed Systems Fall 2017

CS510 - Portland State University

Chapter 13: I/O Systems I/O Hardware Application I/O Interface

EEC 688/788 Secure and Dependable Computing

EEC 688/788 Secure and Dependable Computing

EEC 688/788 Secure and Dependable Computing

Architectures of distributed systems Fundamental Models

Database System Architectures

EEC 688/788 Secure and Dependable Computing

Chapter 2: Operating-System Structures

Chapter 13: I/O Systems.

Implementing Consistency -- Paxos

Module 12: I/O Systems I/O hardwared Application I/O Interface

Sisi Duan Assistant Professor Information Systems

Presentation transcript:

Alternative system models Tudor David

Outline The multi-core as a distributed system Case study: agreement The distributed system as a multi-core

The system model Whatever can be expressed using shared memory Concurrency: several communicating processes executing at the same time; Implicit communication: shared memory; Resources – shared between processes; Communication – implicit through shared resources; Synchronization – locks, condition variables, non-blocking algorithms, etc. Explicit communication: message passing; Resources – partitioned between processes; Communication – explicit message channels; Synchronization – message channels; Whatever can be expressed using shared memory can be expressed using message passing

So far – shared memory view set of registers that any process can read or write to; communication – implicit through these registers; problems: concurrency bugs – very common; scalability - not great when contention high; Scalability

Alternative model: message passing more verbose but – can better control how information is sent in the multi-core. P1 P2 R1 R2 R3 R4 how do we get message passing on multicores? dedicated hardware message passing channels (e.g. Tilera) more common – use dedicated cache lines for message queues

Programming using message passing System design – more similar to distributed systems; Map concepts from shared memory to message passing; A few examples: Synchronization, data structures: flat combining; Programming languages: e.g. Go, Erlang; Operating systems: the multikernel (e.g. Barrelfish)

Barrelfish: All Communication - Explicit Communication – exclusively message passing Easier reasoning: know what is accessed when and by whom Asynchronous operations – eliminate wait time Pipelining, batching More scalable Applications can still use shared memory

Barrelfish: OS Structure – Hardware Neutral Separate OS structure from hardware Machine dependent components Messaging HW interface Better layering, modularity Easier porting

Barrelfish: Replicated State No shared memory => replicate shared OS state Reduce interconnect load Decrease latencies Asynchronous client updates Possibly NUMA aware replication

Consistency of replicas: agreement protocol Implicit communication (shared memory) Explicit communication (message passing) Locally cached data Replicated state State machine replication Total ordering of updates Agreement High availability, High scalability How should we do message-passing agreement in a multi-core?

Outline The multi-core as a distributed system Case study: agreement The distributed system as a multi-core

First go – a blocking protocol Two-Phase Commit (2PC) 1. broadcast Prepare 2. wait for Acks 3. broadcast Commit/Rollback 4. wait for Acks Blocking, all messages go through coordinator

Is a blocking protocol appropriate? “Latency numbers every programmer should know” L1 cache reference 0.5 ns Branch mispredict 5 ns L2 cache reference 7 ns Mutex lock/unlock 25 ns Main memory reference 100 ns Compress 1K bytes 3 000 ns Send 1K bytes over 1 Gbps network 10 000 ns Read 4K randomly from SSD 150 000 ns Read 1MB sequentially from memory 250 000 ns Round trip within datacenter 500 000 ns Read 1 MB sequentially from SSD 1 000 000 ns Disk seek 10 000 000 ns Read 1 MB sequentially from disk 20 000 000 ns Send packet CA->Netherlands->CA 150 000 000 ns Source: Jeff Dean Blocking agreement – only as fast as the slowest participant Scheduling? I/O? Use a non-blocking protocol

Non-blocking agreement protocols Consensus ~ non-blocking agreement between distributed processes on one out of possibly multiple proposed values Paxos Tolerates non-malicious faults or unresponsive nodes: in multi-cores, slow cores Needs a majority of responses to progress (tolerates partitions) Phase 1: prepare Phase 2: accept Roles: Proposer Acceptor Learner Lots of variations and optimizations: CheapPaxos, MultiPaxos, FastPaxos etc. Usually – all roles on a physical node (Collapsed Paxos)

MultiPaxos Unless failed, keep same leader in subsequent rounds P P P

Does MultiPaxos scale in a multi-core? Limited scalability in the multi-core environment

A closer look at the multi-core environment Where does time go when sending a message? Large networks: Minimize number of rounds/instance Multi-core: Minimize the number of messages < 1 us ~100 us

Can we adapt Paxos to this scenario? Replication of service (availability): Advocate client commands Resolve contention between proposers, short-term memory (reliability, availability) A A A L L L Replication of data (reliability): Long-term memory Using one acceptor significantly reduces the number of messages

1Paxos: The failure-free case 2 3 P P P 1. P2: obtains active acceptor A1 and sends prepare_request(pn) 2. A1: if pn -> max. proposal received, replies to P2 with ack A A A 3. P2 -> A1 accept_request(pn, value) L L L 4. A1 broadcasts value to learners Common case: only steps 3 and 4

1Paxos: Switching the acceptor 2 3 1. P2 leader? P P P 2.PaxosUtility: P2 proposes A3 active acceptor Uncommitted proposed values A A A A 3. P2 -> A3: prepare_request L L L

1Paxos: Switching the leader 2 3 1. A1 – active acceptor? P P P P 2. PaxosUtility: P3 new leader and A1 active acceptor A A A 3. P3 -> A1: prepare_request L L L

Switching leader and acceptor The trade-off: while leader and active acceptor non-responsive at the same time ✖ liveness ✔safety P P P A A A Why is it reasonable? small probability event no network partitions if nodes not crashed, but slow -> system becomes responsive after a while L L L

Latency and throughput 45 clients 7 clients 6 clients 13 clients Smaller # of messages - smaller latency and increased throughput

Agreement - summary Multi-core – message passing distributed system, but distributed algorithm implementations different Agreement in multi-cores non blocking reduced # of messages Use one acceptor: 1Paxos reduced latency increased throughput

Outline The multi-core as a distributed system Case study: agreement The distributed system as a multi-core

Remote Direct Memory Access (RDMA) Read/Write remote memory NIC performs DMA requests Great performance Bypasses the kernel Bypasses the remote CPU Source: A. Dragojevic

FaRM: Fast Remote Memory (NSDI’14) – Dragojevic et al. Source: A. Dragojevic FaRM: Fast Remote Memory (NSDI’14) – Dragojevic et al.