Presentation is loading. Please wait.

Presentation is loading. Please wait.

CMSC Cluster Computing Basics

Similar presentations


Presentation on theme: "CMSC Cluster Computing Basics"— Presentation transcript:

1 CMSC 34702-1 Cluster Computing Basics
Junchen Jiang The University of Chicago October 8, 2018

2 MapReduce: Simplified Data Processing on Large Clusters  The Google File System Bigtable: A Distributed Storage System for Structured Data Cassandra - A Decentralized Structured Storage System

3 Consistency, Availability, Partition Tolerance
x x Replica 1 Replica 2

4 Consistency, Availability, Partition Tolerance
Any read must return the last written value set(y) y x Replica 1 Replica 2

5 Consistency, Availability, Partition Tolerance
Any read must return the last written value set(y) y y Replica 1 Replica 2

6 Consistency, Availability, Partition Tolerance
Any read must return the last written value set(y) y get() y y Replica 1 Replica 2

7 Consistency, Availability, Partition Tolerance
Any read must return the last written value x get() Availability Every request must result in a response x x Replica 1 Replica 2

8 Consistency, Availability, Partition Tolerance
Any read must return the last written value y get() Availability Every request must result in a response Partition Tolerance Network can lose any messages between servers y x Replica 1 Replica 2

9 Cassandra: Gossip-based consensus protocol
Consistency Any read must return the last written value set(y) x get() Availability Every request must result in a response Partition Tolerance Network can lose any messages between servers

10 Bigtable: Paxos-based consensus protocol (Chubby)
Consistency Any read must return the last written value set(y) Availability Every request must result in a response Chubby Master Chubby Slave Partition Tolerance Network can lose any messages between servers Chubby Slave Chubby Slave Service is unavailable until a quorum is reached

11 Is it possible to achieve all three simultaneously?
Cassandra Bigtable Impossible Consistency Any read must return the last written value Availability Every request must result in a response Partition Tolerance Network can lose any messages between servers Unfortunately, No. (CAP Theorem) (Eric Brewer.

12 This Class: Stream Processing


Download ppt "CMSC Cluster Computing Basics"

Similar presentations


Ads by Google