From Coulouris, Dollimore, Kindberg and Blair Distributed Systems: Concepts and Design Chapter 3 System Models.

Slides:



Advertisements
Similar presentations
Introduction to cloud computing Jiaheng Lu Department of Computer Science Renmin University of China
Advertisements

Sanjay Ghemawat, Howard Gobioff and Shun-Tak Leung
G O O G L E F I L E S Y S T E M 陳 仕融 黃 振凱 林 佑恩 Z 1.
The Google File System Authors : Sanjay Ghemawat, Howard Gobioff, Shun-Tak Leung Presentation by: Vijay Kumar Chalasani 1CS5204 – Operating Systems.
Chubby Lock server for distributed applications 1Dennis Kafura – CS5204 – Operating Systems.
CSE 486/586, Spring 2012 CSE 486/586 Distributed Systems Google Chubby Lock Service Steve Ko Computer Sciences and Engineering University at Buffalo.
Bigtable: A Distributed Storage System for Structured Data Presenter: Guangdong Liu Jan 24 th, 2012.
Lecture 6 – Google File System (GFS) CSE 490h – Introduction to Distributed Computing, Winter 2008 Except as otherwise noted, the content of this presentation.
Lecture 7 – Bigtable CSE 490h – Introduction to Distributed Computing, Winter 2008 Except as otherwise noted, the content of this presentation is licensed.
The Google File System. Why? Google has lots of data –Cannot fit in traditional file system –Spans hundreds (thousands) of servers connected to (tens.
Google Bigtable A Distributed Storage System for Structured Data Hadi Salimi, Distributed Systems Laboratory, School of Computer Engineering, Iran University.
EEC-681/781 Distributed Computing Systems Lecture 3 Wenbing Zhao Department of Electrical and Computer Engineering Cleveland State University
CHUBBY and PAXOS Sergio Bernales 1Dennis Kafura – CS5204 – Operating Systems.
The Google File System.
7/2/2015EECS 584, Fall Bigtable: A Distributed Storage System for Structured Data Jing Zhang Reference: Handling Large Datasets at Google: Current.
Google File System.
Distributed storage for structured data
Bigtable: A Distributed Storage System for Structured Data
BigTable CSE 490h, Autumn What is BigTable? z “A BigTable is a sparse, distributed, persistent multidimensional sorted map. The map is indexed by.
Case Study - GFS.
Inexpensive Scalable Information Access Many Internet applications need to access data for millions of concurrent users Relational DBMS technology cannot.
Google Distributed System and Hadoop Lakshmi Thyagarajan.
Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung Google∗
The Chubby lock service for loosely-coupled distributed systems Mike Burrows (Google), OSDI 2006 Shimin Chen Big Data Reading Group.
Bigtable: A Distributed Storage System for Structured Data F. Chang, J. Dean, S. Ghemawat, W.C. Hsieh, D.A. Wallach M. Burrows, T. Chandra, A. Fikes, R.E.
1 The Google File System Reporter: You-Wei Zhang.
CSC 456 Operating Systems Seminar Presentation (11/13/2012) Leon Weingard, Liang Xin The Google File System.
Google Bigtable Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah A. Wallach, Mike Burrows, Tushar Chandra, Andrew Fikes, Robert E. Gruber.
1 System Models. 2 Outline Introduction Architectural models Fundamental models Guideline.
Exercises for Chapter 2: System models
Source: George Colouris, Jean Dollimore, Tim Kinderberg & Gordon Blair (2012). Distributed Systems: Concepts & Design (5 th Ed.). Essex: Addison-Wesley.
Google’s Big Table 1 Source: Chang et al., 2006: Bigtable: A Distributed Storage System for Structured Data.
Google Bigtable Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah A. Wallach, Mike Burrows, Tushar Chandra, Andrew Fikes, Robert E. Gruber.
1 Dennis Kafura – CS5204 – Operating Systems Big Table: Distributed Storage System For Structured Data Sergejs Melderis 1.
Bigtable: A Distributed Storage System for Structured Data 1.
Google Bigtable Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah A. Wallach, Mike Burrows, Tushar Chandra, Andrew Fikes, Robert E. Gruber.
Bigtable: A Distributed Storage System for Structured Data Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah A. Wallach, Mike Burrows,
MapReduce and GFS. Introduction r To understand Google’s file system let us look at the sort of processing that needs to be done r We will look at MapReduce.
Presenters: Rezan Amiri Sahar Delroshan
Architecture Models. Readings r Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn. 3 m Note: All figures from this book.
Paxos A Consensus Algorithm for Fault Tolerant Replication.
Eduardo Gutarra Velez. Outline Distributed Filesystems Motivation Google Filesystem Architecture The Metadata Consistency Model File Mutation.
GFS. Google r Servers are a mix of commodity machines and machines specifically designed for Google m Not necessarily the fastest m Purchases are based.
From Coulouris, Dollimore, Kindberg and Blair Distributed Systems: Concepts and Design Edition 5, © Addison-Wesley 2012 Slides for Chapter 21: Designing.
CSE 486/586, Spring 2014 CSE 486/586 Distributed Systems Google Chubby Lock Service Steve Ko Computer Sciences and Engineering University at Buffalo.
CSE 486/586, Spring 2013 CSE 486/586 Distributed Systems Google Chubby Lock Service Steve Ko Computer Sciences and Engineering University at Buffalo.
CSC590 Selected Topics Bigtable: A Distributed Storage System for Structured Data Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah A.
Seminar on Service Oriented Architecture Distributed Systems Architectural Models From Coulouris, 5 th Ed. SOA Seminar Coulouris 5Ed.1.
Eduardo Gutarra Velez. Outline Distributed Filesystems Motivation Google Filesystem Architecture Chunkservers Master Consistency Model File Mutation Garbage.
Exercises for Chapter 2: System models From Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edition 4, © Pearson Education 2005.
Bigtable : A Distributed Storage System for Structured Data Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah A. Wallach Mike Burrows,
Bigtable: A Distributed Storage System for Structured Data
Silberschatz, Galvin and Gagne ©2009 Operating System Concepts – 8 th Edition, Lecture 24: GFS.
CSE 486/586, Spring 2012 CSE 486/586 Distributed Systems Paxos Steve Ko Computer Sciences and Engineering University at Buffalo.
From Coulouris, Dollimore, Kindberg and Blair Distributed Systems: Concepts and Design Edition 5, © Addison-Wesley 2012 System Models by Dr. Sarmad Sadik.
Bigtable: A Distributed Storage System for Structured Data Google Inc. OSDI 2006.
The Google File System Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung Presenter: Chao-Han Tsai (Some slides adapted from the Google’s series lectures)
Bigtable A Distributed Storage System for Structured Data.
PERFORMANCE MANAGEMENT IMPROVING PERFORMANCE TECHNIQUES Network management system 1.
Google Cloud computing techniques (Lecture 03) 18th Jan 20161Dr.S.Sridhar, Director, RVCT, RVCE, Bangalore
Sanjay Ghemawat, Howard Gobioff, Shun-Tak Leung
Google File System.
CSE-291 (Cloud Computing) Fall 2016
Slides for Chapter 2: Architectural Models
The Google File System Sanjay Ghemawat, Howard Gobioff and Shun-Tak Leung Google Presented by Jiamin Huang EECS 582 – W16.
آزمايشگاه سيستمهای هوشمند علی کمالی زمستان 95
Cloud Computing Storage Systems
THE GOOGLE FILE SYSTEM.
Chapter 2: System models
Presentation transcript:

From Coulouris, Dollimore, Kindberg and Blair Distributed Systems: Concepts and Design Chapter 3 System Models

2 System Model  Provide an abstract, simplified but consistent description of a relevant aspect of distributed system design  Physical model  Architectural model  Fundamental model

3 Physical Model  Physical models capture the hardware composition of a system in terms of the computers and their interconnecting networks. mobile network global ISP regional ISP home network institutional network

4 Different Types of Physical Model  Baseline physical model  Early distributed system  Internet-scale distributed system  Contemporary distributed system  Distributed systems of system

5 Architectural Model  Architectural models describe a system in terms of the computational and communication tasks performed by its computational elements

6 Architectural Entities  What are the entities that are communicating in the distributed system?  How do they communicate, or, more specifically, what communication paradigm is used?  What (potentially changing) roles and responsibilities do they have in the overall architecture?  How are they mapped on to the physical distributed infrastructure (what is their placement)?

7 Communication Paradigm  Interprocess communication Socket programming  Remote invocation Request-reply model Remote procedure call Remote method invocation  Indirect communication Group communication Publish-subscribe

8 Client-Server Model

9 P2P Model

Web proxy server 11

Layering Architecture in Distributed System 11

12 Tier Architectures in Distributed System

13 Fundamental Model  Fundamental models examine individual aspects of a distributed system. Interaction model Failure model Security model

14 Interaction Model  Performance of communication channel Delay Bandwidth Jitter  Two types of models Synchronous Message exchange within a bounded delay Asynchronous Message exchange in a arbitrary delay

15 Failure Model  Omission failure Process omission failure Communication omission failure  Arbitrary failure  Timing failure  Masking failure

16 Security Model  Threats Threats to server Threats to client Threats to communication channel  Protection Cryptography Authentication Secure channel Protecting objects

Case Study: Google Search Engine 17

18 Main components  Crawler URL servers assign a list of URLs to crawlers Crawler request the page and store them in storeserver Storeserver compress the page and store them in repo  Indexer (and sorter) Reads repo and uncompress the page Generate barrels based on hit (occurrences, font, etc.) Parse out all the links to anchors URL resolver reads anchor and get URLs  PageRank The number of links (including both from and to) The importance of the links

Case Study: The Overall Google Systems Architecture

Example Google applications 20

Physical Model of the Google Infrastructure 21

Layering Architecture of Google Infrastructure 22

Overall architecture of GFS  Single master File and chunk namespace Mapping from file to chunks Chunk location 23

24 GFS  Chunk 64MB fixed size Chunk handle (64 bit unique identification) Default three replicas  Chunk location Masters polls chunkserver for location information Clients contact master for location information and cache it for a while Clients contact chunkserver directly for download

25 GFS Master  Metadata File and chunk namespace Mapping from file to chunk Location of each chunk  Problems Single point of failure, scalability bottleneck  Solution Shadow master, minimize master involvement  Operational log Keeps track of activities, providing a logical time line Replicated in multiple sites for reliability

Lease and mutation  Master picks one replica as primary;  Primary defines the mutation order;  The other replicas follow the same order;  Minimize overhead at the master;  Data flow is in a carefully defined linear chain; 26

Chubby 27  Chubby is a lock service  Primary Usage Synchronize access to shared resources  Other usage Primary election meta-data storage Root of distributed data structure  Lock service should be reliable and available

Chubby 28  A chubby cell consists of a small set of servers (replicas)  A master is elected from the replicas via a consensus protocol Master lease: several seconds If a master fails, a new one will be elected when the master leases expire  Client talks to the master via chubby library All replicas are listed in DNS; clients discover the master by talking to any replica

Chubby 29  Replicas maintain copies of a simple database  Clients send read/write requests only to the master  For a write: The master propagates it to replicas via the consensus protocol Replies after the write reaches a majority of replicas  For a read: The master satisfies the read alone

Chubby 30  If a replica fails and does not recover for a long time (a few hours) A fresh machine is selected to be a new replica, replacing the failed one It updates the DNS Obtains a recent copy of the database The current master polls DNS periodically to discover new replicas

Paxos Problem 31  Collection of processes proposing values only proposed value may be chosen only single value chosen learn of chosen value only when it has been  Proposers, acceptors, learners  Asynchronous, non-Byzantine model arbitrary speeds, fail by stopping, restart messages not corrupted

Paxos Algorithm 32  Phase 1 (a) Proposer sends prepare request with #n (b) Acceptor: if n > # of any other prepare it has replied to, respond with promise.  Phase 2 (a)If majority reply, proposer sends accept with value v (b)Acceptor accepts unless it responded to prepare with # higher than n

Bigtable 33  BigTable is a distributed storage system for managing structured data.  Designed to scale to a very large size Petabytes of data across thousands of servers  Used for many Google projects Web indexing, Personalized Search, Google Earth, Google Analytics, Google Finance, …  Flexible, high-performance solution for all of Google’s products

The Table Abstraction in Bigtable 34  Rows URL as key, arbitrary string ordered lexicographically  Columns Column family with associate information  Timestamps Different version of data in a cell

SSTable, Tablet and Table 35 Index 64K block SSTable Index 64K block SSTable Tablet Start:aardvarkEnd:apple Index 64K block SSTable Tablet aardvark apple Tablet apple_two_E boat

Tablet Location 36  Level 1: A file stored in chubby contains location of the root tablet, i.e., a directory of ranges (tablets) and associated meta-data. The root tablet never splits.  Level 2: Each meta-data tablet contains the location of a set of user tablets.  Level 3: A set of SSTable identifiers for each tablet.

Bigtable System Architecture 37