Presentation is loading. Please wait.

Presentation is loading. Please wait.

From Coulouris, Dollimore, Kindberg and Blair Distributed Systems: Concepts and Design Chapter 3 System Models.

Similar presentations


Presentation on theme: "From Coulouris, Dollimore, Kindberg and Blair Distributed Systems: Concepts and Design Chapter 3 System Models."— Presentation transcript:

1 From Coulouris, Dollimore, Kindberg and Blair Distributed Systems: Concepts and Design Chapter 3 System Models

2 2 System Model  Provide an abstract, simplified but consistent description of a relevant aspect of distributed system design  Physical model  Architectural model  Fundamental model

3 3 Physical Model  Physical models capture the hardware composition of a system in terms of the computers and their interconnecting networks. mobile network global ISP regional ISP home network institutional network

4 4 Different Types of Physical Model  Baseline physical model  Early distributed system  Internet-scale distributed system  Contemporary distributed system  Distributed systems of system

5 5 Architectural Model  Architectural models describe a system in terms of the computational and communication tasks performed by its computational elements

6 6 Architectural Entities  What are the entities that are communicating in the distributed system?  How do they communicate, or, more specifically, what communication paradigm is used?  What (potentially changing) roles and responsibilities do they have in the overall architecture?  How are they mapped on to the physical distributed infrastructure (what is their placement)?

7 7 Communication Paradigm  Interprocess communication Socket programming  Remote invocation Request-reply model Remote procedure call Remote method invocation  Indirect communication Group communication Publish-subscribe

8 8 Client-Server Model

9 9 P2P Model

10 Web proxy server 11

11 Layering Architecture in Distributed System 11

12 12 Tier Architectures in Distributed System

13 13 Fundamental Model  Fundamental models examine individual aspects of a distributed system. Interaction model Failure model Security model

14 14 Interaction Model  Performance of communication channel Delay Bandwidth Jitter  Two types of models Synchronous Message exchange within a bounded delay Asynchronous Message exchange in a arbitrary delay

15 15 Failure Model  Omission failure Process omission failure Communication omission failure  Arbitrary failure  Timing failure  Masking failure

16 16 Security Model  Threats Threats to server Threats to client Threats to communication channel  Protection Cryptography Authentication Secure channel Protecting objects

17 Case Study: Google Search Engine 17

18 18 Main components  Crawler URL servers assign a list of URLs to crawlers Crawler request the page and store them in storeserver Storeserver compress the page and store them in repo  Indexer (and sorter) Reads repo and uncompress the page Generate barrels based on hit (occurrences, font, etc.) Parse out all the links to anchors URL resolver reads anchor and get URLs  PageRank The number of links (including both from and to) The importance of the links

19 Case Study: The Overall Google Systems Architecture

20 Example Google applications 20

21 Physical Model of the Google Infrastructure 21

22 Layering Architecture of Google Infrastructure 22

23 Overall architecture of GFS  Single master File and chunk namespace Mapping from file to chunks Chunk location 23

24 24 GFS  Chunk 64MB fixed size Chunk handle (64 bit unique identification) Default three replicas  Chunk location Masters polls chunkserver for location information Clients contact master for location information and cache it for a while Clients contact chunkserver directly for download

25 25 GFS Master  Metadata File and chunk namespace Mapping from file to chunk Location of each chunk  Problems Single point of failure, scalability bottleneck  Solution Shadow master, minimize master involvement  Operational log Keeps track of activities, providing a logical time line Replicated in multiple sites for reliability

26 Lease and mutation  Master picks one replica as primary;  Primary defines the mutation order;  The other replicas follow the same order;  Minimize overhead at the master;  Data flow is in a carefully defined linear chain; 26

27 Chubby 27  Chubby is a lock service  Primary Usage Synchronize access to shared resources  Other usage Primary election meta-data storage Root of distributed data structure  Lock service should be reliable and available

28 Chubby 28  A chubby cell consists of a small set of servers (replicas)  A master is elected from the replicas via a consensus protocol Master lease: several seconds If a master fails, a new one will be elected when the master leases expire  Client talks to the master via chubby library All replicas are listed in DNS; clients discover the master by talking to any replica

29 Chubby 29  Replicas maintain copies of a simple database  Clients send read/write requests only to the master  For a write: The master propagates it to replicas via the consensus protocol Replies after the write reaches a majority of replicas  For a read: The master satisfies the read alone

30 Chubby 30  If a replica fails and does not recover for a long time (a few hours) A fresh machine is selected to be a new replica, replacing the failed one It updates the DNS Obtains a recent copy of the database The current master polls DNS periodically to discover new replicas

31 Paxos Problem 31  Collection of processes proposing values only proposed value may be chosen only single value chosen learn of chosen value only when it has been  Proposers, acceptors, learners  Asynchronous, non-Byzantine model arbitrary speeds, fail by stopping, restart messages not corrupted

32 Paxos Algorithm 32  Phase 1 (a) Proposer sends prepare request with #n (b) Acceptor: if n > # of any other prepare it has replied to, respond with promise.  Phase 2 (a)If majority reply, proposer sends accept with value v (b)Acceptor accepts unless it responded to prepare with # higher than n

33 Bigtable 33  BigTable is a distributed storage system for managing structured data.  Designed to scale to a very large size Petabytes of data across thousands of servers  Used for many Google projects Web indexing, Personalized Search, Google Earth, Google Analytics, Google Finance, …  Flexible, high-performance solution for all of Google’s products

34 The Table Abstraction in Bigtable 34  Rows URL as key, arbitrary string ordered lexicographically  Columns Column family with associate information  Timestamps Different version of data in a cell

35 SSTable, Tablet and Table 35 Index 64K block SSTable Index 64K block SSTable Tablet Start:aardvarkEnd:apple Index 64K block SSTable Tablet aardvark apple Tablet apple_two_E boat

36 Tablet Location 36  Level 1: A file stored in chubby contains location of the root tablet, i.e., a directory of ranges (tablets) and associated meta-data. The root tablet never splits.  Level 2: Each meta-data tablet contains the location of a set of user tablets.  Level 3: A set of SSTable identifiers for each tablet.

37 Bigtable System Architecture 37


Download ppt "From Coulouris, Dollimore, Kindberg and Blair Distributed Systems: Concepts and Design Chapter 3 System Models."

Similar presentations


Ads by Google