Presentation is loading. Please wait.

Presentation is loading. Please wait.

Distributed Storage System Survey

Similar presentations


Presentation on theme: "Distributed Storage System Survey"— Presentation transcript:

1 Distributed Storage System Survey
Yang Kun

2 Agenda 1. History of DSS 2. Definition & Terminology 3. Basic Factors
4. DSS Common Design 5. Basic Theories 6. Popular Algorithms 7. Replication Strategies 8. Implementations 9. Open Source & Business

3 History of DSS Network File System (1980s)

4 History of DSS Storage Area Network(SAN)File System(1990s)

5 History of DSS Object oriented parallel file system (2000s)

6 History of DSS Cloud Storage

7 Definition & Terminology
Transparency network-transparency, user-mobility Performance Measurement The amount of time needed to satisfy service requests. The performance should be comparable to that of a conventional file system.

8 Definition & Terminology
Fault Tolerance: 1. Communication faults, machine failures ( of type fail stop), storage device crashes, decays of storage media. Scalability: A scalable system should react more gracefully to increased load The performance should degrade more moderately than that of a non-scalable system. The resources should reach a saturated state later compared with a non-scalable system.

9 Definition & Terminology
Consistency: Consistency requires that there must exist a total order on all operations such that each operation looks as if it were completed at a single instant. Availability: Every request received by a non-failing node in the system must result in a response. Reliability

10 Basic Factors Location Transparency User mobility Security Performance
Scalability Availability Failure Tolerance

11 DSS Common Design Client: Reading Client: Writing

12 Basic Theories CAP Theory ACID vs. BASE Model Quorum NRW

13 CAP Theory

14 CAP Theory In a partition network(both in synchronous and partially synchronous), it is impossible for a web service to provide consistency, availability and partition-tolerance at the same time. Consistency Availability Partition-tolerance

15 CAP Theory CP: All data in only one node, and other node read/write from this node. CA: Database System AP: Make sure that returns the value every time. Cassandra = A + P + Eventually Consistency

16 Eventually consistent
ACID vs. BASE Model ACID BASE Atomic Basically Available Consistency Eventually consistent Isolation Soft state Durability

17 Quorum NRW N: Replica's mount, that is how many backup for each data object. R: The minimum mount of successful reading, that is the minimum mount for identifying a reading operation is successful. W: The minimum mount of successful writing, that is the minimum mount for identifying a writing operation is successful. The three factors decide the availability, consistency and fault-tolerance. And Strong consistency can be guaranteed only if W + R > N.

18 Popular Algorithms PAXOS Algorithms Roles: Proposer, Acceptor, Learner
Phases: Accept, Learn # decree Quorum & Voters A B Γ E 2 a No Yes - 5 b 14 27 29 “Paxos Made Simple” Leslie Lamport 01 Nov. 2001 “The Part-Time Parliament” Leslie Lamport, ACM Transactions on Computer Systems 16,2(May 1998),

19 PAXOS Phase 1. A proposer selects a proposal number n and sends a prepare request with number n to a majority of acceptors. If an acceptor receives a prepare request with number n greater than that of any prepare request to which it has already responded, then it responds to the request with a promise not to accept any more proposals numbered less than n and with the highest numbered proposal (if any) that it has accepted.

20 PAXOS

21 Popular Algorithms Consistent Hashing

22 Popular Algorithms Mutual Algorithms Lamport Algorithm (3*(n - 1))
Improved Lamport Algorithm (3*(n - 1)) Ricart–Agrawala algorithm (2*(n - 1)) Maekawa Algorithm Roucairol-Carvalho Algorithm

23 Popular Algorithms Election Algorithms
Chang-Roberts Algorithm ( n log n) Garcia-Molina's bully Algorithm Non-based on Comparison Algorithms

24 Popular Algorithms Bidding Algorithms Self Stabilization Algorithms

25 Replication Strategies
Asynchronous Master/Slave Replication Log appends are acknowledged at the master in parallel with transmission to slaves. (Not support ACID) Synchronous Master/Slave Replication A master waits for changes to be mirrored to slaves before acknowledging them. (Need timely detection) Optimistic Replication Any member of a homogeneous replica group can accept mutations (Order is not known, transaction is impossible)

26 Chain Replication

27 CRAQ Chain Replication with Apportioned Queries

28 Funnel Replication Topology Vector Clock Total Order Write Request
(key, value, vector clock, originating head replica)

29 Atomic Commit Protocol
Two-PC 1. Voting phase The coordinator requests all participating sites to prepare to commit. 2. Decision phase The coordinator either commits the transaction if all participants are prepared-to-commit (voted “yes”), or aborts the transaction if any participant has decided to abort (voted “no”).

30 Atomic Commit Protocol
Presumed Abort Protocol It is designed to reduce the cost associated with aborting transactions. Presumed Commit Protocol It is designed to reduce the cost associated with committing transactions through interpret missing information about transactions as commit decisions. One-PC One-Phase Commit protocol consists of only a single phase which is the decision phase of 2PC. One-Two-PC

31 Implementations BigTable Windows Azure Storage Google MegaStore Chubby

32 Open Source & Business Business Open Source
Amazon Simple Storage Service MongoDB Windows Azure Storage MemcacheDB Google MegaStore ThruDB Google BigTable Hbase Alibaba's Cloud Storage Cassandra IBM XIV Scalaris Chubby ZooKeeper

33 Thank you!!!


Download ppt "Distributed Storage System Survey"

Similar presentations


Ads by Google