Presentation is loading. Please wait.

Presentation is loading. Please wait.

Building and Programming the Cloud, Mysore, Jan 2010 1 Accountable distributed systems and the accountable cloud Peter Druschel joint work with Andreas.

Similar presentations


Presentation on theme: "Building and Programming the Cloud, Mysore, Jan 2010 1 Accountable distributed systems and the accountable cloud Peter Druschel joint work with Andreas."— Presentation transcript:

1 Building and Programming the Cloud, Mysore, Jan 2010 1 Accountable distributed systems and the accountable cloud Peter Druschel joint work with Andreas Haeberlen 1, Petr Kuznetsov 2, Rodrigo Rodrigues 1 University of Pennsylvania 2 TU Berlin/Deutsche Telekom Labs

2 2 Outline Why accountability? A definition A practical implementation: PeerReview Accountability in the Cloud Technical Challenges Conclusion Building and Programming the Cloud, Mysore, Jan 2010

3 3 What is the problem? Building and Programming the Cloud, Mysore, Jan 2010 Multiple administrative domains (federated, p2p) Multiple stakeholders (hosting, Web) different actors, somewhat different interests lack of global visibility, control Complex faults software faults, mis-configuration, negligence, disgruntled employees, outside attacks, manipulation Lack of transparency

4 4 Learning from the 'offline' world Relies heavily on accountability to deal with faults, misbehavior Example: Banking Record can be used to (manually) detect problems identify the responsible party convince that a problem does (not) exist RequirementSolution CommitmentSigned receipts Tamper-evident recordDouble-entry bookkeeping InspectionsAudits Building and Programming the Cloud, Mysore, Jan 2010

5 5 What does accountability mean in distributed systems? 1. Tamper-evident record of each node‘s actions 2. (Automated) audit for fault detection, localization 3. Evidence to convince a third party that a fault has (not) occured Accountability provides transparency trust incentives to avoid faults Building and Programming the Cloud, Mysore, Jan 2010

6 6 Outline Why accountability? A definition A practical implementation: PeerReview Accountability in the Cloud Technical Challenges Conclusion Building and Programming the Cloud, Mysore, Jan 2010

7 7 Ideal accountability Whenever a node is faulty, the system generates a proof of misbehavior against that node Fault := Node deviates from expected behavior Our goal is to automatically detect faults identify the faulty nodes convince others that a node is (or is not) faulty Can we build a system that provides the following guarantee? Building and Programming the Cloud, Mysore, Jan 2010

8 8 Can we detect all faults? Problem: Faults that affect only a node's internal state Would require online trusted probes at each node Focus on observable faults: Faults that affect a correct node Can detect observable faults without requiring trusted components A A X C C 1001010110 0010110101 1100100100 0 Building and Programming the Cloud, Mysore, Jan 2010

9 9 Can we always get a proof? Problem: He-said-she-said Three possible causes: A never sent X B refuses to acknowledge X X was delayed by the network Cannot get proof of misbehavior! Generalize to verifiable evidence: a proof of misbehavior, or a challenge that a faulty node cannot answer What if the challenged node does not respond? Does not prove a fault, but node is suspected until it responds A A X B B C C ? I sent X! I never received X! ?! Building and Programming the Cloud, Mysore, Jan 2010

10 10 Practical accountability Requirement for an accountable distributed system: This is useful Any (!) fault that affects a correct node is eventually detected and linked to a faulty node It can be implemented in practice Whenever a fault is observed by a correct node, the system eventually generates verifiable evidence against a faulty node Building and Programming the Cloud, Mysore, Jan 2010

11 11 Outline Why accountability? A definition A practical implementation: PeerReview Accountability in the Cloud Technical Challenges Conclusion Building and Programming the Cloud, Mysore, Jan 2010

12 12 Adds accountability to a given system Implemented as a library Provides tamper-evident record Detects faults via state-machine replay Assumptions: PeerReview 1. Nodes can be modeled as deterministic state machines 2. There is a trusted reference implementation of the state machines 3. Correct nodes can eventually communicate 4. Nodes can sign messages Building and Programming the Cloud, Mysore, Jan 2010

13 13 PeerReview is widely applicable App #1: NFS server in the Linux kernel Many small, latency-sensitive requests Tampering with files Lost updates App #2: Overlay multicast Transfers large volume of data Freeloading Tampering with content App #3: P2P email Complex, large, decentralized Denial of service Attacks on DHT routing Details in [Haeberlen et al., SOSP’07] NetReview [ Haeberlen et al. NSDI’08 ] Metadata corruption Incorrect access control Censorship Building and Programming the Cloud, Mysore, Jan 2010

14 14 How much does PeerReview cost? Log storage 10 – 100 GByte per month, depending on application Message signatures Message latency (e.g. 1.5ms RTT with RSA-1024) CPU overhead (embarrassingly parallel) Log/authenticator transfer, replay overhead Depends on # witnesses Can be deferred to exploit bursty/diurnal load patterns Building and Programming the Cloud, Mysore, Jan 2010

15 15 Outline Why accountability? A definition A practical implementation: PeerReview Accountability in the Cloud Technical Challenges Conclusion Building and Programming the Cloud, Mysore, Jan 2010

16 Split administration in the Cloud Bug in Alice‘s software Subtle differences between Alice and Bob‘s environments... 16 Alice Bob Alice's customers Bug in Bob‘s software Insufficient resource allocation Hacker attack... What if there is a problem? Building and Programming the Cloud, Mysore, Jan 2010

17 Split administraction: Alice‘s perspective 17 Building and Programming the Cloud, Mysore, Jan 2010 Alice Alice's customers ? ? ? ? ? ? ? ? Bob If something is wrong, how will I know? How can I tell if it's my software or the cloud? If it's the cloud, how can I convince Bob?

18 If something is wrong, how will I know? How can I tell if it's my software or the cloud? If it's the cloud, how can I convince Bob? Split administraction: Bob's perspective 18 Building and Programming the Cloud, Mysore, Jan 2010 Alice Bob Alice's customers ? ? ???????? ??? If something is wrong, how will I know? How can I tell if it's the cloud or Alice's software? If it's Alice's software, how can I convince Alice?

19 An idealized solution What if we had an oracle that Alice and Bob could ask about problems? Completeness: If the cloud is faulty, the oracle will say so Accuracy: If the cloud is not faulty, the oracle will say so Verifiability: The oracle produces evidence that would convince a third party 19 Building and Programming the Cloud, Mysore, Jan 2010 Alice Bob Alice's customers Oracle

20 The accountable cloud Idea: Make cloud accountable Cloud records its actions in a tamper-evident log Alice can audit the log and check for faults Use log to construct evidence that a fault does (not) exist Should work even if one party was compromised! 20 Building and Programming the Cloud, Mysore, Jan 2010 Alice Bob Alice's customers Tamper-evident log

21 Discussion Is this too pessimistic? Cloud isn't malicious! Hacker attacks, software bugs, operator error, malicious client, … Difficult to come up with a more restrictive fault model Without provable properties, evidence has little value Why would a provider want to deploy this? Attractive to prospective customers (peace of mind) Helps in handling customer complaints, resolve disputes 21 Building and Programming the Cloud, Mysore, Jan 2010

22 22 Outline Why accountability? A definition A practical implementation: PeerReview Accountability in the Cloud Technical Challenges Conclusion Building and Programming the Cloud, Mysore, Jan 2010

23 Is the technology ready? Cloud accountability should Have provable guarantees Work for most cloud applications Require no changes to application code Cover a wide spectrum of properties Have reasonable overhead Can existing techniques deliver this? CATS, Repeat&Compare, AIP, PeerReview, NetReview, AudIt,... More work is needed! 23 Building and Programming the Cloud, Mysore, Jan 2010 ? ? ?

24 Work in progress: AVM Goal: Provide accountability for arbitrary binary executables Idea: Accountable virtual machine (AVM) Cloud records enough data to enable deterministic replay Alice can replay log against a reference implementation Can audit any part of the hosted execution 24 Building and Programming the Cloud, Mysore, Jan 2010 AliceBob Virtual machine

25 Challenges Complete state-machine replay expensive limit to spot checks, investigation of suspected faults multi-core replay is hard replay log against an abstract model? Checking performance properties Checking information flow Lots of research opportunities 25 Building and Programming the Cloud, Mysore, Jan 2010

26 Summary Accountability is a useful capability in distributed systems tamper-evident record fault detection and localization evidence Proposal: the accountable cloud Can verify correct operation, produce evidence Provable guarantees  solid foundation for both players Challenges remain 26 Questions? Building and Programming the Cloud, Mysore, Jan 2010


Download ppt "Building and Programming the Cloud, Mysore, Jan 2010 1 Accountable distributed systems and the accountable cloud Peter Druschel joint work with Andreas."

Similar presentations


Ads by Google