2The Subprime Mortgage Debacle Banks issued dubious mortgages to customersBanks packed mortgages as securities and sold themThey “outsourced” riskBut risk came home to roost…E.g. Bank of America’s Q3 ’07 earnings dropped 32%The moral: Risk is passed around in complex, difficult-to-trace ways…
3Storage-as-a-Service (SaaS) !MegArchive Corp.Seattle, USAMyWeddingPhotosAlicePeer-to-peernetworkMiniFile Inc.Bangalore, IndiaOne day, Alice’s machine crashes, so she contacts MegArchive…BobLiverpool, UK
4Other financially motivated service degradation MegArchive moves Alice’s file to a slow disk arrayFile is there, but retrieval is unacceptably slowMegArchive uses cheap storage that degrades over timeMegArchive throws away a portion of Alice’s file to save spaceMost users don’t retrieve their backup files anyway
5PORs: Proofs of Retrievability Alice would like to ensure that her file F is retrievableShe may also want to ensure compliance with a Service-Level Agreement (SLA)The simple approach: Alice periodically downloads FThis is resource-intensive!What about spot-checking instead?
9F Spot checking ~ ~ f1 ~ ~ f2 f3 f1 f2 f3 Pros: Cons: Archive~f1~~f2f3Pros:Alice needn’t download all of FCan detect large erasure / corruptionF~Cons:Alice must store chunks of FCan’t detect small erasure / corruptionAlicef1f2f3
13MAC refinement: Verification Archivef2f1f3c1c2c3F~FPros:Alice needn’t store any of FCan detect large erasure / corruptionAlicekCons:Can’t detect small erasure / corruption
14* F F Error correcting code Archive*Fparity bitsdecoderFWith ECC to encode F → F*, big error in F* induces small error in FIn effect, we can amplify errors in stored file
15ECC + MAC [loosely like LIBBI ’03, NR ’06] Archivef2f1f3parity bitsf4c4c1c2c3F*FPros:Alice needn’t store any of F—only key kAlice can detect even small corruption in F (= large corruption in F*)Alicek
16ECC + MAC: Verification Archivef2f1f3parity bitsf4c4c1c2c3F*FExample:ECC corrects up to 10% corruptionAlice checks 30 positionsEach check detects corruption with probability ≥ .9Alice detects insurmountable corruption with prob. ≥ > 95%Alicek
17A challenge Applying such an ECC to all of F is impractical Instead, we can stripe the ECCWe’ll need many stripes for efficiencyBut then a clever adversarial archive can corrupt selectively…
18F* F Selective corruption c1 c2 c3 c4 Archivec1c2c3c4F*FAdversary corrupts just one stripe SFile is now irretrievable!Unless Alice checks S, she does not detect corruptionIf S is tiny fraction of file, adversary escapes detection with high probability
19Solution: Hide ECC stripes Archivec1c2c3c4F*FDo secret, randomized partitioning of F into stripesE.g. use secret key to generate pseudorandom permutation and then choose stripes sequentiallyEncrypt and permute parity bitsIn paper, we permute and encrypt F too, but not strictly necessaryF itself is untouched, i.e., intact within F*!But adversary does not know where stripes are, soAdversary cannot feasibly target a stripe!
20Another challenge: How do we formally define a POR? Adversary may respond correctly to challenges, but deny service on retrieval!No global solution: Pattern / number of requests betrays nature of requestA practical approach to definition:Assume memoryless adversary, e.g., software that can be rewound/resetEnsure that challenge interface is identical to retrieval interface, i.e., formal extraction modelLimit distinction to pattern / number of requests
21Yet another challenge: Bandwidth As Alice makes many checks, she fetches many file pieces fiCan we reduce bandwidth?One approach: MAC on server sideFor ith check, generate one-time MACing key ki = g(k)Let archive compute and return m’i = MACk [fi] and E[mi]Alice checks that m’i = miTo preserve extraction interface, Alice must also request and check a (small) piece of fii
22Another bandwidth-limiting approach: Sentinels FInsert hidden, random sentinel blocks into encryption of FIdea: Check sentinel blocks instead of file blocksTheoretically interesting because POR independent of F!Compare with proof of knowledge…Interesting in practical sense because sentinels can be combinedArchive can XOR sentinels before sendingAllows for, e.g., hierarchical POR(In paper, we permute F too, but not strictly necessary)
23Further / related research ImplementationF may be too large to fit in main memoryQuantifying QoS guaranteesFile updates: PORs break!Efficient fix?Extended modelsE.g., publicly verifiableAlgebraic approachesBurns et al., CCS ’07 (next talk!) and other previous work [E.g., FB ’06]