Presentation on theme: "Sushil Jajodia, George Mason U Witold Litwin, U Paris Dauphine Thomas Schwarz, S.J., U Católica Uruguay."— Presentation transcript:
Sushil Jajodia, George Mason U Witold Litwin, U Paris Dauphine Thomas Schwarz, S.J., U Católica Uruguay
Scalable Distributed Data Structures store data in the potentially hostile environment of distributed systems, clouds, P2P Standard protection for confidentiality, integrity, and authentication is encryption ◦ Client-Side Encryption Challenge: Key management Clients loose keys, keys need to be revoked, … Could use Key Escrow Not widely used ◦ Server-Side Encryption Challenge: Key management Client has no control over her/his data
Build on top of LH*, distributed version of linear hashing ◦ Data distributed into buckets, each at a different server ◦ Gracefully adjusts to growth of file ◦ Very efficient access via record identifiers Uses client-side encryption All key are copied in LH* itself: ◦ Each key broken into k + 1 shares through secret sharing Generate k random strings of same size as key C These form first k random shares C k = C 0 C 1 C 2 ... C k-1 C is the last share. ◦ Key shares stored in Share Records
LH* data stored in records consisting of RID and non-key field (payload) LH* RE adds three fields: ◦ I-field:Identifies application, … ◦ F-field: Flag that distinguishes between data records and key share records ◦ T-field:Identifies key being used
Client records are LH* records Data records translated in LH* RE format
Key Management ◦ Clients maintain their key chain in a table T ◦ Each key broken up into key share ◦ Each key share stored as a share record Key Recovery: ◦ Use LH* scan operation to find all key shares belonging to a certain application Key Revocation: ◦ Find all key shares ◦ Find all records ◦ Delete, re-encrypt and reinsert records
LH* RE is k-safe ◦ Attack into up to k servers is not successful
Threat model ◦ Servers are autonomous, no common vulnerabilities: Physical access Administrative access Common configuration Assurance ◦ Probability that x intrusions did not yield records to the attacker Disclosure ◦ Expected proportion of records obtained by an intrusion into x servers
Assurance in an LH* file with K = 4, 8, and 16 key shares (top to bottom) extending over 16, 32, 64, 128, 256, 512, and 1024 servers. The x-axis is chosen to show the 99.999% (five nines) assurance level.
Ratio r of assurances with random placement over assurance with the LH* RE placement scheme for N = 256 sites, K = 4, 8, and 16.
Assurance in an LH* file with K = 4 and r = 10 and r = 100 keys. We vary N from 16 to 1024. The x- axis shows the two nines assurance level
Expected Disclosure ◦ N number of sites, K number of key shares, x number of intruded sites ◦ Independent of number of keys used!
Conditional Disclosure ◦ Expected proportion of records assuming that a successful intrusion has occurred is the probability of a successful attack into one bucket r number of keys
Contour Graph for the conditional disclosure. We vary N, the number of sites, and r, the number of keys. We set K, the number of shares to 8, and show figures for x = 8, 9, 16 and 32 intrusions. The upper right corner of each picture has close to zero conditional disclosure.
Refined Disclosure Costs ◦ Costs of a disclosure of data depends on The fact of disclosure Negative publicity, Costs of filing with authorities and penalties, Costs of incident analysis … The number of records disclosed ◦ Model costs of disclosure by assuming that α of maximum disclosure is fixed
Refined Disclosure Proportion for N = 100, K = 8, α = 0, 0.1, 0.5, 1, and x (x- axis) varying between 0 and 50. Notice the different scales on the y-axis
On Balance: ◦ While maintaining # of nines of safety, can introduce more keys as file extends over more sites ◦ Number of keys has no impact on expected disclosure ◦ Number of keys has positive impact on conditional disclosure ◦ Number of keys has negative impact on refined disclosure costs
Up till now, we assumed an attacker without knowledge of the LH* layout ◦ With knowledge where buckets are (e.g. from observing traffic flow), a “savvy attacker” has an advantage
Assume initial number of buckets is 3, but now grown to 6: ◦ One key share in Buckets 0 and 4 ◦ One key share in Buckets 1, 3, and 5 ◦ One key share in Bucket 2 Optimal attack uses key share distribution and size ◦ Optimal 3 attack plan: Attack either 0 or 4 Attack 3 It has half the records descending from original Bucket 1 Attack 2 ◦ Success rate is ½ * ½ * 1=0.25
Savvy attacker needs to optimize bucket intrusions Advantage higher if: ◦ There are buckets of different size ◦ The initial number of buckets is not a power of 2