Presentation is loading. Please wait.

Presentation is loading. Please wait.

Private Inference Control

Similar presentations

Presentation on theme: "Private Inference Control"— Presentation transcript:

1 Private Inference Control
David Woodruff MIT Joint work with Jessica Staddon (PARC)

2 Contents Background Access Control and Inference Control
Our contribution: Private Inference Control (PIC) Related Work PIC model & definitions Our Results Conclusions

3 Sensitive: Access denied
Access Control User queries a database. Some info in DB sensitive. What’s Bob’s salary? Server DB of n records Sensitive: Access denied Access control prevents user from learning individual sensitive relations/attributes. Does access control prevent user from learning sensitive info?

4 Inference Control Name Job Salary
Alyssa P. Hacker Software Engineer $90,000 Paul E. Nomial Mathematician $31,415 Query 1 How much does Alyssa make? Query 2 What is Alyssa’s job? Query 3 How much do software engineers make? Sensitive. Software Engineer $90,000 Combining non-sensitive info may yield something sensitive Inference Channel: {(name, job), (job, salary)} Inference Control : block all inference channels

5 Inference Control Database x 2 ({0,1}m)n
DB of n records, m attributes 1, …, m per record n tending to infinity, m = O(1) Inference engine: generates collection C of subsets of [m] denoting all the inference channels We assume have an engine [QSKLG93] (exhaustive search) F 2 C means for all i, user shouldn’t learn xi, j for all j 2 F Assume C is monotone. Assume C input to both user and server User learns C anyway when his queries are blocked C is data-independent, reveals info only about attributes

6 Our contribution: Private Inference Control
Existing inference control schemes require server to learn user queries to check if they form an inference Our goal: user Privacy + Inference Control = PIC Privacy: efficient S learns nothing about honest user’s queries except # made so far # queries made so far enables S to do inference control Private and symmetrically-private information retrieval Not sufficient since stateless – user’s permissions change Generic secure function evaluation Not efficient – our communication exponentially smaller This talk: arbitrary malicious users U*, semi-honest S Can apply [NN] to handle malicious S

7 Application Government analysts inspect repositories for terrorist patterns Inference Control: prevent analysts from learning sensitive info about non-terrorists. User Privacy: prevent server from learning what analysts are tracking – if discovered this info could go to terrorists! DB

8 Related Work Data perturbation [AS00, B80, TYW84]
So much noise required data not as useful [DN03] Adaptive Oblivious Transfer [NP99] One record can be queried adaptively at most k times Priced Oblivious Transfer [AIR01] One record, supports more inference channels than threshold version considered in [NP99] We generalize [NP99] and [AIR01] Arbitrary inference channels and multiple records More efficient/private than parallelizing NP99 and AIR01 on each record

9 The Model Offline Stage: S given x, C, 1k, and can preprocess x
Online Stage: at time t, honest U generates query (it, jt) (it, jt) can depend on all prior info/transactions with S Let T denote all queries U makes, (i1, j1), …, (i|T|, j|T|) T r.v. - depends on U’s code, x, and randomness T permissable if no i s.t. (i,j) 2 T for all j 2 F for some F 2 C. We require honest U to generate permissable T. U and S interact in a multiround protocol, then U outputs outt ViewU consists of C, n, m, 1k , all messages from S, randomness ViewS consists of C, n, m, 1k, x, all messages from U, randomness

10 Security Definitions Correctness: For all x, C, for all honest users U, for all  2 [|T(U, x)|], if T permissable, out = xi, j User Privacy: For all x, C, for all honest U, for any two sequences T1, T2 with |T1| = |T2|, for all semi-honest servers S* and random coin tosses of S* (ViewS* | T(U, x) = T1)  (ViewS* | T(U, x) = T2) Inference Control: Comparison with ideal model – for every U*, every x, any random coins of U*, for every C there exists a simulator U’ interacting with trusted party Ch for which ViewU*  View<U’, Ch>, where U’ just asks Ch for tuples (it, jt) that are permissable

11 Efficiency Efficiency measures are per query
Minimize communication & round complexity Ideally O(polylog(n)) bits and 1 round Minimize server’s time-complexity Ideally O(n) without preprocessing W/preprocessing, potentially better, but O(n) optimal w.r.t. known single-server PIR schemes

12 Our Result Using best-known PIR schemes [CMS99], [L04]: PIC scheme
(O~ hides polylog(n), poly(k) terms) Communication O~(1) Work O~(n) 1 round

13 A Generic Reduction A protocol is a threshold PIC (TPIC) if it satisfies the definitions of a PIC scheme assuming C = {[m]}. Theorem (roughly speaking): If there exists a TPIC with communication C(n), work W(n), and round complexity R(n), then there exists a PIC with communication O(C(n)), work O(W(n)), and round complexity O(R(n)).

14 PIC ideas: … User/server do SPIR on table of encryptions
cnvdselvuiaapxnw User/server do SPIR on table of encryptions Idea: Encryptions of both data and keys that will help user decrypt encryptions on future queries User can only decrypt if has appropriate keys – only possible if not in danger of making an inference

15 Stateless PIC Efficiency of PIC is a data structures problem
Which keys most efficienct for user to: Update as user makes new queries? Prove user not in danger of making an inference on current/future queries? Keys must prevent replay attacks: can’t use “old” keys to pretend made less queries to records than actually have

16 PIC Scheme #1 – Stage 1 E(i3), E(j3), ZKPOK PK, SK PK (i3, j3)
Let E by a homomorphic semantically secure encryption scheme (e.g., Pallier) Suppose we allow accessing each record at most once E(i3), E(j3), ZKPOK PK, SK PK (i3, j3) E(i1) -> E(r1(i1 – i3)) E(i2) -> E(r2(i2 – i3)) Recovers r1, r2 iff hasn’t previously accessed i3 From r1 and r2 user can reconstruct a secret S

17 User does “SPIR on records” on
PIC Scheme #1 – Stage 2 E(i3), E(j3), commit, ZKPOK PK, SK PK (i3, j3) E(r1,1(j-j3) + r’1,1(i – i3) + S + x1,1) E(r1,2(j-j3) + r’1,2(i – i3) + S + x1,2) E(r2,1(j-j3) + r’2,1(i – i3) + S + x2,1) Recovers S User does “SPIR on records” on table of encryptions

18 PIC Scheme #1 - Wrapup To extend to querying a record < m times, on t-th query, let r1, …, rt-1 be (t-m+1) out of (t-1) secret sharing of S This scheme can be proven to be a TPIC – use generic reduction to get a PIC User Privacy: semantic security of E, ZK of proof, privacy of SPIR Inference Control: user can recover at most t-m ri if already queried record m-1 times – can build a simulator using SPIR w/knowledge extractor [NP99]

19 O~(1)-communication, O~(n) work PIC
PIC Scheme #2 - Glimpse t O~(1)-communication, O~(n) work PIC Balanced binary tree B Leaves are attributes Parents of leaves are records Internal node n accessed when record r queried and n on path from r to root Keys encode # times nodes in B have been accessed. Ku, a Kv, b Kw,c Kx,d Ky,e Kz,f 1 2 3 4 a+b =t

20 Conclusions Extensions not in this talk Multiple users (pseudonyms)
Collusion resistance: c-resistance => m-channel becomes collection of (m-1)/c channels. Summary New Primitive – PIC Essentially optimal construction w.r.t. known PIR schemes

Download ppt "Private Inference Control"

Similar presentations

Ads by Google