Page 1March 1, 2005 10th Estonian Winter School in Computer Science Privacy Preserving Data Mining Lecture 2 Cryptographic Solutions Benny Pinkas HP Labs,

Slides:



Advertisements
Similar presentations
Polylogarithmic Private Approximations and Efficient Matching
Advertisements

Efficient Private Approximation Protocols Piotr Indyk David Woodruff Work in progress.
Revisiting the efficiency of malicious two party computation David Woodruff MIT.
Efficiency vs. Assumptions in Secure Computation Yuval Ishai Technion & UCLA.
Secure Computation of Linear Algebraic Functions
Secure Evaluation of Multivariate Polynomials
Oblivious Branching Program Evaluation
Foundations of Cryptography Lecture 10 Lecturer: Moni Naor.
Analysis of Algorithms
Lecturer: Moni Naor Foundations of Cryptography Lecture 15: Oblivious Transfer and Secure Function Evaluation.
ITIS 6200/ Secure multiparty computation – Alice has x, Bob has y, we want to calculate f(x, y) without disclosing the values – We can only do.
Rational Oblivious Transfer KARTIK NAYAK, XIONG FAN.
CS555Topic 241 Cryptography CS 555 Topic 24: Secure Function Evaluation.
Digital Signatures and Hash Functions. Digital Signatures.
Oblivious Transfer (OT) Alice (sender) has n secrets Alice wants to give k secrets to Bob Bob wants the secrets but does not want Alice to know which secrets.
Amortizing Garbled Circuits Yan Huang, Jonathan Katz, Alex Malozemoff (UMD) Vlad Kolesnikov (Bell Labs) Ranjit Kumaresan (Technion) Cut-and-Choose Yao-Based.
Introduction to Modern Cryptography, Lecture 12 Secure Multi-Party Computation.
What Crypto Can Do for You: Solutions in Search of Problems Anna Lysyanskaya Brown University.
Yan Huang, Jonathan Katz, David Evans University of Maryland, University of Virginia Efficient Secure Two-Party Computation Using Symmetric Cut-and-Choose.
Algorithms and Problem Solving-1 Algorithms and Problem Solving.
CMSC 414 Computer (and Network) Security Lecture 2 Jonathan Katz.
Oblivious Transfer based on the McEliece Assumptions
Co-operative Private Equality Test(CPET) Ronghua Li and Chuan-Kun Wu (received June 21, 2005; revised and accepted July 4, 2005) International Journal.
Private Analysis of Data Sets Benny Pinkas HP Labs, Princeton.
1 Introduction to Secure Computation Benny Pinkas HP Labs, Princeton.
Privacy Preserving Data Mining Yehuda Lindell & Benny Pinkas.
Usenix Security 2004 Slide 1 Fairplay – A Secure Two- Party Computation System Yaron Sella Hebrew University of Jerusalem Joint work with Dahlia Malkhi,
On Everlasting Security in the Hybrid Bounded Storage Model Danny Harnik Moni Naor.
Privacy Preserving Learning of Decision Trees Benny Pinkas HP Labs Joint work with Yehuda Lindell (done while at the Weizmann Institute)
Slide 1 Vitaly Shmatikov CS 380S Oblivious Transfer and Secure Multi-Party Computation With Malicious Parties.
CS573 Data Privacy and Security
How to play ANY mental game
CS573 Data Privacy and Security
Efficient and Robust Private Set Intersection and multiparty multivariate polynomials Dana Dachman-Soled 1, Tal Malkin 1, Mariana Raykova 1, Moti Yung.
Overview of Privacy Preserving Techniques.  This is a high-level summary of the state-of-the-art privacy preserving techniques and research areas  Focus.
Secure Computation of the k’th Ranked Element Gagan Aggarwal Stanford University Joint work with Nina Mishra and Benny Pinkas, HP Labs.
Discrete Mathematics Algorithms. Introduction  An algorithm is a finite set of instructions with the following characteristics:  Precision: steps are.
Provable Unlinkability Against Traffic Analysis Amnon Ta-Shma Joint work with Ron Berman and Amos Fiat School of Computer Science, Tel-Aviv University.
Secure Cloud Database using Multiparty Computation.
Analysis of Algorithms
Secure Computation (Lecture 7-8) Arpita Patra. Recap >> (n,t)-Secret Sharing (Sharing/Reconstruction) > Shamir Sharing > Lagrange’s Interpolation for.
On the Practical Feasibility of Secure Distributed Computing A Case Study Gregory Neven, Frank Piessens, Bart De Decker Dept. of Computer Science, K.U.Leuven.
Slide 1 Vitaly Shmatikov CS 380S Introduction to Secure Multi-Party Computation.
Slide 1 Vitaly Shmatikov CS 380S Yao’s Protocol. slide Yao’s Protocol uCompute any function securely … in the semi-honest model uFirst, convert.
Secure two-party computation: a visual way by Paolo D’Arco and Roberto De Prisco.
Slide 1 Yao’s Protocol. slide Yao’s Protocol uCompute any function securely … in the semi-honest model uFirst, convert the function into a boolean.
Logic Circuits Chapter 2. Overview  Many important functions computed with straight-line programs No loops nor branches Conveniently described with circuits.
Major objective of this course is: Design and analysis of modern algorithms Different variants Accuracy Efficiency Comparing efficiencies Motivation thinking.
Week 10Complexity of Algorithms1 Hard Computational Problems Some computational problems are hard Despite a numerous attempts we do not know any efficient.
Private Approximation of Search Problems Amos Beimel Paz Carmi Kobbi Nissim Enav Weinreb (Technion)
Privacy Preserving Data Mining Yehuda Lindell Benny Pinkas Presenter: Justin Brickell.
Privacy-Preserving Credit Checking Keith Frikken, Mikhail Atallah, and Chen Zhang Purdue University June 7, 2005.
On the Communication Complexity of SFE with Long Output Daniel Wichs (Northeastern) joint work with Pavel Hubáček.
Secure Computation (Lecture 2) Arpita Patra. Vishwaroop of MPC.
Secure Computation Lecture Arpita Patra. Recap >> Improving the complexity of GMW > Step I: Offline: O(n 2 c AND ) OTs; Online: i.t., no crypto.
Andrew Lindell Aladdin Knowledge Systems and Bar-Ilan University 04/08/08 CRYP-106 Efficient Fully-Simulatable Oblivious Transfer.
Efficient Private Matching and Set Intersection Mike Freedman, NYU Kobbi Nissim, MSR Benny Pinkas, HP Labs EUROCRYPT 2004.
DATA STRUCTURES (CS212D) Overview & Review Instructor Information 2  Instructor Information:  Dr. Radwa El Shawi  Room: 
Round-Efficient Multi-Party Computation in Point-to-Point Networks Jonathan Katz Chiu-Yuen Koo University of Maryland.
Cryptographic methods. Outline  Preliminary Assumptions Public-key encryption  Oblivious Transfer (OT)  Random share based methods  Homomorphic Encryption.
Secure Computation Basics Yan Huang Indiana University May 9, 2016.
Multi-Party Computation r n parties: P 1,…,P n  P i has input s i  Parties want to compute f(s 1,…,s n ) together  P i doesn’t want any information.
Garbling Techniques David Evans
Topic 36: Zero-Knowledge Proofs
Analysis of Algorithms
The first Few Slides stolen from Boaz Barak
Course Business I am traveling April 25-May 3rd
Cryptography CS 555 Lecture 22
Objective of This Course
Cryptography Lecture 18.
Presentation transcript:

page 1March 1, th Estonian Winter School in Computer Science Privacy Preserving Data Mining Lecture 2 Cryptographic Solutions Benny Pinkas HP Labs, Israel

page 2March 1, th Estonian Winter School in Computer Science Secure two-party computation - definition x y F(x,y) and nothing else Input: Output: x y As if… F(x,y)

page 3March 1, th Estonian Winter School in Computer Science Secure Function Evaluation A major topic of cryptographic research How to let n parties, P 1,..,P n compute a function F(x 1,..,x n ) – Where input x i is known to party P i – Parties learn the final input and nothing else Caveat: cryptographic definitions of secure computation are both too strong and too weak: – Too strong: do not allow leakage of harmless information; the price of this extra security is in efficiency. – Too weak: do not address leakage or misuse caused by the function itself (e.g., information implied by the outputs, or misbehavior in choosing an input).

page 4March 1, th Estonian Winter School in Computer Science Leak no other information A protocol is secure if it emulates the ideal solution Alice learns F(x,y), and therefore can compute everything that is implied by x, her prior knowledge of y, and F(x,y). Alice must not be able to compute anything else Simulation: – A protocol is considered secure if: For every adversary in the real world There exists a simulator in the ideal world, which outputs an indistinguishable ``transcript”, given access to the information that the adversary is allowed to learn in the ideal model.

page 5March 1, th Estonian Winter School in Computer Science Secure Function Evaluation Major Result [Yao]: “ Any function that can be evaluated using polynomial resources can be securely evaluated using polynomial resources” (under some cryptographic assumption)

page 6March 1, th Estonian Winter School in Computer Science SFE Building Block: 1-out-of 2 Oblivious Transfer Learns nothing YjYj Alice j  {0,1} Bob Y 0, Y 1 1-out-of-2 OT can be based on most public key systems There are implementations with two communication rounds

page 7March 1, th Estonian Winter School in Computer Science General Two party Computation Two party protocol Input: – Sender: Function F (some representation) The sender’s input Y is already embedded in F – Receiver: X   0,1  n Output: – Receiver: F(x) and nothing else about F – Sender: nothing about x

page 8March 1, th Estonian Winter School in Computer Science Representations of F Boolean circuits [Yao,GMW,…] Algebraic circuits [BGW,…] Low deg polynomials [BFKR] Matrices product over a large field [FKN,IK] Randomizing polynomials [IK] Communication Complexity Protocol [NN]

page 9March 1, th Estonian Winter School in Computer Science Secure two-party computation of general functions [Yao] First, represent the function F as a Boolean circuit C – It’s always possible – Sometimes it’s easy (additions, comparisons) – Sometimes the result is inefficient (e.g. for indirect addressing, e.g. A[x] ) Then, “garble” the circuit Finally, evaluate the garbled circuit

page 10March 1, th Estonian Winter School in Computer Science Garbling the circuit Bob constructs the circuit, and then garbles it. G w i 0,w i 1 w J 0,w J 1 w k 0,w k 1 W values will serve as cryptographic keys W k 0  0 on wire k W k 1  1 on wire k (Alice will learn one string per wire, but not which bit it corresponds to.)

page 11March 1, th Estonian Winter School in Computer Science Gate tables For every gate, every combination of input values is used as a key for encrypting the corresponding output Assume G=AND. Bob constructs a table: – Encryption of w k 0 using keys w i 0,w J 0 (AND(0,0)=0) – Encryption of w k 0 using keys w i 0,w J 1 (AND(0,1)=0) – Encryption of w k 0 using keys w i 1,w J 0 (AND(1,0)=0) – Encryption of w k 1 using keys w i 1,w J 1 (AND(1,1)=1) Result: given w i x,w J y, can compute w k G(x,y)

page 12March 1, th Estonian Winter School in Computer Science Secure computation Bob sends the table of gate G to Alice Given, e.g., w i 0,w J 1, Alice computes w k 0 by decrypting the corresponding entry in the table, but she does not know the actual values of the wires. G w i 0,w i 1 wJ0,wJ1wJ0,wJ1 w k 0,w k 1 Encryption of w k 0 using keys w i 0,w J 0 Encryption of w k 0 using keys w i 0,w J 1 Encryption of w k 1 using keys w i 1,w J 1 Encryption of w k 0 using keys w i 1,w J 0 Permuted order

page 13March 1, th Estonian Winter School in Computer Science Secure computation Bob sends to Alice – Tables encoding each circuit gate. – Garbled values (w’s) of his input values. – Translation from garbled values of output wires to actual 0/1 values. If Alice gets garbled values (w’s) of her input values, she can compute the output of the circuit, and nothing else.

page 14March 1, th Estonian Winter School in Computer Science Alice’s input For every wire i of Alice’s input: – The parties run an OT protocol – Alice’s input is her input bit (s). – Bob’s input is w i 0,w i 1 – Alice learns w i s The OTs for all input wires can be run in parallel. Afterwards Alice can compute the circuit by herself.

page 15March 1, th Estonian Winter School in Computer Science Secure computation – the big picture Represent the function as a circuit C Bob sends to Alice 4|C| encryptions (e.g. 64|C| Bytes), 4 encryptions for every gate. Alice performs an OT for every input bit. (Can do, e.g OTs per sec.) ~One round of communication. Efficient for medium size circuits!

page 16March 1, th Estonian Winter School in Computer Science Example The Millionaires problem: comparing two N bit numbers What’s the overhead?

page 17March 1, th Estonian Winter School in Computer Science Applications Two parties. Two large data sets. Max? Mean? Median? Intersection? Decision Tree learning? ID3?

page 18March 1, th Estonian Winter School in Computer Science Fairplay – a secure two-party computation system Malkhi, Nissan, P., Sella A a full fledged secure two-party computation system, implementing Yao’s “garbled circuit” protocol. Goals: – Investigate whether two-party SFE is practical – Actual measurements of overall computation – Breakdown of computation into parts – Computation versus communication? – Test-bed for various optimizations

page 19March 1, th Estonian Winter School in Computer Science Fairplay The Compilation paradigm – Programs written in SFDL, a high-level programming language – Allows clear, formal, easily understandable definition and requirements by humans – SHDL: Low-level language describing Boolean circuits – SFDL  SHDL compiler and optimizer – SHDL  Java programs implementing Yao’s protocol

page 20March 1, th Estonian Winter School in Computer Science Fairplay – SFDL example program Millionaires { type int = Int ; // 20-bit integer type AliceInput = int; type BobInput = int; type AliceOutput = Boolean; type BobOutput = Boolean; type Output = struct {AliceOutput alice, BobOutput bob}; type Input = struct {AliceInput alice, BobInput bob}; function Output output(Input input) { output.alice = input.alice > input.bob; output.bob = input.bob > input.alice; }

page 21March 1, th Estonian Winter School in Computer Science SFDL properties Conventional syntax (C/Pascal-like) Type system – Boolean, integer, enumerated Program structure – Declarations: global constants, types – Sequence of functions (no nesting [C], no recursion) – Function name is its return value [Pascal] Conditional execution and loops – if-then, if-then-else statements, For-loop (loop boundaries should be known at compile time) Assignments and expressions – constants, variables, array entries, structure items, function calls, operators (+, -, logical, comparison), parenthesis

page 22March 1, th Estonian Winter School in Computer Science SHDL example 0 input//output$input.bob$0 1 input//output$input.bob$1 2 input//output$input.bob$2 3 input//output$input.bob$3 4 input//output$input.alice$0 5 input//output$input.alice$1 6 input//output$input.alice$2 7 input//output$input.alice$3 8 gate arity 2 table [ ] inputs [ 4 5 ] 9 gate arity 2 table [ ] inputs [ 4 5 ]

page 23March 1, th Estonian Winter School in Computer Science k th -ranked element (e.g. median) Inputs: – Alice: S A Bob: S B – Large sets of unique items (  D). Output: – x  S A  S B s.t. x has k-1 elements smaller than it. The rank k – Could depend on the size of input datasets. – Median: k = (|S A | + |S B |) / 2 Motivation: – Basic statistical analysis of distributed data. – E.g. histogram of salaries in CS departments The Problem: Generic constructions using circuits [Yao …] yield an overhead which is at least linear in k.

page 24March 1, th Estonian Winter School in Computer Science An (insecure) two-party median protocol RARA LALA SASA SBSB m A RBRB LBLB m B L A lies below the median, R B lies above the median. New median is same as original median. Recursion  Need log n rounds (assume each set contains n=2 i items) m A < m B

page 25March 1, th Estonian Winter School in Computer Science A Secure two-party median protocol A finds its median m A B finds its median m B mA < mBmA < mB A deletes elements ≤ m A. B deletes elements > m B. A deletes elements > m A. B deletes elements ≤ m B. YES NO Secure comparison (e.g. a small circuit)

page 26March 1, th Estonian Winter School in Computer Science An example A B mA>mBmA>mB mA<mBmA<mB mA<mBmA<mB mA>mBmA>mB mA<mBmA<mB Median found!!

page 27March 1, th Estonian Winter School in Computer Science Proof of security A B mA>mBmA>mB mA<mBmA<mB mA<mBmA<mB mA>mBmA>mB mA<mBmA<mB median mA>mBmA>mB mA<mBmA<mB mA<mBmA<mB mA>mBmA>mB mA<mBmA<mB

page 28March 1, th Estonian Winter School in Computer Science ++ Arbitrary input size, arbitrary k SASA SBSB k Now, compute the median of two sets of size k. Size should be a power of 2. median of new inputs = k th element of original inputs 2i2i ++ --

page 29March 1, th Estonian Winter School in Computer Science Hiding size of inputs Can search for k th element without revealing size of input sets. However, k=n/2 (median) reveals input size. Solution: Let S=2 i be a bound on input size. |S A | S -- ++ -- ++ |S B | Median of new datasets is same as median of original datasets.

page 30March 1, th Estonian Winter School in Computer Science Privacy preserving data mining Confidential database D1 Wish to “mine” D1  D2 without revealing more info Examples: Medical databases protected by law Competing businesses Government agencies (privacy, “need to know”) Confidential database D2 P1P1 P2P2 Huge

page 31March 1, th Estonian Winter School in Computer Science The classification problem Age > 30Sextime insured Claim > $500 Did fraud occur? C1YesM t  [0,9] years No C2NoF t  [10,19] years Yes ……………… CnYesF t  [20,29] years No Goal: based on available data design an algorithm to classify new data

page 32March 1, th Estonian Winter School in Computer Science Classification using Decision Trees Time insured No [0,9] years > 20 years [10,19] years Age > 30 NoYes No Claim > $ 500 NoYes No ID3: Choose attribute A that minimizes the conditional entropy of the attribute class

page 33March 1, th Estonian Winter School in Computer Science Privacy Preserving ID3 Scenario: The inputs are private information of P 1 and P 2 Main technical problem: Comparing entropies while preserving privacy. (entropy =  x logx) Efficiency: – most computation done independently by parties. – The overhead of cryptographic operations depends only on the size of the decision tree (not on the input size). Basic task: compute x log x. x = x 1 +x 2 = e.g., total number of customers with (age > 30) and (fraud = yes)

page 34March 1, th Estonian Winter School in Computer Science Privacy Preserving ID3 Computing x log x: – x = x1 + x2, known to P1 and P2 respectively (independently computed from databases). – Might as well compute x lnx, or lnx. – First run a protocol to compute random shares, y1 + y2 = ln x ln x is Real. Crypto works over finite fields. Must do numerical analysis.

page 35March 1, th Estonian Winter School in Computer Science Cryptographic Tools x Implementation: Two passes, O(degree) (or O( log|F|) ) exponentiations. A polynomial Q(·) Q(x) and nothing else nothing Input: Output: Secure Function Evaluation (SFE) [Yao] Oblivious Polynomial Evaluation [NP]

page 36March 1, th Estonian Winter School in Computer Science Computing random shares of lnx = ln(x 1 +x 2 ) Use Taylor approximation for lnx – x = x 1 + x 2 = 2 n (1+  ) -½ <  < ½ – lnx = ln(2 n (1+  )) = ln 2 n + ln(1+  )  ln 2 n +  i=1..k (-1) i-1  i / i = ln 2 n + T(  ) T(  ) is a polynomial of degree k. Error is exponentially small in k. We only know how to work over finite fields Compute c·lnx, where c compensates for fractions. Work in F, where |F| sufficiently large.

page 37March 1, th Estonian Winter School in Computer Science ln(x 1 +x 2 ) Protocol Step 1 of the protocol – Find n,  – Apply Yao’s protocol to the following small circuit Input: x 1 and x 2 Output (random shares): random a 1 and a 2 s.t. a 1 + a 2 = x-2 n =  ·2 n random b 1 and b 2 s.t. b 1 + b 2 = ln 2 n Operation: The protocol finds 2 n closest to x 1 + x 2, computes  2 n = x 1 + x n. – x = x 1 + x 2 = 2 n +  2 n – lnx = ln(2 n (1+  )) = ln 2 n + ln(1+  )

page 38March 1, th Estonian Winter School in Computer Science ln(x 1 +x 2 ) Protocol (Cont.) Step 2 of the protocol – Compute random shares of T(  ) (Taylor approx.) –P 1 chooses a random w 1  F and defines a polynomial Q(x), s.t. w 1 +Q(a 2 ) = T(  ) (recall a 1 + a 2 =  ·2 n ) –Namely, Q(x) = T( (a 1 +x)/2 n ) – w 1. –Run an oblivious poly evaluation in which P 2 computes w 2 = Q( a 2 ) = T(  ) – w 1. –Now the parties have random w 1 and w 2 s.t. –w 1 + w 2 = T(  )  ln(1+  ) –(b 1 + w 1 ) + (b 2 + w 2 )  ln 2 n + ln(1+  ) = ln x

page 39March 1, th Estonian Winter School in Computer Science Computing x lnx Tool: Multiply(c 1,c 2 ) – Input: c 1, c 2 – Output: d 1, d 2 s.t. d 1 +d 2 = c 1 *c 2 – How? OPE of Q(z) = c 1 *z -d 1 Actual task: x lnx – Input: x 1 +x 2 =x, c 1 +c 2 = ln x – Output: x lnx = (x 1 +x 2 )*(c 1 +c 2 ) – Run Multiply(x 1,c 2 ), Multiply (c 1,x 2 )

page 40March 1, th Estonian Winter School in Computer Science The rest of the work.. The parties compute shares of lnx Then they compute shares of xlnx Each party computes a share of the entropy by summing shares of x lnx (H(X) =  x lnx ) A small circuit finds the attribute giving the minimal conditional entropy The attribute is assigned to the node The databases are divided according to the value of this attribute

page 41March 1, th Estonian Winter School in Computer Science Efficiency lnx protocol: – secure computation of a small circuit – one oblivious polynomial evaluation ID3 for a database with: –1,000,000 transactions –15 attributes –10 values per attribute –4 class values –Communication per node takes seconds (T1) –Computation per node takes minutes (P3)

page 42March 1, th Estonian Winter School in Computer Science Contributions Cryptographic protocols where the bulk of the operations is done independently. Data mining – Rigorous model for secure data-mining. – Efficient, secure protocol for specific problems (median, ID3). Cryptography – Sub-linear complexity - secure computation for large data sets. – Efficient protocols for complex known algorithms. – Secure computation of logarithms (real function - numerical analysis). Drawbacks: – Privacy preserving solutions are less efficient – It’s hard to find efficient private solutions for all interesting functions – Security against malicious parties

page 43March 1, th Estonian Winter School in Computer Science References Lecture notes and overview papers: – B. Pinkas, Cryptographic Techniques for Privacy-Preserving Data Mining, SIGKDD Explorations, January – R. Cramer: Introduction to Secure Computation, – Ivan Damgård, Theory and practice of multiparty computation, 8 th EWSCS, Research papers: – G. Aggarwal, N. Mishra and B. Pinkas, Secure Computation of the K'th-ranked Element, Eurocrypt ' – Y. Lindell and B. Pinkas, Privacy Preserving Data Mining, Journal of Cryptology, Vol. 15 – No. 3, final.pdf