Presentation is loading. Please wait.

Presentation is loading. Please wait.

Yan Huang, David Evans, Jonathan Katz

Similar presentations

Presentation on theme: "Yan Huang, David Evans, Jonathan Katz"— Presentation transcript:

1 Yan Huang, David Evans, Jonathan Katz
Private Set Intersection: Are Garbled Circuits Better than Custom Protocols? Yan Huang, David Evans, Jonathan Katz University of Virginia, University of Maryland Today I am going to talk about our study on comparing generic secure two-party computation protocols with custom protocols. We will look at the PSI in detail as a case study since its an important and widely studied primitive.

2 Motivation --- Common Acquaintances
Private Set Intersection (also short-handed as PSI) is an important component of many interesting applications. There’s an example, it allows two strangers to figure out their common acquaintances by comparing the contact information stored on their phone, without revealing much information on the contacts they don’t share. We’ve developed a proof-of-concept Android App for this. It’s freely available through android market. Beside this, PSI can also enable privacy-preserving joint database search, and can be creatively used in lots of other scenarios.

3 Financial Crypto 2010 EUROCRYPT 2004 CRYPTO 2005 TCC 2008
Because of its importance, the PSI problem and its variants have attracted numerous researchers. Many successful papers have been published on this in the past 7~8 years, among which these are just a selected few. Many clever protocols were proposed. However, as we will explain later, some issues are still plaguing the state-of-the-art protocols.

4 e.g., Garbled Circuit Protocols
Generic Protocols Custom Protocols Designed around specific crypto assumptions and primitives Uses generic and flexible cryptographic primitives Cannot be easily composed with other secure computations Can securely compute arbitrary function New Design and security proofs need to be done for every individual scheme. Security proofs automatically derived from the generic proof. This diagram summarizes the traits and issues of existing custom PSI protocols, and compared them to generic secure computation protocols. First, custom protocols often either rely on some specific (while less standard) cryptographic hardness assumptions --- e.g., the one-more-DL and one-more-RSA assumption used in [DT10]; or require more powerful (but also more expensive) cryptographic primitives --- e.g., a homomorphic encryption scheme used in [FNP04]. In contrast, with a generic approach like the GC-based protocols, only very basic primitives such as secure symmetric encryption and OT schemes are needed, which can be instantiated in many ways based on a variety of hardness assumptions. Second, PSI can inherently leak significant amount of information, thus, instead of being used as a standalone protocol, it’s safer to be used as a component in larger secure computations, where the input sets and the output intersection are secret intermediate values hidden to both parties. In such scenarios, it remains unclear how custom PSI protocols can be easily adopted because they assume the inputs and outputs will be known to respective parties, whereas generic PSI protocols are readily to be composed with any secure computations. Last, every time a custom protocol was proposed, it comes with its own unique design and needs to be proved secure in its own specific way. This can incur significant more design cost than generic protocols where the customization doesn’t affect the security aspects of the protocols, so that the original security proof remains valid. With all these important benefits of generic protocols, you might worry about the performance cost of generic protocols. To the contrary, our work shows, for the PSI problem, that a generic protocol can be very competitive even in terms of performance. e.g., Garbled Circuit Protocols

5 Garbled Circuits & Oblivious Transfers
AND a0 b0 x0 a1 b1 x1 OR x2 And Gate 1 Enca10, b11(x10) Enca11,b11(x11) Enca11,b10(x10) Enca10,b10(x10) Or Gate 2 Encx00, x11(x21) Encx01,x11(x21) Encx01,x10(x21) Encx00,x10(x20) Alice Bob Oblivious Transfer Protocol Rabin, 1981; Even, Goldreich, and Lempel, 1985; Naor and Pinkas 2001, Ishai et al., 2003 Andrew Yao, 1982/1986 In this talk, by generic, we refer to “garbled circuit”-based protocols. It mainly consists of an OT protocol followed by a garbled circuit execution protocol. This technique has been shown to scale well to arbitrarily large secure computation problems in the semi-honest adversary model. Due to the time limit, I won’t be able to cover the technique background of this but those things are covered in both this NDSS paper and our 2011 USENIX paper. Free-XOR technique, Kolesnikov and Shneider, 2008 Y. Huang, D. Evans, J. Katz, L. Malka, Faster Secure Computation Using Garbled Circuits, USENIX Security 2011.

6 Threat Model Semi-Honest Adversary: follows the protocol as specified, but tries to learn more from the protocol execution transcript

7 Generic PSI Protocols Overview
Cost in non-XOR gates Best for Bitwise-AND (BWA) 2 𝜎 Small element space Pairwise-Comparison (PWC) 𝑂( 𝜎𝑛 2 ) Sort-Compare-Shuffle-WN (SCS-WN) 𝑂 𝜎𝑛log𝑛 Large element space This table lists 3 of the 5 protocols we investigated in the paper. Note we count only the non-XOR gates because XOR can be done without any communication and encryption. Therefore, compared to non-XOR gates, they are almost free. 𝜎 – the number of bits used to denote a set element 𝑛 – the size of the sets

8 Generic PSI Protocols Overview
Cost in non-XOR gates Best for Bitwise-AND (BWA) 2 𝜎 Small element space Pairwise-Comparison (PWC) 𝑂( 𝜎𝑛 2 ) Sort-Compare-Shuffle-WN (SCS-WN) 𝑂 𝜎𝑛log𝑛 Large element space Out of the 3 protocols listed here, we’ve only got enough time to talk about the Bitwise-AND and Sort-Compare-Shuffle with Waksman Network protocols. We selected them since they are more interesting. They have their own best-fit depending on the size of the set element space. 𝜎 – the number of bits used to denote a set element 𝑛 – the size of the sets

9 PSI: Needn’t be Complex
Recessive genes: { , , , … } { , , , … } [ PAH, PKU, CF, … ] Encode set elements as bit vectors [ 0, 0, 1, 0, 0, 0, 1, 0, 1, 1, 0] [ 0, 0, 1, 0, 0, 0, 0, 0, 1, 0, 0] . . . We first look at how Bitwise-AND can be used to solve PSI problem. Initially, each party will encode its private set into a bit vector, using a straight-forward position-based scheme, i.e., that the i-th bit in the vector is 1 implies the private set contains the i-th element in the space. Then, we just apply a bunch of binary AND gates to the corresponding bits to find the intersection. (Obviously, the length of the bit vector grows linearly with the number of elements in the space.) (Mouse Click) As simple the scheme is, it can still be used to solve many interesting practical problems. As an example, the recessive genes that solely determine some genetic diseases can be encoded into a bit-vector. Then two people dating each other could figure out potential health problems of their offspring after they get married, without overly revealing their private genetic information. AND AND AND . . . Bitwise-AND

10 What if the element space is large?
BWA Performance What if the element space is large? BWA scheme is actually very efficient. We can do PSI for two sets of 16-bit elements in less than 3 seconds. The graph shows how the time cost scale with the size of the element space, which is 2^sigma. It’s a exponential growth rate. In many other PSI settings like computing the common contacts, we’d like to intersect elements from a pretty large element space such as the hash of the contacts, which can have as many as 160 bits. Clearly BWA doesn’t work well at that scale.

11 Sort-Compare-Shuffle
Sort: Take advantage of total order of elements Sort-Compare-Shuffle Compare adjacent elements Next, we are going to talk about our most interesting PSI protocol that is good at dealing with set elements from large element spaces. We call this scheme sort-compare-shuffle with Waksman Network, short-named as SCS-WN. I think the basic idea is already self-evident from its name: we’d like to first sort all the elements so that the elements in the intersection are adjacent in the sorted sequence, which will be filtered using oblivious comparisons. Last, the elements are obliviously shuffled before the revelation. The oblivious shuffling is indispensable, otherwise, the positions of the intersected elements, which carry extra information beside the intersection alone, will be revealed. Shuffle to hide positions

12 Sort-Compare-Shuffle
Sort: Take advantage of total order of elements Sort-Compare-Shuffle Compare adjacent elements Now let’s zoom into the sorting stage. Certainly, all the secret elements can be sorted with a secure sorting procedure. However, that will be unnecessarily more expensive in our case. Observing that each party can do the local sorting on their own over their private set, which is almost free compared to the cost of collaborative secure computation. Therefore, the protocol only needs to securely merge two sorted sequences with the more expensive secure computation. We implemented the oblivious merging with a bitonic sorting network. Shuffle to hide positions

13 1 1 1 1 1 2 2 4 4 2 3 3 3 9 3 4 2 4 Bitonic Sorting 7 4 5 5 4 5 4 4 5 4 4 5 Here is a animated demo to illustrate how a bitonic sequence of 8 numbers are merged together. The vertical line segments denote comparison-based swappers. Note the traditional linear-time merge-sort can’t work because its efficiency relies on knowing the intermediate comparison results. In addition, the best practical sorting network requires O(nlog^2 n) gates. So we actually gain something by having parties locally sort their inputs first. To compute them in a privacy-preserving way, each swapper is realized as such a circuit and executed using GC technique. 9 9 7 3 7 7 9 2 7 9 Sorting Networks and their Applications, Ken Batcher, 1968

14 … CMP Filter CMP Filter CMP Filter
After sorting is done, the sorted elements have got to go through a filtering layer consisting of a bunch of comparators. Each comparator filter will output exactly the input number if both inputs are equal, and 0 otherwise. Assuming the 0 element is never used to encode any element, we will get the intersected elements accompanied by lots of 0s after this stage. CMP Filter CMP Filter CMP Filter

15 CMP3 Filter CMP3 Filter CMP3 Filter
In our paper, we have discussed an interesting optimization to use 3-input comparators instead of 2-input ones. It helps to reduce the number of output numbers from 2n-1 to n. The key insight is that given any 3 consecutive number in the sorted sequence, at most 2 of them will be equal since the original two private sets don’t contain duplicates. Interested listeners are encouraged to read our paper for the details. CMP3 Filter CMP3 Filter CMP3 Filter

16 Can’t reveal results yet! Position leaks information.
As a second reminder, we can’t reveal the filtered sequence directly because the positions of the intersected elements convey extra information that is supposed to be kept secret. One way to deal with this situation is to use oblivious sorting to sort these elements then reveal. But this is more expensive than necessary. Our better alternative is to shuffle them in a privacy-preserving manner, using a shuffling network proposed by Abraham Waksman. Can’t reveal results yet! Position leaks information.

17 Journal of the ACM, January 1968
It is really an old gadget invented in 1968, originally proposed to implement telephone network switches efficiently. Journal of the ACM, January 1968

18 Waksman Network Same circuit can generate any permutation:
Basically, Waksman Network has this beautiful recursive structure, where a shuffler of n numbers is implemented using 2 shuffler of n/2 numbers, plus n-1 oblivious swappers. Here we applied an interesting optimization to reduce the cost of GC execution to 1/3 of the original. Please read our paper for the details. By oblivious shuffling, we mean that one party chooses a random permutation and then set the swappers’ control bits accordingly, but the result of the permuted sequence is only revealed to the second party, who can locally sort it before sending it back when necessary. Same circuit can generate any permutation: select a random permutation, and pick swaps gates

19 Private Set Intersection Protocol
Gates to generate and evaluate Free As a recap, the SCS-WN protocol has 3 main stages. We summarized the exact non-XOR gate count on the right for each stage. Overall, the asymptotic cost is O(\sigma nlog n), where n is the size of each set and sigma is the number of bits to represent an element. 𝜎 – the number of bits used to denote a set element 𝑛 – the size of the sets

20 SCS-WN Protocol Results
Our experiments show that SCS-WN protocol scales pretty well with the set size, even when the element space is large. The red lines and blues plot our experimental timings and the theoretical projection using the gate counts shown in the last slides. Be aware of the fact that the lines are actually nlogn. They look linear only because both two axis are plotted in logarithmic scale. 32-bit values

21 Relating Performance to Security
We compared the SCS-WN protocol to existing fastest custom PSI protocol by De Cristofaro and Tsudik published in 2010, which we also implemented in Java for a fair comparison. [DT10] deals with \sigma=160-bit elements, but doesn't get more efficient for smaller \sigma. We found that our protocol is faster except for the ultra-short key length. Notably, in those applications where 32-bit are long enough for distinguishing the elements, the SCS-WN protocol will be faster for all key lengths. DL Key-sizes: (1024, 160) (2048, 224) (3072, 256) (7680, 384) (15360, 512) Symmetric: 80 112 128 192 256

22 Conclusion Generic protocols offer many advantages Composability
Flexibility on hardness assumptions Design cost Performance In conclusion, the most interesting take-away would be that generic protocols can be competitive with custom protocols, and offer many advantages. They are easily composable with larger secure computations, can be flexible to be based on a wider range of hardness assumptions. The design cost of generic protocols are relatively lower. And finally, all these actually come without sacrificing the performance. We demonstrated these using PSI problem as a case study and encourage studies on other kinds of secure computing problems that supply more evidences.

23 Q & A?

Download ppt "Yan Huang, David Evans, Jonathan Katz"

Similar presentations

Ads by Google