Personalized Privacy Preservation: beyond k-anonymity and ℓ-diversity SIGMOD 2006 Presented By Hongwei Tian.

Personalized Privacy Preservation: beyond k-anonymity and ℓ-diversity SIGMOD 2006 Presented By Hongwei Tian

Outline What is Privacy Breaching? Drawback of K-Anonymity and ℓ-Diversity Personalized Anonymous How Adversary attacks? How Data owner defeats the attacks? Experiments

What is Privacy Breaching Mainly, there are two classes  Compare prior belief and posterior belief prior belief < posterior belief : help adversary 50% --> 80% “Bob has cancer” prior belief > posterior belief : reasonable? 80% --> 50% “Bob has cancer” =>I don’t think so. And many others have the same thought with me.

What is Privacy Breaching  Only posterior belief K-Anonymity: posterior belief ≤ 1/ k  In a QI group, there are at least k tuples. ℓ-Diversity: posterior belief = p ≤ threshold  In a QI group, p percent of the tuples appear in the largest sub-group Personalized: posterior belief = Pr breach ≤ threshold  Pr breach : Breaching probability

Drawback of K-Anonymity and ℓ-Diversity a k-anonymous table only prevents association between individuals and tuples.  ℓ-Diversity and Personalized methods both prevent association between individuals and sensitive values.

Drawback of K-Anonymity and ℓ-Diversity a k-anonymous table may lose considerable information  ℓ-Diversity also has this problem.

Drawback of K-Anonymity and ℓ-Diversity Consider such a situation:  In one QI group, all tuples come from the same individual v  Adversary only knows v in this QI group from external datasets I am Bob and I am unlucky that I have so many diseases Bob must be here Aha, I know Bob has four diseases

Drawback of K-Anonymity and ℓ-Diversity Do not take into account personal anonymity requirements

Personalized Anonymous personalized anonymity: a person can specify the degree of privacy protection for her/his sensitive values. So far, the literature has focused on a universal approach that exerts the same amount of privacy preserving for all persons, without catering for their concrete needs.

Personalized Anonymous

BREACH PROBABILITY:  For a tuple t ∈ T, its breach probability P breach (t) equals the probability that an adversary can infer from T ∗ that any of the associations {o, v 1 },..., {o, v x } exists in T, where v 1,..., v x are the leaf values in SUBTR(t.GN).

Personalized Anonymous BREACH PROBABILITY  Data owner and Adversary both can compute it  Data owner want to P breach (t) < threshold, then the privacy of the individual corresponding to t holds  Adversary hope to get a P breach (t) > threshold, which breaches the privacy of the individual. How the adversary do (attack)? ???

How Adversary attacks Adversary know One Individual One Tuple (Primary Case) Possible reconstruction P(5,4) ×3 ×3=1080; Breaching reconstruction 2 × P(4,3) × 3 ×3=432; Pbreach(t) = 432/1080=2/5

How Adversary attacks Adversary know One Individual Multiple Tuples (Non-Primary Case) Possible reconstruction 5 4 ×3 ×3=5625; Breaching reconstruction 2 × 5 3 × 3 ×3 － 5 2 × 3 ×3 =2025; Pbreach(t) = 2025/5625=9/25

How Data owner defeats the attacks The formal computation for P breach (t).  Primary Case  Non-Primary Case Overlap disjoint n=5,b=2 n=2,b=2, c=1/3

How Data owner defeats the attacks Utility Measure: Information Loss

How Data owner defeats the attacks Algorithm Picture Table Group1 Group N SA-Generalization Split … New Table Replace if having more utility

How Data owner defeats the attacks Algorithm  Start from all QI values are roots and SA values have been generalized to satisfy every P breach (t)< threshold  Top-down split QI attributes, in order to increase utility information  “Single Split” means every time, only one attribute can be split into its direct children  SA-Generalization guarantee in every QI group, every P breach (t)< threshold, then the whole table prevents privacy  Every iteration, it should find a “Split” and after SA-Generalization, the utility information increases; otherwise, it quits

How Data owner defeats the attacks Algorithm  Bottom-up generalize SA values to improve privacy  If tuples in S prob satisfy privacy requirement, means all tuples in G satisfy privacy requirement  All SA values of the tuples which dissatisfy privacy requirement will be generalized to the parent of the Guarding Node which approaches the root mostly  Finish when no tuples in S prob dissatisfy privacy requirement; or no possibility to generalize.

Experiments Adult dataset (http://www.ipums.org)http://www.ipums.org 5 QI Attributes, 1 SA Attribute Pri-leaf, Pri-mixed, Nonpri-leaf, Nonpri-mixed Breaching Threshold =0.25 All Weight for attributes =1

Experiments Breaching Probability

References Xiaokui Xiao and Yufei Tao. Personalized Privacy Preservation. In SIGMOD 2006. A. Machanavajjhala, J. Gehrke, and D. Kifer. l-diversity: Privacybeyond k-anonymity. In ICDE, 2006. K. LeFevre, D. J. DeWitt, and R. Ramakrishnan. Incognito: Efficient full-domain k-anonymity. In SIGMOD, 2005. A. Evfimievski, J. Gehrke, and R. Srikant. Limiting privacy breaching in privacy preserving data mining. In ACM Symposium on Principles of Database Systems, 2003.

Personalized Privacy Preservation: beyond k-anonymity and ℓ-diversity SIGMOD 2006 Presented By Hongwei Tian.

Similar presentations

Presentation on theme: "Personalized Privacy Preservation: beyond k-anonymity and ℓ-diversity SIGMOD 2006 Presented By Hongwei Tian."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Personalized Privacy Preservation: beyond k-anonymity and ℓ-diversity SIGMOD 2006 Presented By Hongwei Tian.

Similar presentations

Presentation on theme: "Personalized Privacy Preservation: beyond k-anonymity and ℓ-diversity SIGMOD 2006 Presented By Hongwei Tian."— Presentation transcript:

Similar presentations

About project

Feedback