Presentation is loading. Please wait.

Presentation is loading. Please wait.

Xiaowei Ying Xintao Wu Univ. of North Carolina at Charlotte 2009 SIAM Conference on Data Mining, May 1, Sparks, Nevada Graph Generation with Prescribed.

Similar presentations


Presentation on theme: "Xiaowei Ying Xintao Wu Univ. of North Carolina at Charlotte 2009 SIAM Conference on Data Mining, May 1, Sparks, Nevada Graph Generation with Prescribed."— Presentation transcript:

1 Xiaowei Ying Xintao Wu Univ. of North Carolina at Charlotte 2009 SIAM Conference on Data Mining, May 1, Sparks, Nevada Graph Generation with Prescribed Feature Constraints

2 Motivation -- Generate graphs for publishing social network -- Generate graphs for testing data mining results Generator with feature range constraint -- Privacy risks introduced by feature constraints Generator with feature distribution constraint Framework Graph Generation with Prescribed Feature Constraints 2009 SIAM Conference on Data Mining, May 1, Sparks, Nevada 2

3 Graph Generation with Prescribed Feature Constraints Publishing social networks: Privacy VS. Utility Privacy issue: anonymization is not enough Active/passive attacks[Backstrom, et. al., WWW07] Subgraph attacks [M. Hay et. al., VLDB08] K-anonymity in social networks [B. Zhou, et. al. ICDE08] [K. Liu et. al., SIGMOD08] Randomization approach Local topology is changed – reduce re-identification risk Links are randomized – link privacy is pretected Motivation 3

4 2009 SIAM Conference on Data Mining, May 1, Sparks, Nevada Graph Generation with Prescribed Feature Constraints Publishing social networks: Privacy VS. Utility Randomization Approach -- Pure randomization can’t preserve many topological features. [Ying SDM08] Motivation 4 -- the largest eigenvalue of adjacency matrix -- the second smallest eigenvalue of Laplacian matrix -- harmonic mean of shortest distance -- transitivity How to generate graphs preserving data utility?

5 2009 SIAM Conference on Data Mining, May 1, Sparks, Nevada Graph Generation with Prescribed Feature Constraints Generate graphs for testing data mining results -- Generate a set of graph samples s.t. a feature of the samples satisfies a specified distribution. Motivation 5

6 -- Generate graphs for publishing social network -- Generate graphs for testing data mining results Generator with feature range constraint -- Privacy risks introduced by feature constraints Generator with feature distribution constraint Framework Graph Generation with Prescribed Feature Constraints 2009 SIAM Conference on Data Mining, May 1, Sparks, Nevada 6

7 Graph Generation with Prescribed Feature Constraints 1. Accessibility: can access all the graph with the given degree sequence 2. Uniformity: all such graphs have the same probability to be generated 3. Application: empirically learning the property of graph features given degree seq. Switch and Uniform Graph Generator 7 Uniform switch procedure [Taylor, 1981] -- Preserves the degree sequence/distribution

8 2009 SIAM Conference on Data Mining, May 1, Sparks, Nevada Graph Generation with Prescribed Feature Constraints How to generate a graph: 1. with the given degree sequence 2. with the feature range constraint (FRC): 8 Graph Generator with FRC uniformity for accessible graphs

9 Motivation -- Generate graphs for publishing social network -- Generate graphs for testing data mining results Generator with feature range constraint -- Privacy risks introduced by feature constraints Generator with feature distribution constraint Framework Graph Generation with Prescribed Feature Constraints 2009 SIAM Conference on Data Mining, May 1, Sparks, Nevada 9

10 Graph Generation with Prescribed Feature Constraints Privacy risks introduced by FRC Attackers know: 1. The released graph preserve the true degree sequence 2. The true graph has its S feature within range R What attackers can do? Graph Generator and Privacy Issues 10 With the released graph, attackers can explore the graph space

11 2009 SIAM Conference on Data Mining, May 1, Sparks, Nevada Graph Generation with Prescribed Feature Constraints Graph space : {G: with the given degree seq. & } Uniformly sample the graph space: 11 Graph Generator and Privacy Issues Attacker’s confidence on link (i,j)

12 2009 SIAM Conference on Data Mining, May 1, Sparks, Nevada Graph Generation with Prescribed Feature Constraints Network of US political books (105 nodes, 441 edges) Books about US politics sold by Amazon.com. Edges represent frequent co-purchasing of books by the same buyers. Nodes have been given colors of blue, white, or red to indicate whether they are "liberal", "neutral", or "conservative". http://www-personal.umich.edu/˜mejn/netdata/ 12 FRC Can Jeopardize Privacy --A real network example

13 2009 SIAM Conference on Data Mining, May 1, Sparks, Nevada Graph Generation with Prescribed Feature Constraints 13 The attacker simply takes t node pairs with the highest probabilities as candidate links Polbook network 105 nodes, 441 edges FRC Can Jeopardize Privacy --A real network example Some features jeopardize privacy, and some others not Top candidates can seriously jeopardize privacy!!

14 2009 SIAM Conference on Data Mining, May 1, Sparks, Nevada Graph Generation with Prescribed Feature Constraints 14 FRC Can Jeopardize Privacy -- More real network examples Polbook network 105 nodes, 441 edges Enron email network 151 nodes, 869 edges

15 2009 SIAM Conference on Data Mining, May 1, Sparks, Nevada Graph Generation with Prescribed Feature Constraints 15 FRC Can Jeopardize Privacy -- A theoretical result

16 2009 SIAM Conference on Data Mining, May 1, Sparks, Nevada Graph Generation with Prescribed Feature Constraints 16 FRC Can Jeopardize Privacy -- A theoretical result Conclusion: If the FRC specifies a sub-space close to the true graph, privacy is seriously breached

17 Motivation -- Generate graphs for publishing social network -- Generate graphs for testing data mining results Generator with feature range constraint -- Privacy risks introduced by feature constraints Generator with feature distribution constraint Framework Graph Generation with Prescribed Feature Constraints 2009 SIAM Conference on Data Mining, May 1, Sparks, Nevada 17

18 2009 SIAM Conference on Data Mining, May 1, Sparks, Nevada Graph Generation with Prescribed Feature Constraints Graph Generator with FDC Feature Distribution Constraint (FDC) 18 Uniform generator: gives the natural distribution of feature S, highly skewed in the range How to generate graphs s.t. with given degree seq. features value has the target distribution g(x) Natural distribution f(x) Target distribution g(x)

19 2009 SIAM Conference on Data Mining, May 1, Sparks, Nevada Graph Generation with Prescribed Feature Constraints Based on Metropolis-Hastings method Accept ratio depends on target distr. g(x) & natural distr. f(x) 19 Graph Generator with FDC

20 2009 SIAM Conference on Data Mining, May 1, Sparks, Nevada Graph Generation with Prescribed Feature Constraints 20 Graph Generator with FDC Target distribution: Natural distribution: Evaluation

21 2009 SIAM Conference on Data Mining, May 1, Sparks, Nevada Graph Generation with Prescribed Feature Constraints Summary Graph generator with feature range constraint Attackers can sample the graph space near the true graph and breach the privacy. Graph generator with feature distribution constraint Generate a set of graphs samples for statistical testing 21

22 Questions? Acknowledgments This work was supported in part by U.S. National Science Foundation IIS-0546027 and CNS-0831204. Thank You! 22

23 2009 SIAM Conference on Data Mining, May 1, Sparks, Nevada Graph Generation with Prescribed Feature Constraints 23 Example: graphs with degree sequence {3,2,2,2,3}. Is node 1 and 5 connected? Graph Generator and Privacy Issues True graph Published graph

24 2009 SIAM Conference on Data Mining, May 1, Sparks, Nevada Graph Generation with Prescribed Feature Constraints Graph Generator with FDC Problem of generator with FRC: 24 Uniform generator: gives the natural distribution of feature S highly skewed in the range generates biased feature value Real-world graph Range


Download ppt "Xiaowei Ying Xintao Wu Univ. of North Carolina at Charlotte 2009 SIAM Conference on Data Mining, May 1, Sparks, Nevada Graph Generation with Prescribed."

Similar presentations


Ads by Google