Privacy Preserving Subgraph Matching on Large Graphs in Cloud

Slides:



Advertisements
Similar presentations
1 gStore: Answering SPARQL Queries Via Subgraph Matching Presented by Guan Wang Kent State University October 24, 2011.
Advertisements

Efficient Information Retrieval for Ranked Queries in Cost-Effective Cloud Environments Presenter: Qin Liu a,b Joint work with Chiu C. Tan b, Jie Wu b,
Correlation Search in Graph Databases Yiping Ke James Cheng Wilfred Ng Presented By Phani Yarlagadda.
Lei Zou 1, Jinghui Mo 1, Lei Chen 2, M. Tamer Özsu 3, Dongyan Zhao 1 1 gStore: Answering SPARQL Queries Via Subgraph Matching 1 Peking University, 2 Hong.
Jiannan Wang (Tsinghua, China) Guoliang Li (Tsinghua, China) Jianhua Feng (Tsinghua, China)
1 Efficient Subgraph Search over Large Uncertain Graphs Ye Yuan 1, Guoren Wang 1, Haixun Wang 2, Lei Chen 3 1. Northeastern University, China 2. Microsoft.
CoPhy: A Scalable, Portable, and Interactive Index Advisor for Large Workloads Debabrata Dash, Anastasia Ailamaki, Neoklis Polyzotis 1.
Complexity 16-1 Complexity Andrei Bulatov Non-Approximability.
Computability and Complexity 23-1 Computability and Complexity Andrei Bulatov Search and Optimization.
Recent Development on Elimination Ordering Group 1.
Yield- and Cost-Driven Fracturing for Variable Shaped-Beam Mask Writing Andrew B. Kahng CSE and ECE Departments, UCSD Xu Xu CSE Department, UCSD Alex Zelikovsky.
OCFS: Optimal Orthogonal Centroid Feature Selection for Text Categorization Jun Yan, Ning Liu, Benyu Zhang, Shuicheng Yan, Zheng Chen, and Weiguo Fan et.
3-2 Solving Inequalities Using Addition or Subtraction
Efficient Parallel Set-Similarity Joins Using Hadoop Chen Li Joint work with Michael Carey and Rares Vernica.
Domain decomposition in parallel computing Ashok Srinivasan Florida State University COT 5410 – Spring 2004.
Preserving Link Privacy in Social Network Based Systems Prateek Mittal University of California, Berkeley Charalampos Papamanthou.
1 SOC Test Architecture Optimization for Signal Integrity Faults on Core-External Interconnects Qiang Xu and Yubin Zhang Krishnendu Chakrabarty The Chinese.
GStore: Answering SPARQL Queries via Subgraph Matching Lei Zou, Jinghui Mo, Lei Chen, M. Tamer Ozsu ¨, Dongyan Zhao {
© Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. LogKV: Exploiting Key-Value.
Protecting Sensitive Labels in Social Network Data Anonymization.
On Graph Query Optimization in Large Networks Alice Leung ICS 624 4/14/2011.
Zhuo Peng, Chaokun Wang, Lu Han, Jingchao Hao and Yiyuan Ba Proceedings of the Third International Conference on Emerging Databases, Incheon, Korea (August.
An Efficient Linear Time Triple Patterning Solver Haitong Tian Hongbo Zhang Zigang Xiao Martin D.F. Wong ASP-DAC’15.
GStore: Answering SPARQL Queries Via Subgraph Matching Lei Zou 1, Jinghui Mo 1, Lei Chen 2, M. Tamer Özsu 3, Dongyan Zhao Peking University, 2 Hong.
Computer System Design Lab 1 Inverted Index Based Multi-Keyword Public-key Searchable Encryption with Strong Privacy Guarantee Bing Wang * Wei Song *†
Implicit Hitting Set Problems Richard M. Karp Erick Moreno Centeno DIMACS 20 th Anniversary.
The Similarity Graph: Analyzing Database Access Patterns Within A Company PhD Preliminary Examination Robert Searles Committee John Cavazos and Stephan.
Privacy Protection in Social Networks Instructor: Assoc. Prof. Dr. DANG Tran Khanh Present : Bui Tien Duc Lam Van Dai Nguyen Viet Dang.
Graph Data Management Lab, School of Computer Science Branch Code: A Labeling Scheme for Efficient Query Answering on Tree
Melbourne, Australia, Oct., 2015 gSparsify: Graph Motif Based Sparsification for Graph Clustering Peixiang Zhao Department of Computer Science Florida.
Graph Indexing From managing and mining graph data.
Optimal Relay Placement for Indoor Sensor Networks Cuiyao Xue †, Yanmin Zhu †, Lei Ni †, Minglu Li †, Bo Li ‡ † Shanghai Jiao Tong University ‡ HK University.
HCBE: Achieving Fine-Grained Access Control in Cloud-based PHR Systems Xuhui Liu [1], Qin Liu [1], Tao Peng [2], and Jie Wu [3] [1] Hunan University, China.
Hongyu Liang Institute for Theoretical Computer Science Tsinghua University, Beijing, China The Algorithmic Complexity.
Xiaowei Ying, Kai Pan, Xintao Wu, Ling Guo Univ. of North Carolina at Charlotte SNA-KDD June 28, 2009, Paris, France Comparisons of Randomization and K-degree.
Dynamic Mobile Cloud Computing: Ad Hoc and Opportunistic Job Sharing.
Timetable Problem solving using Graph Coloring
Outline Introduction State-of-the-art solutions Equi-Truss Experiments
Optimizing Distributed Actor Systems for Dynamic Interactive Services
Cohesive Subgraph Computation over Large Graphs
Outline Introduction State-of-the-art solutions
Privacy Preserving Subgraph Matching on Large Graphs in Cloud
Maximal Planar Subgraph Algorithms
A Study of Group-Tree Matching in Large Scale Group Communications
Graph Theory and Algorithm 02
Oblivious RAM: A Dissection and Experimental Evaluation
Personalized Privacy Protection in Social Networks
EMIS 8373: Integer Programming
Query-Friendly Compression of Graph Streams
Computability and Complexity
Privacy Preserving Subgraph Matching on Large Graphs in Cloud
Pyramid Sketch: a Sketch Framework
On Efficient Graph Substructure Selection
Constrained Bipartite Vertex Cover: The Easy Kernel is Essentially Tight Bart M. P. Jansen June 4th, WORKER 2015, Nordfjordeid, Norway.
Personalized Privacy Protection in Social Networks
Finding Subgraphs with Maximum Total Density and Limited Overlap
Boi Faltings and Martin Charles Golumbic
Sungho Kang Yonsei University
Diversified Top-k Subgraph Querying in a Large Graph
Algorithms for Budget-Constrained Survivable Topology Design
Efficient Subgraph Similarity All-Matching
Boi Faltings and Martin Charles Golumbic
On the Designing of Popular Packages
Solving the Minimum Labeling Spanning Tree Problem
Jongik Kim1, Dong-Hoon Choi2, and Chen Li3
Approximation Algorithms for k-Anonymity
Approximate Graph Mining with Label Costs
Dong Deng+, Yu Jiang+, Guoliang Li+, Jian Li+, Cong Yu^
Accelerating Regular Path Queries using FPGA
Presentation transcript:

Privacy Preserving Subgraph Matching on Large Graphs in Cloud Zhao Chang*, Lei Zou†, Feifei Li* *University of Utah, USA †Peking University, China

Outline Background K-Automorphism Subgraph Matching Experimental Results Here is a brief outline.

Background Subgraph Matching on Large Graphs Data Privacy Problem: high computational complexity Solution: cloud computing Data Privacy Problem: untrusted cloud Solution: construct a k-automorphic graph and anonymize query graphs

Background Privacy Concerns Original Graph G

Naïve Anonymized Graph G’ Background Privacy Concerns Naïve Anonymized Graph G’ Original Graph G

Naïve Anonymized Graph G’ Background Privacy Concerns I know Tom and his wife graduated from the same university and work at the same company… Naïve Anonymized Graph G’ Original Graph G

Naïve Anonymized Graph G’ Background Privacy Concerns I know Tom and his wife graduated from the same university and work at the same company… Naïve Anonymized Graph G’ Original Graph G

K-Automorphism Original Graph G

K-Automorphism Original Graph G Original Query Q

K-Automorphism Original Graph G Original Query Q

K-Automorphic Graph Gk K-Automorphism K-Automorphic Graph Gk K = 2

K-Automorphic Graph Gk K-Automorphism K-Automorphic Graph Gk K = 2 Outsourced Query Qo

K-Automorphism K-Automorphic Graph Gk K = 2 Outsourced Query Qo Label Correspondence Table (LCT)

K-Automorphism

K-Automorphism

K-Automorphism

Alignment Vertex Table (AVT) K-Automorphism Alignment Vertex Table (AVT)

K-Automorphism Alignment Vertex Table (AVT) Automorphic Function F1

K-Automorphism Details of Block Alignment One More Example 4 6 2 3 1 5 10 8 9 7 19

K-Automorphism Details of Block Alignment One More Example 4 6 2 3 1 5 10 8 9 7 Graph Partition 20

K-Automorphism Details of Block Alignment One More Example 4 6 2 3 1 5 10 8 9 7 Graph Partition 1 7 2 3 5 6 8 9 4 10 21

K-Automorphism Details of Block Alignment One More Example 4 6 2 3 1 5 10 8 9 7 Graph Partition 1 7 2 3 5 6 8 9 4 10 22

K-Automorphism Details of Block Alignment One More Example 4 6 2 3 1 5 10 8 9 7 Graph Partition Graph Alignment 1 7 2 3 5 6 8 9 4 10 23

K-Automorphism Details of Block Alignment One More Example 4 6 2 3 1 5 10 8 9 7 Graph Partition Graph Alignment Alignment Vertex Table (AVT) 1 7 1 7 2 9 3 8 4 10 5 6 2 3 5 6 8 9 4 10 24

K-Automorphism Details of Block Alignment Alignment Vertex Table (AVT) 1 7 Alignment Vertex Table (AVT) 1 7 2 9 3 8 4 10 5 6 2 3 5 6 8 9 4 10

K-Automorphism Details of Block Alignment 1 7 Alignment Vertex Table (AVT) 1 7 2 9 3 8 4 10 5 6 2 3 5 6 8 9 4 10 Finding the optimal block alignment is an NP-hard problem (even in the case of no attributes).

More complicated while considering attributes… K-Automorphism Heuristic Block Alignment More complicated while considering attributes… a a 1 7 b b d c 2 3 5 6 8 9 c c c 4 10 d 27

K-Automorphism Heuristic Block Alignment More complicated while considering attributes… The largest same degree with same label a a 1 7 b b d c 2 3 5 6 8 9 c c c 4 10 d 28

More complicated while considering attributes… K-Automorphism Heuristic Block Alignment More complicated while considering attributes… a a 1 7 b b d c 2 3 5 6 8 9 c c c 4 10 d BF-search 29

More complicated while considering attributes… K-Automorphism Heuristic Block Alignment More complicated while considering attributes… a a 1 7 b b d c 2 3 5 6 8 9 c c c 4 10 d Pair-up vertices with the same label and / or the same (or similar) degrees 30

More complicated while considering attributes… K-Automorphism Heuristic Block Alignment More complicated while considering attributes… a a 1 7 b b d c 2 3 5 6 8 9 c c c 4 10 d 31

More complicated while considering attributes… K-Automorphism Heuristic Block Alignment More complicated while considering attributes… a a 1 7 b b d c 2 3 5 6 8 9 c c c 4 10 d 32

More complicated while considering attributes… K-Automorphism Heuristic Block Alignment More complicated while considering attributes… a a 1 7 b b d c 2 3 5 6 8 9 Alignment Vertex Table (AVT) c c 1 7 2 6 3 9 4 8 5 10 c 4 10 d 33

More complicated while considering attributes… K-Automorphism Heuristic Block Alignment More complicated while considering attributes… a a 1 7 b b d c 2 3 5 6 8 9 Alignment Vertex Table (AVT) c c 1 7 2 6 3 9 4 8 5 10 c 4 10 d 34

More complicated while considering attributes… K-Automorphism Heuristic Block Alignment More complicated while considering attributes… a a 1 7 b b d c 2 3 5 6 8 9 Alignment Vertex Table (AVT) c c 1 7 2 6 3 9 4 8 5 10 c 4 10 d 35

More complicated while considering attributes… K-Automorphism Heuristic Block Alignment More complicated while considering attributes… a a 1 7 b b d c 2 3 5 6 8 9 Alignment Vertex Table (AVT) c c 1 7 2 6 3 9 4 8 5 10 c 4 10 d 36

K-Automorphism Optimization 1 Upload Go rather than Gk to the cloud Save cloud storage

K-Automorphism Optimization 1 Upload Go rather than Gk to the cloud Save cloud storage Outsourced Graph Go K = 2

Alignment Vertex Table (AVT) K-Automorphism Optimization 1 Upload Go rather than Gk to the cloud Save cloud storage Outsourced Graph Go K = 2 Outsourced Query Qo Alignment Vertex Table (AVT)

Subgraph Matching Star Matching Outsourced Query Qo

Subgraph Matching Star Matching Query Decomposition Outsourced Query Qo Star S1 Star S2

Subgraph Matching Star Matching Outsourced Graph Go Star Matching Query Decomposition Outsourced Query Qo Star S1 Star S2

Let |R(Si)| denote the number of matches of a star Si on Go. Subgraph Matching Star Matching Outsourced Graph Go Let |R(Si)| denote the number of matches of a star Si on Go. Star Matching Query Decomposition Outsourced Query Qo Star S1 Star S2

Subgraph Matching Query Decomposition Query Decomposition Outsourced Query Qo Star S1 Star S2

Subgraph Matching Query Decomposition Given a set of stars {S1, S2, …, Sn}, we define the cost of the query decomposition is: Query Decomposition Outsourced Query Qo Star S1 Star S2

Suppose each |R(Si)| has been given. Subgraph Matching Query Decomposition Given a set of stars {S1, S2, …, Sn}, we define the cost of the query decomposition is: Suppose each |R(Si)| has been given. Finding the query decomposition with the minimum cost is an NP-hard problem.

Subgraph Matching Query Decomposition Given a set of stars {S1, S2, …, Sn}, we define the cost of the query decomposition is: Solution: formulate it as an integer linear programming (ILP) problem and use an available ILP tool to solve it.

Subgraph Matching Optimization 2 Estimate the search space of matching a star query (|R(Si)|) Better label generalization, better query decomposition Save query time cost

Subgraph Matching Optimization 2 Estimate the search space of matching a star query (|R(Si)|) Better label generalization, better query decomposition Save query time cost

Subgraph Matching Optimization 2 Estimate the search space of matching a star query (|R(Si)|) Better label generalization, better query decomposition Save query time cost

Subgraph Matching Result Join

Subgraph Matching Result Join Alignment Vertex Table (AVT) Automorphic Function F1

Subgraph Matching Result Join Alignment Vertex Table (AVT) Automorphic Function F1

Subgraph Matching Optimization 3 Generate Rin rather than all the results Save query time cost and communication cost Alignment Vertex Table (AVT) Automorphic Function F1

Generating Candidate Results Subgraph Matching Client Filtering Generating Candidate Results

Generating Candidate Results Subgraph Matching Client Filtering Generating Candidate Results Filtering

Generating Candidate Results Subgraph Matching Client Filtering Original Query Q Generating Candidate Results Matching Result Filtering

Experimental Results Setup: Server: a Windows server with 64 GB main memory and four CPU cores. Client: a Windows PC with 8 GB main memory and Intel Core 2 Duo CPU

Experimental Results Setup: Original Data Graphs: Server: a Windows server with 64 GB main memory and four CPU cores. Client: a Windows PC with 8 GB main memory and Intel Core 2 Duo CPU Original Data Graphs: 105 – 107 vertices, 106 – 108 edges The ratio of # of vertices to # of different labels: around 103 Original Query Graphs: 4 – 12 edges

Experimental Results

Experimental Results

Experimental Results

Experimental Results

Experimental Results

Experimental Results

Experimental Results

Experimental Results

Experimental Results

Experimental Results

Experimental Results

Conclusions 1. Present an efficient framework

Conclusions 1. Present an efficient framework 2. Protect both structural and label privacy

Conclusions 1. Present an efficient framework 2. Protect both structural and label privacy 3. Explore a number of optimization techniques and leverage an effective cost model

Conclusions 1. Present an efficient framework 2. Protect both structural and label privacy 3. Explore a number of optimization techniques and leverage an effective cost model 4. Perform extensive experiments on large real graphs

Thanks!