Subgraph Search Over Uncertain Graphs Erşan Demircioğlu.

Slides:



Advertisements
Similar presentations
Location Recognition Given: A query image A database of images with known locations Two types of approaches: Direct matching: directly match image features.
Advertisements

 Data mining has emerged as a critical tool for knowledge discovery in large data sets. It has been extensively used to analyze business, financial,
gSpan: Graph-based substructure pattern mining
Shuai Ma, Yang Cao, Wenfei Fan, Jinpeng Huai, Tianyu Wo Capturing Topology in Graph Pattern Matching University of Edinburgh.
Correlation Search in Graph Databases Yiping Ke James Cheng Wilfred Ng Presented By Phani Yarlagadda.
Distance-Constraint Reachability Computation in Uncertain Graphs Ruoming Jin, Lin Liu Kent State University Bolin Ding UIUC Haixun Wang MSRA.
Frequent Closed Pattern Search By Row and Feature Enumeration
Multi-label Relational Neighbor Classification using Social Context Features Xi Wang and Gita Sukthankar Department of EECS University of Central Florida.
Visual Data Mining: Concepts, Frameworks and Algorithm Development Student: Fasheng Qiu Instructor: Dr. Yingshu Li.
LOGO Association Rule Lecturer: Dr. Bo Yuan
Connected Substructure Similarity Search Haichuan Shang The University of New South Wales & NICTA, Australia Joint Work: Xuemin Lin (The University of.
Mining Graphs.
Mining Top-K Large Structural Patterns in a Massive Network Feida Zhu 1, Qiang Qu 2, David Lo 1, Xifeng Yan 3, Jiawei Han 4, and Philip S. Yu 5 1 Singapore.
1 Efficient Subgraph Search over Large Uncertain Graphs Ye Yuan 1, Guoren Wang 1, Haixun Wang 2, Lei Chen 3 1. Northeastern University, China 2. Microsoft.
Frequent Subgraph Pattern Mining on Uncertain Graph Data
Association Analysis (7) (Mining Graphs)
The UNIVERSITY of Kansas EECS 800 Research Seminar Mining Biological Data Instructor: Luke Huan Fall, 2006.
COM (Co-Occurrence Miner): Graph Classification Based on Pattern Co-occurrence Ning Jin, Calvin Young, Wei Wang University of North Carolina at Chapel.
Expectation Maximization Method Effective Image Retrieval Based on Hidden Concept Discovery in Image Database By Sanket Korgaonkar Masters Computer Science.
Independence Fault Collapsing
SubSea: An Efficient Heuristic Algorithm for Subgraph Isomorphism Vladimir Lipets Ben-Gurion University of the Negev Joint work with Prof. Ehud Gudes.
Efficient Join Processing over Uncertain Data - By Reynold Cheng, et all. Presented By Lydia & Usha.
New Algorithm DOM for Graph Coloring by Domination Covering
33 rd International Conference on Very Large Data Bases, Sep. 2007, Vienna Towards Graph Containment Search and Indexing Chen Chen 1, Xifeng Yan 2, Philip.
Mining Graphs with Constrains on Symmetry and Diameter Natalia Vanetik Deutsche Telecom Laboratories at Ben-Gurion University IWGD10 workshop July 14th,
FAST FREQUENT FREE TREE MINING IN GRAPH DATABASES Marko Lazić 3335/2011 Department of Computer Engineering and Computer Science,
Interactive Image Segmentation of Non-Contiguous Classes using Particle Competition and Cooperation Fabricio Breve São Paulo State University (UNESP)
Hubert CARDOTJY- RAMELRashid-Jalal QURESHI Université François Rabelais de Tours, Laboratoire d'Informatique 64, Avenue Jean Portalis, TOURS – France.
Intrusion Detection Jie Lin. Outline Introduction A Frame for Intrusion Detection System Intrusion Detection Techniques Ideas for Improving Intrusion.
Slides are modified from Jiawei Han & Micheline Kamber
Graph Indexing Techniques Seoul National University IDB Lab. Kisung Kim
Topological Summaries: Using Graphs for Chemical Searching and Mining Graphs are a flexible & unifying model Scalable similarity searches through novel.
1 Portfolio Optimization Problem for Stock Portfolio Construction Student : Lee, Dah-Sheng Professor: Lee, Hahn-Ming Date: 9 July 2004.
Click to edit Present’s Name Xiaoyang Zhang 1, Jianbin Qin 1, Wei Wang 1, Yifang Sun 1, Jiaheng Lu 2 HmSearch: An Efficient Hamming Distance Query Processing.
Diversified Top-k Graph Pattern Matching 1 Yinghui Wu UC Santa Barbara Wenfei Fan University of Edinburgh Southwest Jiaotong University Xin Wang.
On Graph Query Optimization in Large Networks Alice Leung ICS 624 4/14/2011.
Xiangnan Kong,Philip S. Yu Department of Computer Science University of Illinois at Chicago KDD 2010.
Graph Indexing: A Frequent Structure- based Approach Alicia Cosenza November 26 th, 2007.
Xiangnan Kong,Philip S. Yu Multi-Label Feature Selection for Graph Classification Department of Computer Science University of Illinois at Chicago.
Frequent Subgraph Discovery Michihiro Kuramochi and George Karypis ICDM 2001.
University at BuffaloThe State University of New York Lei Shi Department of Computer Science and Engineering State University of New York at Buffalo Frequent.
Zhuo Peng, Chaokun Wang, Lu Han, Jingchao Hao and Yiyuan Ba Proceedings of the Third International Conference on Emerging Databases, Incheon, Korea (August.
1 AC-Close: Efficiently Mining Approximate Closed Itemsets by Core Pattern Recovery Advisor : Dr. Koh Jia-Ling Speaker : Tu Yi-Lang Date : Hong.
August 30, 2004STDBM 2004 at Toronto Extracting Mobility Statistics from Indexed Spatio-Temporal Datasets Yoshiharu Ishikawa Yuichi Tsukamoto Hiroyuki.
Computer Science and Engineering TreeSpan Efficiently Computing Similarity All-Matching Gaoping Zhu #, Xuemin Lin #, Ke Zhu #, Wenjie Zhang #, Jeffrey.
Mining Top-K Large Structural Patterns in a Massive Network Feida Zhu 1, Qiang Qu 2, David Lo 1, Xifeng Yan 3, Jiawei Han 4, and Philip S. Yu 5 1 Singapore.
Mining Graph Patterns Efficiently via Randomized Summaries Chen Chen, Cindy X. Lin, Matt Fredrikson, Mihai Christodorescu, Xifeng Yan, Jiawei Han VLDB’09.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 A self-organizing map for adaptive processing of structured.
D-skyline and T-skyline Methods for Similarity Search Query in Streaming Environment Ling Wang 1, Tie Hua Zhou 1, Kyung Ah Kim 2, Eun Jong Cha 2, and Keun.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Direct mining of discriminative patterns for classifying.
Graph Indexing From managing and mining graph data.
Indexing and Mining Free Trees Yun Chi, Yirong Yang, Richard R. Muntz Department of Computer Science University of California, Los Angeles, CA {
Ning Jin, Wei Wang ICDE 2011 LTS: Discriminative Subgraph Mining by Learning from Search History.
Algorithms and Computational Biology Lab, Department of Computer Science and & Information Engineering, National Taiwan University, Taiwan Modular organization.
Mining Coherent Dense Subgraphs across Multiple Biological Networks Vahid Mirjalili CSE 891.
Finding Dense and Connected Subgraphs in Dual Networks
Fast nearest neighbor searches in high dimensions Sami Sieranoja
Probabilistic Data Management
Query in Streaming Environment
Graph Search with Indexing
Visualization of query processing over large-scale road networks
CARPENTER Find Closed Patterns in Long Biological Datasets
Probabilistic Data Management
Discovering Larger Network Motifs
Efficient Subgraph Similarity All-Matching
Jongik Kim1, Dong-Hoon Choi2, and Chen Li3
Resource Allocation for Distributed Streaming Applications
Approximate Graph Mining with Label Costs
Distance-Constraint Reachability Computation in Uncertain Graphs
Presentation transcript:

Subgraph Search Over Uncertain Graphs Erşan Demircioğlu

Motivation – 1/2 A graph structure effectively models a set of items and relations between them. So it has many usage area like modeling – Chemical compounds – Social networks – Network traffic 2

Motivation – 2/2 Generally, the data is modeled to extract some information. – In case of a graph model; frequent or queried sub- graphs are interested. For example, – In bioinformatics, a known gene pattern is queried to identify the organism is pathogenic or not. – For social networks, frequent grouping structure could be interested. 3

Formal Definition of Graph 4

Subgraph and Supgraph Isomorphism 5

Uncertain Graphs In graphs, relation between vertices are certain, two vertices are connected or not. However in real life application, relation between vertices are not certain. – In bioinformatics, we don’t know the gene pattern exactly, but we know probability of connection between proteins 6

Formal Definition of Uncertain Graph 7

Subgraph Search over Graphs 8

Scope of The Presentation Because of verifying subgraph isomorphism is an NP-complete problem, we want to deal with minimal number of candidate graphs. Thus most of studies are focused on pruning uninterested graphs. In the scope of this study, two pruning strategies are investigated. 9

Method Proposed by Chen In this study, – A novel feature structure and – An effective pruning method are proposed. 10

Node Projected Vector – 1/4 11

Node Projected Vector – 2/4 Then, dimensions of graph are extracted – A dimension is generated from a NNT and represents edge between two nodes and its level in NNT. 12

Node Projected Vector – 3/4 13

Node Projected Vector – 4/4 14

Dominated Set Cover Algorithm If Q is subgraph isomorphic to G, then the union of query vectors dominated by G covers all node-projected vectors in Q, – First the dominated vectors for each node of G are computed, – Then, union the dominated vectors of all the nodes of G to get the dominated vector set. 15

Dominated Set Cover Algorithm NPV’s of query graph Q = {NPV(1), NPV(2), NPV(3), NPV(4)} NPV’s of graph G = {NPV(a), NPV(b), NPV(c), NPV(d)} 16

Application on Uncertain Graphs 17

Probability Pruning 18

19

20

Opinion About Method Representing graphs as a feature vector is an innovative idea. – As shown in corresponding paper, feature vectors are effectively used to identify unrelated graphs. – In addition to this, vector representation makes possible to use many classification methodology to classify graphs or extracting frequent sub graphs. 21

Method Proposed by Papapetrou This study is focused on finding frequent supgraphs in uncertain graph database. In this study, – First, each possible exact graphs are generated by using given uncertain graph and probability of these exact graphs are calculated. – Then, expected support value is calculated for a query graph by using possible exact graphs. 22

Calculating Exact Graphs 23

Expected Support 24

Opinion About Method Calculating all possible exact graphs from uncertain graphs removes complexity of uncertainty from problem. However, – Increases dataset size and complexity – Not efficient for large graphs. 25

Questions 26

References Papapetrou, O., Ioannou, E., and Skoutas, D., "Efficient discovery of frequent subgraph patterns in uncertain graph databases." In Proceedings of the 14th International Conference on Extending Database Technology (EDBT/ICDT '11), Anastasia Ailamaki, Sihem Amer-Yahia, Jignesh Pate, Tore Risch, Pierre Senellart, and Julia Stoyanovich (Eds.). ACM, New York, NY, USA, , Chen, L., Wang, C., "Continuous Subgraph Pattern Search over Certain and Uncertain Graph Streams," Knowledge and Data Engineering, IEEE Transactions on, vol.22, no.8, pp , 2010 Shang, H., Zhu, K., Lin, X., Zhang, Y., Ichise, R.;, "Similarity search on supergraph containment," Data Engineering (ICDE), 2010 IEEE 26th International Conference on, vol., no., pp , 1-6 March