Presentation is loading. Please wait.

Presentation is loading. Please wait.

Graph Indexing: A Frequent Structure- based Approach Alicia Cosenza November 26 th, 2007.

Similar presentations


Presentation on theme: "Graph Indexing: A Frequent Structure- based Approach Alicia Cosenza November 26 th, 2007."— Presentation transcript:

1 Graph Indexing: A Frequent Structure- based Approach Alicia Cosenza November 26 th, 2007

2 Presentation Outline Introduction Frequent Fragment Discriminative Fragment Gindex Experimental Result

3 Introduction Graphs are used to model complicated structures such as proteins, circuits, images and XML documents Current index approach is path based – Example: GraphGrep – Advantages Paths are easier to handle Index space is predefined :all the path up to maxL length are selected – Disadvantages Path is too simple There are too many paths and too many false positives

4 Introduction “Can we use a graph structure instead of a a path as the basic index feature?” – gIndex Indexes only “frequent subgraphs” Creates a smaller index Improves query times

5 Presentation Outline Introduction Frequent Fragment Discriminative Fragment Gindex Experimental Result

6 Frequent Fragment Key concept Fragment – small subgraph minsup – minimum support threshold – A graph is frequent if its support or the number of times it appears in the graph database is greater than minsup Only frequent fragments will be indexed

7 Frequent Fragment low minimum support on small fragments (for effectiveness) – Want to index lots of the small subgraphs high minimum support on large fragments (for compactness) – Only want to index a large fragment if it appears a lot Otherwise it will be indexed by the smaller subgraphs Problem: There could be a lot of frequent fragments!

8 Presentation Outline Introduction Frequent Fragment Discriminative Fragment Gindex Experimental Result

9 Discriminative Fragment Definition (Redundant Fragment) – Fragment is redundant with respect to feature set if Definition (Discriminative Fragment). – Fragment is discriminative with respect to if Fragments that are not redundant are called discriminative

10 Presentation Outline Introduction Frequent Fragment Discriminative Fragment Gindex Experimental Result

11 GIndex - Construction First generates all frequent fragments while taking out redundant ones Translates fragments into sequences and holds them in a prefix tree – Each fragment has an id list: the ids of the graphs containing the fragment – Graph Sequentialization (DFS Code) Labeled edge is a 5-tuple (I,j,l i, l (I,j),l j ) Described in another paper

12 GIndex - Construction gIndex Tree – each fragment can be mapped to an edge sequence (DFS code), insert the edge sequences of discriminative fragments in a prefix tree called the gIndex Tree

13 GIndex - Construction gIndex Tree – Implemented using a hash table Both black and white nodes are included in the table The tree is still an important concept since it determines what white nodes will be included

14 GIndex - Search

15 Optimization Apriori Pruning – If a fragment is not in the gIndex tree, we need not check its super-graphs

16 GIndex - Search Verification – After getting the candidate answer set, we have to verify that the graphs in the set really contain the query graph perform a subgraph isomorphism test on each graph one by one

17 GIndex – Maintenance

18 Presentation Outline Introduction Frequent Fragment Discriminative Fragment Gindex Experimental Result

19 The index size of gIndex is more than 10 times smaller than that of GraphGrep; gIndex outperforms GraphGrep by 3 to 10 times in various query loads; the index returned by the incremental maintenance algorithm is effective: it performs as well as the index computed from scratch provided the data distribution does not change much.

20 Experimental Result Data is from an AIDS Antiviral Screen Dataset

21 Experimental Result

22 The End


Download ppt "Graph Indexing: A Frequent Structure- based Approach Alicia Cosenza November 26 th, 2007."

Similar presentations


Ads by Google