Presentation is loading. Please wait.

Presentation is loading. Please wait.

Index Structures For ISAT Work In Progress By Biswanath Panda, Mirek Riedewald, Paul Chew & Johannes Gehrke.

Similar presentations


Presentation on theme: "Index Structures For ISAT Work In Progress By Biswanath Panda, Mirek Riedewald, Paul Chew & Johannes Gehrke."— Presentation transcript:

1 Index Structures For ISAT Work In Progress By Biswanath Panda, Mirek Riedewald, Paul Chew & Johannes Gehrke

2 Where Do We Fit In? ISAT uses a tabulation method to find approximate function values Currently a simple binary tree Try existing and new index structures –Lot of work on indexing high dimensional data –Very little work with real applications

3 Outline Of Talk New API for ISATAB Description of indexes Experimental results Ongoing work Discussion

4 API Design For ISAT Why did we need it? Old API –ISATAB logic and index logic separate –Index logic and ISATAB operations on the index still together snepQuery (containment search) snepQueryList (proximity search) nepAdd nepUpdate nepRemove ISATAB

5 New API Design ISATAB General Index Template ISATAB API snepQuery (containment search) snepQueryList (proximity search) nepAdd nepUpdate nepRemove General Index API containmentIterator proximityIterator insertDataItem deleteDataItem updateDataItem Specific Index Implements index logic Packaging Object Transforms ISATAB Objects into index objects

6 Benefits And Losses Advantages –General Index Template does not need to be changed –Can use any existing index that follows API Disadvantages –Different layers cannot talk easily –Extra overheads of multiple function calls

7 Indexes For ISATAB Initial results showed usefulness of caching –LRU List Two broad categories of indexes –Point Indexes Current binary tree Rtree of points –Ellipsoid Indexes Rtree of rectangles LRU list

8 What Is A Rtree?

9 Rtree Properties Balanced tree Each node must have a minimum occupancy Overlapping bounding boxes deteriorates search Delete operation: Deletes and reinserts underfull nodes.

10 Rtree For ISAT Point Rtree –Indexes centers of ellipsoids –Find nearest neighbors both for queries and growing –No delete operation Bounding Box Rtree –Take the bounding box of the ellipsoids –Check for containment in bounding box for queries –Find nearest neighbor to bounding box for a grow –Delete operation in grow

11 Experiments Methane Simulation with 32 species

12 Takeaways Caching good for searching not for growing Fast Scan does seem to do well –Point Rtree + list and Original ISAT do the best –Original ISAT does only 50% primary retrieves Rtree does well but is expensive What we do not understand? –Our code always does more grows

13 Another Example Methane simulation with 55 species Large number of Grows : Order of 10 5 grows in 2x10 6 queries PointRtree + list = 65% hits Original ISATAB = 90% hits (44% secondary retrieves) Rtree = 86% hits (could only reach 10 5 queries) Simple caching not going to work Grow Cost –Grow for Rtree more expensive than point Rtree –Searching for growable ellipsoids and growing them dominates simulation

14 Summary Of Experiments Different simulations have very different characteristics Concept of growing still not clear –Definitely needed –When, what and how much to grow? Simple caching definitely not good Tradeoff discovered –Indexing ellipsoids helps search but growing may dominate costs –Indexing points helps updates but search suffers.

15 Ongoing Work Hybrid index structures –Transition from update friendly to search friendly indexes –Dynamically change index parameters Study statistics as the simulation proceeds Rtree with ellipsoidal bounding region Random Projections –Project ellipsoids on random lines –If a point lies within the projections, then it lies within the ellipsoid Simple averaging of nearest neighbors Error Analysis with different index structures First paper to introduce problem to the database community


Download ppt "Index Structures For ISAT Work In Progress By Biswanath Panda, Mirek Riedewald, Paul Chew & Johannes Gehrke."

Similar presentations


Ads by Google