Presentation is loading. Please wait.

Presentation is loading. Please wait.

Cache Conscious Indexing for Decision-Support in Main Memory Pradip Dhara.

Similar presentations


Presentation on theme: "Cache Conscious Indexing for Decision-Support in Main Memory Pradip Dhara."— Presentation transcript:

1 Cache Conscious Indexing for Decision-Support in Main Memory Pradip Dhara

2 Why In-memory databases Telecommunications CAD tools Moore’s law will allow us to store relations in memory

3 Redesigning DBMS’s Optimize memory-cpu performance vs disk- memory performance Re-evaluate space/time tradeoff – space isn’t cheap Given certain space requirement, need to optimize response time for lookups

4 Indices in In-Memory DBMS’s Little extra space vs. Increased performance Index design takes on new dimensions when looking at in-memory databases Space overhead can not be ignored – hash tables are unacceptable

5 Hardware solutions Caches Growing disparity between CPU performance and memory performance. Cache misses can’t be overlapped

6 Solution CSS-trees indices exploit cache behavior to get improved performance

7 Direct Mapped Cache

8 Fully Associative Cache

9 2-Way Set Associative Cache

10 Binary Search on Sorted Array Store the relation in sorted order on a key Cache performance dependent upon tuple size 1234567891011121314

11 T-trees pointer to record 4, *8, * … 0, *3, * … 10, *16, * … key

12 Enhanced B+ trees 1, *3, *2, *4, *5, *7, *6, *8, *9, *11, *10, *12, * 13, *15, *14, *16, *17, *19, *18, *20, * 591317

13 Hash Indices 000 111 010 011 100 101 110 001 0, *8, *80, *… Put however many pairs fit into a cache line

14 Idea Behind CSS-trees Save space by not storing pointers Use an array as a tree Implicitly store pointers as offsets into the array

15 Useful Formulas for CSS-trees Children of a node b are nodes b(m+1) to b(m+1) + (m+1) N = n * m n = # of elements m = # of elements per node N = # of nodes # of Internal Nodes = First leaf node in bottom level = (EQ 1) (EQ 2) (EQ 3) (EQ 4)

16 How it works Sorted array CSS-tree array (Directory) Full CSS-tree 10 8 9 7 6 5 4 3 2 1 8 9 7 6 5 4 3 2 1 4 2 8 6 8 6 4 2 8 9 7 6 5 4 3 2 1 node 0 node 1node 2node 3 node 4node 5node 6 node 1node 2node 3node 4node 5node 6 Internal nodes Leaf nodes node 0node 1node 2node 3node 4 Values (Lemma 4.1) m (# keys per node) = 2 n (# keys) = 10 k (log m+1 N)= 2 N (# of Leaf Nodes) = 5 Internal Nodes = 2 First leaf node in bottom level = 4

17 Building a full CSS-tree

18 Searching Within a Node 12345678

19 Level CSS-trees 1234567 Value of largest key in subtree m = 2 t Entries per node = m -1

20 Level vs. Full CSS-trees Level CSS-trees will be deeper due to the difference in branching factor Level CSS-trees have fewer comparisons per node Level CSS-trees have more cache accesses and and node traversals log 2 N vslog 2 N * log m+1 m * (1 + 2/(m+1)) log m N vsLog m+1 N

21 Time Analysis R (size of rid) = 4 bytes K (size of key) = 4 bytes P (size of pointer) = 4 bytes h = 1.2 n (# records) = 10 7 c (cache line) = 32 bytes s (node size/c) = 1 D = time to derefence a pointer A b = time to compute child address for binary search A fcss = time to compute child address for full CSS A lcss = time to compute child address for level CSS s = mK/c

22 Space Analysis R (size of rid) = 4 bytes K (size of key) = 4 bytes P (size of pointer) = 4 bytes h = 1.2 n (# records) = 10 7 c (cache line) = 32 bytes s (node size/c) = 1 D = time to derefence a pointer A b = time to compute child address for binary search A fcss = time to compute child address for full CSS A lcss = time to compute child address for level CSS s = mK/c

23 Experiment Results are for Ultra Sparc II – Keys randomly generated integers between 0 and 1 million Performed 5 tests of 100,00 searches for random keys

24 Figure 5a: Array Size vs. time

25 Figure 5b: Array Size vs. Time

26 Figure 6a: Array Size vs. 2 nd cache accesses

27 Figure 6b: Array Size vs. 2 nd cache misses

28 Figure 7: Node Size vs. Time

29 CSS Performance on Other Queries CSS is very good for individual selection queries CSS will probably perform the best in range queries Index nested loops join vs. Sort merge join

30 Doubts About CSS Flexibility of CSS-trees across different cache designs Any applicability to variable sized records Multiple CSS-tree indices on different keys

31 Conclusion CSS-trees improve searching performance by exploiting cache consciousness.

32 One Last Thought Cache designs Should we redesign them to let programmers have control?


Download ppt "Cache Conscious Indexing for Decision-Support in Main Memory Pradip Dhara."

Similar presentations


Ads by Google