Presentation is loading. Please wait.

Presentation is loading. Please wait.

PTrees (predicate Trees) fast, accurate , DM-ready horizontal processing of compressed, vertical data structures Project onto each attribute (4 files)

Similar presentations


Presentation on theme: "PTrees (predicate Trees) fast, accurate , DM-ready horizontal processing of compressed, vertical data structures Project onto each attribute (4 files)"— Presentation transcript:

1 pTrees (predicate Trees) fast, accurate , DM-ready horizontal processing of compressed, vertical data structures Project onto each attribute (4 files) Vertically slice off bit positions (12 files) Ubiqitous Vertically Processing of Horizontal Data: R(A1 A2 A3 A4) Compress e.g., compress R11 into P11: e.g., find the number of occurences of in the table: =2 pTrees: R[A1] R[A2] R[A3] R[A4] Base 10 Base 2 = Horizontal (record-oriented) data sets must be scanned vertically (vertical loops). R11 1 R11 R12 R13 R21 R22 R23 R31 R32 R33 R41 R42 R43 pure1? false=0 pure1? true=1 pure1? false=0 pure1? false=0 pure1? false=0 Record the truth of predicate pure1 = entirely 1-bits in a tree recursively on intervals (halves) until pure. 1. Whole thing pure1? false  0 0 0 0 1 P11 P11 P12 P13 P21 P22 P23 P31 P32 P33 P41 P42 P43 0 0 0 1 1 1 0 1 0 01 0 1 0 0 ^ 2. Left half pure1? false  0 3. Right half pure1? false  0 0 0 P11 4. Left half of rt half ? false0 0 0 5. Rt half of right half? true1 0 0 0 1 To count (7,0,1,4)s use P11^P12^P13^P’21^P’22^P’23^P’31^P’32^P33^P41^P’42^P’43 *23 *22 =2 0 1 *21 *20 = pure0 so branch ends

2 More generally, given a row predicate (rp) and a row ordering (ro), the sequence of row predicate truth bits is called the level-0 pTree. rp: rem(SL/2)=1 ro= table order rp: Color='red' rp: rem(div(SL/2)/2)=1 ... gte50%1 stride=5 PSL,1 1 pure1 str=5 PSL,1 gte25%1 str=5 PSL,1 1 gte75%1 str=5 PSL,1 1 E.g., The IRIS dataset (partial) Name SL SW PL PW Color setosa red setosa blue setosa red setosa white setosa blue versicolor red versicolor red versicolor white versicolor blue versicolor white virginica white virginica red virginica blue virginica red virginica red PSL,0 1 PColor=red 1 PSL,1 1 gte50%1 str=5 PC=red 1 pure1 gte25%1 gte75%1 Next, intervalize the Level-0 pTree (in practice, equiwidth=64?; this example, equiwidth=5), specify a bit-interval-predicate (bip), (e.g., pure1, pure0, gte50%1...) and define the bip stride=m level-1 pTree as the string of truth bits of bip on those intervals. rem(SL/2)=1 ro: given gte50%1 stride=4 PSL,0 1 gte50%1 stride=8 PSL,0 1 rp: PW<7 ro: given PSL,0 1 gte50%1 stride=5 PPW<7 1 PPW<7 1 Note that the gte50%1 st=5 level-1 pTree classifies setosa (mask pTree for the setosa class).


Download ppt "PTrees (predicate Trees) fast, accurate , DM-ready horizontal processing of compressed, vertical data structures Project onto each attribute (4 files)"

Similar presentations


Ads by Google