Presentation on theme: "RankSQL: Supporting Ranking Queries in RDBMS Chengkai Li (UIUC) Mohamed A. Soliman (Univ. of Waterloo) Kevin Chen-Chuan Chang (UIUC) Ihab F. Ilyas (Univ."— Presentation transcript:
RankSQL: Supporting Ranking Queries in RDBMS Chengkai Li (UIUC) Mohamed A. Soliman (Univ. of Waterloo) Kevin Chen-Chuan Chang (UIUC) Ihab F. Ilyas (Univ. of Waterloo)
Overview Ranking (top-k) is an important functionality in many real-world database applications: –E-Commerce, Web Sources –Multimedia Databases –Text Retrieval, Search Engine –OLAP, Decision Support RankSQL: –Support ranking as a first-class query type in RDBMS; –Integrate ranking with traditional Boolean query constructs.
Demo Query SELECT * FROM A,B,C WHERE A.j1=B.j1 and B.j2=C.j2 ORDER BY A.p1+B.p2+C.p3 desc LIMIT 10; membership dimension: Boolean predicates, Boolean function order dimension: ranking predicates, monotonic scoring function B R
RankSQL System Extends PostgreSQL: –Query Engine: Ranking Algebra Execution engine of ranking operators Ranking query optimizer –Front End: Visualizing the enumeration and execution Dataset: 3-table join, 100,000 tuples/table, key- foreign key join
The Differences and Insights Differences –Traditional Plan: ~7sec materialize-then-sort –New Plan: ~0.8sec Ranking is split and interleaved with Boolean query constructs. Why? Early ranking enables –Reduced Boolean effort: cuts intermediate results for Boolean join/filter; –Reduced Ranking Effort: expensive ranking predicates can be optimized as well. sort join 100000 99720 10 rank-join ranking 10 100000 2897422
RankSQL [SIGMOD 05] Rank-Relational Algebra As the foundation, support splitting and interleaving at the algebra level. Two-Dimensional Enumeration ranking (ranking predicate scheduling) and filtering (join order selection)
Welcome to our Demo Group 8 Wednesday 2pm-3:30pm Friday 9am-10:30am