University of Washington Database Group Tiresias The Database Oracle for How-To Queries Alexandra Meliou § ✜ Dan Suciu ✜ § University of Massachusetts.

Slides:



Advertisements
Similar presentations
1 Datalog: Logic Instead of Algebra. 2 Datalog: Logic instead of Algebra Each relational-algebra operator can be mimicked by one or several Database Logic.
Advertisements

CHAPTER OBJECTIVE: NORMALIZATION THE SNOWFLAKE SCHEMA.
Chapter 10: Designing Databases
University of Washington Database Group The Complexity of Causality and Responsibility for Query Answers and non-Answers Alexandra Meliou, Wolfgang Gatterbauer,
SCIP Optimization Suite
Fast Algorithms For Hierarchical Range Histogram Constructions
DISCOVER: Keyword Search in Relational Databases Vagelis Hristidis University of California, San Diego Yannis Papakonstantinou University of California,
The Power of How-to Queries joint work with Dan Suciu (University of Washington) Alexandra Meliou.
Efficient Query Evaluation on Probabilistic Databases
CMPT 354, Simon Fraser University, Fall 2008, Martin Ester 52 Database Systems I Relational Algebra.
Paper by: A. Balmin, T. Eliaz, J. Hornibrook, L. Lim, G. M. Lohman, D. Simmen, M. Wang, C. Zhang Slides and Presentation By: Justin Weaver.
1 Primitives for Workload Summarization and Implications for SQL Prasanna Ganesan* Stanford University Surajit Chaudhuri Vivek Narasayya Microsoft Research.
1 9. Evaluation of Queries Query evaluation – Quantifier Elimination and Satisfiability Example: Logical Level: r   y 1,…y n  r’ Constraint.
University of Washington Database Group Reverse Data Management … and the case for Reverse What-If queries 1 Alexandra Meliou, Wolfgang Gatterbauer, Dan.
A scalable algorithm for answering queries using views Rachel Pottinger, Alon Levy [2000] Rachel Pottinger and Alon Y. Levy A Scalable Algorithm for Answering.
Database management concepts Database Management Systems (DBMS) An example of a database (relational) Database schema (e.g. relational) Data independence.
VLDB Revisiting Pipelined Parallelism in Multi-Join Query Processing Bin Liu and Elke A. Rundensteiner Worcester Polytechnic Institute
1 Relational Model. 2 Relational Database: Definitions  Relational database: a set of relations  Relation: made up of 2 parts: – Instance : a table,
Fundamentals, Design, and Implementation, 9/e Chapter 7 Using SQL in Applications.
Ontology-Based Free-Form Query Processing for the Semantic Web Mark Vickers Brigham Young University MS Thesis Defense Supported by:
Embedded SQL Direct SQL is rarely used: usually, SQL is embedded in some application code. We need some method to reference SQL statements. But: there.
1 Provenance in O RCHESTRA T.J. Green, G. Karvounarakis, Z. Ives, V. Tannen University of Pennsylvania Principles of Provenance (PrOPr) Philadelphia, PA.
10/3/2000SIMS 257: Database Management -- Ray Larson Relational Algebra and Calculus University of California, Berkeley School of Information Management.
Towards Constraint-based Explanations for Answers and Non-Answers Boris Glavic Illinois Institute of Technology Sean Riddle Athenahealth Corporation Sven.
Working with cursors in Python GISDE Python Workshop Qiao Li.
DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition © 2006 Pearson Prentice Hall 7-1 David M. Kroenke’s Chapter Seven: SQL for Database Construction and.
DBS201: DBA/DBMS Lecture 13.
Context Tailoring the DBMS –To support particular applications Beyond alphanumerical data Beyond retrieve + process –To support particular hardware New.
1 Experimental Evidence on Partitioning in Parallel Data Warehouses Pedro Furtado Prof. at Univ. of Coimbra & Researcher at CISUC DEI/CISUC-Universidade.
FEN  Concepts and terminology  Operations (relational algebra)  Integrity constraints The relational model.
CHAPTER EIGHT Accessing Data Processing Databases.
1 Evaluating top-k Queries over Web-Accessible Databases Paper By: Amelie Marian, Nicolas Bruno, Luis Gravano Presented By Bhushan Chaudhari University.
DATA-DRIVEN UNDERSTANDING AND REFINEMENT OF SCHEMA MAPPINGS Data Integration and Service Computing ITCS 6010.
Chapter 13 Query Processing Melissa Jamili CS 157B November 11, 2004.
Author: Graham Hughes, Tevfik Bultan Computer Science Department, University of California, Santa Barbara, CA 93106, USA Source: International Journal.
Christopher Re and Dan Suciu University of Washington Efficient Evaluation of HAVING Queries on a Probabilistic Database.
Copyright © Curt Hill SQL The Intergalactic Standard Database Query Language.
Online aggregation Joseph M. Hellerstein University of California, Berkley Peter J. Haas IBM Research Division Helen J. Wang University of California,
FALL 2004CENG 351 File Structures and Data Management1 Relational Model Chapter 3.
Database Design and Management CPTG /23/2015Chapter 12 of 38 Functions of a Database Store data Store data School: student records, class schedules,
A COURSE ON PROBABILISTIC DATABASES Dan Suciu University of Washington June, 2014Probabilistic Databases - Dan Suciu 1.
Chapter 9 Query-by-Example Pearson Education © 2009.
© ETH Zürich Eric Lo ETH Zurich a joint work with Carsten Binnig (U of Heidelberg), Donald Kossmann (ETH Zurich), Tamer Ozsu (U of Waterloo) and Peter.
Chapter 10 Analysis and Design Discipline. 2 Purpose The purpose is to translate the requirements into a specification that describes how to implement.
FEN Introduction to the database field:  The Relational Model Seminar: Introduction to relational databases.
Efficient RDF Storage and Retrieval in Jena2 Written by: Kevin Wilkinson, Craig Sayers, Harumi Kuno, Dave Reynolds Presented by: Umer Fareed 파리드.
University of Toronto Department of Computer Science Lifting Transformations to Product Lines Rick Salay, Michalis Famelis, Julia Rubin, Alessio Di Sandro,
Chapter 4 An Introduction to SQL. Copyright © 2004 Pearson Addison-Wesley. All rights reserved.4-2 Topics in this Chapter SQL: History and Overview The.
From Theory to Practice: Efficient Join Query Processing in a Parallel Database System Shumo Chu, Magdalena Balazinska and Dan Suciu Database Group, CSE,
Robust Estimation With Sampling and Approximate Pre-Aggregation Author: Christopher Jermaine Presented by: Bill Eberle.
M.Kersten MonetDB, Cracking and recycling Martin Kersten CWI Amsterdam.
Query Caching and View Selection for XML Databases Bhushan Mandhani Dan Suciu University of Washington Seattle, USA.
Types of Information Systems Basic Computer Concepts Types of Information Systems  Knowledge-based system  uses knowledge-based techniques that supports.
Relaxing Queries Presented by Ashwin Joshi Kapil Patil Sapan Shah.
03/02/20061 Evaluating Top-k Queries Over Web-Accessible Databases Amelie Marian Nicolas Bruno Luis Gravano Presented By: Archana and Muhammed.
Chapter 9: Web Services and Databases Title: NiagaraCQ: A Scalable Continuous Query System for Internet Databases Authors: Jianjun Chen, David J. DeWitt,
SQL Introduction to database and SQL. Chapter 1: Databases and Database Users 6 Introduction to Databases Databases touch all aspects of our lives. Examples:
Default-all is dangerous! Wolfgang Gatterbauer Alexandra Meliou Dan Suciu Database group University of Washington.
1 Overview of Query Evaluation Chapter Outline  Query Optimization Overview  Algorithm for Relational Operations.
Chapter 13: Query Processing
CS 440 Database Management Systems Stored procedures & OR mapping 1.
Chapter 3 The Relational Model. Why Study the Relational Model? Most widely used model. Vendors: IBM, Informix, Microsoft, Oracle, Sybase, etc. “Legacy.
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 The Relational Model Chapter 3.
Chapter 4 An Introduction to SQL.
Kyriaki Dimitriadou, Brandeis University
Chapter 13 The Data Warehouse
Lecture 16: Probabilistic Databases
Lecture 12: Data Wrangling
Query Processing.
Probabilistic Databases with MarkoViews
Presentation transcript:

University of Washington Database Group Tiresias The Database Oracle for How-To Queries Alexandra Meliou § ✜ Dan Suciu ✜ § University of Massachusetts Amherst ✜ University of Washington

Hypothetical (What-if) Queries Brokerage company DB Key Performance Indicators (KPI) Example from [Balmin et al. VLDB’00]: “An analyst of a brokerage company wants to know what would be the effect on the return of customers’ portfolios if during the last 3 years they had suggested Intel stocks instead of Motorola.” change something in the source (hypothesis) observe the effect in the target forward

How-To Queries Brokerage company DB Key Performance Indicators (KPI) Modified example: “An analyst wants to ask how to achieve a 10% return in customer portfolios, with the least number of trades.” find changes to the source that achieve the desired effect declare a desired effect in the target reverse

TPC-H example  A manufacturing company keeps records of inventory orders in a LineItem table.  KPI: Cannot order more than 7% of the inventory from any single country  Can reassign orders to new suppliers as long as the supplier can supply the part  Minimize the number of changes (constraints) (variables) (optimization objective) constraint optimization

extract data Constraint Optimization on Big Data DB construct optimization model this is for a set of 10 lineitems and 40 suppliers Mixed Integer Programming (MIP) solver transform into data updates MathProg Impractical!

Demo: Tiresias a tool that makes how-to queries practical

Tiresias: How-To Query Engine DBMS MIP solver Tiresias TiQL (Tiresias Query Language) Declarative interface, extension to Datalog

Overview MathProg or AMPL TiQL Visualizations

MathProg or AMPL TiQL Visualizations Overview Demo

MathProg or AMPL TiQL Visualizations Overview Language semantics Evaluation of a TiQL program: Translation from TiQL to linear constraints Performance optimizations

Tiresias Query Language  Datalog-like notation:  TiQL semantics: Mapping from EDBs (Extensional Database) to possible worlds over HDBs (Hypothetical Database) headbody: conjunction of predicates HDB

TiQL Rules Reduction RuleConstraint RuleDeduction Rule Semantics: Similar to repair-key semantics [Antonova et al. SIGMOD’07], [Koch ICDT’09] Semantics: Takes a subset of tuples Semantics: The head predicate needs to hold for all tuples

Demo MathProg or AMPL TiQL Visualizations Overview Language semantics Evaluation of a TiQL program: Translation from TiQL to linear constraints Performance optimizations

Evaluating a TiQL Program Mixed Integer Program (MIP) TiQL DB

Evaluating a TiQL Program possible worlds

Key Constraints NOT a possible world

Provenance Constraints  A TiQL rule specifies transformations  Transformations define provenance  Boolean semantics for queries without aggregates  Semi-module provenance for queries with aggregates [Amsterdamer et al. PODS’11] Disjunction:Conjunction:

Demo MathProg or AMPL TiQL Visualizations Overview Language semantics Evaluation of a TiQL program: Translation from TiQL to linear constraints Performance optimizations

Optimizing Performance  Model optimizer  eliminates variables, constraints, and parameters  uses key constraints, functional dependencies, and provenance  Partitioning optimizer Significantly faster than letting the MIP solver do it

Evaluation of the Model Optimizer baseline with optimization

Evaluation of Tiresias Partitioning 10k tuples 1M tuples granularity of partitioning complex dependency on the granularity of partitioning

Scalability The MIP solver runtime (per partition) does not increase with data size Constructor time depends on DB query execution time

Related Work  Provenance [Amsterdamer et al. PODS’11], [Cui et al. TODS’00], [Green et al. PODS’07]  Incomplete databases [Antonova et al. SIGMOD’07], [Imielinski et al. JACM’84], [Koch ICDT’09]  Other RDM problems [Arasu et al. SIGMOD’11], [Binnig et al. ICDE’07], [Bohannon et al. PODS’06], [Fagin et al. JACM’10]

Next Steps with Tiresias Tiresias Handling non-partitionable problems Approximations Parallelization and handling of skew Result analysis and feedback-based problem generation

SIGMOD Demo Group C Location: Vaquero A Time:13:30-15:00

Contributions  How-To queries  Using MIP solvers to answer How-To queries  Tiresias prototype implementation