Combining Fuzzy Information: An Overview Ronald Fagin.

Slides:



Advertisements
Similar presentations
Optimal Top-k Generation of Attribute Combinations based on Ranked Lists Jiaheng Lu, Renmin University of China Joint work with Pierre Senellart, Chunbin.
Advertisements

Continuing Abstract Interpretation We have seen: 1.How to compile abstract syntax trees into control-flow graphs 2.Lattices, as structures that describe.
Web Information Retrieval
1 Top-K Algorithms: Concepts and Applications by Demetris Zeinalipour Visiting Lecturer Department of Computer Science University of Cyprus Department.
Supporting Top-k join Queries in Relational Databases By:Ihab F. Ilyas, Walid G. Aref, Ahmed K. Elmagarmid Presented by: Calvin R Noronha ( )
Branch-and-Bound In this handout,  Summary of branch-and-bound for integer programs Updating the lower and upper bounds for OPT(IP) Summary of fathoming.
Mining High-Speed Data Streams
The Efficiency of Algorithms
Sorting CMSC 201. Sorting In computer science, there is often more than one way to do something. Sorting is a good example of this!
Introduction to Algorithms
STATISTICS. SOME BASIC STATISTICS MEAN (AVERAGE) – Add all of the data together and divide by the number of elements within that set of data. MEDIAN –
Adaptive Resonance Theory (ART) networks perform completely unsupervised learning. Their competitive learning algorithm is similar to the first (unsupervised)
SUPPORTING TOP-K QUERIES IN RELATIONAL DATABASES. PROCEEDINGS OF THE 29TH INTERNATIONAL CONFERENCE ON VERY LARGE DATABASES, MARCH 2004 Sowmya Muniraju.
Analysis of Algorithms CS 477/677 Instructor: Monica Nicolescu.
1 8. Safe Query Languages Safe program – its semantics can be at least partially computed on any valid database input. Safety is tied to program verification,
MAE 552 – Heuristic Optimization Lecture 27 April 3, 2002
6/15/20151 Top-k algorithms Finding k objects that have the highest overall grades.
Rank Aggregation. Rank Aggregation: Settings Multiple items – Web-pages, cars, apartments,…. Multiple scores for each item – By different reviewers, users,
1 Searching and Integrating Information on the Web Seminar 4: Ranking Queries and Data Privacy Professor Chen Li UC Irvine.
MAE 552 – Heuristic Optimization Lecture 26 April 1, 2002 Topic:Branch and Bound.
This material in not in your text (except as exercises) Sequence Comparisons –Problems in molecular biology involve finding the minimum number of edit.
Aggregation Algorithms and Instance Optimality
Combining Fuzzy Information: an Overview Ronald Fagin Abdullah Mueen -- Slides by Abdullah Mueen.
Lecture 14: A Programming Example: Sorting Yoni Fridman 7/24/01 7/24/01.
A Unified Approach for Computing Top-k Pairs in Multidimensional Space Presented By: Muhammad Aamir Cheema 1 Joint work with Xuemin Lin 1, Haixun Wang.
Evaluating Top-k Queries over Web-Accessible Databases Nicolas Bruno Luis Gravano Amélie Marian Columbia University.
Top- K Query Evaluation with Probabilistic Guarantees Martin Theobald, Gerhard Weikum, Ralf Schenkel Presenter: Avinandan Sengupta.
CS246 Ranked Queries. Junghoo "John" Cho (UCLA Computer Science)2 Traditional Database Query (Dept = “CS”) & (GPA > 3.5) Boolean semantics Clear boundary.
CHAPTER 7: SORTING & SEARCHING Introduction to Computer Science Using Ruby (c) Ophir Frieder at al 2012.
1 Evaluating top-k Queries over Web-Accessible Databases Paper By: Amelie Marian, Nicolas Bruno, Luis Gravano Presented By Bhushan Chaudhari University.
Computer Science Searching & Sorting.
Towards Robust Indexing for Ranked Queries Dong Xin, Chen Chen, Jiawei Han Department of Computer Science University of Illinois at Urbana-Champaign VLDB.
Ranking in DB Laks V.S. Lakshmanan Depf. of CS UBC.
Ordering Whole Numbers Lesson 1-2. Ordering Whole Numbers There are two ways to order whole numbers: 1._______________________________ 2._______________________________.
Higher Grade Computing Studies 4. Standard Algorithms Higher Computing Software Development S. McCrossan 1 Linear Search This algorithm allows the programmer.
Data Structures Simple Sorts Phil Tayco Slide version 1.0 Feb. 8, 2015.
Scaling up Decision Trees. Decision tree learning.
Information Networks Rank Aggregation Lecture 10.
Centroids part 2 Getting rid of outliers and sorting.
1University of Texas at Arlington.  Introduction  Motivation  Requirements  Paper’s Contribution.  Related Work  Overview of Ripple Join  Rank.
The university of Hong Kong Department of Computer Science Continuous Monitoring of Top-k Queries over Sliding Windows Authors: Kyriakos Mouratidis, Spiridon.
All right reserved by Xuehua Shen 1 Optimal Aggregation Algorithms for Middleware Ronald Fagin, Amnon Lotem, Moni Naor (PODS01)
Supporting Top-k join Queries in Relational Databases Ihab F. Ilyas, Walid G. Aref, Ahmed K. Elmagarmid Presented by: Z. Joseph, CSE-UT Arlington.
Presented by Suresh Barukula 2011csz  Top-k query processing means finding k- objects, that have highest overall grades.  A query in multimedia.
CS4432: Database Systems II Query Processing- Part 2.
CSE 6392 – Data Exploration and Analysis in Relational Databases April 20, 2006.
Optimal Aggregation Algorithms for Middleware By Ronald Fagin, Amnon Lotem, and Moni Naor.
Measures of Central Tendency Mean, Median, Mode, and Range.
8.1 8 Algorithms Foundations of Computer Science  Cengage Learning.
Top-k Query Processing Optimal aggregation algorithms for middleware Ronald Fagin, Amnon Lotem, and Moni Naor + Sushruth P. + Arjun Dasgupta.
Database Searching and Information Retrieval Presented by: Tushar Kumar.J Ritesh Bagga.
03/02/20061 Evaluating Top-k Queries Over Web-Accessible Databases Amelie Marian Nicolas Bruno Luis Gravano Presented By: Archana and Muhammed.
Sorting Algorithms Written by J.J. Shepherd. Sorting Review For each one of these sorting problems we are assuming ascending order so smallest to largest.
Neighborhood - based Tag Prediction
Query Reranking As A Service
Challenges in Creating an Automated Protein Structure Metaserver
Top-k Query Processing
Sorting Algorithms Written by J.J. Shepherd.
Algorithm design and Analysis
Rank Aggregation.
Enumerating Distances Using Spanners of Bounded Degree
Laks V.S. Lakshmanan Depf. of CS UBC
Popular Ranking Algorithms
Lecture 2- Query Processing (continued)
Matthew Renner, Trish Beeksma, Patch Kenny
Ensemble learning.
INFORMATION RETRIEVAL TECHNIQUES BY DR. ADNAN ABID
Efficient Processing of Top-k Spatial Preference Queries
Query Specific Ranking
Algorithm Course Algorithms Lecture 3 Sorting Algorithm-1
Presentation transcript:

Combining Fuzzy Information: An Overview Ronald Fagin

Problem Definition Given a collection of objects, our goal is to find Top-k objects, whose scores are greater than the remaining objects.

A sample set of Databases ObjectArea (x 3 ) ObjectRedness (x 1 ) ObjectRoundness (x 2 ) Attributes Grades Every subsystem is sorted by the grade it holds

Top-k Object Problem Naïve Algorithm Fagin’s Algorithm Threshold Algorithm

Naïve Algorithm Basic Idea:  For for each object, use the aggregation function to get the score  According to the scores, get the top k.

Questions Do we need to count the score for every object in the database? Can we SAFELY ignore some objects whose scores are lower than what we already have?

Fagin’s Algorithm Do Sorted access in parallel at all the lists Stop when we have k objects which appear in all the lists Calculate score value of all the objects Compute Top-k objects

Example: Fagin’s Algorithm { } 1 Objects appear in every list: Objects seen so far: {, } k = 3

Example: Fagin’s Algorithm { } 2 Objects appear in every list: Objects seen so far: {,,, } k = 3

Example: Fagin’s Algorithm { } 3 Objects appear in every list: Objects seen so far: {,,,, } k = 3

Example: Fagin’s Algorithm {,, } 4 Objects appear in every list: Objects seen so far: {,,,, }We got enough objects k = 3

Example: Fagin’s Algorithm {,, } 4 Objects appear in every list: Objects seen so far: {,,,, }We got enough objects For all these, calculate the score and get the Top-k k = 3

The Threshold Algorithm Do Sorted access in parallel at all the lists until τ < g – For each object R that has been seen at least once in any of the list Do random accesses to get the attribute values of R from the lists where the object has not been seen yet. Compute t(R) and update the list of top k objects (Y) if necessary. – Compute τ = t(x 1,x 2,…,x m ) where x i is the grade of the last seen object from list L i under sorted access. – If τ is less than the lowest aggregated grade (g) of the top k set (Y) then halt.

Example : Threshold Algorithm τ = 3, Y = {, } g = x x x-marked objects are the first to be seen of their kind and when seen they have been accessed in the other databases randomly to compute their aggregate function. t=sum and k=3 iterations

Example : Threshold Algorithm τ = 3, Y = {, } g = τ = 2.95, Y = {,, } g = 1.8 x x x x x-marked objects are the first to be seen of their kind and when seen they have been accessed in the other databases randomly to compute their aggregate function. t=sum and k=3 iterations

Example : Threshold Algorithm τ = 3, Y = {, } g = 1.8 τ = 2.02, Y = {,, } g = τ = 2.95, Y = {,, } g = 1.8 x x x x x-marked objects are the first to be seen of their kind and when seen they have been accessed in the other databases randomly to compute their aggregate function. t=sum and k=3 iterations x

Example : Threshold Algorithm τ = 3, Y = {, } g = 1.8 τ = 2.02, Y = {,, } g = 1.95 τ = 1.55, Y = {,, } g = τ = 2.95, Y = {,, } g = 1.8 x x x x x-marked objects are the first to be seen of their kind and when seen they have been accessed in the other databases randomly to compute their aggregate function. t=sum and k=3 iterations x

Restricting Sorted Access A subset Z’ of the databases are not accessible under sorted access. TA is modified to handle such scenario. τ = t(x 1,x 2,…,x m ) where x i is 1 for all inaccessible database L i. All databases in Z’ are accessed only under random access mode.

Restricting Sorted Access τ = 3, Y = { } g = t=sum and k=3 x Inaccessible under sorted access x-marked objects are the first to be seen of their kind

τ = 3, Y = {,, } g = 1.8 Restricting Sorted Access τ = 3, Y = { } g = t=sum and k=3 x x x Inaccessible under sorted access x-marked objects are the first to be seen of their kind

τ = 3, Y = {,, } g = 1.8 Restricting Sorted Access τ = 3, Y = { } g = 2.75 τ = 2.17, Y = {,, } g = t=sum and k=3 x x x x Inaccessible under sorted access x-marked objects are the first to be seen of their kind

τ = 3, Y = {,, } g = 1.8 Restricting Sorted Access τ = 3, Y = { } g = 2.75 τ = 2.17, Y = {,, } g = 1.95 τ = 1.8, Y = {,, } g = t=sum and k=3 x x x x x Inaccessible under sorted access x-marked objects are the first to be seen of their kind

Restricting Random Access If t is a monotone, W(R) is a lower bound on t(R) computed by replacing unknown attribute values with 0 in t. B(R) is an upper bound on t(R) computed by replacing unknown attribute values with the least value seen in the database. Here Y is the top k list that contains k objects with the largest W values seen so far. Ties broken by B values and then arbitrarily.

Example: Restricting Random Access Y = {, } 1 x1x x2x x3x W B Y is the sorted top-k list

Example: Restricting Random Access Y = {,, } 2 x1x x2x x3x W B W( ) = = 1.95

Example: Restricting Random Access Y = {,, } 3 x1x x2x x3x W B B( ) = = 2.17

Example: Restricting Random Access Y = {,, } 4 x1x x2x x3x W B

Example: Restricting Random Access Y = {,, } 5 x1x x2x x3x W B At this point the algorithm halts because all the objects not in Y have smaller B values than the smallest W value in the Y which is 1.95 here.