CMU SCS 15-826: Multimedia Databases and Data Mining Lecture #4: Multi-key and Spatial Access Methods - I C. Faloutsos.

Slides:



Advertisements
Similar presentations
TWO STEP EQUATIONS 1. SOLVE FOR X 2. DO THE ADDITION STEP FIRST
Advertisements

You have been given a mission and a code. Use the code to complete the mission and you will save the world from obliteration…
Copyright © 2003 Pearson Education, Inc. Slide 1 Computer Systems Organization & Architecture Chapters 8-12 John D. Carpinelli.
Chapter 1 The Study of Body Function Image PowerPoint
Copyright © 2011, Elsevier Inc. All rights reserved. Chapter 6 Author: Julia Richards and R. Scott Hawley.
1 Copyright © 2010, Elsevier Inc. All rights Reserved Fig 2.1 Chapter 2.
1 Chapter 40 - Physiology and Pathophysiology of Diuretic Action Copyright © 2013 Elsevier Inc. All rights reserved.
By D. Fisher Geometric Transformations. Reflection, Rotation, or Translation 1.
We need a common denominator to add these fractions.
Jeopardy Q 1 Q 6 Q 11 Q 16 Q 21 Q 2 Q 7 Q 12 Q 17 Q 22 Q 3 Q 8 Q 13
Jeopardy Q 1 Q 6 Q 11 Q 16 Q 21 Q 2 Q 7 Q 12 Q 17 Q 22 Q 3 Q 8 Q 13
Exit a Customer Chapter 8. Exit a Customer 8-2 Objectives Perform exit summary process consisting of the following steps: Review service records Close.
Multiplying binomials You will have 20 seconds to answer each of the following multiplication problems. If you get hung up, go to the next problem when.
0 - 0.
1 1  1 =.
1  1 =.
ALGEBRAIC EXPRESSIONS
DIVIDING INTEGERS 1. IF THE SIGNS ARE THE SAME THE ANSWER IS POSITIVE 2. IF THE SIGNS ARE DIFFERENT THE ANSWER IS NEGATIVE.
MULTIPLYING MONOMIALS TIMES POLYNOMIALS (DISTRIBUTIVE PROPERTY)
ADDING INTEGERS 1. POS. + POS. = POS. 2. NEG. + NEG. = NEG. 3. POS. + NEG. OR NEG. + POS. SUBTRACT TAKE SIGN OF BIGGER ABSOLUTE VALUE.
MULTIPLICATION EQUATIONS 1. SOLVE FOR X 3. WHAT EVER YOU DO TO ONE SIDE YOU HAVE TO DO TO THE OTHER 2. DIVIDE BY THE NUMBER IN FRONT OF THE VARIABLE.
SUBTRACTING INTEGERS 1. CHANGE THE SUBTRACTION SIGN TO ADDITION
MULT. INTEGERS 1. IF THE SIGNS ARE THE SAME THE ANSWER IS POSITIVE 2. IF THE SIGNS ARE DIFFERENT THE ANSWER IS NEGATIVE.
FACTORING ax2 + bx + c Think “unfoil” Work down, Show all steps.
Addition Facts
Year 6 mental test 10 second questions Numbers and number system Numbers and the number system, fractions, decimals, proportion & probability.
Introduction to Relational Database Systems 1 Lecture 4.
Around the World AdditionSubtraction MultiplicationDivision AdditionSubtraction MultiplicationDivision.
CMU SCS : Multimedia Databases and Data Mining Lecture #17: Text - part IV (LSI) C. Faloutsos.
Displaying Data from Multiple Tables
O X Click on Number next to person for a question.
© S Haughton more than 3?
CMU SCS : Multimedia Databases and Data Mining Lecture #7: Spatial Access Methods - IV Grid files, dim. curse C. Faloutsos.
Twenty Questions Subject: Twenty Questions
Copyright © 2012, Elsevier Inc. All rights Reserved. 1 Chapter 7 Modeling Structure with Blocks.
Solve by Substitution: Isolate one variable in an equation
Energy & Green Urbanism Markku Lappalainen Aalto University.
© 2012 National Heart Foundation of Australia. Slide 2.
Introduction to Indexes Rui Zhang The University of Melbourne Aug 2006.
Adding Up In Chunks.
Lets play bingo!!. Calculate: MEAN Calculate: MEDIAN
Lecture plan Outline of DB design process Entity-relationship model
Past Tense Probe. Past Tense Probe Past Tense Probe – Practice 1.
Event 4: Mental Math 7th/8th grade Math Meet ‘11.
Addition 1’s to 20.
25 seconds left…...
Test B, 100 Subtraction Facts
Week 1.
Number bonds to 10,
We will resume in: 25 Minutes.
©Brooks/Cole, 2001 Chapter 12 Derived Types-- Enumerated, Structure and Union.
CMU SCS : Multimedia Databases and Data Mining Lecture#1: Introduction Christos Faloutsos CMU
1 Unit 1 Kinematics Chapter 1 Day
O X Click on Number next to person for a question.
PSSA Preparation.
Essential Cell Biology
Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Slide
Copyright © Cengage Learning. All rights reserved.
Copyright © Cengage Learning. All rights reserved.
Non-Perfect Squares 7.N.18 Identify the two consecutive whole numbers between which the square root of a non-perfect square whole number less than 225.
CMU SCS : Multimedia Databases and Data Mining Lecture #11: Fractals: M-trees and dim. curse (case studies – Part II) C. Faloutsos.
CMU SCS : Multimedia Databases and Data Mining Lecture #7: Spatial Access Methods - Metric trees C. Faloutsos.
CMU SCS : Multimedia Databases and Data Mining Lecture #10: Fractals - case studies - I C. Faloutsos.
CMU SCS : Multimedia Databases and Data Mining Lecture#5: Multi-key and Spatial Access Methods - II C. Faloutsos.
© 2006, François Brouard Case Real Group François Brouard, DBA, CA January 6, 2006.
CMU SCS : Multimedia Databases and Data Mining Lecture#2: Primary key indexing – B-trees Christos Faloutsos - CMU
CMU SCS : Multimedia Databases and Data Mining Lecture #7: Spatial Access Methods - Metric trees C. Faloutsos.
15-826: Multimedia Databases and Data Mining
15-826: Multimedia Databases and Data Mining
15-826: Multimedia Databases and Data Mining
Presentation transcript:

CMU SCS : Multimedia Databases and Data Mining Lecture #4: Multi-key and Spatial Access Methods - I C. Faloutsos

CMU SCS Copyright: C. Faloutsos (2012)2 Must-Read Material MM-Textbook, Chapter 4 [Bentley75] J.L. Bentley: Multidimensional Binary Search Trees Used for Associative Searching, CACM, 18,9, Sept Ramakrinshan+Gehrke, Chapter

CMU SCS Copyright: C. Faloutsos (2012)3 Outline Goal: ‘Find similar / interesting things’ Intro to DB Indexing - similarity search Data Mining

CMU SCS Copyright: C. Faloutsos (2012)4 Indexing - Detailed outline primary key indexing secondary key / multi-key indexing spatial access methods text...

CMU SCS Copyright: C. Faloutsos (2012)5 Sec. key indexing attributes w/ duplicates (eg., EMPLOYEES, with ‘job-code’) Query types: –exact match –partial match ‘job-code’= ‘PGM’ and ‘dept’=‘R&D’ –range queries ‘job-code’=‘ADMIN’ and salary < 50K

CMU SCS Copyright: C. Faloutsos (2012)6 Sec. key indexing Query types - cont’d –boolean ‘job-code’=‘ADMIN’ or salary>20K –nn salary ~ 30K

CMU SCS Copyright: C. Faloutsos (2012)7 Solution?

CMU SCS Copyright: C. Faloutsos (2012)8 Solution? Inverted indices (usually, w/ B-trees) Q: how to handle duplicates? salary-index 50 70

CMU SCS Copyright: C. Faloutsos (2012)9 Solution A#1: eg., with postings lists salary-index postings lists

CMU SCS Copyright: C. Faloutsos (2012)10 Solution A#2: modify B-tree code, to handle dup’s salary-index

CMU SCS Copyright: C. Faloutsos (2012)11 How to handle Boolean Queries? salary-index eg., ‘sal=50 AND job-code=PGM’?

CMU SCS Copyright: C. Faloutsos (2012)12 How to handle Boolean Queries? salary-index –from indices, find lists of qual. record-ids –merge lists (or check real records)

CMU SCS Copyright: C. Faloutsos (2012)13 Sec. key indexing easily solved in commercial DBMS: create index sal-index on EMPLOYEE (salary); select * from EMPLOYEE where salary > 50 and job-code = ‘ADMIN’

CMU SCS Copyright: C. Faloutsos (2012)14 Sec. key indexing can create combined indices: create index sj on EMPLOYEE( salary, job- code);

CMU SCS Copyright: C. Faloutsos (2012)15 Indexing - Detailed outline primary key indexing secondary key / multi-key indexing –main memory: quad-trees –main memory: k-d-trees spatial access methods text...

CMU SCS Copyright: C. Faloutsos (2012)16 Quad-trees problem: find cities within 100mi from Pittsburgh assumption: all fit in main memory Q: how to answer such queries quickly?

CMU SCS Copyright: C. Faloutsos (2012)17 Quad-trees A: recursive decomposition of space, e.g.: PGH ATL PHL

CMU SCS Copyright: C. Faloutsos (2012)18 Quad-trees A: recursive decomposition of space, e.g.: PGH ATL PHL (30,10) SW

CMU SCS Copyright: C. Faloutsos (2012)19 Quad-trees A: recursive decomposition of space, e.g.: PGH ATL PHL (30,10) SW ,20 NE

CMU SCS Copyright: C. Faloutsos (2012)20 Quad-trees - search? find cities with (35<x<45, 15<y<25): PGH ATL PHL (30,10) SW ,20 NE

CMU SCS Copyright: C. Faloutsos (2012)21 Quad-trees - search? find cities with (35<x<45, 15<y<25): PGH ATL PHL (30,10) SW ,20 NE

CMU SCS Copyright: C. Faloutsos (2012)22 Quad-trees - search? pseudocode: range-query( tree-ptr, range) if (tree-ptr == NULL) exit; if (tree-ptr->point within range){ print tree-ptr->point} for each quadrant { if ( range intersects quadrant ) { range-query( tree-ptr->quadrant-ptr, range); }

CMU SCS Copyright: C. Faloutsos (2012)23 Quad-trees - k-nn search? k-nearest neighbor algo - more complicated: –find ‘good’ neighbors and put them in a stack –go to the most promising quadrant, and update the stack of neighbors –until we hit the leaves

CMU SCS Copyright: C. Faloutsos (2012)24 Quad-trees - discussion great for 2- and 3-d spaces several variations, like fixed decomposition: PGH ATL PHL PGH ATL PHL ‘adaptive’ ‘fixed’ z-ordering (later) middle

CMU SCS Copyright: C. Faloutsos (2012)25 Quad-trees - discussion but: unsuitable for higher-d spaces (why?)

CMU SCS Copyright: C. Faloutsos (2012)26 Quad-trees - discussion but: unsuitable for higher-d spaces (why?) A: 2^d pointers, per node! Q: how to solve this problem? A: k-d-trees!

CMU SCS Copyright: C. Faloutsos (2012)27 Indexing - Detailed outline primary key indexing secondary key / multi-key indexing –main memory: quad-trees –main memory: k-d-trees spatial access methods text...

CMU SCS Copyright: C. Faloutsos (2012)28 k-d-trees Binary trees, with alternating ‘discriminators’ PGH ATL PHL (30,10) SW quad-tree

CMU SCS Copyright: C. Faloutsos (2012)29 k-d-trees Binary trees, with alternating ‘discriminators’ PGH ATL PHL (30,10) W k-d-tree E

CMU SCS Copyright: C. Faloutsos (2012)30 k-d-trees Binary trees, with alternating ‘discriminators’ PGH ATL PHL (30,10) x<=30 x>30 ATL

CMU SCS Copyright: C. Faloutsos (2012)31 k-d-trees Binary trees, with alternating ‘discriminators’ PGH ATL PHL (30,10) x<=30 x>30 ATL (40,20) y<=20 y>20 PHL x y

CMU SCS Copyright: C. Faloutsos (2012)32 (Several demos/applets, e.g.) s/kdtree.htmlhttp://donar.umiacs.umd.edu/quadtree/point s/kdtree.html

CMU SCS Copyright: C. Faloutsos (2012)33 Indexing - Detailed outline primary key indexing secondary key / multi-key indexing –main memory: quad-trees –main memory: k-d-trees insertion; deletion range query; k-nn query spatial access methods text...

CMU SCS Copyright: C. Faloutsos (2012)34 k-d-trees - insertion Binary trees, with alternating ‘discriminators’ PGH ATL PHL (30,10) x<=30 x>30 ATL (40,20) y<=20 y>20 PHL x y

CMU SCS Copyright: C. Faloutsos (2012)35 k-d-trees - insertion discriminators: may cycle, or.... Q: which should we put first? PGH ATL PHL (30,10) x<=30 x>30 ATL (40,20) y<=20 y>20 PHL x y

CMU SCS Copyright: C. Faloutsos (2012)36 k-d-trees - deletion How? PGH ATL PHL (30,10) x<=30 x>30 ATL (40,20) y<=20 y>20 PHL x y

CMU SCS Copyright: C. Faloutsos (2012)37 k-d-trees - deletion Tricky! ‘delete-and-promote’ (or ‘mark as deleted’) PGH ATL PHL (30,10) x<=30 x>30 ATL (40,20) y<=20 y>20 PHL x y

CMU SCS Copyright: C. Faloutsos (2012)38 k-d-trees - range query PGH ATL PHL (30,10) x<=30 x>30 ATL (40,20) y<=20 y>20 PHL

CMU SCS Copyright: C. Faloutsos (2012)39 k-d-trees - range query similar to quad-trees: check the root; proceed to appropriate child(ren). PGH ATL PHL (30,10) x<=30 x>30 ATL (40,20) y<=20 y>20 PHL

CMU SCS Copyright: C. Faloutsos (2012)40 k-d-trees - k-nn query e.g., 1-nn: closest city to ‘X’ PGH ATL PHL (30,10) x<=30 x>30 ATL (40,20) y<=20 y>20 PHL X

CMU SCS Copyright: C. Faloutsos (2012)41 k-d-trees - k-nn query A: check root; put in stack; proceed to child PGH ATL PHL (30,10) x<=30 x>30 ATL (40,20) y<=20 y>20 PHL X

CMU SCS Copyright: C. Faloutsos (2012)42 k-d-trees - k-nn query A: check root; put in stack; proceed to child PGH ATL PHL (30,10) x<=30 x>30 ATL (40,20) y<=20 y>20 PHL X

CMU SCS Copyright: C. Faloutsos (2012)43 Indexing - Detailed outline primary key indexing secondary key / multi-key indexing –main memory: quad-trees –main memory: k-d-trees insertion; deletion range query; k-nn query discussion spatial access methods text...

CMU SCS Copyright: C. Faloutsos (2012)44 k-d trees - discussion great for main memory & low ‘d’ (~<10) Q: what about high-d? A: Q: what about disk A:

CMU SCS Copyright: C. Faloutsos (2012)45 k-d trees - discussion great for main memory & low ‘d’ (~<10) Q: what about high-d? A: most attributes don’t ever become discriminators Q: what about disk? A: Pagination problems, after ins./del. (solutions: next!)

CMU SCS Copyright: C. Faloutsos (2012)46 Conclusions sec. keys: B-tree indices (+ postings lists) multi-key, main memory methods: –quad-trees –k-d-trees

CMU SCS Copyright: C. Faloutsos (2012)47 References [Bentley75] J.L. Bentley: Multidimensional Binary Search Trees Used for Associative Searching, CACM, 18,9, Sept [Finkel74] R.A. Finkel, J.L. Bentley: Quadtrees: A data structure for retrieval on composite keys, ACTA Informatica,4,1, 1974 Applet: eg.,