CMU SCS 15-826: Multimedia Databases and Data Mining Lecture #30: Conclusions C. Faloutsos.

Slides:



Advertisements
Similar presentations
CMU SCS : Multimedia Databases and Data Mining Lecture #17: Text - part IV (LSI) C. Faloutsos.
Advertisements

CMU SCS : Multimedia Databases and Data Mining Lecture #7: Spatial Access Methods - IV Grid files, dim. curse C. Faloutsos.
CMU SCS : Multimedia Databases and Data Mining Lecture #14: Text – Part I C. Faloutsos.
CMU SCS : Multimedia Databases and Data Mining Lecture #20: SVD - part III (more case studies) C. Faloutsos.
CMU SCS : Multimedia Databases and Data Mining Lecture #19: SVD - part II (case studies) C. Faloutsos.
Feature Selection as Relevant Information Encoding Naftali Tishby School of Computer Science and Engineering The Hebrew University, Jerusalem, Israel NIPS.
CMU SCS : Multimedia Databases and Data Mining Lecture #11: Fractals: M-trees and dim. curse (case studies – Part II) C. Faloutsos.
CMU SCS : Multimedia Databases and Data Mining Lecture #7: Spatial Access Methods - Metric trees C. Faloutsos.
CMU SCS : Multimedia Databases and Data Mining Lecture #10: Fractals - case studies - I C. Faloutsos.
CMU SCS : Multimedia Databases and Data Mining Lecture#5: Multi-key and Spatial Access Methods - II C. Faloutsos.
CMU SCS : Multimedia Databases and Data Mining Lecture#2: Primary key indexing – B-trees Christos Faloutsos - CMU
Indexing and Data Mining in Multimedia Databases Christos Faloutsos CMU
CMU SCS : Multimedia Databases and Data Mining Lecture #11: Fractals: M-trees and dim. curse (case studies – Part II) C. Faloutsos.
15-826: Multimedia Databases and Data Mining
Multimedia DBs. Multimedia dbs A multimedia database stores text, strings and images Similarity queries (content based retrieval) Given an image find.
CMU SCS : Multimedia Databases and Data Mining Lecture #16: Text - part III: Vector space model and clustering C. Faloutsos.
CMU SCS : Multimedia Databases and Data Mining Lecture #11: Fractals - case studies Part III (regions, quadtrees, knn queries) C. Faloutsos.
10-603/15-826A: Multimedia Databases and Data Mining SVD - part II (more case studies) C. Faloutsos.
Multimedia DBs.
Indexing Time Series. Time Series Databases A time series is a sequence of real numbers, representing the measurements of a real variable at equal time.
Indexing Time Series Based on Slides by C. Faloutsos (CMU) and D. Gunopulos (UCR)
Multimedia and Text Indexing. Multimedia Data Management The need to query and analyze vast amounts of multimedia data (i.e., images, sound tracks, video.
CMU SCS Graph and stream mining Christos Faloutsos CMU.
1 ISI’02 Multidimensional Databases Challenge: representation for efficient storage, indexing & querying Examples (time-series, images) New multidimensional.
Multimedia Databases Text I. Outline Spatial Databases Temporal Databases Spatio-temporal Databases Data Mining Multimedia Databases Text databases Image.
Spatial and Temporal Data Mining
Spatial, text, and multimedia databases Erik Zeitler UDBL.
10-603/15-826A: Multimedia Databases and Data Mining SVD - part I (definitions) C. Faloutsos.
Multimedia Databases Text II. Outline Spatial Databases Temporal Databases Spatio-temporal Databases Multimedia Databases Text databases Image and video.
Multimedia Databases LSI and SVD. Text - Detailed outline text problem full text scanning inversion signature files clustering information filtering and.
E.G.M. PetrakisDimensionality Reduction1  Given N vectors in n dims, find the k most important axes to project them  k is user defined (k < n)  Applications:
10-603/15-826A: Multimedia Databases and Data Mining Text - part II C. Faloutsos.
Spatial and Temporal Databases Efficiently Time Series Matching by Wavelets (ICDE 98) Kin-pong Chan and Ada Wai-chee Fu.
Indexing Time Series.
CMU SCS : Multimedia Databases and Data Mining Lecture#1: Introduction Christos Faloutsos CMU
CMU SCS : Multimedia Databases and Data Mining Lecture #8: Fractals - introduction C. Faloutsos.
Multimedia and Time-series Data
CMU SCS : Multimedia Databases and Data Mining Lecture #26: Compression - JPEG, MPEG, fractal C. Faloutsos.
Spatial Data Management Chapter 28. Types of Spatial Data Point Data –Points in a multidimensional space E.g., Raster data such as satellite imagery,
CS654: Digital Image Analysis Lecture 15: Image Transforms with Real Basis Functions.
CMU SCS : Multimedia Databases and Data Mining Lecture #12: Fractals - case studies Part III (quadtrees, knn queries) C. Faloutsos.
Fast Subsequence Matching in Time-Series Databases Author: Christos Faloutsos etc. Speaker: Weijun He.
Indexing Time Series. Outline Spatial Databases Temporal Databases Spatio-temporal Databases Data Mining Multimedia Databases Text databases Image and.
CMU SCS : Multimedia Databases and Data Mining Lecture #18: SVD - part I (definitions) C. Faloutsos.
Multimedia and Time-Series Data When Is “ Nearest Neighbor ” Meaningful? Group member: Terry Chan, Edward Chu, Dominic Leung, David Mak, Henry Yeung, Jason.
1. Systems of Linear Equations and Matrices (8 Lectures) 1.1 Introduction to Systems of Linear Equations 1.2 Gaussian Elimination 1.3 Matrices and Matrix.
Indexing Time Series. Outline Spatial Databases Temporal Databases Spatio-temporal Databases Multimedia Databases Time Series databases Text databases.
UMass Lowell Computer Science Analysis of Algorithms Prof. Karen Daniels Fall, 2001 Review Lecture Tuesday, 12/11/01.
FastMap : Algorithm for Indexing, Data- Mining and Visualization of Traditional and Multimedia Datasets.
CS654: Digital Image Analysis Lecture 11: Image Transforms.
CMU SCS KDD'09Faloutsos, Miller, Tsourakakis P9-1 Large Graph Mining: Power Tools and a Practitioner’s guide Christos Faloutsos Gary Miller Charalampos.
CMU SCS : Multimedia Databases and Data Mining Lecture #7: Spatial Access Methods - Metric trees C. Faloutsos.
Carnegie Mellon Data Mining – Research Directions C. Faloutsos CMU
High-Dimensional Data. Topics Motivation Similarity Measures Index Structures.
Data statistics and transformation revision Michael J. Watts
Keogh, E. , Chakrabarti, K. , Pazzani, M. & Mehrotra, S. (2001)
Spatial Data Management
Indexing and Data Mining in Multimedia Databases
15-826: Multimedia Databases and Data Mining
15-826: Multimedia Databases and Data Mining
15-826: Multimedia Databases and Data Mining
15-826: Multimedia Databases and Data Mining
15-826: Multimedia Databases and Data Mining
15-826: Multimedia Databases and Data Mining
15-826: Multimedia Databases and Data Mining
Graph and Tensor Mining for fun and profit
Lecture 15: Least Square Regression Metric Embeddings
15-826: Multimedia Databases and Data Mining
15-826: Multimedia Databases and Data Mining
Latent Semantic Analysis
Presentation transcript:

CMU SCS : Multimedia Databases and Data Mining Lecture #30: Conclusions C. Faloutsos

CMU SCS (c) 2012, C. Faloutsos2 Outline Goal: ‘Find similar / interesting things’ Intro to DB Indexing - similarity search –Points –Text –Time sequences; images etc –Graphs Data Mining

CMU SCS (c) 2012, C. Faloutsos3 Indexing - similarity search

CMU SCS (c) 2012, C. Faloutsos4 Indexing - similarity search R-trees z-ordering / hilbert curves M-trees (DON’T FORGET … ) A B C D E F G H I J P1 P2 P3 P4

CMU SCS (c) 2012, C. Faloutsos5 Indexing - similarity search R-trees z-ordering / hilbert curves M-trees beware of high intrinsic dimensionality A B C D E F G H I J P1 P2 P3 P4

CMU SCS (c) 2012, C. Faloutsos6 Outline Goal: ‘Find similar / interesting things’ Intro to DB Indexing - similarity search –Points –Text –Time sequences; images etc –Graphs Data Mining

CMU SCS Text searching ‘find all documents with word bla’ (c) 2012, C. Faloutsos7

CMU SCS Text searching Full text scanning (‘grep’) Inversion (B-tree or hash index) (signature files) Vector space model –Ranked output –Relevance feedback String editing distance (-> dynamic prog.) (c) 2012, C. Faloutsos8

CMU SCS (c) 2012, C. Faloutsos9 Multimedia indexing day 1365 day 1365 S1 Sn

CMU SCS (c) 2012, C. Faloutsos#10 day 1365 day 1365 S1 Sn F(S1) F(Sn) ‘GEMINI’ - Pictorially eg, avg eg,. std

CMU SCS (c) 2012, C. Faloutsos11 Multimedia indexing Feature extraction for indexing (GEMINI) –Lower-bounding lemma, to guarantee no false alarms MDS/FastMap

CMU SCS (c) 2012, C. Faloutsos12 Outline Goal: ‘Find similar / interesting things’ Intro to DB Indexing - similarity search –Points –Text –Time sequences; images etc –Graphs Data Mining

CMU SCS (c) 2012, C. Faloutsos13 Time series & forecasting Goal: given a signal (eg., sales over time and/or space) Find: patterns and/or compress year count lynx caught per year

CMU SCS (c) 2012, C. Faloutsos14 Wavelets t f time value Q: baritone/silence/soprano - DWT?

CMU SCS (c) 2012, C. Faloutsos15 Time series + forecasting Fourier; Wavelets Box/Jenkins and AutoRegression non-linear/chaotic forecasting (fractals again) – ‘Delayed Coordinate Embedding’ ~ nearest neighbors

CMU SCS (c) 2012, C. Faloutsos16 Outline Goal: ‘Find similar / interesting things’ Intro to DB Indexing - similarity search –Points –Text –Time sequences; images etc –Graphs Data Mining

CMU SCS Graphs Real graphs: surprising patterns –?? (c) 2012, C. Faloutsos17

CMU SCS Graphs Real graphs: surprising patterns –‘six degrees’ –Skewed degree distribution (‘rich get richer’) –Super-linearities (2x nodes -> 3x edges ) –Diameter: shrinks (!) –Might have no good cuts (c) 2012, C. Faloutsos18

CMU SCS (c) 2012, C. Faloutsos19 Outline Goal: ‘Find similar / interesting things’ Intro to DB Indexing - similarity search Data Mining

CMU SCS (c) 2012, C. Faloutsos20 Data Mining - DB

CMU SCS (c) 2012, C. Faloutsos21 Data Mining - DB Association Rules (‘diapers’ -> ‘beer’ ) [ OLAP (DataCubes, roll-up, drill-down) ] [ Classifiers ]

CMU SCS (c) 2012, C. Faloutsos22 Taking a step back: We saw some fundamental, recurring concepts and tools:

CMU SCS (c) 2012, C. Faloutsos23 Powerful, recurring tools Fractals/ self similarity –Zipf, Korcak, Pareto’s laws –intrinsic dimension (Sierpinski triangle) –correlation integral –Barnsley’s IFS compression –(Kronecker graphs)

CMU SCS (c) 2012, C. Faloutsos24 Powerful, recurring tools Fractals/ self similarity –Zipf, Korcak, Pareto’s laws –intrinsic dimension (Sierpinski triangle) –correlation integral –Barnsley’s IFS compression –(Kronecker graphs) ‘Take logarithms’ AVOID ‘avg’

CMU SCS (c) 2012, C. Faloutsos25 Powerful, recurring tools SVD (optimal L2 approx) –LSI, KL, PCA, ‘eigenSpokes’, (& in ICA ) –HITS (PageRank) v1 first singular vector

CMU SCS (c) 2012, C. Faloutsos26 Powerful, recurring tools Discrete Fourier Transform Wavelets

CMU SCS (c) 2012, C. Faloutsos27 Powerful, recurring tools Matrix inversion lemma –Recursive Least Squares –Sherman-Morrison(-Woodbury)

CMU SCS (c) 2012, C. Faloutsos28 Summary fractals / power laws probably lead to the most startling discoveries (‘the mean may be meaningless’) SVD: behind PageRank/HITS/tensors/… Wavelets: Nature seems to prefer them RLS: seems to achieve the impossible (approximate counting – NOT in the exam)

CMU SCS (c) 2012, C. Faloutsos29 Thank you! Feel free to contact me: GHC 8019 Reminder: faculty course eval’s: – Final: Thu. Dec 13, A302 –Double-check at – Have a great break!