Daniel A. Keim, Hans-Peter Kriegel Institute for Computer Science, University of Munich 3/23/2011 1 VisDB: Database exploration using Multidimensional.

Slides:



Advertisements
Similar presentations
ARTIFICIAL PASSENGER.
Advertisements

Ranking Multimedia Databases via Relevance Feedback with History and Foresight Support / 12 I9 CHAIR OF COMPUTER SCIENCE 9 DATA MANAGEMENT AND EXPLORATION.
Pseudo-Relevance Feedback For Multimedia Retrieval By Rong Yan, Alexander G. and Rong Jin Mwangi S. Kariuki
Relevance Feedback and User Interaction for CBIR Hai Le Supervisor: Dr. Sid Ray.
Activity Set 3.2 PREP PPTX Visual Algebra for Teachers.
Cascading Style Sheets
Digital Image Processing
Grey Level Enhancement Contrast stretching Linear mapping Non-linear mapping Efficient implementation of mapping algorithms Design of classes to support.
Windows XP Basics OVERVIEW Next.
INFORMATION MURAL A technique for displaying and navigating large information spaces Dean F. Jerding and John T. Stasko Graphics, Visualization, and Usability.
3D Shape Histograms for Similarity Search and Classification in Spatial Databases. Mihael Ankerst,Gabi Kastenmuller, Hans-Peter-Kriegel,Thomas Seidl Univ.
Chapter 8 Content-Based Image Retrieval. Query By Keyword: Some textual attributes (keywords) should be maintained for each image. The image can be indexed.
Chapter 4: Image Enhancement
Content Based Image Clustering and Image Retrieval Using Multiple Instance Learning Using Multiple Instance Learning Xin Chen Advisor: Chengcui Zhang Department.
Connecting with Computer Science, 2e
Example-Based Color Transformation of Image and Video Using Basic Color Categories Youngha Chang Suguru Saito Masayuki Nakajima.
T.Sharon 1 Internet Resources Discovery (IRD) Introduction to MMIR.
Table Lens From papers 1 and 2 By Tichomir Tenev, Ramana Rao, and Stuart K. Card.
B a c kn e x t h o m e Classification of Variables Discrete Numerical Variable A variable that produces a response that comes from a counting process.
Tutorial 6 Using Form Tools and Creating Custom Forms
Digital Images The nature and acquisition of a digital image.
Basic Manipulations / Transformations Nomar Arredondo & Michael Sexstella B L E N D 3 RB L E N D 3 R With There’s SPARKY!!
Excel Web App By: Ms. Fatima Shannag.
Radial-Basis Function Networks
 A monitor or display is an electronic visual display for computers.  The monitor consists of : o the display device o circuitry o enclosure The display.
Frequency Distributions and Graphs
Solving Systems of Equations and Inequalities
Data Mining Techniques
IE433 CAD/CAM Computer Aided Design and Computer Aided Manufacturing Part-2 CAD Systems Industrial Engineering Department King Saud University.
Exploratory Data Analysis. Computing Science, University of Aberdeen2 Introduction Applying data mining (InfoVis as well) techniques requires gaining.
Solving Systems of Equations and Inequalities Section 3.1A-B Two variable linear equations Section 3.1C Matrices Resolution of linear systems Section 3.1D.
Multimedia Databases (MMDB)
Microsoft Excel By: Dr. K.V. Vishwanath Professor, Dept. of C.S.E,
Perception-Based Classification (PBC) System Salvador Ledezma April 25, 2002.
Space & Order (1) Jing Li The Visual Design and Control of Trellis Display R. A. Becker, W. S. Cleveland, and M. J. Shyu (1996). Source:
Chapter 6 Probability. Introduction We usually start a study asking questions about the population. But we conduct the research using a sample. The role.
Programming Project (Last updated: August 31 st /2010) Updates: - All details of project given - Deadline: Part I: September 29 TH 2010 (in class) Part.
Tuesday August 27, 2013 Distributions: Measures of Central Tendency & Variability.
The X-Tree An Index Structure for High Dimensional Data Stefan Berchtold, Daniel A Keim, Hans Peter Kriegel Institute of Computer Science Munich, Germany.
Texture. Texture is an innate property of all surfaces (clouds, trees, bricks, hair etc…). It refers to visual patterns of homogeneity and does not result.
Chapter 2 Describing Data.
September 5, 2013Computer Vision Lecture 2: Digital Images 1 Computer Vision A simple two-stage model of computer vision: Image processing Scene analysis.
Visualization of Multidimensional Functions Team Members: Mrinmayee Kulkarni Reenal Mahajan Priya Shastri.
INVESTIGATION 1.
Semantic Wordfication of Document Collections Presenter: Yingyu Wu.
2005/12/021 Content-Based Image Retrieval Using Grey Relational Analysis Dept. of Computer Engineering Tatung University Presenter: Tienwei Tsai ( 蔡殿偉.
VisDB: Database Exploration Using Multidimensional Visualization Maithili Narasimha 4/24/2001.
2005/12/021 Fast Image Retrieval Using Low Frequency DCT Coefficients Dept. of Computer Engineering Tatung University Presenter: Yo-Ping Huang ( 黃有評 )
Excel Web App By: Ms. Fatima Shannag.
Intelligent Database Systems Lab N.Y.U.S.T. I. M. Externally growing self-organizing maps and its application to database visualization and exploration.
VizDB A tool to support Exploration of large databases By using Human Visual System To analyze mid-size to large data.
Chapter 3 Response Charts.
Visual Computing Computer Vision 2 INFO410 & INFO350 S2 2015
Results Post Processing using MSC.Acumen
CS COMPUTER GRAPHICS LABORATORY. LIST OF EXPERIMENTS 1.Implementation of Bresenhams Algorithm – Line, Circle, Ellipse. 2.Implementation of Line,
Robust Watermarking of 3D Mesh Models. Introduction in this paper, it proposes an algorithm that extracts 2D image from the 3D model and embed watermark.
Chapter 3 DATA PROCESS & ANALYSIS OF STATISTICS Dr. BALAMURUGAN MUTHURAMAN
VisDB and Pixel Bar Charts Daniel A. Keim et al. ICS 280 Information Visualization Presented by Jeff Ridenour 4/16/02.
1 Berger Jean-Baptiste
Digital Image Processing CCS331 Relationships of Pixel 1.
Chapter 2 Frequency Distributions PowerPoint Lecture Slides Essentials of Statistics for the Behavioral Sciences Seventh Edition by Frederick J Gravetter.
Singular Value Decomposition and its applications
Divide and Conquer.
Visualization of Eye Gaze Data using Heat Maps
Histograms CSE 6363 – Machine Learning Vassilis Athitsos
DIGITAL SIGNAL PROCESSING
UZAKTAN ALGIILAMA UYGULAMALARI Segmentasyon Algoritmaları
Histogram—Representation of Color Feature in Image Processing Yang, Li
What Is Good Clustering?
Presentation transcript:

Daniel A. Keim, Hans-Peter Kriegel Institute for Computer Science, University of Munich 3/23/ VisDB: Database exploration using Multidimensional Visualization

Created By 3/23/  Rohan Ladkhedkar  Ajinkya Raulkar  Vrushali Date  Anuja Surgude

Contents 3/23/ Introduction to VisDB Basic Idea of VisDB Techniques used Basic Visualization Mapping 2D to Axis Grouping the Dimensions Working Hardware/Software Future Scope Conclusion

Introduction to VisDB 3/23/ Typical difficulties faced with large databases: Finding a specific data No knowledge about database systems, query language and data model Intersection data spots 1 to 1 queries provide multiple data items with no feedback

Introduction to VisDB 3/23/ Sorting the data items according to user query. Visualizing as many data items as possible (Suppose in Ten Million) at the same time to give the user some kind of feedback on his query. Also the resolution of current displays(1 to 3 million pixels) is an important consideration. Interaction of the system with user.

Basic Idea of VisDB 3/23/ Support Query Specification process by visually representing the result. Restricts the visualized dimensions which are of no interest to users.

Basic Idea of VisDB 3/23/ Each pixel of screen is used to visualize the data items resulting from a query. Approximate results are determined using distance functions. These distances are then combined to get relevance factor which is useful for mapping.

Distance Function 3/23/ The distance between attribute and corresponding query value is determined. Distance function used here are data type and application dependent. In some cases, even for a single data type multiple distance functions can be used. Calculating distance functions for 1. Number types(Integer) – Numerical difference. 2. Ordinal types(Grades) – domain specific distance functions 3. Nominal Types(Professionals) – Distance matrix

Combining Distances into Relevance Factor 3/23/ Combine independently calculated distances of the different selection predicates. But it should have a global meaning. User interaction required. Obtain weighting factors (Wj, j Є 1, ……, #sp) as per order of importance from users. Normalization of all distances. Linear transformation of the range [ dmax,dmin ] for each predicate e.g. (0,255)

Combining Distances into Relevance Factor 3/23/ For combining the normalized distances we use numerical mean functions such as : 1. Weighted arithmetic mean for ‘AND’ – connected condition part. 2. Weighted geometric mean for ‘OR’- connected condition part. Relevance factor is inverse of distance value

Formula for calculating combined distance 3/23/

Reducing the amount of data to be displayed 3/23/ Adequate heuristics are required to: 1. Reduce amount of data 2. Determine data items whose distances are to be displayed. Hence α -quantile is defined as lowest value ξα such that:

Techniques Used 3/23/ techniques are used 1. Basic Visualization Technique 2. Mapping two dimensions to the Axes 3. Grouping the dimensions for each data Item

1. Basic Visualization Technique 3/23/ Sorts data according to relevance with respect to query. Then maps the relevance factors to colors. Sorting is needed to avoid sprinkled images (which are not clear to user). Highest Relevance factors centered to middle of window Approximate answers create a rectangular spiral around this region(100% correct answers are yellow in color).

1. Basic Visualization Technique 3/23/ Color ranges from Yellow in middle to green, blue, red and lastly black These ranges denote the distance from correct answers.

1. Basic Visualization Technique 3/23/ Multidimensional Visualization - In this we generate a separate window for each selection predicate of the query.

Question 1: 3/23/ % correct answers are denoted by which color in Basic Visualization Technique? 1. Red 2. Yellow 3. Green 4. White 5. Blue

Answer 1: 3/23/ Correct answer: 2

2. Mapping Two Dimensions to Axes 3/23/ Reasons for not pursuing 2D-3D visualizations although they are useful is because of Limited Number of data items. Systems already exist. Improvement – Providing feedback on the direction of the distance into visualization.

2. Mapping Two Dimensions to Axes 3/23/ Assign two dimensions to the axes Arrange the relevance factor according to the direction of the distance. For 1 dimension, arrangement is Negative distances to left, Positive distances to right, For other dimension Negative distances to bottom, Positive ones to top

2D arrangement of 1dimension 3/23/

Problems in this method 3/23/ Corner of window would be completely empty. Worst case- 2 diagonally opposite corners of the window may be completely empty which results in only half data items to be presented Maximizing the number of data item conflict with arrangements that have multiple dimensions assigned to axis.

Question 2: 3/23/ In 1 Dimension Negative distances are arranged 1) at the bottom 2) to the right 3) at the top 4) to the left

Answer 2: 3/23/ Correct answer: 4

3. Grouping the Dimensions for each Data Item 3/23/ All dimensions for one data item are grouped together in one area. Visualizations generated using this arrangement consists of only one window. We do not focus on shape to distinguish data items, and the criterion and arrangement of the data items is also different. 2x2 pixels per dimension needed as opposed to 1 pixel per dimension in previous 2 methods.

Grouping arrangement for 5 Dimensional Data 3/23/

Contd… 3/23/ Grouping arrangement is only suitable for focused search on smaller data sets because only one-fourth of the data items can be displayed on screen at one point of time. But still provides more visualizations for data sets with larger dimensionality. In other two techniques the pixels for each dimension of the data items are only related by their position.

Working 3/23/ Divided into the Visualization portion on left and Query Modification on right. In Visualization portion the resulting data set including a certain percentage of approximate answers is displayed by using one of the visualization methods. In Query Modification the sliders for modifying the selection predicates and weighting factors as well as some other options are provided.

3/23/

Working contd.. 3/23/ Different kind of sliders are there. Ex: Sliders for numbers, sliders for discrete types, sliders for non-metric types(ordinal and nominal data types) Other parameters listed are Number of results Query range Weighting factors Data values for selected tuple Data values corresponding to some selected color range

Working contd.. 3/23/ Changing the percentage of data being displayed may completely change the visualization as distance values are normalized according to new range. Normal Mode - System recalculates the visualization after each modification of query. Auto-Recalculate Off mode – Queries are only recalculated on demand.

Question 3: 3/23/ In which two sections is VisDB mainly divided?? 1. Visualization Portion 2. Grouping Dimentions 3. Query Modification 4. Coloration of Relevance factors

Answer 3: 3/23/ Correct answer: 1 and 3

Question 4 3/23/ In which mode does the system recalculates the visualization after each modification of query? 1. Normal Mode 2. Auto Recalculate Mode 3. Visual Mode 4. None of the above.

Answer 4: 3/23/ Correct answer: 1

Example(1000 data Items) 3/23/

Example(1000 data Items) 3/23/

Example(7000 data Items) 3/23/

Example(7000 data Items) 3/23/

Hardware/Software 3/23/ Software used C++ MOTIF Hardware used X- Windows on HP 7xx machines(Current version is main memory based and allows interaction data base exploration for database containing 50,000 data items)

Future Scope 3/23/ Automatic generation of queries that correspond to some specific region in one of the visualization windows. Generate time series of visualizations corresponding to queries that are changed incrementally. Applying to many different application domains each having its own parameters, distance functions, query requirements and so on.

Conclusion 3/23/ This VisDB allows visualization of the largest amount of data that can be displayed at one point of time on current display. Provides valuable feedback in querying the database Allows the user to find results which would other wise remain hidden in database.

3/23/ Thank you