Dynamic query tools for time series data sets: Timebox widgets for interactive exploration Harry Hochheiser Ben Shneiderman Presented by Justin Domke.

Slides:



Advertisements
Similar presentations
CS4432: Database Systems II
Advertisements

Probabilistic Skyline Operator over Sliding Windows Wenjie Zhang University of New South Wales & NICTA, Australia Joint work: Xuemin Lin, Ying Zhang, Wei.
Knapsack Problem Section 7.6. Problem Suppose we have n items U={u 1,..u n }, that we would like to insert into a knapsack of size C. Each item u i has.
Why ROOT?. ROOT ROOT: is an object_oriented frame work aimed at solving the data analysis challenges of high energy physics Object _oriented: by encapsulation,
1 Microsoft Access 2002 Tutorial 9 – Automating Tasks With Macros.
Dynamic Queries for Visual Information Seeking Ben Shneiderman Jin Tong Hyunmo Kang Cmsc838 Sep. 28, 1999.
Exploring Microsoft Access
Concepts of Database Management Seventh Edition
Polaris: A System for Query, Analysis and Visualization of Multi-dimensional Relational Databases Presented by Darren Gates for ICS 280.
Spreadsheets and Non- Spatial Databases Unit 4: Module 15, Lecture 2- Advanced Microsoft Excel.
Time Series visualizations Information Visualization – CPSC 533c Lior Berry March 10 th 2004.
1 Presented by Jean-Daniel Fekete. 2  Motivation  Mélange [Elmqvist 2008] Multiple Focus Regions.
Interactive Pattern Search in Time Series (Using TimeSearcher 2) Paolo Buono, Aleks Aris, Catherine Plaisant, Amir Khella, and Ben Shneiderman Proceedings,
Sharing Aggregate Computation for Distributed Queries Ryan Huebsch, UC Berkeley Minos Garofalakis, Yahoo! Research † Joe Hellerstein, UC Berkeley Ion Stoica,
1 SIMS 247: Information Visualization and Presentation Marti Hearst Oct 24, 2005.
A new crossover technique in Genetic Programming Janet Clegg Intelligent Systems Group Electronics Department.
Discrete Event Simulation How to generate RV according to a specified distribution? geometric Poisson etc. Example of a DEVS: repair problem.
1 SIMS 247: Information Visualization and Presentation Marti Hearst Oct 19, 2005.
TimeSearcher: Interactive Querying for Identification of Patterns in Genetic Data Harry Hochheiser Eric Baehrecke Stephen Mount Ben Shneiderman Harry Hochheiser.
Retrieval Evaluation: Precision and Recall. Introduction Evaluation of implementations in computer science often is in terms of time and space complexity.
Flowcharts Remember that a solution to a problem is called an algorithm. Algorithms are often a series of steps required to solve the problem. A flowchart.
An Incremental Refining Spatial Join Algorithm for Estimating Query Results in GIS Wan D. Bae, Shayma Alkobaisi, Scott T. Leutenegger Department of Computer.
Interactive Exploration of Hierarchical Clustering Results HCE (Hierarchical Clustering Explorer) Jinwook Seo and Ben Shneiderman Human-Computer Interaction.
Retrieval Evaluation. Introduction Evaluation of implementations in computer science often is in terms of time and space complexity. With large document.
CIS607, Fall 2005 Semantic Information Integration Article Name: Clio Grows Up: From Research Prototype to Industrial Tool Name: DH(Dong Hwi) kwak Date:
TimeSearcher: Interactive Querying for Identification of Patterns in Genetic Microarray Time Series Data Harry Hochheiser Ben Shneiderman Eric Baehrecke,
1 A Rank-by-Feature Framework for Interactive Exploration of Multidimensional Data Jinwook Seo, Ben Shneiderman University of Maryland Hyun Young Song.
McGraw-Hill/Irwin The Interactive Computing Series © 2002 The McGraw-Hill Companies, Inc. All rights reserved. Microsoft Excel 2002 Exploring Formulas.
Math – Getting Information from the Graph of a Function 1.
Homework Discussion Homework 1 (Glade Manual Chapter 1) Introduction to Excel.
1 Chapter 5: Creating Summarized Output 5.1 Generating Summary Statistics 5.2 Creating a Summary Report with the Summary Tables Task 5.3 Creating and Applying.
Mathematical Processes GLE  I can recognize which symbol correlates with the correct term.  I can recall the correct definition for each mathematical.
©Silberschatz, Korth and Sudarshan5.1Database System Concepts Chapter 5: Other Relational Languages Query-by-Example (QBE) Datalog.
Ch 6 - Menu-Based and Form Fill-In Interactions Yonglei Tao School of Computing & Info Systems GVSU.
Dynamic Queries –presented by Bhaskar Chatterjee Visual Alternative to SQL for Querying databases Depending on data types and the values decides the input.
Spring /6.831 User Interface Design and Implementation1 Lecture 15: Experiment Analysis.
Lesson 5 Using FunctionUsing Function. Objectives.
PIVOT TABLES AND CHARTS CS1100 Computer Science and its Applications CS1100Pivot tables and charts1.
Macros, Navigation Form, PivotTables, and Pivot Charts Access – Lesson 6.
©Silberschatz, Korth and Sudarshan13.1Database System Concepts Chapter 13: Query Processing Overview Measures of Query Cost Selection Operation Sorting.
Copyright © 2008 Pearson Prentice Hall. All rights reserved. 1 1 Copyright © 2008 Prentice-Hall. All rights reserved. What Can I Do with a Spreadsheet.
Dynamic Visualization Dynamic Queries For Visual Information Seeking by Ben Shneiderman Data Visualization Sliders by Stephen G. Eick Presented by Yimeng.
Fuzzy Inference (Expert) System
Touchstone Automation’s DART ™ (Data Analysis and Reporting Tool)
Introducing a fully integrated mathematics learning platform a fully integrated mathematics learning platform.
CORE 1: PROJECT MANAGEMENT Designing. This stage is where the actual solution is designed and built. This includes describing information processes and.
Recap Sum and Product Functions Matrix Size Function Variance and Standard Deviation Random Numbers Complex Numbers.
XP. Objectives Sort data and filter data Summarize an Excel table Insert subtotals into a range of data Outline buttons to show or hide details Create.
Introduction to ArcGIS for Environmental Scientists Module 3 – GIS Analysis Model Builder.
CMP 131 Introduction to Computer Programming Violetta Cavalli-Sforza Week 3, Lecture 1.
1 1 Chapter 3: Graphical Data Exploration 3.1 Exploring Relationships with a Continuous Y Variable 3.2 Exploring Relationships with a Categorical Y Variable.
VisDB: Database Exploration Using Multidimensional Visualization Maithili Narasimha 4/24/2001.
Navigation and Ancillary Information Facility NIF Introduction to WebGeocalc October 2014 SPICE components and services are not restricted under ITAR and.
Shuang Wu REU-DIMACS, 2010 Mentor: James Abello.  Project description  Our research project Input: time data recorded from the ‘Name That Cluster’ web.
1.1 Functions This section deals with the topic of functions, one of the most important topics in all of mathematics. Let’s discuss the idea of the Cartesian.
Visualization Four groups Design pattern for information visualization
1 Semantics and Evaluation Techniques for Window Aggregates in Data Streams Jin Li, David Maier, Kristin Tufte, Vassilis Papadimos, Peter Tucker This work.
Instance Discovery and Schema Matching With Applications to Biological Deep Web Data Integration Tantan Liu, Fan Wang, Gagan Agrawal {liut, wangfa,
Graphics Programming. Graphics Functions We can think of the graphics system as a black box whose inputs are function calls from an application program;
Shuang Wu REU-DIMACS, 2010 Mentor: James Abello. Project description Our research project Input: time data recorded from the ‘Name That Cluster’ web page.
Program Design & Development EE 201 C7-1 Spring
SIMS 247 Lecture 7 Simultaneous Multiple Views
How to automatise the grid production - using model builder in ArcGIS
Notes Over 2.1 Function {- 3, - 1, 1, 2 } { 0, 2, 5 }
Usability Case Study: Datamaps Visualization
Scatter Plots Section 1-5.
Algorithm Efficiency Chapter 10.
Information Design and Visualization
Dynamic Queries for Visual Information Seeking Ben Shneiderman
CS 405G: Introduction to Database Systems
Presentation transcript:

Dynamic query tools for time series data sets: Timebox widgets for interactive exploration Harry Hochheiser Ben Shneiderman Presented by Justin Domke

Data that changes over time is common. Algorithmic and statistical methods are good at answering questions. How to choose the questions themselves? Motivation

Standard time plots are very compelling, but can only display a limited amount of data

Idea: Query the data!

Notation n i is an item in a time series data set n i (t) is the value of n i at time t

Three Widgets: (1) Timebox A timebox is a 4-tuple b = (t min, t max, v min, v max ) n i satisfies b if for all t, t min ≤ t ≤ t max, v min ≤ n i (t) ≤ v max

Three Widgets: (2) Variable Time Timebox A variable time timebox is a 5-tuple b = (t min, t max, v min, v max, R) n i satisfies b if: there exists t 0, t min ≤ t 0 ≤ t max - R, such that for all t, t 0 ≤ t ≤ t 0 +R, v min ≤ n i (t) ≤ v max t min v min v max R

Three Widgets: (3) Angular Query Widget An angular query widget is a 4-tuple b = (t min, t max, θ min, θ max ) n i satisfies b if for all t, t min ≤ t ≤ t max, θ min ≤ φ(n i (t), n i (t)) ≤ θ max Where φ is the angle formed on the graph. min max

Demonstration Standard Timeboxes –Drag From Display Window –Manpulate multiple boxes –Coupling of windows Variable Time Timeboxes Angular Queries Query Inversion Query Multiple Variables Leaders and Laggards

Performance Over 75% of time is spent on query evaluation. Naïve approach: –For each item in the set, examine every point in each timebox. Easy improvement: –Throw an item out if it fails any query.

Performance (2) – Alternatives Suppose data has n time series, each with m time points. Think of this as mn points in 2-d space. Use geometric methods to find the points in each given range. –Increment a value for each point in a series. If the sum is right, the series satisfies the query. Use orthogonal range tree or grid approach with buckets

Performance – 3 Seq – Sequential Orth – Orthogonal Range Tree Grid-X – Grid approach w/ X buckets Average query completion time vs. number of items for random data. (100 time points)

Performance – 4 Seq – Sequential Orth – Orthogonal Range Tree Grid-X – Grid approach w/ X buckets Average query completion time vs. number of time points for random data. (100 items)

Design Studies 24 Computer Science students completed various tasks using different but semantically equivalent input mechanisms: –Timebox queries –Fill-in –Range sliders

Design Study 1 Fully specified tasks. (“During days 22-23, are there more stocks between , , or 49-99”) –Form fill in fastest –Range sliders second. –Timeboxes last.

Design Study 2 More open-ended tasks. Comare: –Timeboxes with graphical output –Forms with graphical output –Forms with tabular output No statistically significant difference. (Were the users already familiar with timeboxes?)

Comments Problems with user interface? Why “timesearcher”, instead of “parallelcoordinatesearcher”? In the performance experiment, what did the data look like? In the design study, were the users already familiar with Timesearcher?