Presentation is loading. Please wait.

Presentation is loading. Please wait.

DataJewel 1 : Tightly Integrating Visualization with Temporal Data Mining Mihael Ankerst, David H. Jones, Anne Kao, Changzhou Wang 1 US patent pending.

Similar presentations


Presentation on theme: "DataJewel 1 : Tightly Integrating Visualization with Temporal Data Mining Mihael Ankerst, David H. Jones, Anne Kao, Changzhou Wang 1 US patent pending."— Presentation transcript:

1 DataJewel 1 : Tightly Integrating Visualization with Temporal Data Mining Mihael Ankerst, David H. Jones, Anne Kao, Changzhou Wang 1 US patent pending

2 DataJewel: A novel Architecture for temporal data mining Motivation:  In different domains, different kind of patterns are of interest  Architecture that provides access to many temporal mining algorithms  Databases are built based on organizational needs  Architecture that links together databases  Databases can be huge in size  Data has to be compressed  Current Data Mining tools are for data mining experts  Architecture that is very intuitive and easy to use

3 Visual Data Mining Information Visualization Data Mining Visual Data Mining Data Mining Algorithms ++-- Actionable Evaluation Flexibility User Interaction Visualization --++

4 Visual Data Mining Architecture: Tightly Integrated Visualization Data Knowledge DM-Algorithm Result Visualization of the data Result DM-Algorithm step 1 Data Knowledge DM-Algorithm step n Visualization + Interaction Preceding Visualization (PV) Subsequent Visualization (SV) Tightly integrated Visualization (TIV) Visualization of the result Data DM-Algorithm Knowledge Visualization of the result Result

5 Data source layer Statistical layer Data mining layer Visualization layer Access and link multiple heterogeneous databases, data sources Compression, aggregation, sampling Extensible set of data mining algorithms for automatic pattern discovery Extensible set of visualizations for representing data and the patterns + interaction capabilities for the user to incorporate domain expertise Architecture of DataJewel

6 TimeEvent typeLocation… 09/11/2001Door brokenSeattle… 09/12/2001……… January 2002 S M T W T F S Tuesday, Jan 1 st 2002 Doors Engine Landing Gear Lights The Visualization Component

7 Goal: Mining algorithms should be  Very efficient (result in interactive times)  Types of patterns: single event: recurrence, periodicity,… multiple events: similarity, causality, clustering,…  Tightly integrated with the visualization Solution:  Algorithm computes pattern and updates visualization by assigning unique colors just to events which are contained in the pattern All algorithms result in updating the color assignment: - CalendarView visualizes the data and the patterns - CalendarView visualizes the data and the patterns - Same color assignment interface is used by the user and the algorithm - Same color assignment interface is used by the user and the algorithm The Temporal Mining Component

8 Implemented new mining algorithms  LongestStreak  Most Deviations  Correlated Events  Basic ideas of algorithms are motivated by control charting (stabilized p-chart) time Frequency mean 755610

9 The Statistical & Database Component Access to data from different databases Precompute compressed/aggregated/ sampled data Use lookup tables to further compress data  Currently, we can analyze millions of records in real-time

10 The Statistical & Database Component DateATAComplaint_t xt … 1/1/200035….… 1/1/200035…… 1/1/200039…… Procurement DB Maintenance DB Procurement DB Maintenance DB DateATAComplaint_t xt … 12/1/200073….… 12/1/200073…… 15/1/200049…… Airline_a Airline_b

11 The Statistical & Database Component DateATAComplaint_t xt … 1/1/200035….… 1/1/200035…… 1/1/200039…… Procurement DB Maintenance DB Select Date, ATA, count(*) as Freq From airline_a GROUP BY Date, ATA ORDER BY Date, ATA DateATAFreq 12/1/19997327 15/1/1999499 ……… Aggregate data with: DateATAComplaint_t xt … 12/1/200073….… 12/1/200073…… 15/1/200049…… Airline_a Airline_b

12 The Statistical & Database Component DateATAComplaint_t xt … 1/1/200035….… 1/1/200035…… 1/1/200039…… Procurement DB Maintenance DB Select Date, ATA, count(*) as Freq From airline_b GROUP BY Date, ATA ORDER BY Date, ATA DateATAFreq 1/1/200035344 1/1/200039193 ……… Aggregate data with: DateATAComplaint_t xt … 1/1/200035….… 1/1/200035…… 1/1/200039…… Airline_a Airline_b

13 User-Centric Data Mining User selects data source/ attributes Data is compressed and loaded Data is visualized User selects date range User interacts with visualization User invokes algorithm Raw data is shown User selects visualization technique

14 Using 41 “different” colors… DataJewel – Scenario: Mining Algorithm

15

16 Press here for running mining algorithm DataJewel – Scenario: Mining Algorithm

17

18

19

20 DataJewel – Scenario: User Interaction

21

22

23

24 One airline, one model, ATA: 73 (Engine fuel/ control) Screenshots

25 One airline, one model, ATA: 49 (airborne auxiliary power) Screenshots

26 Conclusions  Data mining algorithms and visualization technique can nicely complement each other  CalendarView is a new visualization technique, representing frequency of daily events  DataJewel uses the same visualization to represent the data and the patterns. The color assignment interface is used by both the user (to incorporate domain knowledge) and for the computer (to represent the discovered patterns). These two key properties greatly improve the applicability of the system by domain experts. Future work: user studies, new visualizations, algorithms, …


Download ppt "DataJewel 1 : Tightly Integrating Visualization with Temporal Data Mining Mihael Ankerst, David H. Jones, Anne Kao, Changzhou Wang 1 US patent pending."

Similar presentations


Ads by Google