Presentation is loading. Please wait.

Presentation is loading. Please wait.

Mining Weather Data for Decision Support Roy George Army High Performance Computing Research Center Clark Atlanta University Atlanta, GA 30314.

Similar presentations


Presentation on theme: "Mining Weather Data for Decision Support Roy George Army High Performance Computing Research Center Clark Atlanta University Atlanta, GA 30314."— Presentation transcript:

1 Mining Weather Data for Decision Support Roy George Army High Performance Computing Research Center Clark Atlanta University Atlanta, GA 30314

2 2 Research Clustering Algorithms for Data Mining Spatio-Temporal Domain Parallelization of Algorithms Algorithms for Feature Extraction and Knowledge Discovery

3 3 Challenges of Geographical Data Complexities associated with data volume Terabyte databases Domain complexities Interesting signals hidden by stronger patterns Complexities caused by local variation Systems are interconnected Data gathering and sampling Interpretation of aggregated data Formalizing the domain

4 4 Background: Issues with Hard Clustering Issue: Force data with imprecision and/or uncertainty into discrete classes Result: Missing important outliers, boundary patterns Approach: Use of Approximate Clustering Technique

5 5 Background: K-Means Clustering Partition the data into K Clusters that are homogenous Algorithm Select K time series as initial centroids Assign all time series to the most similar centroid Re-compute the centeroids Repeat till centroids do not change Variations based on different measures of similarity

6 6 Unsupervised Fuzzy K-Means (UKFM) Clustering Choose the initial number of clusters Develop a clustering using the Fuzzy K- Means Merge the cluster pair that have maximum correlation Compute validity measure Repeat till until termination condition reached

7 7 UKFM Results Weather Data Set Initial: 11 ClustersOptimal: 8 Clusters Final: 4 Clusters

8 8 Global Earth Science Data Collaborative Effort with V. Kumar (UMinn) Test bed for UKFM (comparison with existing techniques) Data Set Global Sea Pressure (1989 – 1993) Ocean Climate Indices Capture Teleconnections Result UKFM can capture even weaker OCI’s using coarse clusters

9 9 Global Climate Data (Sea Level Pressure) Intermediate: 60 Clusters

10 10 Global Climate Data (Sea Level Pressure) Final: 26 Clusters

11 11 Relation with SOI

12 12 Integrating Multi Datasets in UFKM Clustering Motivation: Data-based approach of Determining “interesting” clusters Validate using multi datasets Rule: Retain clusters that have supporting data Applicable in Data Rich Environment

13 13 UKFM Clustering with Multi- Dataset Validation Choose the initial number of clusters Develop a clustering using the Fuzzy K- Means Validate cluster with other datasets D i=1,n Merge if clusters is uncorrelated Else Consider next candidate pair to merge Repeat till until termination condition reached

14 14 UKFM Multi-Dataset Results Height Pressure Temperature Windspeed

15 15 Multi-threading Parallel Algorithm For each clustering stage For each iteration Slaves: Calculate M for each cluster Master: Normalize M Slaves: Calculate C for each cluster Master: Normalize C

16 16 Multi-threading Result Implemented on Sun Fire workstation with four 900-MHz UltraSPARC® III processors Near Linear Speed Up Obtained

17 17 Relevance to the Army Directly supports the FBKOF STO (B. Broome) Development of the Weather Information and Tactical Support (WITS) System

18 18 Weather Information and Tactical Support (WITS) Objective: Extraction of patterns from weather to be extracted and fused with external databases (logistics, terrain, forces, etc.) for higher level planning

19 19 Approach Development of an OLAP Weather Repository GA Weather (1981-2002) Sources: Nat. Weather Svc, GA Env. Network Development of WITS Modules Ad-hoc Querying Real time Analysis and Planning Effects on Army Systems Integration with IWEDA Abstract Data Representation

20 20 WITS System Design

21 21 WITS/IQ

22 22 WITS/IQ

23 23 WITS/IWEDA

24 24 WITS/Analysis

25 25 WITS/Analysis

26 26 Work in Progress Characterization of Analysis Queries Incorporation into Data Mining Algorithms into WITS Enhancement of WITS/TAPS Implementation of WITS/Real

27 27 Hybrid Genetic Fuzzy Systems for Feature Extraction and Knowledge Discovery

28 28 Project Goals Design and implement hybrid genetic fuzzy system for knowledge discovery. Develop API/Tools. Apply tools to Army related problems.

29 29 Contribution Hybrid system based on the Simple Genetic Algorithm (SGA). Enhanced the SGA by adding three levels of knowledge discovery. Level 1: Discovers up to k possible rules for a given set of inputs and outputs. It then attempts to minimize the number of rules and tune the knowledge base. Level 2: Takes the set of rules from Level 1 and further minimizes the rules. In addition, it also tunes the knowledge base. Level 3: Makes one last attempt to further tune the architecture of the knowledge base.

30 30 Rule Discovery Search for k possible rules from the set of p possible rules. k is a input parameter of the GA application. Discover the smallest value of k, therefore reducing the number of rules needed. Example Rules: If INPUT_1 is low AND INPUT_2 is medium THEN OUTPUT_1 is high If INPUT_1 is high THEN OUTPUT_1 is low

31 31 Relevance to the Army Collaborators: Jeff Passner, John Raby (ARL) IMETS weather modeling Post processing used to predict additional parameters Visibility, Turbulence, Fog, etc. Use of Knowledge Discovery to Predict Parameters

32 32 Visibility Application Generate and tune a system that can predict visibility based on input parameters Tasks for the fuzzy genetic system Search for a set of k rules from p possible rules that describe the relationship of the input parameters with the output (visibility) Concurrently discover the architecture, and optimize the performance of the knowledge-bases in relation to the k rules

33 33 Results for Low Visibility Classifier

34 34 Results for Medium Visibility Classifier


Download ppt "Mining Weather Data for Decision Support Roy George Army High Performance Computing Research Center Clark Atlanta University Atlanta, GA 30314."

Similar presentations


Ads by Google