Parallel Feature Identification and Elimination from a CFD Dataset

Slides:



Advertisements
Similar presentations
List Ranking on GPUs Sathish Vadhiyar. List Ranking on GPUs Linked list prefix computations – computations of prefix sum on the elements contained in.
Advertisements

Jialin Liu, Bradly Crysler, Yin Lu, Yong Chen Oct. 15. Seminar Data-Intensive Scalable Computing Laboratory (DISCL) Locality-driven High-level.
1 Projection Indexes in HDF5 Rishi Rakesh Sinha The HDF Group.
Image Indexing and Retrieval using Moment Invariants Imran Ahmad School of Computer Science University of Windsor – Canada.
Autocorrelation and Linkage Cause Bias in Evaluation of Relational Learners David Jensen and Jennifer Neville.
Mining for High Complexity Regions Using Entropy and Box Counting Dimension Quad-Trees Rosanne Vetro, Wei Ding, Dan A. Simovici Computer Science Department.
Click to add text Introduction to the new mainframe: Large-Scale Commercial Computing © Copyright IBM Corp., All rights reserved. Chapter 3: Scalability.
UNCLASSIFIED: LA-UR Data Infrastructure for Massive Scientific Visualization and Analysis James Ahrens & Christopher Mitchell Los Alamos National.
Classifier Decision Tree A decision tree classifies data by predicting the label for each record. The first element of the tree is the root node, representing.
University of CreteCS4831 The use of Minimum Spanning Trees in microarray expression data Gkirtzou Ekaterini.
1 A Scalable Content- Addressable Network S. Ratnasamy, P. Francis, M. Handley, R. Karp, S. Shenker Proceedings of ACM SIGCOMM ’01 Sections: 3.5 & 3.7.
(C) 2001 SNU CSE Biointelligence Lab Incremental Classification Using Tree- Based Sampling for Large Data H. Yoon, K. Alsabti, and S. Ranka Instance Selection.
Adaptive Signal Processing Class Project Adaptive Interacting Multiple Model Technique for Tracking Maneuvering Targets Viji Paul, Sahay Shishir Brijendra,
Roger ZimmermannCOMPSAC 2004, September 30 Spatial Data Query Support in Peer-to-Peer Systems Roger Zimmermann, Wei-Shinn Ku, and Haojun Wang Computer.
Very Long Instruction Word (VLIW) Architecture. VLIW Machine It consists of many functional units connected to a large central register file Each functional.
Institute for Mathematical Modeling RAS 1 Visualization in distributed systems. Overview. Remote visualization means interactive viewing of three dimensional.
CSE314 Database Systems More SQL: Complex Queries, Triggers, Views, and Schema Modification Doç. Dr. Mehmet Göktürk src: Elmasri & Navanthe 6E Pearson.
Performance Concepts Mark A. Magumba. Introduction Research done on 1058 correspondents in 2006 found that 75% OF them would not return to a website that.
A Metadata Based Approach For Supporting Subsetting Queries Over Parallel HDF5 Datasets Vignesh Santhanagopalan Graduate Student Department Of CSE.
Interactive Shortest Path Part 3 An Image Segmentation Technique Jonathan-Lee Jones.
Computer Science Research and Development Department Computing Sciences Directorate, L B N L 1 Storage Management and Data Mining in High Energy Physics.
Chapter 10 Image Segmentation.
Adaptive Web Caching CS411 Dynamic Web-Based Systems Flying Pig Fei Teng/Long Zhao/Pallavi Shinde Computer Science Department.
Motion Analysis using Optical flow CIS750 Presentation Student: Wan Wang Prof: Longin Jan Latecki Spring 2003 CIS Dept of Temple.
VizDB A tool to support Exploration of large databases By using Human Visual System To analyze mid-size to large data.
LBR & WS LAB 1: INTRODUCTION TO GIS.
ATmospheric, Meteorological, and Environmental Technologies RAMS Parallel Processing Techniques.
Introduction.  Administration  Simple DBMS  CMPT 454 Topics John Edgar2.
A Design Flow for Optimal Circuit Design Using Resource and Timing Estimation Farnaz Gharibian and Kenneth B. Kent {f.gharibian, unb.ca Faculty.
AQWA Adaptive Query-Workload-Aware Partitioning of Big Spatial Data Dimosthenis Stefanidis Stelios Nikolaou.
University of Texas at Arlington Scheduling and Load Balancing on the NASA Information Power Grid Sajal K. Das, Shailendra Kumar, Manish Arora Department.
Harnessing Kansas City Open Data to Improve the Lives of Citizens SAMAA GAZZAZ – SUPERVISED BY DR. PRAVEEN RAO DEPT. COMPUTER SCIENCE ELECTRICAL ENGINEERING.
Hierarchical Systolic Array Design for Full-Search Block Matching Motion Estimation Noam Gur Arie,August 2005.
Digital Image Processing CCS331 Relationships of Pixel 1.
Performing Fault-tolerant, Scalable Data Collection and Analysis James Jolly University of Wisconsin-Madison Visualization and Scientific Computing Dept.
Database (Microsoft Access). Database A database is an organized collection of related data about a specific topic or purpose. Examples of databases include:
CFD Simulation Investigation of Natural Gas Components through a Drilling Pipe RASEL A SULTAN HOUSSEMEDDINE LEULMI.
Computer Science and Engineering Parallelizing Feature Mining Using FREERIDE Leonid Glimcher P. 1 ipdps’04 Scaling and Parallelizing a Scientific Feature.
Fast Data Analysis with Integrated Statistical Metadata in Scientific Datasets By Yong Chen (with Jialin Liu) Data-Intensive Scalable Computing Laboratory.
Motion of a Fluid Particle (Kinematics)
Motion of a Fluid Particle (Kinematics)
More SQL: Complex Queries, Triggers, Views, and Schema Modification
Pathology Spatial Analysis February 2017
Parallel Databases.
Introduction to Wireless Sensor Networks
Locality-driven High-level I/O Aggregation
In-situ Visualization using VisIt
CHAPTER 4 TOP LEVEL VIEW OF COMPUTER FUNCTION AND INTERCONNECTION
Dissemination of Sensitive Variables on a Grid Dataset :
Introduction to client/server architecture
Mean Shift Segmentation
CSCE 990: Advanced Distributed Systems
SpatialHadoop: A MapReduce Framework for Spatial Data
Dynamic Indexing in SpatialHadoop
Parallel Programming in C with MPI and OpenMP
Preliminaries: -- vector, raster, shapefiles, feature classes.
EFFICIENT RANGE QUERY PROCESSING ON UNCERTAIN DATA
Akshay Tomar Prateek Singh Lohchubh
Physics-based simulation for visual computing applications
Introduction to Operating Systems
TeraScale Supernova Initiative
Introduction to Operating Systems
CS510 - Portland State University
Evaluation of Relational Operations: Other Techniques
Database System Architectures
Parallel Programming in C with MPI and OpenMP
Automatic and Efficient Data Virtualization System on Scientific Datasets Li Weng.
Range Likelihood Tree: A Compact and Effective Representation for Visual Exploration of Uncertain Data Sets Ohio State University (Shen) Problem: Uncertainty.
Numerical Investigation of Hydrogen Release from Varying Diameter Exit
Detecting and analysing motion
Presentation transcript:

Parallel Feature Identification and Elimination from a CFD Dataset Jeremy Davis CSE 260 November 30, 2006

Introduction Analysis of scientific data places a high demand on computing resources Computational complexity (processing cost) Large data sets (memory and I/O cost) Parallel processing can help Split computation among multiple processors Larger overall memory size

Datasets This project uses a set of 3D computational fluid dynamics (CFD) simulation datasets Discrete field data Each point contains flow velocity (X, Y, and Z directions), pressure, and density values Two sizes: 385 x 130 x 194 (370 MB) 642 x 193 x 385 (1830 MB) One file per time increment

Analysis Analysis consisted of calculating vorticity at each point, and identifying features which match certain characteristics Vorticity calculated from flow velocities of nearby points Thresholding used to identify points which qualify as a feature

Vorticity Features Columns – Points corresponding to high vertical vorticity and low horizontal vorticity Dislocations – Points corresponding to low vertical vorticity and high horizontal vorticity Y X

Analysis Steps Partition the data to allow parallel computation Calculate vorticity Organize data points based on vorticity values Identify features Calculate and plot results

Data Partitioning Dataset is first partitioned into distinct 3D regions Each parallel process will work with a subset of the available regions Some points duplicated at region boundaries to allow independent vorticity calculation I/O intensive Not scalable (device contention)

Vorticity Calculation Each process calculates vorticity values for all points within its assigned region(s) Highly scalable No communication needed – processes can work independently within their own regions

Data Organization As vorticity is calculated, identifiers for each point are added to a spatial data structure Horizontal and vertical vorticity determine spatial coordinates

Identify Features Points meeting the feature thresholds can be found via a spatial query Only check points that are within or close to the threshold values Incremental queries can be done using prior results

Calculate and Plot Results Once features are identified, the results can be visualized, or further calculations can be performed Aggregate values for feature points, or eliminate features and analyze remaining points Y X

Performance and Scalability

Performance and Scalability

Conclusions and Future Work Analysis can be completed in parallel with good scalability I/O must be considered Experiment with other spatial data structures E.g. R-Tree based Explore interactive applications