Parallel OpenFOAM CFD Performance Studies Student: Adi Farshteindiker Advisors: Dr. Guy Tel-Zur,Prof. Shlomi Dolev The Department of Computer Science Faculty.

Slides:



Advertisements
Similar presentations
Real Time Power and Performance Monitoring of Supercomputer Application Shankar Prajapati BS in Computer Science Claflin University
Advertisements

Program Analysis and Tuning The German High Performance Computing Centre for Climate and Earth System Research Panagiotis Adamidis.
Profiling your application with Intel VTune at NERSC
Intel® performance analyze tools Nikita Panov Idrisov Renat.
XEON PHI. TOPICS What are multicore processors? Intel MIC architecture Xeon Phi Programming for Xeon Phi Performance Applications.
Presented by Rengan Xu LCPC /16/2014
Robert Bell, Allen D. Malony, Sameer Shende Department of Computer and Information Science Computational Science.
On the Integration and Use of OpenMP Performance Tools in the SPEC OMP2001 Benchmarks Bernd Mohr 1, Allen D. Malony 2, Rudi Eigenmann 3 1 Forschungszentrum.
NETL 2014 Workshop on Multiphase Flow Science August 5-6, 2014, Morgantown, WV Accelerating MFIX-DEM code on the Intel Xeon Phi Dr. Handan Liu Dr. Danesh.
Contemporary Languages in Parallel Computing Raymond Hummel.
ORIGINAL AUTHOR JAMES REINDERS, INTEL PRESENTED BY ADITYA AMBARDEKAR Overview for Intel Xeon Processors and Intel Xeon Phi coprocessors.
Team Members Lora zalmover Roni Brodsky Academic Advisor Professional Advisors Dr. Natalya Vanetik Prof. Shlomi Dolev Dr. Guy Tel-Zur.
Performance Evaluation of Hybrid MPI/OpenMP Implementation of a Lattice Boltzmann Application on Multicore Systems Department of Computer Science and Engineering,
Processing of a CAD/CAE Jobs in grid environment using Elmer Electronics Group, Physics Department, Faculty of Science, Ain Shams University, Mohamed Hussein.
1 Intel® Many Integrated Core (Intel® MIC) Architecture MARC Program Status and Essentials to Programming the Intel ® Xeon ® Phi ™ Coprocessor (based on.
WORK ON CLUSTER HYBRILIT E. Aleksandrov 1, D. Belyakov 1, M. Matveev 1, M. Vala 1,2 1 Joint Institute for nuclear research, LIT, Russia 2 Institute for.
LLNL-PRES-XXXXXX This work was performed under the auspices of the U.S. Department of Energy by Lawrence Livermore National Laboratory under Contract DE-AC52-07NA27344.
Computing Labs CL5 / CL6 Multi-/Many-Core Programming with Intel Xeon Phi Coprocessors Rogério Iope São Paulo State University (UNESP)
Analyzing parallel programs with Pin Moshe Bach, Mark Charney, Robert Cohn, Elena Demikhovsky, Tevi Devor, Kim Hazelwood, Aamer Jaleel, Chi- Keung Luk,
Integrated Performance Views in Charm++: Projections meets TAU Scott Biersdorff Allen D. Malony Department Computer and Information Science University.
Scalable Analysis of Distributed Workflow Traces Daniel K. Gunter and Brian Tierney Distributed Systems Department Lawrence Berkeley National Laboratory.
High Performance Computing Processors Felix Noble Mirayma V. Rodriguez Agnes Velez Electric and Computer Engineer Department August 25, 2004.
Improving I/O with Compiler-Supported Parallelism Why Should We Care About I/O? Disk access speeds are much slower than processor and memory access speeds.
Motivation: Sorting is among the fundamental problems of computer science. Sorting of different datasets is present in most applications, ranging from.
Connections to Other Packages The Cactus Team Albert Einstein Institute
By Chi-Chang Chen.  Cluster computing is a technique of linking two or more computers into a network (usually through a local area network) in order.
HPC F ORUM S EPTEMBER 8-10, 2009 Steve Rowan srowan at conveycomputer.com.
Integrated Performance Views in Charm++: Projections meets TAU Scott Biersdorff Allen D. Malony Department Computer and Information Science University.
PERFORMANCE OF THE OPENMP AND MPI IMPLEMENTATIONS ON ULTRASPARC SYSTEM Abstract Programmers and developers interested in utilizing parallel programming.
Written by : Oren Frenkel Intel Confidential ® C CD SDS.
Martin Kruliš by Martin Kruliš (v1.1)1.
Evolution at CERN E. Da Riva1 CFD team supports CERN development 19 May 2011.
Towards a High Performance Extensible Grid Architecture Klaus Krauter Muthucumaru Maheswaran {krauter,
Parallel Programming Models
Software and Communication Driver, for Multimedia analyzing tools on the CEVA-X Platform. June 2007 Arik Caspi Eyal Gabay.
Kai Li, Allen D. Malony, Sameer Shende, Robert Bell
What Do Computers Do? A computer system is
Productive Performance Tools for Heterogeneous Parallel Computing
Community Grids Laboratory
R. Rastogi, A. Srivastava , K. Sirasala , H. Chavhan , K. Khonde
COMPUTATIONAL MODELS.
Performance Technology for Scalable Parallel Systems
For Massively Parallel Computation The Chaotic State of the Art
Welcome: Intel Multicore Research Conference
MPI: Portable Parallel Programming for Scientific Computing
Tracing and Performance Analysis Tools for Heterogeneous Multicore System by Soon Thean Siew.
Parallel Programming By J. H. Wang May 2, 2017.
Multi-core processors
High Performance Computing on an IBM Cell Processor --- Bioinformatics
Pattern Parallel Programming
Performance Analysis, Tools and Optimization
University of Technology
Many-core Software Development Platforms
Operating Systems (CS 340 D)
IXPUG Abstract Submission Instructions
Intel® Parallel Studio and Advisor
Simulation at NASA for the Space Radiation Effort
Scalable Parallel Interoperable Data Analytics Library
Discussion HPC Priority project for COSMO consortium
PASC PASCHA Project The next HPC step for the COSMO model
CARLA Buenos Aires, Argentina - Sept , 2017
Peng Jiang, Linchuan Chen, and Gagan Agrawal
Hybrid Programming with OpenMP and MPI
Outline Introduction Motivation for performance mapping SEAA model
Defining the Grid Fabrizio Gagliardi EMEA Director Technical Computing
Kevin Cole Faculty Advisor: Dr. Ivaylo Nedyalkov Objective
Lecture 20 Parallel Programming CSE /27/2019.
Cloud Computing What is it ? Why use it ? Enablers Pros and Cons
Question 1 How are you going to provide language and/or library (or other?) support in Fortran, C/C++, or another language for massively parallel programming.
L. Glimcher, R. Jin, G. Agrawal Presented by: Leo Glimcher
Presentation transcript:

Parallel OpenFOAM CFD Performance Studies Student: Adi Farshteindiker Advisors: Dr. Guy Tel-Zur,Prof. Shlomi Dolev The Department of Computer Science Faculty of Natural Sciences Ben-Gurion University of the Negev

Background Message Passing Interface (MPI) is a communication protocol for programming parallel computers. It goals are high performance, scalability, and portability. OpenFOAM is a free, open source Computational Fluid Dynamics (CFD) software. It has an extensive range of features to solve differential equations. TAU (Tuning and Analysis Utility) is a portable profiling and tracing toolkit for performance analysis of parallel programs.

Note: The pictures are for demonstration only Specifications Intel Xeon E v2 From Intel E5 family is a processor with 12 cores. Intel Xeon Phi processor,based on Intel Many Integrated Core (Intel MIC) architecture, is a coprocessor card enable dramatic performance for some of today’s most demanding applications. Xeon Phi has 60 cores, 4 threads per core.

Motivation Our project is motivated by the observation that OpenFOAM can’t run with “out of the box” profiler, thus limited information about OpenFOAM’s performance provided. In general, integrating profiler to a program requires additional work, since the program should be compiled and linked with the profiler’s libraries and routines.

Subject In this project, we focus on three main topics: 1.Analyze OpenFOAM programming model, code structure and compiling procedures. 2.Study basic parallel programming models on Xeon Phi processor. 3.Assimilate TAU profiler to OpenFOAM case study.

Workflow Stage 1 Execute OpenFOAM on Intel Xeon Build TAU on Intel Xeon Integrate OpenFOAM and TAU on Intel Xeon Stage 2 Execute OpenFOAM on Intel Xeon Phi Build TAU on Intel Xeon Phi Integrate OpenFOAM and TAU on Intel Xeon Phi

Conclusions We saw a way to compile and build OpenFOAM’s libraries on Xeon Phi processor. Analysis of OpenFOAM’s solver using TAU shows us the bottleneck area in our computation.