CcaEcloud Phase I Wrap-up Phase I Doe SBIR Stefan Muszala, PI DOE Grant No DE-FG02-08ER85152 Tech-X Corporation Boulder, CO Updates: onRamp, FACETS+Babel,

Slides:



Advertisements
Similar presentations
Network II.5 simulator ..
Advertisements

Technology Drivers Traditional HPC application drivers – OS noise, resource monitoring and management, memory footprint – Complexity of resources to be.
MotoHawk Training Model-Based Design of Embedded Systems.
Chapter 1 CSF 2009 Computer Performance. Defining Performance Which airplane has the best performance? Chapter 1 — Computer Abstractions and Technology.
Chapter 1 Introduction to Object- Oriented Programming and Problem Solving.
Nick Trebon, Alan Morris, Jaideep Ray, Sameer Shende, Allen Malony {ntrebon, amorris, Department of.
Active Messages: a Mechanism for Integrated Communication and Computation von Eicken et. al. Brian Kazian CS258 Spring 2008.
Chapter 13 Embedded Systems
1/28/2004CSCI 315 Operating Systems Design1 Operating System Structures & Processes Notice: The slides for this lecture have been largely based on those.
EET 4250: Chapter 1 Performance Measurement, Instruction Count & CPI Acknowledgements: Some slides and lecture notes for this course adapted from Prof.
Introduction What is this ? What is this ? This project is a part of a scientific research in machine learning, whose objective is to develop a system,
Generative Programming. Generic vs Generative Generic Programming focuses on representing families of domain concepts Generic Programming focuses on representing.
Introduction To C++ Programming 1.0 Basic C++ Program Structure 2.0 Program Control 3.0 Array And Structures 4.0 Function 5.0 Pointer 6.0 Secure Programming.
1b.1 Types of Parallel Computers Two principal approaches: Shared memory multiprocessor Distributed memory multicomputer ITCS 4/5145 Parallel Programming,
1 Presenter: Ming-Shiun Yang Sah, A., Balakrishnan, M., Panda, P.R. Design, Automation & Test in Europe Conference & Exhibition, DATE ‘09. A Generic.
Lecture 29 Fall 2006 Lecture 29: Parallel Programming Overview.
BCS 2143 Introduction to Object Oriented and Software Development.
Silberschatz, Galvin and Gagne ©2009 Operating System Concepts – 8 th Edition, Chapter 2: System Structures.
1 Fault Tolerance in the Nonstop Cyclone System By Scott Chan Robert Jardine Presented by Phuc Nguyen.
SWE 316: Software Design and Architecture – Dr. Khalid Aljasser Objectives Lecture 11 : Frameworks SWE 316: Software Design and Architecture  To understand.
ICOM 5995: Performance Instrumentation and Visualization for High Performance Computer Systems Lecture 7 October 16, 2002 Nayda G. Santiago.
Babel F2003 Wrap-up Stefan Muszala*, Tom Epperly(LLNL), Nanbor Wang* Funded by DOE (TASCS) Grant No DE-FC02-07ER25805, DOE Grant No DE-FG02-04ER84099 and.
1b.1 Types of Parallel Computers Two principal approaches: Shared memory multiprocessor Distributed memory multicomputer ITCS 4/5145 Parallel Programming,
CHAPTER FOUR COMPUTER SOFTWARE.
Role-Based Guide to the RUP Architect. 2 Mission of an Architect A software architect leads and coordinates technical activities and artifacts throughout.
C OMPUTER O RGANIZATION AND D ESIGN The Hardware/Software Interface 5 th Edition Chapter 1 Computer Abstractions and Technology Sections 1.5 – 1.11.
Tot 15 LTPDA Graphic User Interface summary and status N. Tateo 26/06/2007.
1 ENERGY 211 / CME 211 Lecture 26 November 19, 2008.
VORPAL Optimizations for Petascale Systems Paul Mullowney, Peter Messmer, Ben Cowan, Keegan Amyx, Stefan Muszala Tech-X Corporation Boyana Norris Argonne.
Plans and Opportunities Involving Beam Dynamics Components ComPASS SAP Project and Phase I and II Doe SBIR Boyana Norris (ANL) In collaboration with Stefan.
Components for Beam Dynamics Douglas R. Dechow, Tech-X Lois Curfman McInnes, ANL Boyana Norris, ANL With thanks to the Common Component Architecture (CCA)
Building an Electron Cloud Simulation using Bocca, Synergia2, TxPhysics and Tau Performance Tools Phase I Doe SBIR Stefan Muszala, PI DOE Grant No DE-FG02-08ER85152.
SAP Participants: Douglas Dechow, Tech-X Corporation Lois Curfman McInnes, Boyana Norris, ANL Physics Collaborators: James Amundson, Panagiotis Spentzouris,
Center for Component Technology for Terascale Simulation Software CCA is about: Enhancing Programmer Productivity without sacrificing performance. Supporting.
SCIRun and SPA integration status Steven G. Parker Ayla Khan Oscar Barney.
Finalizing Design Specifications
Presented by An Overview of the Common Component Architecture (CCA) The CCA Forum and the Center for Technology for Advanced Scientific Component Software.
CCA Components for Accelerator Physics ComPASS SAP Project and Phase II Doe SBIR Stefan Muszala, Tech-X Corp, Boulder, CO In collaboration with Jim Amundson.
Update on the CCA Groundwater Simulation Framework: the BOCCA Experience Bruce Palmer, Yilin Fang, Vidhya Gurumoorthi, James Fort, Tim Scheibe Computational.
A summary by Nick Rayner for PSU CS533, Spring 2006
Update on CORBA Support for Babel RMI Nanbor Wang and Roopa Pundaleeka Tech-X Corporation Boulder, CO Funded by DOE OASCR SBIR.
 Advanced Accelerator Simulation Panagiotis Spentzouris Fermilab Computing Division (member of the SciDAC AST project)
Integrating Large-Scale Distributed and Parallel High Performance Computing (DPHPC) Applications Using a Component-based Architecture Nanbor Wang 1, Fang.
Brief Update: BABEL Struct Support for Fortran and raw-arrays Stefan Muszala Svetlana G. Shasharina, John Cary, Nanbor Wang, Rooparani Pundaleeka, Scott.
Babel F2003 Struct Support Update and FACETS integration Funded by DOE (TASCS) Grant No DE-FC02-07ER25805, DOE Grant No DE-FG02-04ER84099 and Tech-X Stefan.
Distributed Components for Integrating Large- Scale High Performance Computing Applications Nanbor Wang, Roopa Pundaleeka and Johan Carlsson
M. Accetta, R. Baron, W. Bolosky, D. Golub, R. Rashid, A. Tevanian, and M. Young MACH: A New Kernel Foundation for UNIX Development Presenter: Wei-Lwun.
Progress on Component-Based Subsurface Simulation I: Smooth Particle Hydrodynamics Bruce Palmer Pacific Northwest National Laboratory Richland, WA.
CHAPTER 14 Classes, Objects, and Games XNA Game Studio 4.0.
1 Lecture 2: Performance, MIPS ISA Today’s topics:  Performance equations  MIPS instructions Reminder: canvas and class webpage:
Chapter 8 System Management Semester 2. Objectives  Evaluating an operating system  Cooperation among components  The role of memory, processor,
Efficient Software-Based Fault Isolation Authors: Robert Wahbe Steven Lucco Thomas E. Anderson Susan L. Graham Presenter: Gregory Netland.
C OMPUTATIONAL R ESEARCH D IVISION 1 Defining Software Requirements for Scientific Computing Phillip Colella Applied Numerical Algorithms Group Lawrence.
1 Asstt. Prof Navjot Kaur Computer Dept PRESENTED BY.
Basic Syntax อ. ยืนยง กันทะเนตร คณะเทคโนโลยีสารสนเทศและการสื่อสาร มหาวิทยาลัยพะเยา Chapter 2.
April 24, 2002 Parallel Port Example. April 24, 2002 Introduction The objective of this lecture is to go over a simple problem that illustrates the use.
Toward a Distributed and Parallel High Performance Computing Environment Johan Carlsson and Nanbor Wang Tech-X Corporation Boulder,
Learning A Better Compiler Predicting Unroll Factors using Supervised Classification And Integrating CPU and L2 Cache Voltage Scaling using Machine Learning.
PROGRAMMING FUNDAMENTALS INTRODUCTION TO PROGRAMMING. Computer Programming Concepts. Flowchart. Structured Programming Design. Implementation Documentation.
1 University of Maryland Using Information About Cache Evictions to Measure the Interactions of Application Data Structures Bryan R. Buck Jeffrey K. Hollingsworth.
R4.21 – Public Report on "Scilab/Scicos code generation for IFP platform and real-time multicore code generation with SynDEx" Simon Nivault, Yves Sorel.
Facade Pattern Jim Fawcett CSE776 – Design Patterns Summer 2010
Lecture 2: Performance Evaluation
Done By: Ashlee Lizarraga Ricky Usher Jacinto Roches Eli Gomez
Facade Pattern Jim Fawcett CSE776 – Design Patterns Summer 2010
CSCI/CMPE 3334 Systems Programming
Evaluating Compuware OptimalJ as an MDA tool
Chapter 1 Introduction(1.1)
Outline Chapter 2 (cont) OS Design OS structure
System calls….. C-program->POSIX call
Presentation transcript:

CcaEcloud Phase I Wrap-up Phase I Doe SBIR Stefan Muszala, PI DOE Grant No DE-FG02-08ER85152 Tech-X Corporation Boulder, CO Updates: onRamp, FACETS+Babel, Babel Structs

Tech-X Corporation Particle accelerator programs play a significant role in 14 out of 28 DOE laboratories which span a number of DOE offices such as the Offices of High Energy Physics (HEP), Nuclear Physics (NP) and Basic Energy Sciences (BES) (Facilities for the Future of Science) Accelerator simulation is required throughout the life- cycle of accelerators in four areas -Design -Analysis -Optimization -Upgrading Accelerator simulations play vital near-, medium-, and long-term roles

Tech-X Corporation Software reuse and common interfaces Ability to compose simulations Portability Mixed language programming interoperability Performance analysis of composed simulations High-performance accelerator software should allow complex applications while promoting good software engineering practices

Tech-X Corporation Channel Driver Component for isolating space charge kick calculation Tweaked SIDL interfaces over what was used for the electron cloud component Apply performance analysis Model 1 processor performance while increasing problem size Model 1-4 processor performance (two dual core AMD Opterons) Modifications to Bocca by Stephen Tramer Splicer block protection Bocca change Bocca copy Since the last CCA meeting finished Phase I work and prepared for Phase II

Tech-X Corporation The original Synergia2 channel driver exercised different space charge routines and provides a concise test-bed for a CCA implementation Results are comparable Horizontal Width (M) Longitudinal position (M)

Tech-X Corporation After Instrumentation we can see Solve and Kick behavior even with substantial call overhead

Tech-X Corporation The source of overhead is due to existing Synergia2 method structure

Tech-X Corporation Single Processor Performance of the space charge calculation goes as as N 3 Number of Particles = N 3, N=grid size, particles/cell=1 Core Work consists of two triple-nested for loops over 3-dimensions: 6(N 3 ) T 1 =T u 6(N 3 )  +  T u = {min | max | average} for cell update. We use min. need to study why  and  min  T u )

Tech-X Corporation Multi-Processor Performance starts with Amdahl’s law Start with Amdahl’s law T P = S + Q/P but let f= S/(S + Q) be the fraction of serial work. Amdahl’s law is now: T P = fT 1 + (1 − f )T 1 /P Account for Communication for PEs > 0 Substitute for T 1 and T comm T u = {min | max | average} for cell update. From serial model We use min. for now. Need to actually quantify f (cycle and instruction count), understand messaging better

Tech-X Corporation Multi-Processor Performance model matches real data for a 32 3 size problem (32,768 particles)

Tech-X Corporation Stephen Tramer was able to add features to Bocca Bocca splicer block protection when using bocca-merge in target files ---no-preserve option -4 regression tests -merging a protected block inside a key block, -ignoring a protected block outside a key block, -standard merge -preservation turned off Bocca change now supports: –--remove-implements –--remove-requires Bocca copy operation: –may now create exact duplicates of {component, class, interface, port, enum}

CcaEcloud Phase I Wrap-up Phase I Doe SBIR Stefan Muszala, PI DOE Grant No DE-FG02-08ER85152 Tech-X Corporation Boulder, CO Updates: onRamp, FACETS+Babel, Babel Structs

Tech-X Corporation Update on other projects using CCA tools OnRamp: –Need to write prototype with onRamp mini-tutorial and autotools –Possible use in Synergia2/TxPhysics as well as in FACETS transport model integration FACETS transport model integration: –Will work with Tom to implement an alternative struct passing mechanism Legacy Fortran allocates memory as the callee, reluctance to change contributed codes to fix this Concern over F2003 compilers available on Franklin and Jacquard in the near term Did not want to re-write existing f90 code for using the Array type Babel doesn’t support arrays of structs (and deeper nesting) so needed derived type refactor in transport models if using current Babel struct implementation Babel Structs: –Few hours so far. By end of May should be well on the way to implementing F77 & F90 struct support and tests. Java struct support and tests are next target