Web and Grid Services from Pitt/CMU Andrew Connolly Department of Physics and Astronomy University of Pittsburgh Jeff Gardner, Alex Gray, Simon Krughoff,

Slides:



Advertisements
Similar presentations
Remote Visualisation System (RVS) By: Anil Chandra.
Advertisements

A Workflow Engine with Multi-Level Parallelism Supports Qifeng Huang and Yan Huang School of Computer Science Cardiff University
Instructor Notes Lecture discusses parallel implementation of a simple embarrassingly parallel nbody algorithm We aim to provide some correspondence between.
Distributed Systems CS
SLA-Oriented Resource Provisioning for Cloud Computing
Master/Slave Architecture Pattern Source: Pattern-Oriented Software Architecture, Vol. 1, Buschmann, et al.
1. Topics Is Cloud Computing the way to go? ARC ABM Review Configuration Basics Setting up the ARC Cloud-Based ABM Hardware Configuration Software Configuration.
GridRPC Sources / Credits: IRISA/IFSIC IRISA/INRIA Thierry Priol et. al papers.
Condor and GridShell How to Execute 1 Million Jobs on the Teragrid Jeffrey P. Gardner - PSC Edward Walker - TACC Miron Livney - U. Wisconsin Todd Tannenbaum.
Hiperspace Lab University of Delaware Antony, Sara, Mike, Ben, Dave, Sreedevi, Emily, and Lori.
Summary Background –Why do we need parallel processing? Applications Introduction in algorithms and applications –Methodology to develop efficient parallel.
GridFlow: Workflow Management for Grid Computing Kavita Shinde.
Web-based Distributed Flexible Manufacturing System (FMS) Monitoring and Control Student: Wei Liu Instructor: Dr. Chang Apr. 23, 2003.
A Web service for Distributed Covariance Computation on Astronomy Catalogs Presented by Haimonti Dutta CMSC 691D.
High Performance Communication using MPJ Express 1 Presented by Jawad Manzoor National University of Sciences and Technology, Pakistan 29 June 2015.
AstroBEAR Parallelization Options. Areas With Room For Improvement Ghost Zone Resolution MPI Load-Balancing Re-Gridding Algorithm Upgrading MPI Library.
1 Computer Science, University of Warwick Architecture Classifications A taxonomy of parallel architectures: in 1972, Flynn categorised HPC architectures.
FLANN Fast Library for Approximate Nearest Neighbors
31 January 2007Craig E. Ward1 Large-Scale Simulation Experimentation and Analysis Database Programming Using Java.
What is Concurrent Programming? Maram Bani Younes.
Enabling Knowledge Discovery in a Virtual Universe Harnessing the Power of Parallel Grid Resources for Astrophysical Data Analysis Jeffrey P. Gardner Andrew.
N Tropy: A Framework for Analyzing Massive Astrophysical Datasets Harnessing the Power of Parallel Grid Resources for Astrophysical Data Analysis Jeffrey.
STRATEGIES INVOLVED IN REMOTE COMPUTATION
Lecture 29 Fall 2006 Lecture 29: Parallel Programming Overview.
CC02 – Parallel Programming Using OpenMP 1 of 25 PhUSE 2011 Aniruddha Deshmukh Cytel Inc.
ICOM 5995: Performance Instrumentation and Visualization for High Performance Computer Systems Lecture 7 October 16, 2002 Nayda G. Santiago.
Parallel Programming Models Jihad El-Sana These slides are based on the book: Introduction to Parallel Computing, Blaise Barney, Lawrence Livermore National.
Polish Infrastructure for Supporting Computational Science in the European Research Space QoS provisioning for data-oriented applications in PL-Grid D.
Computational Design of the CCSM Next Generation Coupler Tom Bettge Tony Craig Brian Kauffman National Center for Atmospheric Research Boulder, Colorado.
N Tropy: A Framework for Knowledge Discovery in a Virtual Universe Harnessing the Power of Parallel Grid Resources for Astrophysical Data Analysis Jeffrey.
Ohio State University Department of Computer Science and Engineering 1 Cyberinfrastructure for Coastal Forecasting and Change Analysis Gagan Agrawal Hakan.
N Tropy: A Framework for Knowledge Discovery in a Virtual Universe Harnessing the Power of Parallel Grid Resources for Astronomical Data Analysis Jeffrey.
Effect Of Message Size and Number of Clients on WS Frameworks For CIS* Service Oriented Computing Dariusz Grabka Gerett Commeford Jack Cole.
Mantid Development introduction Nick Draper 11/04/2008.
April 26, CSE8380 Parallel and Distributed Processing Presentation Hong Yue Department of Computer Science & Engineering Southern Methodist University.
Center for Component Technology for Terascale Simulation Software CCA is about: Enhancing Programmer Productivity without sacrificing performance. Supporting.
Definitions Speed-up Efficiency Cost Diameter Dilation Deadlock Embedding Scalability Big Oh notation Latency Hiding Termination problem Bernstein’s conditions.
1 Geospatial and Business Intelligence Jean-Sébastien Turcotte Executive VP San Francisco - April 2007 Streamlining web mapping applications.
NIH Resource for Biomolecular Modeling and Bioinformatics Beckman Institute, UIUC NAMD Development Goals L.V. (Sanjay) Kale Professor.
NIH Resource for Biomolecular Modeling and Bioinformatics Beckman Institute, UIUC NAMD Development Goals L.V. (Sanjay) Kale Professor.
Group 3: Architectural Design for Enhancing Programmability Dean Tullsen, Josep Torrellas, Luis Ceze, Mark Hill, Onur Mutlu, Sampath Kannan, Sarita Adve,
Summary Background –Why do we need parallel processing? Moore’s law. Applications. Introduction in algorithms and applications –Methodology to develop.
Classic Model of Parallel Processing
ATmospheric, Meteorological, and Environmental Technologies RAMS Parallel Processing Techniques.
Department of Computer Science MapReduce for the Cell B. E. Architecture Marc de Kruijf University of Wisconsin−Madison Advised by Professor Sankaralingam.
A scalable and flexible platform to run various types of resource intensive applications on clouds ISWG June 2015 Budapest, Hungary Tamas Kiss,
Mantid Stakeholder Review Nick Draper 01/11/2007.
Efficiency of small size tasks calculation in grid clusters using parallel processing.. Olgerts Belmanis Jānis Kūliņš RTU ETF Riga Technical University.
Enabling Rapid Development of Parallel Tree-Search Applications Harnessing the Power of Massively Parallel Platforms for Astrophysical Data Analysis Jeffrey.
Parallelization Strategies Laxmikant Kale. Overview OpenMP Strategies Need for adaptive strategies –Object migration based dynamic load balancing –Minimal.
Parallel IO for Cluster Computing Tran, Van Hoai.
NGS computation services: APIs and.
PARALLEL AND DISTRIBUTED PROGRAMMING MODELS U. Jhashuva 1 Asst. Prof Dept. of CSE om.
First INFN International School on Architectures, tools and methodologies for developing efficient large scale scientific computing applications Ce.U.B.
VIEWS b.ppt-1 Managing Intelligent Decision Support Networks in Biosurveillance PHIN 2008, Session G1, August 27, 2008 Mohammad Hashemian, MS, Zaruhi.
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved DISTRIBUTED SYSTEMS.
CS122B: Projects in Databases and Web Applications Spring 2017
CS122B: Projects in Databases and Web Applications Winter 2017
Applying Control Theory to Stream Processing Systems
The Improvement of PaaS Platform ZENG Shu-Qing, Xu Jie-Bin 2010 First International Conference on Networking and Distributed Computing SQUARE.
University of Technology
Many-core Software Development Platforms
CS122B: Projects in Databases and Web Applications Winter 2018
Summary Background Introduction in algorithms and applications
CS122B: Projects in Databases and Web Applications Spring 2018
CSE8380 Parallel and Distributed Processing Presentation
TeraScale Supernova Initiative
Chapter 4: Threads & Concurrency
DBOS DecisionBrain Optimization Server
Presentation transcript:

Web and Grid Services from Pitt/CMU Andrew Connolly Department of Physics and Astronomy University of Pittsburgh Jeff Gardner, Alex Gray, Simon Krughoff, Andrew Moore, Bob Nichol, Jeff Schneider

Serial and Parallel Applications ITR collaboration: University of Pittsburgh and Carnegie Mellon University. ITR collaboration: University of Pittsburgh and Carnegie Mellon University. Astronomers, Computer Scientists and Statisticians Astronomers, Computer Scientists and Statisticians Developing fast, usually tree-based, algorithms to reduce large volumes of data. Developing fast, usually tree-based, algorithms to reduce large volumes of data. Sample projects: Sample projects: Source detection and cross match Source detection and cross match Parallel Correlation Functions Parallel Correlation Functions Parallelized serial applications (through GridShell on the Teragrid) Parallelized serial applications (through GridShell on the Teragrid) Object classification (Morphologies and general classification) Object classification (Morphologies and general classification) Anomaly finding Anomaly finding Density estimation Density estimation Intersections in parameter space Intersections in parameter space

WSExtractor: Source Detection Wrapping existing services as webservices Wrapping existing services as webservices Accessible through client and webservices Accessible through client and webservices Prototype Sextractor (and interface to other services) Prototype Sextractor (and interface to other services) Communication through SOAP messages and attachments Communication through SOAP messages and attachments Full configurable through webservice interface Full configurable through webservice interface Outputs VOTables Outputs VOTables Dynamically cross matches with openskyquery.net Dynamically cross matches with openskyquery.net Returns cross matched VOTables Returns cross matched VOTables Plots through Aladin and VOPlot Plots through Aladin and VOPlot WSext

Issues VOTable and Java VOTable and Java DataHandler type DataHandler type Working with C and.NET Working with C and.NET Aladin as applet on server Aladin as applet on server Eliminates multiple transfers of images Eliminates multiple transfers of images Concurrent Access – sessions with WS Concurrent Access – sessions with WS Value validation Value validation

Harnessing parallel grid resources in the NVO data mining framework Harnessing parallel grid resources in the NVO data mining framework N-pt correlation functions for SDSS data: N-pt correlation functions for SDSS data: 2-pt: hours 2-pt: hours 3-pt: weeks 3-pt: weeks 4-pt: 100 years! 4-pt: 100 years! ● With the NVO, computational requirements will be much more extreme. ● There will be many more problems for which throughput can be substantially enhanced by parallel computers. Grid Services

Parallel programs are hard to write! Parallel programs are hard to write! Parallel codes (especially massively parallel ones) are used by a limited number of "boutique enterprises" with the resources to develop them. Parallel codes (especially massively parallel ones) are used by a limited number of "boutique enterprises" with the resources to develop them. Scientific programming is a battle between run time and development time: Scientific programming is a battle between run time and development time: Development time must be less than run time! Development time must be less than run time! Large development time means they must be reused again and again and again to make the development worth it (e.g. N-body hydrodynamic code). Large development time means they must be reused again and again and again to make the development worth it (e.g. N-body hydrodynamic code). Even the "boutiques" find it extraodinarily difficult to conduct new queries in parallel. Even the "boutiques" find it extraodinarily difficult to conduct new queries in parallel. For example, all of the U.Washington "N-Body Shop's" data analysis code is still serial. For example, all of the U.Washington "N-Body Shop's" data analysis code is still serial. The challenge of Parallelism

Canned routines (e.g. "density calculator", "cluster finder", "2-pt correlation function") restrict inquery space. Canned routines (e.g. "density calculator", "cluster finder", "2-pt correlation function") restrict inquery space. Bring the power of a distributed TeraScale computing grid to NVO users. Provide seamless scalability from single processor machines to TeraFlop platforms while not restricting inquery space. Bring the power of a distributed TeraScale computing grid to NVO users. Provide seamless scalability from single processor machines to TeraFlop platforms while not restricting inquery space. Toolkit: Toolkit: Highly general Highly general Highly flexible Highly flexible Minimizes development time Minimizes development time Efficiently and scalably parallel Efficiently and scalably parallel Optimized for common architectures (MPI, SHMEM, POSIX, etc) Optimized for common architectures (MPI, SHMEM, POSIX, etc) A Scalable Parallel Analysis Framework

Identify Critical Abstraction Sets and Critical Methods Sets from work done on serial algorithm research by existing ITR. Identify Critical Abstraction Sets and Critical Methods Sets from work done on serial algorithm research by existing ITR. Efficiently parallelize these abstraction and methods sets Efficiently parallelize these abstraction and methods sets Distribute in the form of a parallel toolkit. Distribute in the form of a parallel toolkit. Developing a fast serial algorithm is completely different than implementing that algorithm in parallel. Developing a fast serial algorithm is completely different than implementing that algorithm in parallel. Methodology

Developing an efficient & scalable parallel 2-, 3-, 4-point correlation function code. Developing an efficient & scalable parallel 2-, 3-, 4-point correlation function code. Development on Terascale Computing System Development on Terascale Computing System Based on the parallel gravity code PKDGRAV: Based on the parallel gravity code PKDGRAV: Highly portable (MPI, POSIX Threads, SHMEM, & others) Highly portable (MPI, POSIX Threads, SHMEM, & others) Highly scalable Highly scalable Parallel Npt

92% linear speedup on 512 PEs! By Amdahl's Law, this means 0.017% of the code is actually serial.

So far, only 74% speedup on 128 PEs 2-pt Correlation Function (2 o ) Parallel Npt performance

2-pt Correlation Function (2 o ) Parallel Npt Performance

Issues and Future directions Problem is communication latencies Problem is communication latencies N-body inter-node communication small (npt large) N-body inter-node communication small (npt large) Minimize off processor communication Minimize off processor communication Extend to npt (in progress) Extend to npt (in progress) Adaptive learning for allocating resources and work load Adaptive learning for allocating resources and work load Interface with webservices Interface with webservices Not another IRAF Not another IRAF