SAN DIEGO SUPERCOMPUTER CENTER Inca 2.0 Shava Smallen Grid Development Group San Diego Supercomputer Center June 26, 2006.

Slides:



Advertisements
Similar presentations
Cultural Heritage in REGional NETworks REGNET Project Meeting Content Group
Advertisements

TeraGrid Deployment Test of Grid Software JP Navarro TeraGrid Software Integration University of Chicago OGF 21 October 19, 2007.
CPSCG: Constructive Platform for Specialized Computing Grid Institute of High Performance Computing Department of Computer Science Tsinghua University.
Test harness and reporting framework Shava Smallen San Diego Supercomputer Center Grid Performance Workshop 6/22/05.
Cross-site data transfer on TeraGrid using GridFTP TeraGrid06 Institute User Introduction to TeraGrid June 12 th by Krishna Muriki
A Computation Management Agent for Multi-Institutional Grids
Condor and GridShell How to Execute 1 Million Jobs on the Teragrid Jeffrey P. Gardner - PSC Edward Walker - TACC Miron Livney - U. Wisconsin Todd Tannenbaum.
Monitoring and performance measurement in Production Grid Environments David Wallom.
1 Software & Grid Middleware for Tier 2 Centers Rob Gardner Indiana University DOE/NSF Review of U.S. ATLAS and CMS Computing Projects Brookhaven National.
Robust Tools for Archiving and Preserving Digital Data Joseph JaJa, Mike Smorul, and Mike McGann Institute for Advanced Computer Studies Department of.
Sergey Belov, LIT JINR 15 September, NEC’2011, Varna, Bulgaria.
Tools and Services for the Long Term Preservation and Access of Digital Archives Joseph JaJa, Mike Smorul, and Sangchul Song Institute for Advanced Computer.
TeraGrid Science Gateway AAAA Model: Implementation and Lessons Learned Jim Basney NCSA University of Illinois Von Welch Independent.
Simo Niskala Teemu Pasanen
MCTS Guide to Microsoft Windows Server 2008 Network Infrastructure Configuration Chapter 11 Managing and Monitoring a Windows Server 2008 Network.
Hands-On Microsoft Windows Server 2008 Chapter 11 Server and Network Monitoring.
Windows Server 2008 Chapter 11 Last Update
Network, Operations and Security Area Tony Rimovsky NOS Area Director
Monitoring and Accounting on the NGS Guy Warner NeSC TOE Team.
GIG Software Integration: Area Overview TeraGrid Annual Project Review April, 2008.
TeraGrid Information Services December 1, 2006 JP Navarro GIG Software Integration.
Open Science Grid Software Stack, Virtual Data Toolkit and Interoperability Activities D. Olson, LBNL for the OSG International.
National Center for Supercomputing Applications The Computational Chemistry Grid: Production Cyberinfrastructure for Computational Chemistry PI: John Connolly.
SOS EGEE ‘06 GGF Security Auditing Service: Draft Architecture Brian Tierney Dan Gunter Lawrence Berkeley National Laboratory Marty Humphrey University.
SAN DIEGO SUPERCOMPUTER CENTER Working with Inca Reporters Jim Hayes Inca Workshop September 4-5, 2008.
OSG Middleware Roadmap Rob Gardner University of Chicago OSG / EGEE Operations Workshop CERN June 19-20, 2006.
Module 7: Fundamentals of Administering Windows Server 2008.
INFSO-RI Enabling Grids for E-sciencE SA1: Cookbook (DSA1.7) Ian Bird CERN 18 January 2006.
Through the development of advanced middleware, Grid computing has evolved to a mature technology in which scientists and researchers can leverage to gain.
The Network Performance Advisor J. W. Ferguson NLANR/DAST & NCSA.
G RID M IDDLEWARE AND S ECURITY Suchandra Thapa Computation Institute University of Chicago.
CYBERINFRASTRUCTURE FOR THE GEOSCIENCES Data Replication Service Sandeep Chandra GEON Systems Group San Diego Supercomputer Center.
SAN DIEGO SUPERCOMPUTER CENTER Inca Data Display (data consumers) Shava Smallen Inca Workshop September 5, 2008.
Kurt Mueller San Diego Supercomputer Center NPACI HotPage Updates.
ARGONNE NATIONAL LABORATORY Climate Modeling on the Jazz Linux Cluster at ANL John Taylor Mathematics and Computer Science & Environmental Research Divisions.
TeraGrid CTSS Plans and Status Dane Skow for Lee Liming and JP Navarro OSG Consortium Meeting 22 August, 2006.
NA-MIC National Alliance for Medical Image Computing UCSD: Engineering Core 2 Portal and Grid Infrastructure.
SAN DIEGO SUPERCOMPUTER CENTER Inca TeraGrid Status Kate Ericson November 2, 2006.
GRIDS Center Middleware Overview Sandra Redman Information Technology and Systems Center and Information Technology Research Center National Space Science.
Scalable Systems Software for Terascale Computer Centers Coordinator: Al Geist Participating Organizations ORNL ANL LBNL.
NEES Cyberinfrastructure Center at the San Diego Supercomputer Center, UCSD George E. Brown, Jr. Network for Earthquake Engineering Simulation NEES TeraGrid.
US LHC OSG Technology Roadmap May 4-5th, 2005 Welcome. Thank you to Deirdre for the arrangements.
Next Steps.
SAN DIEGO SUPERCOMPUTER CENTER Inca Control Infrastructure Shava Smallen Inca Workshop September 4, 2008.
Grid Monitoring and Information Services: Globus Toolkit MDS4 & TeraGrid Inca Jennifer M. Schopf Argonne National Lab UK National eScience Center (NeSC)
Portal Update Plan Ashok Adiga (512)
6/23/2005 R. GARDNER OSG Baseline Services 1 OSG Baseline Services In my talk I’d like to discuss two questions:  What capabilities are we aiming for.
SAN DIEGO SUPERCOMPUTER CENTER Administering Inca with incat Jim Hayes Inca Workshop September 4-5, 2008.
Testing Grid Software on the Grid Steven Newhouse Deputy Director.
Network, Operations and Security Area Tony Rimovsky NOS Area Director
Feb 2-4, 2004LNCC Workshop on Computational Grids & Apps Middleware for Production Grids Jim Basney Senior Research Scientist Grid and Security Technologies.
SAN DIEGO SUPERCOMPUTER CENTER Welcome to the 2nd Inca Workshop Sponsored by the NSF September 4 & 5, 2008 Presenters: Shava Smallen
TeraGrid QA/INCA Turnover Jeff Koerner Q meeting December 8, 2010.
Quality Assurance (QA) Working Group Update July 1, 2010 Kate Ericson (SDSC) Shava Smallen (SDSC)
Monitoring Guy Warner NeSC Training.
CERN Certification & Testing LCG Certification & Testing Team (C&T Team) Marco Serra - CERN / INFN Zdenek Sekera - CERN.
ITMT 1371 – Window 7 Configuration 1 ITMT Windows 7 Configuration Chapter 8 – Managing and Monitoring Windows 7 Performance.
SAM architecture EGEE 07 Service Availability Monitor for the LHC experiments Simone Campana, Alessandro Di Girolamo, Nicolò Magini, Patricia Mendez Lorenzo,
A System for Monitoring and Management of Computational Grids Warren Smith Computer Sciences Corporation NASA Ames Research Center.
Monitoring and Information Services Core Infrastructure (MIS-CI) Service Description Mark L. Green OSG Integration Workshop at UC Feb 15-17, 2005.
OSG Facility Miron Livny OSG Facility Coordinator and PI University of Wisconsin-Madison Open Science Grid Scientific Advisory Group Meeting June 12th.
Open Science Grid Configuring RSV OSG Resource & Service Validation Thomas Wang Grid Operations Center (OSG-GOC) Indiana University.
The EPIKH Project (Exchange Programme to advance e-Infrastructure Know-How) gLite Grid Introduction Salma Saber Electronic.
TeraGrid Software Integration: Area Overview (detailed in 2007 Annual Report Section 3) Lee Liming, JP Navarro TeraGrid Annual Project Review April, 2008.
Overview – SOE PatchTT November 2015.
GWE Core Grid Wizard Enterprise (
Overview – SOE PatchTT December 2013.
Joseph JaJa, Mike Smorul, and Sangchul Song
Leanne Guy EGEE JRA1 Test Team Manager
Leigh Grundhoefer Indiana University
Presentation transcript:

SAN DIEGO SUPERCOMPUTER CENTER Inca 2.0 Shava Smallen Grid Development Group San Diego Supercomputer Center June 26, 2006

SAN DIEGO SUPERCOMPUTER CENTER You know you have [a distributed system] when the crash of a computer you've never heard of stops you from getting any work done. -- Leslie Lamport Simple Grid application Grid Reliability Grid computing: The ability to dynamically link resources together as an ensemble to support the execution of large- scale, resource-intensive, and distributed applications

SAN DIEGO SUPERCOMPUTER CENTER Is the Grid up? Can I login? Are Grid services the application[s] use available? Compatible versions? Are dataset[s] N accessible to user X? Credentials? … Can user X run application[s] Y on Grid[s] Z? Access dataset[s] N?

SAN DIEGO SUPERCOMPUTER CENTER Testing a Grid 1.Iteratively define a set of concrete requirements 2.Write tests to verify requirements 3.Periodically run tests and collect data 4.Publish data Automate Steps 3 and 4

SAN DIEGO SUPERCOMPUTER CENTER What type of testing? Deployment testing Automated, continuous checking of Grid services, software, and environment Installed? Configured correctly? Running? Accessible to users? Acceptable performance? E.g., gatekeeper ping or scaled down application Software Package (unit, integrated) Software Stack (interoperability) Software Deployment NMI Junit, PyUnit, Tinderbox

SAN DIEGO SUPERCOMPUTER CENTER Who are the consumers? Grid/VO management Responsible for designing & maintaining requirements Verify requirements are fulfilled by resource providers System administrators Notified of problems Enough information to understand context of problem End users View results and compare to problems they are having Debug user account/environment issues Advanced users: feedback to Grid/VO

SAN DIEGO SUPERCOMPUTER CENTER Inca Inca is a framework for the automated testing, benchmarking and monitoring of Grid resources Inca provides: Scheduled execution of information gathering scripts (reporters) Data management collection archiving publishing

SAN DIEGO SUPERCOMPUTER CENTER Incas primary objective: user-level Grid functionality testing and performance measurement Related Grid monitoring tools Hawkeye MDS

SAN DIEGO SUPERCOMPUTER CENTER Unique features of Inca Debugging Runs under a regular user account Flexibly expresses results Captures reporter execution context Securely re-runs reporters Archives full reports Reporters can be run outside framework

SAN DIEGO SUPERCOMPUTER CENTER Unique features of Inca (cont.) Compares results to a specification Easily and securely configured Data collection Installation Profiles and logs reporter resource use

SAN DIEGO SUPERCOMPUTER CENTER Outline Inca in use Architecture overview Software Status

SAN DIEGO SUPERCOMPUTER CENTER Inca today Version 1 aka available from website and NMI distribution Version 2 pre-release Available as of 02/06 Production version available in late summer Both versions of Inca are currently being used in production environments

SAN DIEGO SUPERCOMPUTER CENTER Inca in use 1)Software stack validation and verification (v1) 2)Network bandwidth measurements (v1) 3)Grid benchmarking

SAN DIEGO SUPERCOMPUTER CENTER 1) Inca in use: TeraGrid software stack V&V TeraGrid - an enabling cyberinfrastructure for scientific research ANL, Indiana Univ., NCSA, ORNL, PSC, Purdue Univ., SDSC, TACC 40+ TF, 1+ PB, 40Gb/s net Common TeraGrid Software & Services Common user environment across heterogeneous resources TeraGrid VO service agreement

SAN DIEGO SUPERCOMPUTER CENTER 1) Inca in use: TeraGrid software stack V&V Common software stack: 20 core packages: Globus, SRB, Condor-G, MPICH-G2, OpenSSH, SoftEnv, etc. 9 viz package/builds: Chromium, ImageMagick, Mesa, VTK, NetPBM, etc. 21 IA-64/Intel/Linux packages: glibc, GPFS, PVFS, OpenPBS, intel compilers, etc. 50 version reporters: compatible versions of SW 123 tests/resource: package functionality Services: Globus GRAM, GridFTP, MDS, SRB, DB2, MyProxy, OpenSSH Cross-site: Globus GRAM, GridFTP, OpenSSH

1) Inca in use: TeraGrid deployment 8 sites/17 resources Run under user account inca

1) Inca in use: Summary status page History of percentage of tests passed in Grid category for a 6 month period All tests passed: 100% One or more tests failed: < 100% Key Tests not applicable to machine or have not yet been ported

1) Inca in use: Detailed Status View SW packages Resources

1) Inca in use: Detailed view

2) Inca in use: Comparison of end-to- end bandwidth measurement tools Joint work with Margaret Murray (TACC) and Martin Swany (UDel) Deployed to TeraGrid, GEON Compare bandwidth measurement tools: Pathload [Dovrolis] Pathchirp [Ribeiro] NWS ping [Wolski]

SAN DIEGO SUPERCOMPUTER CENTER 3) Inca in use: Grid benchmarks GrASP: Grid Assessment Probes Omid Khalili, et al., Acquiring and Using Benchmark Data from Computational Grids, accepted for Grid 2006 Set of probes designed to emulate Grid applications Deployed to GEON and TeraGrid Gather probe on TeraGrid

SAN DIEGO SUPERCOMPUTER CENTER 3) Inca in use: Measuring Grid middleware performance

SAN DIEGO SUPERCOMPUTER CENTER 3) Inca in use: Monitoring Grid middleware reliability

SAN DIEGO SUPERCOMPUTER CENTER 3) Inca in use: Error tracking over time

SAN DIEGO SUPERCOMPUTER CENTER Outline Inca in use Architecture overview Software Status

Architecture overview Resource 2 Resource 1 Resource N … Reporter Manager Reporter Manager Reporter Manager Reporter Repository Incat Reporter Agent Depot 1.Create a suite 2.Submit suite to Reporter Agent Suite Depot 3. Reporter Agent invokes Reporter Managers and distributes suite and reporters R R R S R S S 4. Reporter Managers send data to Depot 5. GUIs can display collected data by querying Depot Data Consumers

Architecture overview: Scalable design Resource 2 Resource 1 Resource 3 Incat Reporter Repository Reporter Agent Depot Reporter Manager Reporter Manager Reporter Manager VO A Resource 4 Resource 5 Incat Reporter Repository Reporter Agent Depot Reporter Manager Reporter Manager VO B Forward suite Forward results

SAN DIEGO SUPERCOMPUTER CENTER Outline Inca in use Architecture overview Software Status

SAN DIEGO SUPERCOMPUTER CENTER New features of v2 Full report archiving Flexible querying interface Improved installation and configuration control GUI tool for centralized administration Proxy management via MyProxy Reporter sharing via repositories Binary distribution Profile reporter system usage Inca components communicate using SSL

SAN DIEGO SUPERCOMPUTER CENTER Software Status 2.0 Pre-release Available as of February 6, 2006 More integration/stability testing Not recommended for production deployments Binary distribution 2.0 Production release in August Source and binary distributions

SAN DIEGO SUPERCOMPUTER CENTER Supported by: Inca Information Announcements: Bugs/Feature Requests: Website:

SAN DIEGO SUPERCOMPUTER CENTER Add Interactive Reporter Details Submit information or suggestions View error history View historical graphs of run time, CPU time, memory usage