July 25, 20071/21 OSG Information Services Gabriele Garzoglio, Rob Quick, Chris Green OSG Information Services, VO Monitoring Services and Resource Selection.

Slides:



Advertisements
Similar presentations
Open Science Grid Discovering and understanding the site environment Or, yet another site test kit.
Advertisements

Workload management Owen Maroney, Imperial College London (with a little help from David Colling)
Dec 14, 20061/10 VO Services Project – Status Report Gabriele Garzoglio VO Services Project WBS Dec 14, 2006 OSG Executive Board Meeting Gabriele Garzoglio.
EGEE-II INFSO-RI Enabling Grids for E-sciencE Information System Gonçalo Borges, Jorge Gomes, Mário David LIP Lisboa EGEE & Int.EU.Grid.
Workload Management Massimo Sgaravatto INFN Padova.
First steps implementing a High Throughput workload management system Massimo Sgaravatto INFN Padova
LHC Experiment Dashboard Main areas covered by the Experiment Dashboard: Data processing monitoring (job monitoring) Data transfer monitoring Site/service.
January 2008 Grid School / Florida, OSG Engagement VO 1 Open Science Grid Rosetta OSG Engagement VO Resource Selection on the Grid.
The SAM-Grid Fabric Services Gabriele Garzoglio (for the SAM-Grid team) Computing Division Fermilab.
Alain Roy Computer Sciences Department University of Wisconsin-Madison An Introduction To Condor International.
MyOSG: A user-centric information resource for OSG infrastructure data sources Arvind Gopu, Soichi Hayashi, Rob Quick Open Science Grid Operations Center.
Open Science Grid Software Stack, Virtual Data Toolkit and Interoperability Activities D. Olson, LBNL for the OSG International.
Rsv-control Marco Mambelli – Site Coordination meeting October 1, 2009.
Workload Management WP Status and next steps Massimo Sgaravatto INFN Padova.
OSG Operations and Interoperations Rob Quick Open Science Grid Operations Center - Indiana University EGEE Operations Meeting Stockholm, Sweden - 14 June.
OSG Services at Tier2 Centers Rob Gardner University of Chicago WLCG Tier2 Workshop CERN June 12-14, 2006.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE Middleware: gLite Information Systems (IS) EGEE Tutorial 23 rd APAN Meeting,
Grid Job and Information Management (JIM) for D0 and CDF Gabriele Garzoglio for the JIM Team.
OSG Middleware Roadmap Rob Gardner University of Chicago OSG / EGEE Operations Workshop CERN June 19-20, 2006.
Publication and Protection of Site Sensitive Information in Grids Shreyas Cholia NERSC Division, Lawrence Berkeley Lab Open Source Grid.
HPDC 2007 / Grid Infrastructure Monitoring System Based on Nagios Grid Infrastructure Monitoring System Based on Nagios E. Imamagic, D. Dobrenic SRCE HPDC.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Simply monitor a grid site with Nagios J.
INFSO-RI Enabling Grids for E-sciencE Logging and Bookkeeping and Job Provenance Services Ludek Matyska (CESNET) on behalf of the.
1 BIG FARMS AND THE GRID Job Submission and Monitoring issues ATF Meeting, 20/06/03 Sergio Andreozzi.
May 8, 20071/15 VO Services Project – Status Report Gabriele Garzoglio VO Services Project – Status Report Overview and Plans May 8, 2007 Computing Division,
Grid Workload Management & Condor Massimo Sgaravatto INFN Padova.
G RID M IDDLEWARE AND S ECURITY Suchandra Thapa Computation Institute University of Chicago.
Apr 30, 20081/11 VO Services Project – Stakeholders’ Meeting Gabriele Garzoglio VO Services Project Stakeholders’ Meeting Apr 30, 2008 Gabriele Garzoglio.
DataGrid WP1 Massimo Sgaravatto INFN Padova. WP1 (Grid Workload Management) Objective of the first DataGrid workpackage is (according to the project "Technical.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Information System on gLite middleware Vincent.
SAMGrid as a Stakeholder of FermiGrid Valeria Bartsch Computing Division Fermilab.
Use of Condor on the Open Science Grid Chris Green, OSG User Group / FNAL Condor Week, April
Overview of Monitoring and Information Systems in OSG MWGS08 - September 18, Chicago Marco Mambelli - University of Chicago
Grid Workload Management Massimo Sgaravatto INFN Padova.
Mar 28, 20071/9 VO Services Project Gabriele Garzoglio The VO Services Project Don Petravick for Gabriele Garzoglio Computing Division, Fermilab ISGC 2007.
Mar 28, 20071/18 The OSG Resource Selection Service (ReSS) Gabriele Garzoglio OSG Resource Selection Service (ReSS) Don Petravick for Gabriele Garzoglio.
GLite Information System(s) Antonio Juan Rubio Montero CIEMAT 10 th EELA Tutorial. Madrid, May 7 th -11 th,2007.
GLUE Schema: LDIF to old classad mapping Gabriele Garzoglio Computing Division, Fermilab May 31, 2006.
Towards a Global Service Registry for the World-Wide LHC Computing Grid Maria ALANDES, Laurence FIELD, Alessandro DI GIROLAMO CERN IT Department CHEP 2013.
What is SAM-Grid? Job Handling Data Handling Monitoring and Information.
Data reprocessing for DZero on the SAM-Grid Gabriele Garzoglio for the SAM-Grid Team Fermilab, Computing Division.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks A GSI-secured job manager for connecting.
INFSO-RI Enabling Grids for E-sciencE OSG-LCG Interoperability Activity Author: Laurence Field (CERN)
E-infrastructure shared between Europe and Latin America FP6−2004−Infrastructures−6-SSA gLite Information System Pedro Rausch IF.
VO Privilege Activity. The VO Privilege Project develops and implements fine-grained authorization to grid- enabled resources and services Started Spring.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Using GStat 2.0 for Information Validation.
The OSG and Grid Operations Center Rob Quick Open Science Grid Operations Center - Indiana University ATLAS Tier 2-Tier 3 Meeting Bloomington, Indiana.
FP6−2004−Infrastructures−6-SSA E-infrastructure shared between Europe and Latin America gLite Information System Claudio Cherubino.
RSV: OSG Grid Fabric Monitoring and Interoperation with WLCG Monitoring Systems Rob Quick, Arvind Gopu, and Soichi Hayashi Computing in High Energy and.
E-infrastructure shared between Europe and Latin America gLite Information System(s) Manuel Rubio del Solar CETA-CIEMAT EELA Tutorial, Mérida,
Eileen Berman. Condor in the Fermilab Grid FacilitiesApril 30, 2008  Fermi National Accelerator Laboratory is a high energy physics laboratory outside.
EGEE-II INFSO-RI Enabling Grids for E-sciencE Practical: The Information Systems.
Summary from WP 1 Parallel Section Massimo Sgaravatto INFN Padova.
Sep 25, 20071/5 Grid Services Activities on Security Gabriele Garzoglio Grid Services Activities on Security Gabriele Garzoglio Computing Division, Fermilab.
April 25, 2006Parag Mhashilkar, Fermilab1 Resource Selection in OSG & SAM-On-The-Fly Parag Mhashilkar Fermi National Accelerator Laboratory Condor Week.
Grid Workload Management (WP 1) Massimo Sgaravatto INFN Padova.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Regional Nagios Emir Imamagic /SRCE EGEE’09,
Open Science Grid OSG Resource and Service Validation and WLCG SAM Interoperability Rob Quick With Content from Arvind Gopu, James Casey, Ian Neilson,
The Resource Selection Service (ReSS) Activity Gabriele Garzoglio Fermilab, Computing Division March 14, 2006.
OSG Status and Rob Gardner University of Chicago US ATLAS Tier2 Meeting Harvard University, August 17-18, 2006.
SAM architecture EGEE 07 Service Availability Monitor for the LHC experiments Simone Campana, Alessandro Di Girolamo, Nicolò Magini, Patricia Mendez Lorenzo,
E-science grid facility for Europe and Latin America Updates on Information System Annamaria Muoio - INFN Tutorials for trainers 01/07/2008.
Parag Mhashilkar Computing Division, Fermilab.  Status  Effort Spent  Operations & Support  Phase II: Reasons for Closing the Project  Phase II:
Open Science Grid Configuring RSV OSG Resource & Service Validation Thomas Wang Grid Operations Center (OSG-GOC) Indiana University.
Grid Colombia Workshop with OSG Week 2 Startup Rob Gardner University of Chicago October 26, 2009.
Honolulu - Oct 31st, 2007 Using Glideins to Maximize Scientific Output 1 IEEE NSS 2007 Making Science in the Grid World - Using Glideins to Maximize Scientific.
Basic Grid Projects – Condor (Part I)
EGEE Middleware: gLite Information Systems (IS)
The gLite information system: Top BDII
Information Services Claudio Cherubino INFN Catania Bologna
Presentation transcript:

July 25, 20071/21 OSG Information Services Gabriele Garzoglio, Rob Quick, Chris Green OSG Information Services, VO Monitoring Services and Resource Selection Services Gabriele Garzoglio, Chris Green, Computing Division, Fermilab Rob Quick, Indiana University OSG User Meeting & OSG Site Administrators Meeting July 2007 OSG Information Services Architecture The VO Resource Service (VORS) The OSG Resource Selection Service (ReSS) ClassAd Matchmaking How these affect the Sites How these affect the User

July 25, 20072/21 OSG Information Services Gabriele Garzoglio, Rob Quick, Chris Green Context The OSG Information Services have 4 goals: –Provide static and “real-time” (where real-time is still evolving) information about Resource configurations and state. –Feed OSG-wide monitoring tools and provide interfaces to this information for Grid operations, VOs and Users. –Provide information for interoperation of OSG and EGEE for LHC Experiments and WLCG operations. –Provide information for resource selection by OSG VOs and Users.

July 25, 20073/21 OSG Information Services Gabriele Garzoglio, Rob Quick, Chris Green Please ask Questions During this talk We are looking for input, feedback and guidance.

July 25, 20074/21 OSG Information Services Gabriele Garzoglio, Rob Quick, Chris Green OSG IS Architecture Grid / Site Interface VO / Grid Interface Site Grid VO Static Info (LDIF) Info Providers Config … Configuration Info Formatting Info Publishing Site Info Publisher (CEMon) Generic Info Providers (GIP) Info Gathering Info Collection Job / Res. Match ReSS Info Collection BDII Info Coll. WLCG BDII Job Queue Condor Schedd Job/Res. Match Condor Matchmaker Job Queue Job/Res. Match EGEE Resource Broker (RB) Info Collection Info Display VORS Info Gathering VORS Probes Instantiate… LDIF Classad LDIF

July 25, 20075/21 OSG Information Services Gabriele Garzoglio, Rob Quick, Chris Green VORS in OSG IS Grid / Site Interface VO / Grid Interface Site Grid VO Static Info (LDIF) Info Providers Config … Configuration Info Formatting Info Publishing Site Info Publisher (CEMon) Generic Info Providers (GIP) Info Gathering Info Collection Job / Res. Match ReSS Info Collection BDII Info Coll. WLCG BDII Job Queue Condor Schedd Job/Res. Match Condor Matchmaker Job Queue Job/Res. Match EGEE Resource Broker (RB) Info Collection Info Display VORS Info Gathering VORS Probes Instantiate… LDIF Classad LDIF

July 25, 20076/21 OSG Information Services Gabriele Garzoglio, Rob Quick, Chris Green What VORS does for you… Allows VO users to pick which sites support their VO Provides critical site info to a VO user Gives users a snapshot of current grid and site status Will provide a facility for users to look at other Grids from an OSG PO

July 25, 20077/21 OSG Information Services Gabriele Garzoglio, Rob Quick, Chris Green

July 25, 20078/21 OSG Information Services Gabriele Garzoglio, Rob Quick, Chris Green

July 25, 20079/21 OSG Information Services Gabriele Garzoglio, Rob Quick, Chris Green

July 25, /21 OSG Information Services Gabriele Garzoglio, Rob Quick, Chris Green ReSS in OSG IS Grid / Site Interface VO / Grid Interface Site Grid VO Static Info (LDIF) Info Providers Config … Configuration Info Formatting Info Publishing Site Info Publisher (CEMon) Generic Info Providers (GIP) Info Gathering Info Collection Job / Res. Match ReSS Info Collection BDII Info Coll. WLCG BDII Job Queue Condor Schedd Job/Res. Match Condor Matchmaker Job Queue Job/Res. Match EGEE Resource Broker (RB) Info Collection Info Display VORS Info Gathering VORS Probes Instantiate… LDIF Classad LDIF

July 25, /21 OSG Information Services Gabriele Garzoglio, Rob Quick, Chris Green ReSS Motivations Implement a light-weight cluster selector for push-based job handling services Enable users to express requirements on the resources in the job description Enable users to refer to abstract characteristics of the resources in the job description Provide soft-registration for clusters Use the standard characterizations of the resources via the Glue Schema

July 25, /21 OSG Information Services Gabriele Garzoglio, Rob Quick, Chris Green ReSS Technology ReSS basis its central services on the Condor Match- making service –Users of Condor-G naturally integrate their scheduler servers with ReSS –Condor information collector manages resource soft registration Resource characteristics is handled at sites by the EGEE gLite CE Monitor Service (CEMon) –CEmon registers with the central ReSS services at startup –Info is gathered by CEMon at sites running Generic Information Prividers (GIP) –GIP expresses resource information via the Glue Schema model –CEMon converts the information from GIP into old classad format. Other supported formats: XML, LDIF, new classad –CEMon publishes information using web services interfaces

July 25, /21 OSG Information Services Gabriele Garzoglio, Rob Quick, Chris Green A case study: VO Schedd to interact with ReSS Grid / Site Interface VO / Grid Interface Site Grid VO Static Info (LDIF) Info Providers Config … Configuration Info Formatting Info Publishing Site Info Publisher (CEMon) Generic Info Providers (GIP) Info Gathering Info Collection Job / Res. Match ReSS Info Collection BDII Info Coll. WLCG BDII Job Queue Condor Schedd Job/Res. Match Condor Matchmaker Job Queue Job/Res. Match EGEE Resource Broker (RB) Info Collection Info Display VORS Info Gathering VORS Probes Instantiate… LDIF Classad LDIF

July 25, /21 OSG Information Services Gabriele Garzoglio, Rob Quick, Chris Green VO Condor-Schedd interacts with ReSS Condor Match Maker Info Gatherer classads Condor Scheduler job What Gate? Gate 3 job CEMon CE Gate1 job-managers jobsinfo CLUSTER GIP CEMon CE Gate2 job-managers jobsinfo CLUSTER GIP CEMon CE Gate3 job-managers jobsinfo CLUSTER GIP ReSS Info Gatherer is the Interface Adapter between CEMon and Condor Grid / Site Interface VO / Grid Interface Grid Site VO

July 25, /21 OSG Information Services Gabriele Garzoglio, Rob Quick, Chris Green User Interacts with Schedd and ReSS universe = globus globusscheduler = $$(GlueCEInfoContactString) requirements = TARGET.GlueCEAccessControlBaseRule == "VO:DZero" executable = /bin/hostname arguments = -f queue MyType = "Machine" Name = "antaeus.hpcc.ttu.edu:2119/jobmanager-lsf-dzero " Requirements = (CurMatches < 10) ReSSVersion = "1.0.6" TargetType = "Job" GlueSiteName = "TTU-ANTAEUS" GlueSiteUniqueID = "antaeus.hpcc.ttu.edu" GlueCEName = "dzero" GlueCEUniqueID = "antaeus.hpcc.ttu.edu:2119/jobmanager-lsf-dzero" GlueCEInfoContactString = "antaeus.hpcc.ttu.edu:2119/jobmanager-lsf" GlueCEAccessControlBaseRule = "VO:dzero" GlueCEHostingCluster = "antaeus.hpcc.ttu.edu" GlueCEInfoApplicationDir = "/mnt/lustre/antaeus/apps GlueCEInfoDataDir = "/mnt/hep/osg" GlueCEInfoDefaultSE = "sigmorgh.hpcc.ttu.edu" GlueCEInfoLRMSType = "lsf" GlueCEPolicyMaxCPUTime = 6000 GlueCEStateStatus = "Production" GlueCEStateFreeCPUs = 0 GlueCEStateRunningJobs = 0 GlueCEStateTotalJobs = 0 GlueCEStateWaitingJobs = 0 GlueClusterName = "antaeus.hpcc.ttu.edu" GlueSubClusterWNTmpDir = "/tmp" GlueHostApplicationSoftwareRunTimeEnvironment = "MountPoints,VO-cms-CMSSW_1_2_3" GlueHostMainMemoryRAMSize = 512 GlueHostNetworkAdapterInboundIP = FALSE GlueHostNetworkAdapterOutboundIP = TRUE GlueHostOperatingSystemName = "CentOS" GlueHostProcessorClockSpeed = 1000 GlueSchemaVersionMajor = 1 … Resource Description Job Description Abstract Resource Characteristic Resource Requirements

July 25, /21 OSG Information Services Gabriele Garzoglio, Rob Quick, Chris Green Does this sound like something you need to do ? (Users) Does this sound reasonable to you? (Site Admins)

July 25, /21 OSG Information Services Gabriele Garzoglio, Rob Quick, Chris Green ReSS Deployment on OSG Click here for live URLhere

July 25, /21 OSG Information Services Gabriele Garzoglio, Rob Quick, Chris Green Status of ReSS ReSS is a lightweight Resource Selection Service for push-based job handling systems ReSS is deployed on OSG as a general service: talk to us if you are interested! DZero and Engagement VO use ReSS on OSG ReSS is used by FermiGrid for campus-wide resource selection More info at election/ election/

July 25, /21 OSG Information Services Gabriele Garzoglio, Rob Quick, Chris Green What Sites Need to Do Configure GIPs correctly so show Green on GIP monitor validate.grid.iu.edu/production/index.htmlhttp://gip- validate.grid.iu.edu/production/index.html Make sure VORS reports correct info for your site Make sure CEMon reports info from your site History.html History.html Ask for help from if you have any questions or

July 25, /21 OSG Information Services Gabriele Garzoglio, Rob Quick, Chris Green What VOs and Users need to do Understand parameters needed to select resource where your applications can run Interface the Information services to your application AND/OR use one of the OSG provided resource selectors (details in hidden slides).

July 25, /21 OSG Information Services Gabriele Garzoglio, Rob Quick, Chris Green Conclusions OSG Information Services exist and are used in patches but the information provided is not yet complete nor uniform. We need the Sites to pay attention to the information content and configurations. We support Users who want to use any or all of the tools. OSG has a focus on Usability and Robustness over the next 12 months

July 25, /21 OSG Information Services Gabriele Garzoglio, Rob Quick, Chris Green Additional Slides for More Detailed Information

July 25, /21 OSG Information Services Gabriele Garzoglio, Rob Quick, Chris Green User Interaction with ReSS The ReSS exposes information via condor collector interfaces –Programmatically: via a Web Service interface –Command line, via condor_status Examples: Tools Tools The Engagement VO gets OSG info from ReSS and does match making via a VO Match Making Service: ntVO ntVO –Condor scheduler interaction with ReSS See how to connect a scheduler directly to the OSG ReSS (à la DZero): See how FermiGrid uses ReSS for campus-wide resource selection: –Glue Schema Attributes definition: –FermiGrid classads:

July 25, /21 OSG Information Services Gabriele Garzoglio, Rob Quick, Chris Green Glue Schema to old classad Mapping Site Cluster CE1 SubCluster1 SubCluster2 CE2 VO1 VO2 VO3 … Mapping the Glue Schema “tree” into a set of “flat” classads: all possible combination of (Cluster, Subcluster, CE, VO)

July 25, /21 OSG Information Services Gabriele Garzoglio, Rob Quick, Chris Green Glue Schema to old classad Mapping Site Cluster CE1 SubCluster1 SubCluster2 CE2 VO1 VO2 VO3 Site Cluster SubCluster1 CE1 VO1 classad … Mapping the Glue Schema “tree” into a set of “flat” classads: all possible combination of (Cluster, Subcluster, CE, VO)

July 25, /21 OSG Information Services Gabriele Garzoglio, Rob Quick, Chris Green Glue Schema to old classad Mapping Site Cluster CE1 SubCluster1 SubCluster2 CE2 VO1 VO2 VO3 Site Cluster SubCluster1 CE1 VO1 classad Site Cluster SubCluster2 CE1 VO1 classad … Mapping the Glue Schema “tree” into a set of “flat” classads: All possible combination of (Cluster, Subcluster, CE, VO)

July 25, /21 OSG Information Services Gabriele Garzoglio, Rob Quick, Chris Green Glue Schema to old classad Mapping Site Cluster CE1 SubCluster1 SubCluster2 CE2 VO1 VO2 VO3 Site Cluster SubCluster1 CE1 VO1 classad Site Cluster SubCluster2 CE1 VO1 classad Site Cluster SubCluster1 CE1 VO2 classad … Mapping the Glue Schema “tree” into a set of “flat” classads: All possible combination of (Cluster, Subcluster, CE, VO)

July 25, /21 OSG Information Services Gabriele Garzoglio, Rob Quick, Chris Green Glue Schema to old classad Mapping Site Cluster CE1 SubCluster1 SubCluster2 CE2 VO1 VO2 VO3 Site Cluster SubCluster1 CE1 VO1 classad Site Cluster SubCluster2 CE1 VO1 classad Site Cluster SubCluster1 CE1 VO2 classad Site Cluster SubCluster2 CE1 VO2 classad Site Cluster SubCluster1 CE2 VO1 classad Site Cluster SubCluster2 CE2 VO1 classad … Mapping the Glue Schema “tree” into a set of “flat” classads: All possible combination of (Cluster, Subcluster, CE, VO)