Presentation is loading. Please wait.

Presentation is loading. Please wait.

INFSO-RI-508833 Enabling Grids for E-sciencE www.eu-egee.org Logging and Bookkeeping and Job Provenance Services Ludek Matyska (CESNET) on behalf of the.

Similar presentations


Presentation on theme: "INFSO-RI-508833 Enabling Grids for E-sciencE www.eu-egee.org Logging and Bookkeeping and Job Provenance Services Ludek Matyska (CESNET) on behalf of the."— Presentation transcript:

1 INFSO-RI-508833 Enabling Grids for E-sciencE www.eu-egee.org Logging and Bookkeeping and Job Provenance Services Ludek Matyska (CESNET) on behalf of the JRA1 IT-CZ cluster

2 Enabling Grids for E-sciencE INFSO-RI-508833 2 Talk Outline Logging and bookkeeping (L&B) –General overview and main features –Deployment –Expected use Job Provenance (JP) –Motivation –Overview and relationship with L&B –Expected use Conclusion

3 Enabling Grids for E-sciencE INFSO-RI-508833 3 Logging and Bookkeeping Motivation –Keep track of Grid jobs General overview –Capture job control flow –Provide job state information –Just in time or short-term post mortem analysis –Support user generated events

4 Enabling Grids for E-sciencE INFSO-RI-508833 4 General architecture

5 Enabling Grids for E-sciencE INFSO-RI-508833 5 Features L&B events as important points in the flow control of job –Submission –Transfer between components –Match making and brokerage results –Starting/finishing job execution –Events generated directly by user  Only during the actual job execution Events delivered in non-blocking way but reliably Job state computed by fault tolerant state machine

6 Enabling Grids for E-sciencE INFSO-RI-508833 6 User interaction Implicit: –Submitting a job Explicit –Logging events during job execution –Querying the bookkeeping server Predefined set of common queries –Directly available through the UI Public API to access bookkeeping server –More general, for complex queries –User can register to receive a notification about job state Both reject “dangerous” queries Support for aggregated information about DAGs

7 Enabling Grids for E-sciencE INFSO-RI-508833 7 Interaction overview

8 Enabling Grids for E-sciencE INFSO-RI-508833 8 User events Users can store events in the bookkeeping DB –Non-blocking reliable mechanism for passing job related information Information is available through the L&B querying mechanism –Through the UI or public API Still asynchronous –Events from the same CE will usually arrive in correct order –Internal and user issued timestamps may help

9 Enabling Grids for E-sciencE INFSO-RI-508833 9 L&B deployment EGEE –Around 50 production installations of bookkeeping servers –Over 20 000 jobs per day on average –Over 60 GB of data since January 2005 Other projects using EDG or EGEE middleware –LCG –CrossGrid

10 Enabling Grids for E-sciencE INFSO-RI-508833 10 L&B Use Provision of job state –Including notification –Feed into R-GMA Provision of more detailed info about job flow Debugging –Transfer between components, failure trace Statistics (JRA2) –Time of submission, execution start and end –Matchmaking results, reasons for no match found –Failures End user events –E.g. visualization of progress of job execution

11 Enabling Grids for E-sciencE INFSO-RI-508833 11 Job Provenance Motivation –The information about jobs has longer value  E.g. repeat a submission of a job executed year ago –The information about job control flow and job execution environment complements job results  E.g. to be able to reliably resubmit a job Job Provenance –Preserve information about Grid jobs –Allow data-mining in this information –Assist job re-submission

12 Enabling Grids for E-sciencE INFSO-RI-508833 12 JP with WMS and L&B

13 Enabling Grids for E-sciencE INFSO-RI-508833 13 JP Gathered Data Data from L&B Job inputs –The input sandbox –No copies of files in remote storage  However, file/collection identification is available Execution track –Data (“measurements”) from CE  Installed software versions, environment, … –Accounting data  DGAS User annotations Scalability –Record volatile data only

14 Enabling Grids for E-sciencE INFSO-RI-508833 14 Primary Data in JP Job is the primary entity Minimal set of core attributes: –JobID, owner, registration time Short data items: tags –“key = value” pairs Bulk data: uploaded files

15 Enabling Grids for E-sciencE INFSO-RI-508833 15 JP Job Attributes A way to provide a generic unified view on any job data –Multivalued –Format: “namespace:key = value” –Namespaces may have defined schema User annotations are mapped directly to Job Attributes File-type specific plugins –Process bulk files Job Attributes used both for internal handling and user queries

16 Enabling Grids for E-sciencE INFSO-RI-508833 16 JP Main Components Primary storage –Where the data are stored “forever” Index server

17 Enabling Grids for E-sciencE INFSO-RI-508833 17 JP Primary Storage Gather and store data Process “bulk files” on demand to extract attributes Interaction with users: –Annotate –Retrieve job attributes, download files –Always keyed by JobID only  Performance and scalability Web service control interface gsiftp for file transfer

18 Enabling Grids for E-sciencE INFSO-RI-508833 18 JP Index Server To provide scalability for access Created and configured for a particular purpose –Set of Primary servers to register with –Conditions on jobs to retrieve  Job from VO A submitted after January 1 st, 2006 –List of attributes to collect Only fraction of data from Primary storage Incremental feed from Primary storage –Batch feed also available (e.g. after a crash) Complex user queries –May refer only to the IS configured attributes

19 Enabling Grids for E-sciencE INFSO-RI-508833 19 JP – Current Status Prototype implementation –Included in gLite 1.5 –Limited IS configuration –Supported files:  L&B and input sandbox Plans –Available from GUI –Complex authorization (VOMS based) –Support for re-submission of jobs

20 Enabling Grids for E-sciencE INFSO-RI-508833 20 Conclusion Job centric monitoring approach –Users and their jobs –User specific data (annotations) –Infrastructure information specific Logging and bookkeeping: production –Information gathered and provided when job within the Grid –Generic interfaces (including web service interface) –Security from the scratch (VOMS authorization) Job Provenance: prototype –Permanent job related information storage –Data-mining over complex job sets


Download ppt "INFSO-RI-508833 Enabling Grids for E-sciencE www.eu-egee.org Logging and Bookkeeping and Job Provenance Services Ludek Matyska (CESNET) on behalf of the."

Similar presentations


Ads by Google