Presentation is loading. Please wait.

Presentation is loading. Please wait.

Instrumentation of the SAM-Grid Gabriele Garzoglio CSC 426 Research Proposal.

Similar presentations


Presentation on theme: "Instrumentation of the SAM-Grid Gabriele Garzoglio CSC 426 Research Proposal."— Presentation transcript:

1 Instrumentation of the SAM-Grid Gabriele Garzoglio CSC 426 Research Proposal

2 Overview  Characteristics of the High Energy Physics Community The SAM-Grid: enabling fully distributed analysis job processing The Proposed Instrumentation

3 Characteristics of the work in High Energy Physics High Energy Physics studies the fundamental interaction of Nature. Few laboratories around the world provide each unique facilities (accelerators) to study particular aspects of the field: the collaborations are geographically distributed. Experiments become every decade more challenging/expensive: the collaborations are large groups of people. The phenomena studied are statistical in nature and very rare events: a lot of data/statistics is needed

4 The Fermi National Accelerator Laboratory

5 The Nature of the Data

6 An example: the D0 Experiment Detector Data –1,000,000 Channels –Event size 250KB –Event rate ~50 Hz –On-line Data Rate 12 MBps –Est. 2 year totals (incl Processing and analysis): 1 x 10 9 events ~0.5 PB Monte Carlo Data (simulations) –5 remote processing centers –Estimate ~300 TB in 2 years.

7 The D0 Collaboration ~500 Physicists 72 institutions 18 Countries

8

9 How can all of them work together ? Using Large Distributed System Middleware: the Grid

10 Overview Characteristics of the High Energy Physics Community  The SAM-Grid: enabling fully distributed analysis job processing The Proposed Instrumentation

11 The SAM-Grid Project Mission: enable fully distributed computing for DZero and CDF Strategy: enhance the distributed data handling system of the experiments (SAM), incorporating standard Grid tools and protocols, and developing new solutions for Grid computing (JIM) Funds: the Particle Physics Data Grid (US) and GridPP (UK) People: Computer scientists and Physicists from Fermilab and the collaborating Universities History: SAM from 1997, JIM from end of 2001 Schedule: CDF and DZero are running now! A prototype is running, scheduled for production in Spring 03; long-term deliverables in 2 yrs.

12

13 The Logistics

14 JOB Computing Element Submission Client User Interface Queuing System Job Management User Interface Resource Selector Match Making Service Information Collector Execution Site #1 Submission Client Match Making Service Computing Element Grid Sensors Execution Site #n Queuing System Grid Sensors Storage Element Computing Element Storage Element Data Handling System Storage Element Informatio n Collector Grid Sensor s Computin g Element Data Handling System

15 Overview Characteristics of the High Energy Physics Community The SAM-Grid: enabling fully distributed analysis job processing  The Proposed Instrumentation

16 Why is this useful ? The SAM-Grid is a complex system: the instrumentation is of critical importance to Troubleshoot the system –Production systems are maintained 24x7 –Ease user support –Find anomalies/bugs Gather statistics –User data access patterns –Resource utilization –Global parameter optimization

17 Why is this challenging ? The SAM-Grid is composed of hundreds of servers, widely geographically distributed: what is a suitable architecture ? Servers have very diverse functionalities: is it possible to enable some form of uniform data access ?

18 Current instrumentation…. The SAM System uses a global log service: every SAM Server records free-format events/messages JIM V1 is under intense development: the current instrumentation is insufficient

19 …and its limitations The current log server is centralized: for the SAM system only it records 1 GB every few days. This does not scale. Message transport is UDP-based: this scales in the number of reporting servers, but data integrity is not guaranteed. The messages are not structured: data mining / presentation is non-trivial.

20 The direction 1 The CODA distributed File System is a good example of successful distributed architecture for instrumentation. Client Server Data Collector Data Log Reaper Database Off-Line Analyses

21 The direction 2 The structure of the message should include: the name of the client/server the types of the client/server: various groupings may be meaningful i.e. logistical, functional, logical, etc. the location of the client/server a global time stamp an id code, related to the severity of the message

22 Rough time estimate 1 FTE month to design the architecture + the message structure 1 FTE month to implement basic messaging 1 FTE month to study initial results 1 FTE month to feedback changes to the message structure and implementation


Download ppt "Instrumentation of the SAM-Grid Gabriele Garzoglio CSC 426 Research Proposal."

Similar presentations


Ads by Google