Presentation is loading. Please wait.

Presentation is loading. Please wait.

24 x 7 support in Amsterdam Jeff Templon NIKHEF GDB 05 september 2006.

Similar presentations


Presentation on theme: "24 x 7 support in Amsterdam Jeff Templon NIKHEF GDB 05 september 2006."— Presentation transcript:

1 24 x 7 support in Amsterdam Jeff Templon NIKHEF GDB 05 september 2006

2 Jeff Templon – Amsterdam 24x7 support, GDB, BNL, 2006.09.05 - 2 24 x 7 support

3 Jeff Templon – Amsterdam 24x7 support, GDB, BNL, 2006.09.05 - 3 Main Principle: avoid needing it u Basic Infrastructure n Power : on-site emergency generators n Network : SURFnet console staffed 24 x 7 n Guard informs all relevant people in case of ‘calamities’ n Real people watching all services (and support email / ticket systems) closely during working hours u Critical Servers: redundant failover n DNS server for farm networks n Databases (FTS, LFC, 3D, etc) n Pnfs server for dCache n dCache server itself

4 Jeff Templon – Amsterdam 24x7 support, GDB, BNL, 2006.09.05 - 4 More avoidance u Computing services: NIKHEF and SARA share computing, hence complete interruption of service is either network into Amsterdam, or something beyond our control u Tape Robot: dimension incoming disk cache to several days, hence can survive a weekend without tape if need be

5 Jeff Templon – Amsterdam 24x7 support, GDB, BNL, 2006.09.05 - 5 Monitoring u Create Dashboard (pieces exist) u Pool of +- 10 people who agree to watch things and alert the relevant person in case of problems; check at least once every 12 hours. u Look into system a la IN2P3 for restart privileges to this team via special account and scripts. u Already have SMS service in place for some critical components

6 Jeff Templon – Amsterdam 24x7 support, GDB, BNL, 2006.09.05 - 6 Plan u Put this into place early 2007 u No formal 24 x 7 or on-call system u See how it goes n If we reach targets and don’t miss response deadlines, OK n If we miss targets and deadlines, start hard discussions n Note that 24 x 7 would depend on other pieces (like NIKHEF mail server) which themselves don’t have 24 x 7!

7 Jeff Templon – Amsterdam 24x7 support, GDB, BNL, 2006.09.05 - 7 Open Questions u What about dynamic redistribution from source in case of problems? Increases site load by 1 / (N(N-1)) naively u How big is CERN’s data buffer? u What to do with externally identified problems? GGUS will not get our on-call support number u Cost choices : what is the cheapest road? We expect that paying staff for 24 x 7 is not the cheapest. Grid is about distribution and redundancy, we should exploit it. u Are we making best choices? (push vs. pull?)


Download ppt "24 x 7 support in Amsterdam Jeff Templon NIKHEF GDB 05 september 2006."

Similar presentations


Ads by Google