Presentation is loading. Please wait.

Presentation is loading. Please wait.

LCG/EGEE Security Operations HEPiX, Fall 2004 BNL, 22 October 2004 David Kelsey CCLRC/RAL, UK

Similar presentations


Presentation on theme: "LCG/EGEE Security Operations HEPiX, Fall 2004 BNL, 22 October 2004 David Kelsey CCLRC/RAL, UK"— Presentation transcript:

1 LCG/EGEE Security Operations HEPiX, Fall 2004 BNL, 22 October 2004 David Kelsey CCLRC/RAL, UK d.p.kelsey@rl.ac.uk

2 22-Oct-04David Kelsey, LCG/EGEE Security Ops, HEPiX2 Outline Security Operations today (new) Site Registration procedures EGEE Operational Security Coordination Team –Show (some) slides by Ian Neilson (CERN) Talk given at EGEE ROC Managers mtg (5 Oct 04) n.b. will not cover Authentication, Authorization, User Registration, VO management, Firewalls etc.

3 22-Oct-04David Kelsey, LCG/EGEE Security Ops, HEPiX3 LCG Policy Security & Availability Policy Usage Rules Certification Authorities Audit Requirements GOC Guides Incident Response User Registration & VO Management http://cern.ch/proj-lcg-security/documents.html Application Development & Network Admin Guide picture from Ian Neilson

4 22-Oct-04David Kelsey, LCG/EGEE Security Ops, HEPiX4 Security Operations today Not part of LCG GOC activities –But the RAL GOC did write three Guides Part of the Security Policy Now starting integration with EGEE ROCs –See later Ian Neilson (CERN) is the LCG Security Officer Maria Dimou (CERN) is the LCG Registrar Both also playing key roles in EGEE When sites join LCG/EGEE –Security Contact details must be provided Name, mail and phone numbers (out of band contact) Local CSIRT mail list (for emergency use) Mail list of these contacts used for discussion

5 22-Oct-04David Kelsey, LCG/EGEE Security Ops, HEPiX5 Security Operations today (2) Patching responsibilities –Operating System patches Responsibility of the Site managers –Grid middleware patches Pushed out by CERN Deployment Team Help on security policy and procedures –mailto:project-lcg-security-support@cern.chmailto:project-lcg-security-support@cern.ch All sites required to read, digest and follow the policy documents –There have been very few questions!

6 22-Oct-04David Kelsey, LCG/EGEE Security Ops, HEPiX6 Security Operations today (3) LCG Audit Requirements See https://edms.cern.ch/document/428037/https://edms.cern.ch/document/428037/ –Every site must keep (for at least 90 days) Jobmanager and/or gatekeeper logfiles Data transfer logs Batch system and process activity records –Need to be preserved over system re-installs –Logs also needed for accounting

7 22-Oct-04David Kelsey, LCG/EGEE Security Ops, HEPiX7 Security Operations today (4) Agreement on Incident Response See https://edms.cern.ch/document/428035/https://edms.cern.ch/document/428035/ What is an incident? –Investigation -> break in service –Misuse of remote Grid resources –Long-lived (>3 days) credentials stolen Sites must –Take local action to prevent disruption –Report to local security officers –Report to others via Grid CSIRT mail list

8 22-Oct-04David Kelsey, LCG/EGEE Security Ops, HEPiX8 Site Registration procedures Joint Security Policy Group Working on more formal procedures –When Sites join LCG/EGEE –Need to collect all Contact details System Managers and Security Contacts –Sites must confirm (sign?) acceptance of policy and procedures EGEE sites need to be approved by local ROC –LCG sites approved by Deployment team or GDB

9 LCG/EGEE Security Coordination Ian Neilson Grid Deployment Group CERN (OSCT: Operational Security Coordination Team)

10 EGEE ARM-2 – 5 Oct 2004 - 10 Security Coordination Objectives Ownership of … Security incidents From notification to resolution Liaise with national/institute CERTs Middleware security problems Liaise with development & deployment groups Co-ordination of security monitoring Post-mortem analysis Access to team of experts Security Service Challenges - LCG

11 EGEE ARM-2 – 5 Oct 2004 - 11 OSG - Security Incident Handling and Response Guide (draft) To guide the development and maintenance of a common capability for handling and response to cyber security incidents on Grids. The capability will be established through (1) common policies and processes, (2) common organizational structures, (3) cross-organizational relationships, (4) common communications methods, and (5) a modicum of centrally-provided services and processes. DPK comment: LCG/EGEE intends to base new procedures on the OSG document

12 EGEE ARM-2 – 5 Oct 2004 - 12 The Joint Security Policy Group Security & Availability Policy Usage Rules Certification Authorities Audit Requirements GOC Guides Incident Response User Registration Application Development & Network Admin Guide http://cern.ch/proj-lcg-security/documents.html (1) Common policies and processes

13 EGEE ARM-2 – 5 Oct 2004 - 13 Security Coordination - Groups Parties from OSG IR Security Operations Centre(s) (=?GOCs/CICs) Organize, coordinate, track, report Security contacts Defined for every grid participant: users and resources Incident Response & Technical Experts Managed list of available expertise Ad hoc Incident Response teams Formed on demand Security Operations Advisory group Advise development and practice of SOC (=JSPG+?) X-SOC coordination SOCs participation/communication across grid boundaries (2) common organizational structures

14 EGEE ARM-2 – 5 Oct 2004 - 14 Security Coordination - Channels OSCT ROC RC CIC/GOC CSIRT “External” GRID Media/Press “PR” (3) cross-organizational relationships, EGEE operational channels still being established. Responsibilities and processes being defined.

15 EGEE ARM-2 – 5 Oct 2004 - 15 Security Coordination – Comms. Incident Reporting List INCIDENT-SEC-L@xxx.yyy Security Contacts Discussion List INCIDENT-DISCUSS@xxx.yyy External contact Reporting Other grids MUST be Encrypted How is this achieved and managed? Tracking system MUST be secure Press and Public Relations (4) common communications methods

16 EGEE ARM-2 – 5 Oct 2004 - 16 Operational Security - Services List Management Alert/Discuss – ref: previous slide Multiple ad-hoc IR Teams Experts Ticket Tracking System Where do problems enter? – local contact Can this be part of support lists? Must be secure Public Relations Guidelines, practice statements Policy interface to JSPG Evidence gathering/preservation – use local law enforcement OSCT must (help) define process behind all these services (5) a modicum of centrally-provided services and processes

17 EGEE ARM-2 – 5 Oct 2004 - 17 Security Coordination - Issues “Security Operations Centre”: what is it for EGEE/LCG? Don’t think we can have “Central” control So formulate activity as “coordination team” Security contacts lists need management Dead boxes, moderated boxes, etc etc Do we have appropriate contact: site security or local admin? Need to coordinate through Regional Operations Centres (ROC) Need to utilise services from Core Infrastructure Centres (CIC) Wherever possible - don’t duplicate channels What is the relationships with LCG GOCs and EGEE CICs? –Are they the same? Are we communicating with local site security team or grid ‘admin’ responsibles

18 EGEE ARM-2 – 5 Oct 2004 - 18 Operational Security – where to start? “Start small and keep it simple.” Define basic structures Where/how lists hosted Where/how problems tracked Who/where/how ‘experts’ organised JSPG review and update policy documents ROCs to take over management of contacts lists Must integrate with site registration process Establish what level of support is behind site security entries Relationships with local/national CERT Validate/test entries Exercise channels and raise awareness by Security Challenges – next slide.

19 EGEE ARM-2 – 5 Oct 2004 - 19 2004 Security Service Challenges Objectives Evaluate the effectiveness of current procedures by simulating a small and well defined set of security incidents. Use the experiences of a) in an iterative fashion (during the challenges) to update procedures. Formalise the understanding gained in a) & b) in updated incident response procedures. Provide feedback to middleware development and testing activities to inform the process of building security test components. Exercise response procedures in controlled manner Non-intrusive Compute resource usage trace to owner –Run a job to send an email Storage resource trace to owner –Run a job to store a file Disruptive Disrupt a service and map the effects on the service and grid

20 22-Oct-04David Kelsey, LCG/EGEE Security Ops, HEPiX20 Summary There is much work ahead of us! We need to work together to define and maintain better operational policies and procedures –Wherever possible should work towards common (or at least interoperable) procedures between Grid projects Our applications are global –Must add to existing CSIRT procedures JSPG and OSCT will be looking for input from site managers and security contacts –Please help!


Download ppt "LCG/EGEE Security Operations HEPiX, Fall 2004 BNL, 22 October 2004 David Kelsey CCLRC/RAL, UK"

Similar presentations


Ads by Google