Risk Assessment for Grid What is the level of risk ? +Grid is not (yet) widely used with limited resources +Resources are typically made available through a gatekeeper –“Grid is an error amplifier on steroids” –An opportunistic Grid job is much like a worm: spread to all available resources and perform some coordinated task –Needs and desires are outpacing controls –Participants in a Grid announce themselves What do the bad guys want ? –Most attacks are automatic these days (limited worry about WarGames attackers, but sk attack is counter argument) –Spread itself is the game –Spread to pop up untraceable service –Spread to setup some mass action
Vulnerabilities: Authentication Authentication system can be compromised: –CA compromise –User private key compromise –Proxy (authentication token) compromise –Protocols implemented to fail open Responsibilities –CA operators have to follow CP/CPS –Resource Administrators have to protect proxies –Users have to protect their private keys. –All parties need to perform due diligence on checks
Vulnerabilities: Resources Software applications (including OS) can be attacked. –DOS attacks –resource attacks (eg. “filez”, crackers, …) –Parallel for a sniffer attack would be a proxy hijacker. Responsibilities –Resource Administrators have to keep software current –Application authors have to patch defective software –Incident Response ?
Vulnerabilities: Authorization Authorization (AuthZ) system can be compromised –AuthZ authority compromise –Authorization managers’ authentication compromise –Policy statements (eg. ACL) compromise –AuthZ token (attribute certificate ?) compromise Responsibilities –AuthZ operators have to follow an acceptable operations policy (CP/CPS or equiv) –AuthZ admins have to protect their identities –Resource Admins have to protect policy enforcement
Private Keys Today’s story focuses on user private keys. Whoever has access to the private key can assert the public identity to which it corresponds. FNAL does not accept user-held private key PKI for general access systems save for a few Grid test systems. Considering the future. –FNAL PKI for user certs uses KCA. –Fair test of incremental load of 3 rd party authentication Debate over user diligence on protecting private keys tested by looking at site network file systems. Storing a private key on any network file system exposes the key to network and file system attacks. openssl is your friend and the Grid incident investigator’s Swiss army knife.
AFS and User Private Keys Many users have home areas in AFS. Many users do not understand how AFS access control lists work. It is easy for users to leave their private keys world readable in AFS space. Should one proactively create a.globus directory in all users $HOME with the proper permissions ? What about SSH RSA keys, browser credential caches, PGP keys, …
The Stats Of 18 directories, 14 were world readable. 11 had valid certificates. After 40 days, 8 had still not been revoked. 3 directories were still readable. 1 new exposure had occurred. Distribution of sources 5 DOEGrids 5 DOESciencegrids 1 Princeton self-signed
The Timeline September 5, scanned all $HOME areas for readable.globus directories. –Found 14 of 18 directories were world readable –11 had valid certificates for the matching keys September 5, sent mail to all affected users –Basic statement of problem –telling them how to fix the AFS permissions –recommending they get certificates revoked
Details of a user private key more ~/.globus/userkey.pem Bag Attributes friendlyName: Dane Skow 's ID localKeyID: 53 E0 A1 4A 57 DE 10 E6 79 DF DD AF DA 7D 4F 94 AD 90 E3 51 Key Attributes: -----BEGIN RSA PRIVATE KEY----- Proc-Type: 4,ENCRYPTED DEK-Info: DES-EDE3-CBC,DBAD807ACC JIgCtaZcD4f2gyeILoxkzd6nlLbK6JxZD/9ZJVK+nPddsu2j+y972JjAYhPR9b7y 123ejBMW4XhikXewhODlkSZD0lNVU+tcWBSKEmyMnkBXoZmHfpxSTQ6MZvAWkBbH WZt44Zdsw2ICbpqozy7zwAaCYFWOtwoE5DwvpR51koKVAUcAjZdaLKzpxFs4wLm1 oVcA0ONjM6jRBjtf0qQWcMDUYtn57xZZlXscptORP2VRYBRjMY9xDPewnWUcM6FW b0iGe+rfs435XyAIcWDx0VL5GI1l1d1GBYyyKsoLruTah2IVJpbssmrroUeqt0T6 jvlBgZgD7uRNSvfhBddXUV1uyE5SjgURI+1t0BGoUs03K7MQjvjsenUIwPM/LjJY gdx2ctWtPR6YgXE4YCoqi30PwWd5SeyJljM3Mp0H28V7425DiU21VHwvHJPXfu5N sO3Q8oYC90G8H49UZQGh6aZotJJQboGB3qHpYwwu4bSf1Rj9aLqqdr2NVSSHTmTL 8bfchMIb5gGBrSTku1jq10Itdpg5KOvcH8neRllqN4p/NEdbRqf4e6R99E3P+EdS UnyMJ5yiduE6SLdb49E6Z/McpRcv7SAyomAn4YkADs4Az3MGqQ+nnHHOFThHNcyC RcPmd/0JW2OQiEqnz5eCXIsOfx2YXUHuPRKmqO/RWHx2yGu3lGuNpqHeouGrfcQf 3zYrBH5jgxeEd627w6cE2Ty0KVgCwVMJd0ULZDlHQ8pqmGTRDkBPaQGC6liX9vRN arMZKIGQB9EqhskWsVe4WkQAowEumFlDYPqVP4n3wkgM5Ks0My++jg== -----END RSA PRIVATE KEY-----
Details of a certificate $ more ~/.globus/usercert.pem Bag Attributes friendlyName: Dane Skow 's ID localKeyID: 53 E0 A1 4A 57 DE 10 E6 79 DF DD AF DA 7D 4F 94 AD 90 E3 51 subject=/O=doesciencegrid.org/OU=People/CN=Dane Skow issuer= /DC=net/DC=es/OU=Certificate Authorities/OU=DOE Science Grid/CN=pki BEGIN CERTIFICATE----- MIIDFjCCAf6gAwIBAgICAMQwDQYJKoZIhvcNAQEFBQAwdTETMBEGCgmSJomT8ixk ARkWA25ldDESMBAGCgmSJomT8ixkARkWAmVzMSAwHgYDVQQLExdDZXJ0aWZpY2F0 ZSBBdXRob3JpdGllczEZMBcGA1UECxMQRE9FIFNjaWVuY2UgR3JpZDENMAsGA1UE AxMEcGtpMTAeFw0wMjA1MTYxNjE2MjlaFw0wMzA1MTYxNjE2MjlaMEkxGzAZBgNV BAoTEmRvZXNjaWVuY2VncmlkLm9yZzEPMA0GA1UECxMGUGVvcGxlMRkwFwYDVQQD ExBEYW5lIFNrb3cgOTk1Mzk5MIGfMA0GCSqGSIb3DQEBAQUAA4GNADCBiQKBgQDi UmnCnjw/0cpL9JSzc6VEwUzVkB72eOuSjW+VQ1NSJ+qIEm2f6q/7TA8xtSXaOm6w IMhqaN95iS0Klmavd3gaUa4e0rvkKltt+7dmqhVU7p9DI74TnpQfJLV/CICIKcWu mfO3QBnsLPPUYy4n819LGhgSVjrXWOymQxODp9t31QIDAQABo2AwXjARBglghkgB hvhCAQEEBAMCBeAwDgYDVR0PAQH/BAQDAgTwMB8GA1UdIwQYMBaAFFQXiMoDwTkm uFWmxJn0KwKrvgDpMBgGA1UdEQQRMA+BDWRhbmVAZm5hbC5nb3YwDQYJKoZIhvcN AQEFBQADggEBAGKmQ5aNxFzCHlnfhttEQqBH1lz/3n1Whfd4iyyYzZqlcOb8U7L3 /owvB4O4zOAt5v4echSrQxxEmp8O8T0ZZlN9ybOwUBhy5usTczOpTmdokuCuNUsI CqIBY6+iaQlaq5yDoILCjt30yslDoT+oinJ+fxOsOHltscNWCzNUju65cnz3Hjqm dHqGWOLOozmrOvoRc44y0BroJ5+TF79YKl7wbH8JbiwL4EOI85EOzPkQC9ZGoo2a dZFa5zY/OFk5ekiXvk5mTMnXbTQF8kdTi5pG7QwF9KL7f4Nctux85OgnPoF5VZWg FTjyrRYuiKHaZEJaczAj4JXIjsAqzH800/U= -----END CERTIFICATE----- Human readable version available via “openssl x509 –text –in ~/.globus/usercert.pem”
Timeline (cont’d) September 6-8, had first round of followup with users and their management to explain the problem and why removal from the gridmapfile is insufficient. September, discussed with CA admins their policy on certificate revocation. No proof of association nor compromise. September 25, Tested SAZ revocation of site access for compromised certificates Oct 15, raised issue with LCG Security Group on appropriate response. –Agreement that site blocks until CRL issued may be prudent. Some concern about only triggering on “real” exposures. –Agreement to distribute lists of compromised certificates to collaborating sites in the “grid”. Complaint from at least one CSIRT about noise and unknown expectations. Concerns about DOS and spoofed reports.
Timeline (cont’d) Oct 15, –repeated scan 3 directories remained open 1 new exposure had occurred –Tested CRLs 2 certificates had been revoked 1 was stuck in process
Certificate Revocation Lists Where do you find them ? –Supposed to be referenced in the certificate –List of CA’s useful reference page Guess what they look like ? -----BEGIN X509 CRL----- MIIBmTCBgjANBgkqhkiG9w0BAQQFA… -----END X509 CRL----- Need to use tools to compare contents –Certificates identified by serial number only –Case of hex serial number not standard
Followup Issues What constitutes a private key compromise ? –To prove it, one has to crack the private key encryption. –Do we run GridCrack on our filesystems regularly (ala passwd/shadow checks) ? –If anything else, how does one establish trust between the CA and the reporter ? Correct assessment of exposure Correct association of key to certificate
Followup II What coordination between resource providers, VOs, users, … is necessary ? –Learn of suspected compromised identities Trusted communication chain Agreement on “compromise” Determine appropriate scope of response –Is disable everywhere overkill ? –Investigate the problem Coordinate forensics investigations Present conclusions and summarize confidence –Remediate the problem Issue the “all clear” Agree on followup responsibilities
Followup III Incident Response –How does the case of compromise of a host/service private key differ from this ? Are there restrictions on types of access ? Are there differences in service to service transactions ? –How does case of application hole exploit differ from this ? Does the grid contain its own advertisement (ala NIS) ?
Followup IV Authorization handled by gridmapfile for each resource. –Think of a gridmapfile as an /etc/passwd file on a host Authorization done by DN (Distinguished Name) only –How to deal with replacement certificates with same DN ? Maintenance of gridmapfile either manual or disconnected from incident response teams.