Enterprise Encryption 101

Enterprise Encryption 101
Good morning. I’m Phil Smith from Voltage Security. We’re a software vendor, and we do encryption—both secure and data encryption for enterprise customers. But of course this isn’t a vendor pitch, so that’s the last I’ll say about that. I will warn you that I’m going to briefly mention a couple data protection technologies that are Voltage-specific, but in the context of other technologies. I’m on the data encryption side of the company, and so that’s where my focus is. And I’m a technologist but not a cryptographer. This is perhaps a good thing, because it means I can talk about real-world uses, rather than cryptography arcana. At least, I’ll try to do that! Phil Smith III Voltage Security, Inc.

(C) 2013 Voltage Security, Inc. All Rights Reserved
Agenda Why we’re here Encryption basics: terminology and types What is “enterprise encryption”? Why encryption is difficult and scary The five Ws of encryption Encryption key management: the “other” gotcha A realistic approach to enterprise encryption Example: Voltage SecureData After I joined Voltage and started learning more about the encryption market and technologies, I realized pretty quickly that a big issue is that folks just don’t understand the subject. This includes customer management, customer techies, and even QSAs (Qualified Security Assessors—security auditors). That was the genesis of this pitch, which is a crash course in enterprise encryption. It’s not bits-and-bytes and how encryption algorithms work: rather, it’s intended to provide an overview of what encryption offers, why encryption is gaining importance in the enterprise, and how to successfully implement it. Throughout, I’ve tried to be objective and skeptical when quoting statistics. The goal isn’t to scare you, but rather to at least get you to take the issues seriously. (C) 2013 Voltage Security, Inc. All Rights Reserved

Enterprise Encryption In Sixty Minutes
So in the next hour, we’ll cover the high points of enterprise encryption.

Why We’re Here Encryption is on many folks’ minds these days CxOs, CISOs are saying “Gotta encrypt stuff now!” Breaches are in the news Heartland, Epsilon, et al…even RSA! Many sites have implemented several point solutions Different platforms, different problems…not interoperable! DLP (data leakage prevention) is not foolproof If it’s leaked but encrypted, you care a whole lot less! The h4xx0rs are out there… …and they’re getting smarter and more creative Internal breaches are increasing Gartner et al. agree: 70%++ breaches are internal If you’re lucky, your management is aware of the need for encryption and is telling you to start working on it. I know, “Lucky???”—you have enough to do already—but the alternative is that you get to do it on a crash basis if and when a data breach occurs, or an auditor tells you that you need encryption now. I’ll bet you’ve gotten a phone call or letter from one of your credit card companies saying “We’re invalidating your card and sending a new one”. Those calls likely happened because of a data breach: your card was among those that may have been compromised. It seems like there’s a data breach in the news every week or so—and those are the ones we hear about. Some number must be occurring that are not reported or simply not detected. You probably have some encryption in place somewhere in the company already, but unless you have developed a company-wide data protection strategy, there are likely huge gaps. And even if you have a well-thought-out, company-wide data protection strategy, you cannot claim to foresee every eventuality. An example: a friend of mine was once planning a DR test, and wanted it to be perfect. He had two sets of backup tapes, two sets of documentation, two sets of everything, all shrink-wrapped and palletized. He had two different trucking companies come and pick up the pallets to deliver them to the DR site. Unbeknownst to him, the two drivers knew and disliked each other: they wound up in some kind of a race on the New Jersey Turnpike and wrecked both trucks. So of course his new DR plan includes different pickup times … (C) 2013 Voltage Security, Inc. All Rights Reserved

Encryption Basics But before we talk about implementing encryption, we need to understand the landscape.

Encryption Basics Encryption means using an algorithm (cipher) plus a secret value (key) to transform data (plaintext) into another format (ciphertext) so it is no longer readable without decryption In other words: Make important data useless to anyone who isn’t authorized to read it! Note: Encryption tends to talk in terms of “messages” Stored data may not go anywhere, but same principles apply This is the classical definition of encryption: an algorithm that uses a key to make data unreadable, hopefully reversibly. But here’s the short version: make data that should be kept private useless to anyone who isn’t supposed to be able to read it. This “message” terminology is important only in terms of thinking of the data originator and the data consumer. Whether you’re sending a file across the network or just storing it and reading it later, you have two “sides”. THE MISSILE LAUNCH CODE IS XYZZY123plover MV*U24AT2HaIKUewzqWPzvLXaT9UGM!\zj(`iwPO…  (C) 2013 Voltage Security, Inc. All Rights Reserved

Encryption Types: Symmetric
Symmetric encryption means same key is used to encrypt and decrypt Means both parties need access to the same keys Many varieties (algorithms): DES, TDES, AES, Twofish, RC4, CAST5, IDEA, Blowfish… Can be strong and also fairly high-performance “Strength” determined by key length in bits as well as algorithmic integrity Symmetric encryption is the encryption we all grew up with—Little Orphan Annie decoder rings, puzzles in magazines, Enigma in WW II… Ciphers of this sort go back to ancient times, as simple substitution ciphers, where A becomes G and so forth. Allegedly Julius Caesar used such a method, and so it’s also called a “Caesar cipher”. By now, symmetric algorithms are a bit more sophisticated (to put it mildly), and there are many, many variations. Something important to recognize about any encryption is that it isn’t free—it takes processing power to encrypt and decrypt. Yet you want it to be “strong”—hard for someone to “break”, to be able to decrypt unless you want them to. Most modern symmetric algorithms strike a reasonable balance between the two. We’ll talk about key strength in more detail later.

Symmetric Encryption: Stream and Block
Symmetric encryption comes in two flavors: Stream ciphers transform the key as they progress, processing one chunk (bit, byte, whatever) at a time Block ciphers use fixed keys every block (blocksize=keysize) Difference matters little in practice Stream generally faster, but requires more key complexity Many block ciphers have modes that effectively operate like stream ciphers Most data protection products use block ciphers There are two types of symmetric ciphers, block and stream. You can think of a stream cipher as starting with a key, using it for one chunk, then transforming that key with some or all of the data from the previous chunk, using it for the next chunk, etc. A traditional block cipher uses the same key for each block, but most modern block ciphers do some sort of key transformation as well, so they’re basically like stream ciphers. They still operate on bigger blocks of data than stream ciphers, however. Stream ciphers are typically used for realtime requirements like encrypting cellphone traffic. For all practical purposes, you likely don’t need to worry about the difference except to be aware that it exists, so if someone says “We use a block cipher” you don’t have to ask what that means. Stream Cipher (C) 2013 Voltage Security, Inc. All Rights Reserved

Asymmetric aka Public Key Encryption
Asymmetric encryption means what it sounds like: Different keys needed to encrypt and decrypt Each entity has two keys: public and private Invented in 1970s (Diffie-Hellman, RSA, UK government) Makes key distribution much easier: I can publish my public key safely You encrypt using public key, I decrypt using my private key Downside is performance Symmetric algorithms are typically much faster—public key often too expensive for application data protection Requires significant data layout/application changes Asymmetric encryption uses different keys to encrypt and decrypt. These keys are mathematically related, but you can’t derive one from the other. This means that you can publish one of the keys, and anyone can use it to encrypt data, but nobody can decrypt the data except you. The idea of asymmetric encryption was proposed many years ago, and first implemented some time after that. It’s been around for a couple of decades now, but is more compute-intensive than symmetric encryption. Thus until relatively recently, it hasn’t been very practical (unless you’re The Gummint). It’s still a lot more expensive than symmetric. But there are ways to exploit both to get the benefits of both. By the way, “PKI” means “Public Key Infrastructure”, which refers to certificate stuff—something a bit different. So yes, it’s confusing! (C) 2013 Voltage Security, Inc. All Rights Reserved

Asymmetric Encryption Uses
Some use cases are ideal for public key encryption Hassle-free (public) key exchange makes some things easy A key is a key, so either (private/public) usable for encryption or decryption, provided “other” used for opposite function Better yet, encrypt twice: my private, your public You and I can each other our public keys I encrypt with my private, your public You decrypt with your private, my public You now know the data was encrypted by me, I know only you could decrypt it Provided neither of us has exposed our private keys! PGP is an example of public-key encryption that you may have come across. The keys are actually symmetrical in that you can use either in either direction, but you need to use both to get back to the plaintext. That is, you can encrypt with public/decrypt with private OR the other way ’round. This means you can use BOTH your private AND someone else’s public to doubly encrypt data: now only they can decrypt it, and only you could have sent it. (C) 2013 Voltage Security, Inc. All Rights Reserved

Hybrids: Key “Wrapping”
Because asymmetric encryption is expensive, hybrid solutions are attractive: Sender generates random symmetric key Encrypts actual data (“payload”) using that symmetric key Encrypts symmetric key using target’s public key Sends encrypted symmetric key with data To decrypt: Key decrypted using (expensive) asymmetric (private key) Payload decrypted using cheaper symmetric algorithm As I said, asymmetric is expensive. And the cost is linear with the data length. So a hybrid approach is a clever way to get the benefits of both: Encrypt the data with a symmetric key Encrypt that key with asymmetric encryption Send both as a “bundle” The recipient then uses the expensive asymmetric to decrypt the short symmetric key with confidence, and the symmetric key to retrieve the actual longer plaintext message. (C) 2013 Voltage Security, Inc. All Rights Reserved

Cryptographic Hashes and Digests
Related to encryption: cryptographic hashes aka digests Functions that convert variable-length input to fixed-length output Any change to original data changes the hash Used in digital signatures, as checksums, etc. Good hashes (SHA-1/2/3, MD4/5) have these properties: Easy to compute for given data Infeasible to reconstruct data from hash Infeasible to modify data without changing hash Collisions (same hash from different data) very rare A good way to represent data without leakage risk Frequently used for things like verifying downloads So there are a few other things related to encryption that come up in the same breath. Hashes or digests are a way of creating a fairly short binary blob that represents some data. A checksum is a very primitive hash. A hash isn’t reversible—you can’t get to the data from the hash—but the good modern hashes like SHA-2 (and the new SHA-3) and MD5 have the properties listed. Folks have spent a lot of computing power looking for collisions with SHA-2—cases where two different pieces of data result in the same hash—and have not found any. We know they must exist; this just means that they’re very rare. You’ll see hashes listed with things like downloads: they let you know that the otherwise unverifiable data you downloaded is what you expect, by matching the hash listed to the one you produce. (C) 2013 Voltage Security, Inc. All Rights Reserved

Digital Signatures Digital signatures are also related to cryptography Generated from the data using public/private-like key pairs Result is a hash-like blob Signatures prove data authenticity and integrity Authenticity: Data is from who it says it’s from Integrity: Data has not been tampered with (since signing) Implements important concept: non-repudiation Means sender cannot (reasonably) say “I didn’t sign that” Frequently used for things like secure Avoids problems due to forged mail A digital signature is kind of an encryption; you can think of it as the asymmetric version of a hash: you “sign” the data by processing it with a private key and an algorithm to produce a “blob” that can only be decrypted using the corresponding public key. When someone decrypts it with the matching public key, if they get something coherent, they know you sent it. This also means you can’t repudiate it—that is, you can’t claim you didn’t sign it. (C) 2013 Voltage Security, Inc. All Rights Reserved

Message Authentication Codes (MACs)
A MAC (Message Authentication Code) is a keyed hash Created using a hash function plus a secret key Verify both data integrity and authenticity Different from digital signatures: same secret key used by creator/reader Thus more like symmetric encryption, where digital signatures are more like public key encryption Generally faster to generate than digital signatures MAC sent along with data Receiver re-generates MAC against data, confirms match Useful for verifying transactions The last thing I’ll mention is Message Authentication Codes, which are usually just called MACs (yes, Yet Another Meaning for “MAC”). You create a MAC by combining a key (think of a password) with some data to produce a hash value. The recipient knows the same password, and uses it against the data to produce another copy of the hash. If they match, the data is complete and authentic. This is faster than creating a digital signature for the entire data (especially if the data is very long). So it’s like the hash version of symmetric encryption. The “useful for verifying transactions” means that things like Amazon Web Services can use a password (key) you have been given to authenticate a request, without the key being sent over the network in plaintext: a request includes a MAC generated from the request + the account password; the Amazon end recalculates the expected MAC (since it knows the request and the account, and has the password) and makes sure they match. So the MAC becomes sort of a one-time password for the request, which the receiver can validate using that previously shared password.

A Few Words About “Encryption Strength”
Encryption strength refers to the likelihood that an attacker can “break” encrypted data Typically tied to bit length of encryption key Exponential: 128-bit key is 264 times as strong as 64-bit See “Understanding Cryptographic Key Strength” on youtube.com/user/VoltageOne for a good discussion/illustration The encryption community is collaborative Research, algorithms are all published and peer-reviewed Cryptographers look for weaknesses in their own and each others’ work So if you spend more than 30 seconds looking at encryption, you’ll come across this term “encryption strength”. It refers to how hard it is for someone without the original data to decrypt it if they don’t have the key. Usually this means “by trying all the possible values”. So if it’s a one-byte key, then there are 256 possible values for the key. That takes less time on a modern computer than saying it. And of course on average, it will take half the number of possible values to find the right one. If it’s a 128-bit key, then there are 2-to-the-128th possible values for the key. That’s a lot. With a 256-bit key, there are 2-to-the-256th possible values—a BIG lot more. The YouTube reference here is one of our guys with a whiteboard; it’s not as popular as Justin Bieber or the dancing baby, but it’s short and very accessible. The fact that the encryption community is collaborative is very important. You’ve all heard the term “Security through obscurity”, I expect; it would be easy to create an encryption product that uses a Top Secret Algorithm and make wild claims about it. But that isn’t how things work. Instead, the community of crypto geeks all review each others’ stuff, looking for weaknesses—patterns in the results, for example. (C) 2013 Voltage Security, Inc. All Rights Reserved

More About “Encryption Strength”
Cryptographers “cheat” in favor of attacker when analyzing Make assumptions like “attacker has multiple known examples of encrypted data and matching plaintext” Also assume they’ll know plaintext when they find it, and that the encryption algorithm is known “Weaknesses” reported are often largely theoretical— only NSA could really exploit Huge amounts of time, brute-force computing power required It’s also important to understand that when cryptographers do this analysis, they “cheat”—that is, they make assumptions about what the attacker has and knows that are likely more generous than any real-world attack. They also assume that they’ll recognize the plaintext when they find it. That’s an important cheat: if I if I DES encrypt something and then AES encrypt it, and you think I only AES encrypted it, you’ll never find the plaintext. Or if I take a Word document, rename it to .ZIP, and then encrypt it as “ENCRYPTED.ZIP”, you could spend the next six thousand millennia trying to decrypt it and looking for the ZIP file header, and you’re probably not going to find it (and if you do, it won’t mean it’s a readable ZIP). Even if you know the format of what you’re looking for, you need some way to verify that you’ve really found it: if you’re trying to crack an encrypted credit card number, do you declare victory just because the decrypted value is sixteen decimal digits? Even with 56-bit DES, you’re likely to find some “false positives”. And even when weaknesses are found, they’re often amazingly theoretical—things like “10,000 PCs working for ‘only’ six months would have a 50% probability…”, or “A 128-bit key is only 124 bits strong” (which means that it would take, oh, only 2 billion years with a trillion machines doing a billion attempts per second…) (C) 2013 Voltage Security, Inc. All Rights Reserved

More About “Encryption Strength”
This “cheating” ensures encryption strength is real* This approach increases security for all By the time an algorithm is accepted as a standard and implemented in products, confidence is high Even if a weakness is later discovered, it’s likely largely theoretical/impractical for most to exploit Makes it easy to spot the charlatans Companies whose proprietary algorithms are not peer-reviewed Also look for claims like “unbreakable encryption”, or focus on key length rather than standards-based cryptography So clearly this peer review is a good thing for encryption users: it means something that is insecure is likely discovered by the good guys before the bad guys. This has been demonstrated with SSL exploits, for example. And it means you can spot products and companies that are serious, by looking for use of standard algorithms and plausible claims. Companies using proprietary algorithms and really, really long encryption keys are suspect: a long key with an insecure algorithm is, of course, meaningless. * Well, as real as the smartest minds in the business can make it! (C) 2013 Voltage Security, Inc. All Rights Reserved

Encryption Algorithm Examples
DES: Data Encryption Standard Selected as standard by US government in 1976 Block cipher, uses 56-bit keys Considered insecure: as of 1999, “breakable” in < 24 hours TDES: Triple DES What it sounds like: DES applied three times Uses two or three different keys Thus at least 2112-bit key strength (168-bit with three keys) Considered secure, though relatively slow So let’s talk briefly about a few encryption algorithms. DES was a variation of an IBM algorithm called Lucifer. But Lucifer used 64- or 128-bit keys. When DES was announced with 56-bit keys, many were suspicious that The Gummint wanted it to be weak, and/or that it had a “secret back door”. However, eventually weaknesses were found in Lucifer—weaknesses the NSA likely knew about before 1999, when they were found by the community. The suspicions, however, probably contributed to the evolution of the crypto community’s collaborative peer-review approach, so it was ultimately goodness. (C) 2013 Voltage Security, Inc. All Rights Reserved

More Encryption Algorithm Examples
AES: Advanced Encryption Standard Adopted as US standard in 2001 128-, 192-, or 256-bit keys Relatively fast Blowfish, Twofish, Serpent… Similar to AES in strength Mostly a bit slower (with exceptions) Algorithms are public domain (as is AES) Dozens (hundreds!) more exist, of course Given AES’s ubiquity and proven strength, generally no reason to use anything else So DES is considered obsolete, although it’s perfectly viable for most applications. AES is the current US standard, and has been for most of a decade. The fact that AES is the standard means that it has likely been subjected to more research/attacks than any other algorithm. As such, the fact that it has survived is a pretty good indicator that it can be considered secure. Of course there are lots of others… By the way, you may have heard that “quantum computing” will make encryption “trivial” to break; I’ll believe it when I see it… (C) 2013 Voltage Security, Inc. All Rights Reserved

Implementing Encryption
OK, enough about what’s out there. Let’s talk about actually USING this stuff.

What is “Enterprise Encryption”?
A scalable, manageable data protection plan Standards-based, provably secure Applies across multiple data sources (databases etc.) Not just point solutions for specific data sources Cross-platform Everyone has multiple platforms nowadays Includes key management When I talk about Enterprise Encryption, I mean something that is strategic—not a point solution for one database or one application. Enterprise Encryption has all of these attributes: scalability, manageability, provable security, cross-platform, cross-application, cross-datasource; and all of these require good key management, which we’ll also talk about. (C) 2013 Voltage Security, Inc. All Rights Reserved

Encryption Is Difficult
Lots of different technologies Hardware-based, software-based, hardware-assisted DES, TDES, AES, Blowfish, Twofish, CAST, PGP, GPG … ! Companies have lots of data in lots of places Much of it probably of unknown value/use The sheer volume is daunting Difficult to imagine how to get started Easier to stick your head in the sand and hope it goes away First, it’s not just you—this stuff is complicated. There’s the usual alphabet soup, some of which we just talked about: TLAs, ETLAs, One fish two fish redfish blowfish, symmetric algorithms like DES/AES, public/private key stuff…who can keep it all straight? I know I can’t. And you have data all over: in databases, on servers, on tapes, on punch cards…ok, maybe not punch cards. For extra credit, identify the two pretty diagrams (AES and Twofish algorithms) (C) 2013 Voltage Security, Inc. All Rights Reserved

Encryption Is Scary Most of us don’t understand the technologies Math classes were a looong time ago It changes constantly We hear “DES has been broken, use AES” What does that mean? Is DES useless? Is AES next to fall? Lots of snake-oil salesmen in encryption touts “unbreakable encryption” Easy to decide encryption is unapproachably complex Like buying your first house, or doing your own taxes… Yes, if you get it wrong, you will lose data! Another reason prompting avoidance behavior… When one of our math PhDs tells me, “Elliptic curve encryption is based on a point on an elliptic curve representing the product of two large prime numbers”, I nod my head sagely. But I don’t claim to understand it, other than remembering that determining whether a large number is prime is hard. I had followed cryptography very casually for years before joining Voltage, enough to know that there were new developments happening in it, both good and bad. The popular press, of course, doesn’t help much: they love to say things like “DES has been broken!”, without even pretending to understand it. “Broken” in this case means that, with an only slightly insane amount of computing power, a brute-force attack on a DES-encrypted text can decrypt it in under a week. That doesn’t mean DES is useless—how often can someone afford that kind of attack? (and TDES—triple DES—is still quite effective). Neither was it a surprise: cryptographers warned that DES was weak from the start, due to its relatively short keys. But any encryption can be broken, in theory, given enough computing power and time. If you’ve followed encryption at all, you also know that there are lots of charlatans who pop up, claiming to have a new encryption algorithm that’s better than any previous (for some reason, data compression attracts the same kind of fraudster, perhaps because both involve complex math). For a good time, check out meganet.com, which claims “1 Mega Bit Unbreakable Encryption” The “unbreakable” claim is a dead giveaway, for anyone in cryptography knows that while some algorithms are unbreakable for all practical purposes given current technology, none is theoretically unbreakable. They also sell bomb jammers and encrypted flash drives. Not sure I want to trust my bomb-prevention strategy to them... (C) 2013 Voltage Security, Inc. All Rights Reserved

The Five Ws of Encryption
Why encrypt data? What should be encrypted? Where should it be encrypted? When should it be encrypted? Who should be able to encrypt/decrypt? How will you encrypt it? So, let’s look at encryption from the perspective of the “five Ws”: Why, What, Where, When, Who, How? (OK, that’s six, and the last one isn’t a W…though it has a W in it)

Why Encrypt? Every company has data to protect NPPI, PII, or just PI Customer information Internal account information Intellectual property Financial data Every company moves data around Backup tapes Networks Laptops Flash drives Data for test systems First, let me note some terminology: NPPI means Non-Public Personal Information PII means Personally Identifiable Information PI means Personal Information …and there are others, such as SPI (Sensitive Personal Information). I usually refer to “PII” as my term of choice for any of these. No matter what you call it, everyone has some data they want to protect—any of the things listed, plus the CEO’s bonus program. Huge volumes of data move every day, both formally and informally. Increasingly, that’s what our industry does—we don’t make things, we move data. (C) 2013 Voltage Security, Inc. All Rights Reserved

Why Encrypt? Different media have different issues Very few backup tapes get lost…but it does happen Networks get compromised fairly regularly Laptops are lost or stolen every day Flash drives are disposable nowadays Different media types mean different levels of risk Deliberate, targeted network breaches are obvious concern Missing backups probably won’t be read Missing laptops probably won’t be analyzed for PII Found flash drives are probably given to the kids And “stuff happens”. Backup tapes get lost; a Trojan gets into an internal network and phones home. Google Answers claims that over a million laptops are reported lost or stolen every year. I find that hard to believe (it’s about one every 30 seconds!), but the real number is surely large. And flash drives are so ubiquitous that many companies forbid their use—but short of strip-searching every employee, that’s pretty hard to enforce (although disabling USB ports works pretty well). Remember in the movie The Recruit, where Bridget Moynahan smuggled data out of CIA using a thumb drive hidden in a coffee mug? I’ve heard of crackers dropping infected flash drives in parking lots to try to get rootkits exposure to internal networks; not sure I believe it, but it’s at least possible. It’s important to recognize that the problems are different for different media: Hackers trying to break into your network are obviously an issue, but if a backup tape goes missing, should you worry? What about if a sales rep loses a laptop? What about a lost flash drive? Each of those becomes less likely to matter—but, again, you can’t afford to be complacent. (C) 2013 Voltage Security, Inc. All Rights Reserved

Why Encrypt? Breaches happen! Identity Theft Resource Center says:
Annual totals: 2009: : : : 447 Not improving…and what about undetected/small ones? Can you afford to bet your job/business? Data encryption is not a luxury Claimed cost per compromised card is $154–$215!!! * Heartland breach: 130M cards; TJX: 94M cards Do the math… * Source: Ponemon Institute $154 = negligent inside $215 = malicious/criminal act I already mentioned the real big breaches that made the news—RSA, TJX, WorldPay—but what about the tiny ones? If it’s your information that gets leaked, or your company’s, you don’t really care that it was “small”. And the numbers are growing. Here’s another of those statistics that seems hard to believe: if it really costs about two hundred bucks per compromised card, then the 2009 Heartland breach of 180M records cost $20 billion—that’s the GNP of Costa Rica. I suspect that the $200 is more like the average cost if a compromised card gets used fraudulently, which is obviously a lot fewer than the 130M cards in the Heartland breach. But it’s still significant. NOTE: While the total number of reported breaches isn’t changing much, the number of malicious attacks keeps rising: we’re getting better about accidental breaches, but the bad guys are getting better too!

Why Encrypt? Data breach sources: 73%: external 18%: insiders 39%: business partners 30%: multiple parties But insider breaches far more expensive: External attack costs averages $57,000 Insider attacks average $2,700,000! Source: Verizon Business Data Breach Investigations Report The 2010 Verizon Business Data Breach report said, in part, “In a finding that may be surprising to some, most data breaches investigated were caused by external sources. Breaches attributed to insiders, though fewer in number, were much larger than those caused by outsiders when they did occur. As a reminder of risks inherent to the extended enterprise, business partners were behind well over a third of breaches, a number that rose five-fold over the time period of the study.” It also said IT spending on security went from 5% of budget in 2008 to 13% in So awareness is growing. Cost numbers are from Ernst & Young. (C) 2013 Voltage Security, Inc. All Rights Reserved

Why Encrypt? Commonalities: Causes:
66%: victim unaware data was on system 75%: not discovered by victim 83%: not “highly difficult” 85%: opportunistic 87%: avoidable through “reasonable” controls Causes: 62%: attributed to a “significant error” 59%: from hacking or intrusions 31%: used malicious code 22%: exploited vulnerability 15%: physical attacks Here are some more numbers and some quotes from the Verizon Business Data Breach Investigations Report. “Most breaches resulted from a combination of events rather than a single action. Some form of error often directly or indirectly contributed to a compromise. In terms of deliberate action against information systems, hacking and malcode proved to be the attack method of choice among cybercriminals. Intrusion attempts targeted the application layer more than the operating system and less than a quarter of attacks exploited vulnerabilities. Ninety percent of known vulnerabilities exploited by these attacks had patches available for at least six months prior to the breach.” It goes on to say that “9 of 10 breaches involved some type of “unknown unknown,” the most common of which was data that was not known to be on the compromised system. Most breaches go undetected for quite a while and are discovered by a third party rather than the victim organization. Attacks tend to be of low to moderate difficulty and largely opportunistic in nature rather than targeted. Due, in part, to these reasons, investigators concluded that nearly all breaches would likely have been prevented if basic security controls had been in place at the time of attack.” The pictures, by the way, show an ATM skimmer, used in conjunction with a hidden camera so they get the PIN as well as the card data.

Why Encrypt? The law is catching up with the reality PCI DSS (Payment Card Industry Data Security Standard) Red Flag Identity Theft Rules (FACTA) GLBA (Gramm-Leach-Bliley Act) SB1386 (California) Directive 95/46/EC (EU) HIPAA etc. PCI DSS not only requires data encryption, but also: Restrict cardholder data access by business need-to-know This is called separation of duties It’s really pretty surprising, when you think about it, that we spent all those years not encrypting data. I told the story of the wrecked trucks with the backup tapes; at the time, nobody worried much about the fact that the tapes themselves could have been stolen from the crash site. As a vendor, I remember being surprised 20 years ago when a customer told me they couldn’t send me diagnostic output because it might contain customer information; now that seems obvious. So the result is all this legislation that specifies how PII must be handled and protected. Some of it is good and usable, like PCI DSS; some of it is so poorly worded, like HIPAA, that doctor’s offices don’t know whether to make you whisper your name when you come in for an appointment or to post it on a bulletin board. But it’s all evolving, and the end result is that we must protect data, even from internal access. PCI DSS is one of the most comprehensive standards. It’s focused on credit card info, but it can be applied to any PII. One of its most important requirements is this separation of duties. This means not only that the minimum-wage front line phone rep can’t troll the database collecting PII, but that the DBA can’t do it either. This raises an interesting challenge: how does the DBA do his or her job without full access? The answer is encryption: if an Evil DBA can only access the encrypted data, then (s)he can’t do anything dastardly with it. In the rare cases that access to the actual data is required, someone with appropriate authority can get involved. (C) 2013 Voltage Security, Inc. All Rights Reserved

What To Encrypt? Everything! (Well, maybe not…) Performance, usability, cost are barriers Partners likely use different encryption technology Changing every application that uses the data is prohibitive No single answer Laptops, flash drives: at least PII, probably all data Backup tapes: all data Whole-database encryption possible but not a good answer So, you’ve decided or been told “Thou shalt encrypt PII”. So what does that mean? How do you decide what to encrypt? In theory, the simplest answer would be to encrypt everything. And there are solutions—typically hardware-based, such as the new DS8000s and various tape drives—which encrypt everything automatically. But these really only save you from lost backup tapes; when was the last time anyone left a DS8000 at an airline gate? (Although full-disk encryption does have value: for outsourced data centers, disk encryption can help give an additional level of isolation/security; and when disks are decommissioned, there’s less concern about latent data.) Encryption isn’t free performance-wise: if encrypting the data makes you miss batch windows or slows interactive response unacceptably, it’s not a solution. If you exchange data with partners, they have to be able to decrypt it, too. And if you’re like most shops and have dozens/hundreds/thousands of applications, you’d rather not change every one of them. Again, the answer varies depending on the media. For laptops and flash drives, all data is often simplest, and modern hardware is fast enough that performance is probably acceptable. That’s assuming your users will put up with it, and won’t just write the passwords on Post-Its! For backup tapes, all data is definitely simplest. (C) 2013 Voltage Security, Inc. All Rights Reserved

What To Encrypt? Whole database encryption fails on several counts Can impose unacceptable performance penalty Prevents data compression, using more disk space etc. Violates separation of duties requirements Better to just encrypt the PII (whatever that is)! What about referential integrity and other data relationships? Database 1 and database 2 both use SSN as key If you encrypt them, encrypted SSNs better match! Else must decrypt every access, and indexes useless Whole database encryption really isn’t a good answer. First, it often imposes an unacceptable performance penalty, not just for the cost of decrypting the data you want: it tends to break most query optimizations, resulting in table scans every time. It also means data no longer compresses, which can use tremendous amounts of additional disk space (and I/O bandwidth, and backup tapes, and…) Whole database encryption also fails to meet the Separation of Duties requirements of PCI DSS and other standards, because it means the data is decrypted automatically on access. By the way, deciding what constitutes “the PII” is non-trivial: if you encrypt names and addresses, does encrypting SSNs matter any more? Or vice versa? Maybe not, maybe so. But selective database encryption of PII must preserve cross-database integrity—that is, the same encryption must be performed on all related databases, so references to, say, an SSN as a key still work. (C) 2013 Voltage Security, Inc. All Rights Reserved

Application & Database Encryption Today: Four Approaches
Whole Database Encryption Encrypt all data in DB—slows all applications No granular access control, no separation of duties No security of data within applications Column Encryption Solutions Encrypt data via DB API or stored procedure Major DB type/version dependencies No data masking support and poor separation of duties U2FsdGVkX1+ybFt... Encrypted CC# CC# Traditional Application-level Encryption Encrypt data itself via complex API Requires DB schema/application format changes High implementation cost plus key management complexity There are four basic approaches that folks usually take to encryption in the real world. Encrypting the whole DB is unpalatable for the reasons I just discussed. Encrypting specific columns using a database API, stored procedure, or user-defined function can work, but with most solutions, is complex and typically doesn’t support separation of duties. DB2 V10 adds some controls for this, which is a big step forward. With most encryption, you have to rework the database and the application anyway, since the data changes format. And that also can affect middleware, which may be dependent on data format (for example, I believe DB2 Connect does ASCII-EBCDIC translation, which means that the data must be decrypted before getting to DB2 Connect). Finally, an option that has some advantages is a lookaside database, aka “Tokenization”, because you’re replacing the actual data with a “token” -- something that represents the real data. This means that an application scans the database, replacing each unique PII value with a fake value. The good news here is that this preserves referential integrity and makes it easy to “roll” keys (which I’ll talk about later); the bad news is, it requires a lookup into another database whenever data is accessed. This often requires significant work in the application, and can have a major performance impact. It also doesn’t necessarily fully solve the problem: it just moves the data into another database. This might be easier to protect, but the fundamental issues remain. It also means that you have to have access to this other database at all times – “offline” operation is not possible. We’ll talk about tokenization some more later. Lookaside Database (aka “Tokenization”) CC# indexed, actual CC# in protected DB Requires online lookup for every access Requires major rearchitecting; scope issues 383491 CC Index Account # CC# (C) 2013 Voltage Security, Inc. All Rights Reserved

Where To Encrypt? Different question than “what”: Data at rest and in motion Data at rest “Brown, round, and spinning” (disks of all types) On tape (backup or otherwise) Data in motion Traversing the network Where data is encrypted is also important. The jargon used is data at rest and data in motion. Data at rest means data residing on disk or tape (or CDs, or punch cards); data in motion means data traversing the network. The threats are different, so the encryption requirements are different. I suppose your laptop being carried through an airport is sort of “data in motion” too…though we don’t usually think of it that way! I heard of a company whose VP of Sales managed to lose a laptop by leaving it on the trunk of a car: when he drove away, it of course fell off. An unemployed techie found it, pieced it together enough to fire it up, and then managed to break into it enough to figure out the company it belonged to. He showed up at corporate headquarters saying, “I think this belongs to someone here”; not only was the company appreciative, but they asked him about it, and wound up hiring him—and he’s become one of their most valued employees. So now when they have a slot to fill, they joke about sending the VP out to troll for techies with his laptop… (C) 2013 Voltage Security, Inc. All Rights Reserved

Where To Encrypt? Data in motion particularly troublesome
How do you know if it’s been sniffed as it went by? Data at rest somewhat easier Intrusion detection systems fairly effective (if installed and configured, and if someone actually checks the logs) ESMs very effective on z/OS (if administered correctly) Different issues, thus different criteria! The big problem with data in motion is the threat of “sniffing” it as it goes across the wire. This seems unlikely in most cases, but it does happen—and shops with particularly valuable data do get targeted. Once it gets out into the Big Bad Internet, if it’s not encrypted, there is a small but real chance that it might be intercepted. SSL solves this problem. For data at rest, External Security Managers and Intrusion Detection Systems help let us at least know after a breach occurs. (Wireshark picture)

When To Encrypt? Ideally, data is encrypted as it’s captured By the data entry application, or the card swipe machine In reality, it’s often done far downstream The handheld the flight attendant just used—is it encrypting? Did last night’s restaurant encrypt your credit card number? If the data goes over a wireless network, is it WEP? WPA? “Doing it right” is harder: more touchpoints Easier (if less effective) to say “Just encrypt at the database” Avoids interoperability issues (ASCII/EBCDIC, partners) When to do the encryption is closely related to where, but I mean “at what step in the process”, not “on what medium”. Clearly the way to minimize risk is to encrypt PII as soon as possible, preferably at the point of entry—in the data entry app, or the card-swipe machine, or wherever it first becomes electronic. Of course this is not the case today: the waiter takes your credit card into the back room and processes it. Hopefully he doesn’t also write down the info for his own use. In the EU and some other countries, the card may not leave your presence: they bring you a handheld device, similar to the ones you’ve seen on airplanes. Of course, some of those handhelds transmit the data wirelessly for processing. Is that data encrypted? One can only hope. For obvious practical reasons: it’s a lot easier to say “We’ll encrypt it when it gets to the database, so we only have to deal with the issue in one place.” But that doesn’t make it a good reason. Note that with traditional encryption technologies, data that ends up on the mainframe may also have ASCII/EBCDIC translation issues, which are made much more complex if the data is encrypted before passing over the translation boundary. (C) 2013 Voltage Security, Inc. All Rights Reserved

Who Can Encrypt/Decrypt?
Usual question is: Who can decrypt? Who should have the ability to decrypt PII? Should your staff have full access to all data? Many unreported (or undetected) internal breaches occur What if someone leaves the company? How do you ensure their access is ended? What if an encryption key is compromised? Can you revoke it, so it’s no longer useful? PCI DSS et al. require these kinds of controls This is a big deal—not trivial to implement “Who” is the Separation of Duties issue: Only people with need-to-know should be able to see the unencrypted data. Customer Service reps, for example, often only need “last four digits”: they do not need to see the entire SSN or credit card number! Separation of Duties means that access to unencrypted data is limited to staff who have a legitimate need to see that data. Not to pick on DBAs, but they do not typically need the ability to see my credit card information! They need enough access to be able to do their jobs, of course, but no more. They might need to be able to find my record and see whether there is a credit card number; they don’t usually need the number itself. So you not only need to protect the PII, but also to protect it with some granularity and control. (C) 2013 Voltage Security, Inc. All Rights Reserved

How Will You Encrypt Data?
Hardware? Software? Many options exist for both Is a given solution cross-platform? If not, you must decrypt/re-encrypt when data moves AES? TDES? Symmetric? Public/private key? Many, many choices exist—too many! Finally, we get to how. There are tons of encryption products. There are general software solutions; there are built-in database solutions. There are hardware solutions—encrypting tape drives, encrypting DASD. (IBM 4764 PCI-X next to an nCipher (Thales) HSM) (C) 2013 Voltage Security, Inc. All Rights Reserved

How Will You Encrypt Data?
Different issue: How do you get from here to there? 100M++ data records—how to encrypt without outage? “Customer database down next week while we encrypt”?! What about data format changes? Encrypted data usually larger than original Does not compress well (typically “not at all”) Database schema, application fields expect current format Can you change everything that touches the data? (Should you need to?) Besides the specific technology, you also have to cross the Rubicon and encrypt the existing production data: how do you make that leap? You can’t go out of business for a week while you do it. Changing the data format has huge application impact, and also tends to make the data larger, as well as randomizing the data enough that it doesn’t compress as well as plaintext values. This can be significant with storage subsystems that do automatic compression. This is why storage subsystems that do automatic encryption typically do automatic compression before encrypting, but again, that gets you back to whole-data encryption, which isn’t that useful. (C) 2013 Voltage Security, Inc. All Rights Reserved

Key Management “Encryption is easy, key management is hard” Ultimately, encryption is just some function applied to data To recover the original data, you need key management Three main key management functions: Give encryption keys to applications that must protect data Give decryption keys to users/applications that correctly authenticate according to some policy Allow administrators to specify that policy: who can get what keys, and how they authenticate A mantra at Voltage is that “Encryption is easy, key management is hard”. I’m not sure who said this first, but it’s certainly true. As with many things, the first and most obvious problem isn’t necessarily the hardest. Encrypting the data today is relatively easy; five years from now when you need to get at that data, you’d really, really like to be able to find the key to decrypt it! So good key management has to do these three things: Generate keys for encryption on demand Provide the correct keys for decryption to users or apps that are authorized to get them And of course allow admins to control that authorization. Obviously this problem is worse with outsourced development groups—now you have to either decrypt data before sending it to them, or get them keys somehow. (C) 2013 Voltage Security, Inc. All Rights Reserved

Key Management Key servers generate keys for each new request Key server must back those up—an ongoing nightmare What about keys generated between backups? Maybe punch a card every time a key is generated… Alternative: Derive keys on-the-fly No key database, no ongoing backups/synchronization What about distributed applications? How do you distribute keys among isolated networks? What about keys for partners who receive encrypted data? “Allow open key server access” not a good answer Suggest it, watch network security folks’ heads explode So typically an application says “Please, Mr. Key Server, may I have a key named FRED?” The key server verifies that the requestor is entitled to that key, then looks in its database to see if it already knows that key. If so, great, it serves it. If not, it generates it randomly, makes sure it isn’t a duplicate of an existing key, stores it in the database along with the key name, and serves it. This presents some significant management challenges, because every time you generate an encryption key, you really don’t want to lose it. How do you back it up instantly? If you do backups once an hour, what happens if the key server dies between backups? Are the keys generated since the last backup gone forever? That would be crossing-the-streams Bad. The usual solution is to pre-generate keys, so you have (say) a week’s worth already backed up before they’re used. That assumes your key consumption rate is at least semi-predictable, of course. You also need to keep multiple key servers synchronized, which probably means that every time a key is generated, there’s some sort of two-phase commit-type activity, since if two key servers both create a key named FRED at the same time, chances are those keys won’t be the same—and that way lies madness. A better method is to derive keys dynamically, using standard key generation algorithms plus a base key, instantiated when the key server is created (and backed up then), plus a key name provided by the application. So FRED and the base key are combined to produce a key. And this key can be reproduced at will; it also means the key server can be cloned—for high availability/failover and/or geographic distribution—without any synchronization issues. And if you need a key from ten years ago, you restore the server and the base key material and you can regenerate that key—you don’t have to restore a terabyte database just to be able to fetch one 256-bit key. The key distribution problem is also nasty. This is what PKI is all about: avoiding the hassles of distributing keys. We can’t say that PKI has been a wild success, however. And you can’t just let anyone get at your key server; you often can’t even get a firewall opened for a partner. So how do you do it? (C) 2013 Voltage Security, Inc. All Rights Reserved

Getting There From Here: A Realistic Approach
OK, we’ve spent some time looking at the problems. Now let’s talk about how you can solve them without losing your sanity.

A Realistic Approach: Take A Deep Breath
Investigate encryption, now or soon Better now than after a breach That light at the end of the tunnel is a train! Understand that choices have far-reaching effects Data tends to live on for a very long time Expect to use multiple solutions Backups, laptops, databases all have different requirements “Right” answer differs E.g., for backups, hardware-based solution; for customer database, column-based encryption My goal here isn’t to scare you: lots of companies even more dysfunctional than yours have implemented enterprise encryption successfully—and you can too. First, it behooves you to keep track of encryption options, and to start planning to encrypt at least some data. Remember that old saying: “data never dies—it just flakes off of old backup tapes”. In other words, you’re going to be stuck with your solutions for a long time. And don’t look for a magic bullet—different media require different solutions. (C) 2013 Voltage Security, Inc. All Rights Reserved

A Realistic Approach: High-Level Roadmap
Classify data by degree of sensitivity This is harder than it sounds! Analyze risks: Security costs How secure can you afford to be? Implement solution (remediation) Must be a gradual process Use compensating controls sparingly By definition, they’re suboptimal Goal: persistent encryption everywhere Best achieves regulatory compliance 1. Data Classification 2. Risk Analysis 3. Remediation So a very early step is to do that analysis of your data: what is the PII? How P is it? There’s an old saying from the racing business: “Speed costs; how fast can you afford to go?” and this applies to security, too: “Security costs; how secure can you afford to be?” You cannot guarantee 100% security, but you need to be able to quantify the costs of a breach to some degree. Even if you can’t get anyone to do anything about it, you’ll be better positioned when something bad does happen. PCI DSS and other regulations mostly define remediation as encryption, but they also talk about “compensating controls”. This means having alternative protections in place when a PCI DSS requirement cannot be met. For example, if a group does not have enough resources to provide proper separation of duties, then they must demonstrate that there is some post-operational review by a manager to ensure that access was appropriate. So on the way to encryption, you may employ some compensating controls—it isn’t “all or nothing, day 1”. 4. Persistent Encryption 3a. Compensating Controls (C) 2013 Voltage Security, Inc. All Rights Reserved

A Realistic Approach: Key Steps
Key: Involve stakeholders across the enterprise “No database is an island”: multiple groups use the data Partners, widespread applications need access too… Key: Find a “starter” application Generating test data from production is a good beachhead If you “get it wrong”, you haven’t lost anything “real” Key: Designate data by sensitivity: Red: Regulated (legally required to be protected) Yellow: Intellectual property or other internal (unregulated) Green: Public Each requires a different level of isolation/encryption As with any such wide-ranging project, it’s important that it not be a secret IT effort. It’s going to affect lots of folks, and they need to be on-board. Otherwise they’ll at best be behind, at worst openly resistant or even hostile to the effort. Everyone worries about losing data because a decryption key gets lost somehow. That’s a valid concern, and it’s happened. So you need to find a “safe” place to start; if you have any need for test data, that’s a good one, because it lets you validate the process without risking anything “real”. `One of our early customers came up with this “red/yellow/green” scheme, and it works quite nicely. They’re a multinational telecom, with thousands of applications on most every version of any platform you might imagine, and they did their analysis and came up with a couple of dozen fields that need protection, everything from name and SSN to criminal record. Something similar works for pretty well any enterprise. (C) 2013 Voltage Security, Inc. All Rights Reserved

A Realistic Approach: Proof of Concept
Encrypt a representative database “Database” could be RDBMS, .CSV, flat file... Update application(s) that access it You know what all your applications do, right?  Validate performance, usability, integrity Encryption is not free: may see significant performance hit Demonstrate to other groups Invite discussion, counter-suggestions Once (if!) project approved, request executive mandate Otherwise, some groups may simply not participate Your encryption POC should represent where your data actually lives: encrypting a tape doesn’t prove that you can do real-time encryption in your customer service app, now does it? So pick a database. Update the apps that use that data—this is where those involved stakeholders get to speak up and say “Oh, we use that database for <something you didn’t know about>“. Be prepared for (and measure!) the performance impact. Once you have the POC working, demonstrate it to the other groups who will be affected, and get their input. They may still be hostile—”It ain’t broken” may be their attitude. Once you’ve proven that it works and have approval, you may need a mandate saying “You WILL get on board with this”, or you’ll be waiting forever on those groups. (C) 2013 Voltage Security, Inc. All Rights Reserved

A Realistic Approach: Finishing the Job
Doing all databases/applications takes time Expect glitches Perhaps most difficult: understanding data relationships Table A and Table B seem unrelated, but aren’t Lather, rinse, repeat… Each database will have its own issues/surprises Of course this all makes it sound easy. If you have one database and one app, it’s not that hard. But who fits that description? Make sure folks are aware that there will be some glitches, that there will be complex data interrelationships that nobody understands (or at least, nobody remembers until something breaks). The real grind is going through the rest of the apps and databases. (C) 2013 Voltage Security, Inc. All Rights Reserved

Alternatives to Traditional Encryption
So we know that traditional encryption is pretty secure, but apps take a big hit – lots of data format changes, so lots of application changes. There are a couple of other approaches that are pretty interesting.

Tokenization Tokenization is another approach to data protection Replaces values with randomly generated values Values stored in database, requires database lookups Confusion abounds re tokenization vs. encryption Some QSAs think tokenization is better—”no encryption key” Cryptographers see the database index itself as the key Traditional tokenization has some challenges Ever-growing databases: replication, backup, collisions Enter Voltage Secure Stateless Tokenization technology! Avoids all these—no database: random table instantiated once I mentioned Tokenization back a few slides ago, talking about the four usual approaches to data protection. Tokenization is something that a number of companies are pushing as an alternative to encryption. With tokenization, you replace specific data elements (credit card numbers, SSNs, etc.) with “tokens” – randomly generated values that look like the original value. This is nice because the encrypted data still looks and feels like the original, but of course isn’t. The downside is that rather than solving the problem, it just moves it: the tokenization database is now where the real data lives, AND you must have access to this database every time you need a plaintext. So every data access now involves a second database access, probably across the network. Yes, it may be easier to protect this one database than the thousand other databases that contain those SSNs or credit card numbers, but depending on the environment and application, that second database dip may not be practical. Also, some QSAs seem to think tokenization is somehow preferred from a security standpoint, because there is no encryption key per se. In fact, from a cryptographic perspective, the database index itself is a key – a single key that holds all the golden eggs. The existing standards such as PCI are vague about tokenization vs. encryption; as they evolve, clarity will emerge. Voltage has a technology called Secure Stateless Tokenization, which produces tokens from a table of random values that’s generated at installation time. This avoids all the database issues, and also allows copies of the tokenization server to exist, without any synchronization issues: because they all use copies of the same table, they’re guaranteed to produce the same results. This is patent-pending technology, with available security proofs. (C) 2013 Voltage Security, Inc. All Rights Reserved

Format-Preserving Encryption
Format-Preserving Encryption is another choice Data encrypted with FPE has same format as input Encrypted SSN still 9 digits; name has same number of characters; credit card number has same number of digits… Name SS# Credit Card # Street Address Zip James Potter 1279 Farland Avenue 77901 Ryan Johnson 111 Grant Street 75090 Carrie Young 4513 Cambridge Court 72801 Brent Warner 1984 Middleville Road 91706 Anna Berman 2893 Hamilton Drive 21842 Format-Preserving Encryption is a technology that lets you encrypt data and get back something that’s the same size and uses the same character set: SSNs are 9 digits, credit card numbers are n digits, names and addresses have the same pattern of upper/lowercase characters, numbers, and blanks, etc. You can define patterns too, for things like customer IDs – “5 letters followed by two digits”, if you were doing CA PMFIDs, for example. Here’s an example: the last name, entire SSN, part of CCN, and street addresses have been FPEd. Of course the first name and ZIP could be, too. Name SS# Credit Card # Street Address Zip James Cqvzgk 289 Ykzbpoi Clpppn 77901 Ryan Iounrfo 406 Cmxto Osfalu 75090 Carrie Wntob 1498 Zejojtbbx Pqkag 72801 Brent Gzhqlv 8261 Saicbmeayqw Yotv 91706 Anna Tbluhm 8412 Wbbhalhs Ueyzg 21842 (C) 2013 Voltage Security, Inc. All Rights Reserved

Format-Preserving Encryption
Format-Preserving Encryption benefits: Avoids database schema changes Minimizes application changes In fact, most applications can operate on the encrypted data: Fewer than 10% of applications need actual data Another Voltage technolgoy Invented by Voltage Security, based on work at Stanford Future AES mode, standards process almost completed Google “nist ffx” Peer-reviewed, proven technology—not snake oil! With FPE, you don’t have to change the database schema, the middleware, or most of the applications that touch the data. Our customers tell us that they’ve found that 90%+ of their apps don’t care about the actual PII, since they can use the encrypted values for indexes/keys etc. (To search, you FPE the search term and then find that.) When I first talked to Voltage, this sounded like snake oil, but once it was explained to me, I understood it, at least enough to believe in it. And with it having passed peer review, I believe in it as much as AES itself. As does NIST. It’s based on combining two encryption concepts: variable-length AES keys, and “distilling” or “packing” data. Think of encrypting digits: there are ten values for each digit. So you take the first digit, and add it to your running total (zero). Then you take the second digit, multiply it by ten, and add it to the running total. And so on. For uppercase only, you’d assign ordinal values to each letter: A = 0, B=1, etc. (you’ve done this implicitly with digits). If you were encrypting hex (0-F), you’d use a set of 16 values, and so on. Once you’ve done that with the input data, you have a “distilled” value that you can encrypt. Getting the right length of key is a bit tricky, and consists of using a Feistel network (an encryption construct used in DES and AES) to generate a secure but shorter key for the last (or only) block. You then encrypt the distilled data using that key, and “undistill”. Thus the output uses the same alphabet. (C) 2013 Voltage Security, Inc. All Rights Reserved

FPE: Cross-Platform Capable
ASCII/EBCDIC issues go away Data converted to Unicode before encryption/decryption Results in native host format (ASCII or EBCDIC) Possible because character sets are deterministic (FPE!) Encrypt/decrypt where the data is created/used Avoids plaintext data ever traversing the network With Format-Preserving Encryption, you can process the data internally as UTF-8 (or, I suppose, EBCDIC). This works because the encryption operates on glyphs – representations of characters. That is, when you say “encrypt numeric digits”, you don’t mean F0-F9, you mean “whatever 0-9 look like on this platform”. Same for A-Z, etc. And since the output is in the same character set, it gets returned in the local format, and you can store it in the local format, and then you can decrypt it at either end (after appropriate ASCII-EBCDIC translation, of course—which your application must be doing already on the plaintext). This means that plaintext never has to go across the network, and that no additional special handling is required by applications – they can just do database queries etc. locally and add a single “decrypt this please” line. Encrypt on mainframe, decrypt on distributed Decrypt on mainframe, encrypt on distributed (C) 2013 Voltage Security, Inc. All Rights Reserved

FPE for Data Masking Application testing needs realistic datasets Fake sample datasets typically too small, not varied enough Best bet: Use production data…but: Test systems may not be as secure Testing staff should not have full access to PII! Answer: Use FPE to mask (anonymize) test data With FPE, encrypted production data is already usable for test No extra steps required! Now, another use for encryption is for data masking or obfuscation; I alluded to this before. To test applications, you need data. The best data is your production data (or a subset of it), since you know it’s representative. But random testers really shouldn’t be able to see details of customers that they don’t need to see (and some of the legislation I’ve mentioned in fact prohibits such access, or at least strongly discourages it). But if you can encrypt that data without changing its format, then you can use it for testing. And Format-Preserving Encryption makes that easy. And if you use a random key, the data is essentially impossible for anyone to ever decrypt. You can also use a key that you keep secure (either generated by the key server, or specified explicitly), so that you can decrypt test data when absolutely necessary for debugging. This is what I call the “sealed envelope in the safe” approach. (C) 2013 Voltage Security, Inc. All Rights Reserved

Making Intelligent Choices
I could go on for several more hours here, but you’d all fall asleep (or leave). So I’m going to wrap up with three slides that discuss how to make intelligent choices among the far-too-many options. Approaches and Options

Choose a Hardware Solution?
Many choices: encrypting disk arrays, tape drives… Minimal performance impact Biggest issue: hardware only protects hardware! Layers “above” all unprotected Use case determines value Also: key management Usually proprietary Difficult to integrate Separation of Duties? Security Coverage Application Middleware Security Gap (data in the clear) Network Security Gap (data in the clear) Everyone likes the idea of hardware, because it avoids adding load to existing systems. The bad news it provides less protection. So you need to understand the use case. We like to use the stack diagram you see here to show that a data protection solution at one level only protects at that level and below. I already talked about this in the context of whole-database encryption: sure, it’s transparent, but anything that can access the database has full access. This is even more the case with hardware solutions: that transparency costs you some security. It also costs you some control: the keys used by that hardware need to be managed, and so far things like Key Management Interoperability Protocol, or KMIP, haven’t taken hold. That means that the key technology used by the hardware is going to be specific to the hardware, so you can’t easily integrate it into your overall key management strategy. Database Security Gap (data in the clear) OS/Hardware (C) 2013 Voltage Security, Inc. All Rights Reserved

Or a Software Solution? Encryption software products fall into two categories SaaS/SOA/SOAP (web services) remote server-based Native (encryption/tokenization performed on local machine) Some are very narrow “point” solutions E.g., platform-specific, or file encryption only Do you want to manage dozens of products? (Hint: “No”) Enterprise solution is cross-platform, solves multiple problems Web services is good for low-volume/obscure platforms Native solutions perform better, avoid network vagaries Software products for data protection fall into two categories: Services accessed on a server across the network Code that runs locally—natively on the machine requesting the encryption or tokenization We’ve seen large companies with literally dozens of point solutions, each of which requires its own management strategy. That isn’t what you want: not only is there a lot more management to do, but there’s a lot more decryption and re-encryption as data travels between systems using those various point solutions. Code that runs natively provides much higher performance than server-based processing, even with a fast network, due to network latency and SSL overhead. I have seen cases like the Australian PTT, whose systems are state-of-the-art Tru64 UNIX on DEC Alpha, for which web services was the only practical answer, since nobody wanted to develop or support native code for that! (C) 2013 Voltage Security, Inc. All Rights Reserved

Questions to Ask Security-related questions Are algorithms strong, peer-reviewed? Does solution support hardware security modules/assists? Is key management part of the solution? Operational/deployment questions Is implementation cost reasonable? Is implementation under your control? Is product multi-platform? Finally, once you’ve found some candidate products that claim to meet your requirements, here are some questions you’ll want to get answers to: Is their crypto proven, strong, peer-reviewed? As we’ve discussed, there’s generally no reason to use anything but AES; if it’s an asymmetric use cases, it should usually use “wrapped” AES. Does it support hardware assists, such as on-the-chip AES instructions, or hardware security modules? These improve performance, eliminate side channel risks, and generally make auditors happier, which is always a good thing. What’s the key management strategy? Must keys be stored multiple places, independently secured? If you will have to do key rollover (for PCI DSS compliance, for example), how difficult is that and how long does it take? If your production database is going to take a month to do key rollover, that’s not workable. What about long-term historical key access? If you need a key for data that hasn’t been used in five years, will you be able to find it? This is of course another reason for a single, enterprise solution: trying to manage multiple key archives forever is exponentially more difficult. What does it cost to implement? Some vendors offer cheap software because their main play is professional services—millions of dollars over many months of consulting services. And every time you need to add or upgrade, you get to call them again and get out your wallet. Consider having to train tens or hundreds of different developers in the solution—is that going to be a never-ending drain on the security office? You’re your folks develop crypto expertise to exploit it? It shouldn’t be—they’re applications folks, they should be able to view data protection as a function call that does a data transformation, like uppercasing. And as I’ve already discussed, is it multi-platform? If you know that you need data protection on multiple platforms, this is a very important requirement. And even if you don’t think you do, what happens if/when encryption use spreads? (And it will!) (C) 2013 Voltage Security, Inc. All Rights Reserved

Summary So in conclusion…

Conclusion Encryption is not a luxury, not optional today Many solutions exist Different data/media require different solutions A complex topic, but one that can be mastered! I hope this has given you some idea what’s involved in enterprise encryption. The nastiest thing about encryption is that nobody wants to do it—it doesn’t make life better (except maybe around audit time), and it’s never a fun thing. Done right, in fact, you can’t even tell it’s there. (C) 2013 Voltage Security, Inc. All Rights Reserved

Encryption Resources InfoSecNews.org: /RSS feed of security issues infosecnews.org/mailman/listinfo/isn Voltage security, cryptography, and usability blog Bruce Schneier’s CRYPTO-GRAM monthly newsletter schneier.com/crypto-gram.html RISKS Digest: moderated forum on technology risks catless.ncl.ac.uk/risks US Computer Emergency Response Team advisories us-cert.gov/cas/signup.html Track breaches: privacyrights.org and databreachtoday.com and datalossdb.org and idtheftcenter.org and breachtool.html Here are several useful data sources that you might want to check out. InfoSecNews gets you daily updates on security issues. Our blog isn’t necessarily updated daily, but has some good stuff. Schneier is always interesting, if provocative. I subscribe to probably newsletters, and his is the only one I don’t read and dispose of in one shot. RISKS has been around for decades, and is an occasional mailing, required for all techies. The CERT advisories should be on everyone’s subscription list-news about vulnerabilities. Search for “breaches” on the last four—they keep changing the sub-pages, but they’re easy to find. (C) 2013 Voltage Security, Inc. All Rights Reserved

Questions/Discussion
Phil Smith III

Enterprise Encryption 101

Similar presentations

Presentation on theme: "Enterprise Encryption 101"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Enterprise Encryption 101

Similar presentations

Presentation on theme: "Enterprise Encryption 101"— Presentation transcript:

Similar presentations

About project

Feedback