SWE 681 / ISA 681 Secure Software Design & Programming: Lecture 5: Calling Out to Other Components (including injection) Dr. David A. Wheeler 2018-06-04.

SWE 681 / ISA 681 Secure Software Design & Programming: Lecture 5: Calling Out to Other Components (including injection) Dr. David A. Wheeler

Abstract view of a program
Process Data (Structured Program Internals) Input Output Call-out to other programs (also consider input & output issues) You are here

The XKCD cartoon on SQL injection
Randall Munroe, “Exploits of a Mom”, XKCD, - Included under the conditions of

Outline General issues Focus on metacharacters & issues of:
SQL (countering “SQL injection”) CSV Command/shell (countering “command injection”) Filenames Insecure deserialization Assume breach, detect/respond, & logs Check what you get back

No Man (or Program) is an Island
Practically no program is truly self-contained Nearly all programs depend on/call out to other programs for resources Operating systems Software Libraries Local services Remote services (DNS, web servers, tile servers, etc.) Sometimes dependency is not obvious Transitive dependency (A depends on B depends on C …) Leads to dependencies on many components/hidden infrastructure E.G., dynamic libraries, kernel modules, language run-times, C run-time, remote webserver operating systems, etc.

Be careful about the components/ services your program trusts
What components/services you trust… and how much? What do you send to them (output to them)? Read the documentation – what’s allowed/supported? Does the component trust its data, e.g., does it auto-download refs or execute embedded code? Encode/filter as necessary How do you send data? Alternatives? Encrypted channels? What do you accept back (input from them)? How much do you trust the results? Maybe check it! How do you accept it back, and what happens if it takes too long?

Call only safe library routines
You want to reuse libraries But their specifications may not guarantee that they’re secure Their specification may even guarantee they’re insecure Implementation may be insecure on your platform/environment Prefer libraries that guarantee what you want Use the right library – e.g., don’t use eval(), use something specialized E.g., when parsing JSON with JavaScript, use JSON.parse() not eval() Sometimes you can wrap a routine to do what you need Where possible, test if it provides capabilities you need for security Perform test in compilation, installation, and/or at start-up You may sometimes need to re-implement a library if it’s insecure If you can’t be sure, re-implement It is your users who are hurt if you choose an insecure library You are responsible for the libraries/infrastructure you choose

AJAX & JSON AJAX = “Asynchronous JavaScript and XML”
Common set of technologies/techniques Often uses JSON = JavaScript Object Notation For data serialization, original spec RFC 4627 JSON example: { "firstName": "David", "lastName": "Wheeler", "address": { "streetAddress": "1600 Pennsylvania Ave", "city": "Washington", "state": "DC" } JSON doesn’t allow trailing commas (JSON5 does)

Don’t just “eval” untrusted data… JSON is an example
In general, don’t run “eval” with untrusted data Most JSON-formatted text also syntactically legal JavaScript code “Easy” way to parse data in JavaScript is via eval() Incorrect - doesn’t support some Unicode characters Security vulnerability if data & Javascript environment not controlled by single trusted source E.G,. malicious Javascript attack, application forgery, etc. In general, don’t “eval” untrusted data!!! In JavaScript, instead use newer function JSON.parse() Mozilla Firefox 3.5+, MS IE 8+, Opera 10.5+, Google Chrome, Safari

Limit call-outs to valid values
Ensure any call out to another program only permits valid and expected values for every parameter Including libraries, daemons, remote servers This is more difficult than it sounds Many library calls or commands call other components in potentially surprising ways Don’t always clearly document exactly what’s safe Result: Need to be conservative Common problem: Metacharacters

Metacharacters Metacharacters = characters in an input that are not interpreted as data Metacharacters instead control how the other characters are interpreted May be commands, delimiters of data, etc. If there's a language specification, it has metacharacters Examples: SQL: the single-quote ' can begin/end a string POSIX shell: $ can begin a parameter reference, e.g., $HOME If your program allows attackers to insert such metacharacters, don’t pass them on unescaped You often must allow metacharacters through (“O'Mally”)

SQL – Getting Connected in Java
import java.sql.*; class ExecuteSqlQuery { public static void main(String[] args) { // In real code, use try {…} catch () … to deal with database connection failure, etc. String connectionURL = "jdbc:mysql://localhost:3306/mydatabase"; // Load JDBC driver for MySQL (for example): Class.forName("com.mysql.jdbc.Driver").newInstance(); // Connect given connectionURL, username, password: Connection connection = DriverManager.getConnection(connectionURL, "root", "root"); // What’s wrong with this password handling approach? Statement statement = connection.createStatement(); ResultSet rs = null;

SQL – How to do it wrong … // User input search_lastname: String QueryString = "select * from authors where lastname = ' " + search_lastname + " ' "; // data surrounded by single-quotes rs = statement.executeQuery(QueryString); while (rs.next()) { System.out.println(rs.getString(2) + "\n"); } // … eventually close … rs.close(); statement.close(); connection.close(); This string concatenation, followed by an execution, takes the untrusted user input data & passes it directly to an interpreter. Bad idea.

SQL Query: Intent & Actual
Intent of programmer is to create: select * from authors where lastname = 'user_input' Imagine attacker provides, as input value: name' OR 'a'='a Resulting query is: select * from authors where lastname = 'name' OR 'a'='a' Last part always true, so whole table returned Single quote is a SQL metacharacter

SQL Injection SQL injection = attack where:
An attacker inserts data That will eventually be supplied to a SQL interpreter In a way where that data will be misinterpreted (e.g. as a metacharacter) Previous input example, name' OR 'a'='a, is an example of a SQL injection attack

SQL injection: Other common attacks
Massive variation in SQL interpreters SQL “standard” isn’t So attacks vary depending on interpreter Common sequences in SQL injection attacks: Single/double quote (as already seen) Using “;” as command separator Insert whole new commands after separator Using “--” as comment token Specify “ignore material afterwards” & foil limiting text Again, don’t try to create a blacklist

Solution for SQL Injection: Prepared statements
“Prepared statement” to identify placeholders Pre-existing library then escapes it properly Properly-implemented Object-Relational Mapping (ORM) systems internally use prepared statements Many advantages Library does the escaping for you – simpler, more likely to get right Tends to produce easier-to-maintain code Tends to execute faster Especially important for SQL – different SQL engines can have different rules

Java “prepareStatement” method
public interface “Connection” represents a connection (session) with a specific database SQL statements executed & results are returned within context of a connection Includes method PreparedStatement prepareStatement(String sql) throws SQLException … Creates a PreparedStatement object for sending prepared SQL statements to the database has the effect of escaping metacharacters in each parameter “sql” is an SQL statement that may contain one or more '?' IN parameter placeholders

Prepared statements example (Java)
String QueryString = "select * from authors where lastname = ?"; PreparedStatement pstmt = connection.prepareStatement(QueryString); // Set first param - library escapes it pstmt.setString(1, search_lastname); ResultSet results = pstmt.execute( );

Warning: “prepareStatement” can be misused, too!!
Prepared statements only work if all input influenced by untrusted users is actually prepared E.g., substituted as “?” in query string & setString Don’t do this (where search_lastname is untrusted): String QueryString = "select * from authors where lastname = ' " + search_lastname + " ' "; PreparedStatement pstmt = connection.prepareStatement(QueryString); ResultSet results = pstmt.execute( ); I’m using prepared statements, so nothing can go wrong … can go wrong … can go wrong …

Other (SQL) injection countermeasures
Writing your own escape code Be careful – identify characters that must not or don’t need escaping (e.g., alphanumerics) & escape everything else (whitelist) Rules vary depending on SQL engine & its version Don’t do this with SQL – use libraries designed for purpose! Stored procedures – can help prevent SQL injection attacks By limiting types of statements that can be passed to their parameters However, attackers can often work around those limits Can prevent some attacks, but by themselves they don’t counter SQL injection (you still have to escape, etc.) Prevent metacharacters from getting in as input (input validation) If you can, do it. ASCII alphanumerics are normally not metachars Input validation often cannot be sole countermeasure; often must accept some metacharacters (e.g., '). Input validation not enough Do not depend solely on input validation to counter injection Mistakes too easily made & future mods may require removing restrictions

Not just SQL Nearly all database systems (SQL or not) support languages At least for queries So need to prevent metacharacter misinterpretation Use a prepared statement library if you can If none available, consider creating one At least create escaping library - easily check & reuse Metacharacters also show up in CSV, XML, command shell, etc.

Comma Separated Value (CSV) Injection
CSV format is very simple: UserId,BillToDate,ProjectName,Description,DurationMinutes 1, ,Test Project,Flipped the jibbet,60 2, ,Important Client,"Bop, dop, and giglip", 240 2, ,Important Client,"=2+5", 240 2, ,Important Client,"=2+5+cmd|' /C calc'!A0", 240 2, ,Important Client,"=IMPORTXML(CONCAT("" CONCATENATE(A2:E2)), ""//a"")",240 Spreadsheets often used to read CSV files but they may execute formulas in CSV files (e.g., if begin with “=“) Some executions request confirmation, but users often do IMPORTXML & others enable no-confirmation exfiltration Source: “The Absurdly Underestimated Dangers of CSV Injection”, George Mauer, 7 October,

XML: Check formatting Lots of data/messages formatted using XML
“Well-formed”: Follows certain syntax rules E.G., all opened tags are closed, nesting ok Check before using from untrusted sources! “Valid”: Meets some schema definition Check for validity before using untrusted input Eliminates many problems – schema == whitelist Don’t let attacker determine what schema to use! Decide what schema is okay & use that

XML Vulnerability: External Entities / External References
In XML supports external references which can be auto-loaded, e.g.: <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1//EN" " <!DOCTYPE letter [ <!ENTITY part1 SYSTEM " <!ENTITY part2 SYSTEM "../../../secrets/part2.xml"> ]> … <building> &part1; &part2; </building> Don’t accept unchecked external references from untrusted sources Possible solutions: Configure XML reader to ignore external references (& TEST that!) Forbid or check (with whitelist) external reference before use Don’t use XML, SOAP (which uses XML), etc. This is #4 vulnerability on the OWASP top 10 of 2017 (XML External Entities (XXE))

Command shell Most systems have at least one command shell
Useful for quickly combining programs, query, debugging, etc. Unix-like systems: e.g., /bin/sh, bash, dash, ksh, csh Windows: e.g., cmd.exe, PowerShell Command shell idea originally from MULTICS (1960s) Many metacharacters, and shell can call other commands Useful… but easy to make mistakes Only use when you really need its capability & be cautious doing so Several library calls may call command shell system(3), popen(3); sometimes execlp(3) and execvp(3) Perl & shell backtick (`); Python os.system() Java java.lang.Runtime.getRuntime().exec() In C, execve(3) does not invoke shell Use execve(3), not system(3), if you don’t need full system(3) functionality Many programs have mechanisms to invoke commands too Mail, vi, vim, emacs, ….

Unix-like shell metacharacters
Shell metacharacter list is long; one list is: & ; ` ' \ " | * ? ~ < > ^ ( ) [ ] { } $ \n \r Yet that list is known to be wrong, e.g.: Space & tab: normal separators # is comment (all rest on line ignored) ! means “not” in some contexts & history in others Leading “-” on parameter may be considered option Whitelist – escape non-alphanumerics Other commands have own character set and may need to escape through shell too Escape in right order, so that it’s actually escaped

SAMATE 1596 (1) – Unix-like // SAMATE CWE-077: Failure to Sanitize Data into a Control Plane (Command Injection) at line 44,45 /* Description: Tainted input allows command execution. Keywords: Port Java Size0 Complex0 Taint Unsafe InvalidParam: "user=bogus;ls -l /" ValidParam: user=root Copyright 2005 Fortify Software. Permission is hereby granted, without written agreement or royalty fee, to use, copy, modify, and distribute this software and its documentation for any purpose, provided that the above copyright notice and the following three paragraphs appear in all copies of this software. IN NO EVENT SHALL FORTIFY SOFTWARE BE LIABLE TO ANY PARTY FOR DIRECT, INDIRECT, SPECIAL, INCIDENTAL, OR CONSEQUENTIAL DAMAGES ARISING OUT OF THE USE OF THIS SOFTWARE AND ITS DOCUMENTATION, EVEN IF FORTIFY SOFTWARE HAS BEEN ADVISED OF THE POSSIBILITY OF SUCH DAMANGE. FORTIFY SOFTWARE SPECIFICALLY DISCLAIMS ANY WARRANTIES INCLUDING, BUT NOT LIMITED TO THE IMPLIED WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE, AND NON-INFRINGEMENT. THE SOFTWARE IS PROVIDED ON AN "AS-IS" BASIS AND FORTIFY SOFTWARE HAS NO OBLIGATION TO PROVIDE MAINTENANCE, SUPPORT, UPDATES, ENHANCEMENTS, OR MODIFICATIONS. */

SAMATE 1596 (2) – Unix-like import java.lang.Runtime; import java.io.*; import javax.servlet.*; import javax.servlet.http.*; public class Proc1_bad extends HttpServlet { public void doGet(HttpServletRequest req, HttpServletResponse res) throws ServletException, IOException res.setContentType("text/html"); ServletOutputStream out = res.getOutputStream(); out.println("<HTML><HEAD><TITLE>Test</TITLE></HEAD><BODY><blockquote><pre>");

SAMATE 1596 (3) – Unix-like String user = req.getParameter("user"); if(user != null) { try { String[] args = { "/bin/sh", "-c", "finger " + user }; // Uh oh!!!! Process p = Runtime.getRuntime().exec(args); // Disaster here. BufferedReader fingdata = new BufferedReader(new InputStreamReader(p.getInputStream())); String line; while((line = fingdata.readLine()) != null) out.println(line); p.waitFor(); } catch(Exception e) { throw new ServletException(e); } } else { out.println("specify a user"); } out.println("</pre></blockquote></BODY></HTML>"); out.close();

Example of bad code on Windows
String btype = request.getParameter("backuptype"); String cmd = new String("cmd.exe /K \"c:\\util\\rmanDB.bat " + btype + “&&c:\\utl\\cleanup.bat\""); Runtime.getRuntime().exec(cmd); Again, data from untrusted user is used, unescaped, as part of larger command – disaster. Source: “Secure Programming with Static Analysis”, Chess & West, page 168

Storing data: Databases & simple files
Databases (e.g., SQL) provide many useful functions E.G., arbitrary query Enable storage of field data But using them means that your program depends on potentially-complex DBMS Need to set up / configure May have its own vulnerabilities, etc. You have to include lower-level kernel anyway If simple files work for storage (e.g., with filenames as key), may want to use them for storage Can be faster, simpler. Directory structures can simplify much But not if you start rewriting a DBMS (!) Design decision (DBMS vs. files), but either involves calling out

Common data type passed elsewhere: Pathnames
Pathname = how to find file system object Unix-like: Sequence of 1+ filenames separated by “/” Pathnames often partly controlled by untrusted user Often useful to use file/directory names as a key to identify relevant data, can lead to untrusted user controlling filenames Monitoring/management of VM/shared system – untrusted monitoree controls filenames Defense-in-depth: Counter attacker who gets in part way Need to protect… so need to know about filename issues Obvious case: Don’t allow redirection outside dir E.G., if whitelist allowed “.”, “/”, and maybe “\” – problem! “trusted_root_path + username” might go somewhere unexpected if username is “../../../mysecrets” As always, use limited whitelist for info used to create filenames

Windows Pathnames: Difficult to make secure
Windows pathname interpretations vary depending on: Version of Windows API used (some use CreateFile, support \\.\) “letter:…” and “\\server\share...” have special meaning Nasty issue: reserved names in files Built-in reserved device names: CON, PRN, AUX, NUL, COM1, COM2, COM3, COM4, COM5, COM6, COM7, COM8, COM9, LPT1, LPT2, LPT3, LPT4, LPT5, LPT6, LPT7, LPT8, and LPT9 Even worse, drivers can create more reserved names (!) Avoid reserved names with/without extension.. if attacker can trick into reading/writing (e.g., com1.txt), may (depending on API) cause r/w to device In this case, even simple alphanumerics can cause disaster (rare situation!) Directory separator \ and / (\ widely used for escapes) Don’t end a file or directory name with space or period Underlying file system may support, but Windows shell & user interface don’t More info:

Unix-like filenames Unix-like systems in practice allow almost any sequence of bytes as a pathname Separated by “/”, terminated by \0 Case sensitive (normally) So yes, they allow problematic filenames with: Spaces Control characters (including tab, newline, escape) Non-characters (e.g., non-UTF-8) Leading “-” (option marker) Problematic filenames can cause trouble later Some additional problem happen when using shell Filename problems not limited to shell

First, a POSIX shell specific issue: Variable quoting
POSIX (Bourne) shell references need to be quoted if variable can include input separators Separators are space, tab, newline by default So this is usually wrong: cat $filename If $filename might include space, tab, or newline, it will be split into multiple filenames So if you use a Bourne shell to process filenames: Always quote variable references unless known the variable cannot have the space, tab, or newline characters: cat "$filename" Set IFS to just newline & tab (no space) early on, e.g.: IFS="$(printf '\n\t')" Do both if can (defense-in-depth); IFS change may require other changes

POSIX filenames: How to do it wrongly (one-directory)
One-directory example 1: cat * > ../collection # WRONG Fails if a filename begins with “-” (e.g., “-n”) This can happen in any language One-directory example 2: for file in * ; do # WRONG cat "$file" >> ../collection done Also fails if filename begins with “-” Fails if no files (will loop on file named “*”) Primarily a problem in shell (glob libraries simplify this)

POSIX filenames: How to do it wrongly (multiple-directories)
POSIX “find” is usual mechanism for walking directories Automatically inserts directory name in front Leading “-” can’t happen if directory name doesn’t start with “-” “Easy” ways often fail with space, newline, tab (often split up) If a produced filename contains “*” etc., then when $(…) returns, shell will try to expand using that (“set –f” disables this) Multi-directory example 1: cat $(find . -type f) > ../collection # WRONG Fails if file contains space, tab, newline Multi-directory example 2: ( for file in $(find . -type f) ; do # WRONG cat "$file" done ) > ../collection Same problem Multi-directory example 3: ( for file in $(find . -type f) ; do # WRONG cat "$file" done ) > ../collection

POSIX filenames: How to do it wrongly (multiple-directories)
Multi-directory example 4: ( find . -type f | xargs cat ) > ../collection # WRONG Wrong. By default, xargs’ input is parsed, so space characters (as well as newlines) separate arguments, and the backslash, apostrophe, double-quote, and ampersand characters are used for quoting Multi-directory example 5: ( find . -type f | while IFS="" read -r filename ; do cat "$filename" done ) > ../collection # WRONG Wrong. Ok if filenames can’t include newline If filenames can include newline, then any line-at-a-time processing of filenames fails

POSIX filenames: How to do it right
Prefix all globs/filenames so they cannot begin with "-" when expanded Add “./” prefix, etc., as needed Handle any filename byte sequence Including tab, newline, escape, non-UTF-8 Filenames are sequences of bytes, not characters Beware of printing filenames “Escape” control character may control other things Newline/tab/space may separate surprisingly Non-UTF-8: Unix-like don’t guarantee printability “<” and “&” if output to HTML/XML

Correct glob use #1 (one directory)
# Prefix glob (do this in any language) with ./ # Check if file exists (more important in shell) for file in ./* ; do # Use "./*", NEVER "*" if [ -e "$file" ] ; then # Prevent empty match COMMAND ... "$file" ... fi done

Correct find use Quick ways to use find:
find ... -exec COMMAND... {} \; find ... -exec COMMAND... {} \+ Nonstandard but common find/xargs extensions: find . -print0 | xargs -0 COMMAND Nonstandard extensions to find/shell (bash ok): find . -print0 | while IFS="" read -r -d "" file ; do ... COMMAND "$file" # Use quoted "$file", not $file done

File walking in shell – a simpler alternative, but with a caveat
If IFS doesn’t include space (per above), and filenames cannot include tab/newline… Then shell patterns like this work just fine: set –f # Need so “*” in filenames not expanded for file in $(find .) ; do COMMAND "$file" ... done Simpler – but only if you can guarantee it Alternative – try, & skip files with control chars: controlchars="$(printf '*[\001-\037\177]*')" set -f for file in $(find . ! –name "$controlchars"')

More information on Unix-like pathnames
“Filenames and Pathnames in Shell: How to do it correctly” “Fixing Unix/Linux/POSIX Filenames: Control Characters (such as Newline), Leading Dashes, and Other Problems”

Windows-specific file content problems
Windows Binary vs. Text file semantics “Text” often requires extra translations on input & output Windows text files use the CP/M line-ending convention CRLF Most other systems/languages use 1-char line-ending (LF) Yes, there are rare alternatives (IBM NEL, pre-MacOS-X CR) Can lead to misinterpretation or data corruption If translates when it shouldn’t, or doesn’t when it should E.G., \n in C/C++/Java often has to be converted back & forth Character encoding Many nonstandard Windows encodings (e.g., 1252, etc.) Many files use a UTF-16-style encoding (byte order!) Guessing common in Windows apps… yet can easily go wrong Many programs use Win32 charset detection function IsTextUnicode Especially bad pre-Vista (“Bush hid the facts” or “this app can break”) More recent tweaks make less likely, but fundamental problem still there Can lead to mojibake (aka “character salad”)/misinterpretation

Introduction: Insecure Deserialization / Demarshalling
Deserialization = convert stream of bytes/characters back into a copy of the original object Applications/APIs may request deserialization of hostile or tampered objects from attacker Typically by calling some deserialization library, which may be hidden within a larger system Most deserialization attacks can be subdivided as: Object and data structure related attacks where the attacker modifies application logic or achieves arbitrary remote code execution if there are classes available to the application that can change behavior during or after deserialization. Typical data tampering attacks such as access-control-related attacks where existing data structures are used but the content is changed [OWASP] OWASP 2017 top ten #8 is Insecure deserialization; see also CWE-502. More:

Examples where Insecure Deserialization could happen
Java’s Serializable format Python’s “pickle” library (can recreate objects) Remote- and inter-process communication (RPC/IPC) Wire protocols, web services, message brokers HTTP cookies If serialize object to cookie, deserialize cookie later, & cookie signature unchecked

Countering insecure deserialization
Use data-only formats (e.g., JSON) instead of formats that can embed arbitrary objects only deals with case #1, object and data structure related attacks Only deserialize data with trusted signature If an attacker could have created/modified it, don’t trust it Must verify signature first! Your application could be signer – if so, can use (fast) shared secret MAC algorithm (secret shared with self) Don’t let attacker control data to be deserialized Input validation – but beware, poor approach Hard to do correctly, so in general unreliable

Quick aside: “Assume breach” paradigm
Depending only on prevention doesn’t work Some attackers do break through all protections Esp. if software wasn’t designed & implemented to be secure (most software today) “Assume breach” mindset: “limits the trust placed in applications, services, identities and networks… by treating them all—both internal and external—as not secure and probably already compromised” Source: “The Assume Breach paradigm” –

“Assume Breach”: The “new” paradigm?
“Assume breach” mindset can help, but it depends on how it’s applied If everything is actually always breached, you end up with the ridiculous: Can’t do anything useful - no secure state to recover back to & use Detecting often expensive (auto not enough) & always misses some things Recovering after-the-fact is often extremely expensive & sometimes impossible (e.g., usually can’t undo a data release) Some say “assume breach” to justify failing to prevent attacks at all, even when it’s cheaper & more effective as it often is (bad!) Some say “assume breach” to: Ensure that resources are also spent on detection & recovery (good!) Ensure that “least privilege” design principle is applied (good!) Focusing only on prevention – or only on detection & recovery – has always been bad “Least privilege” is a 1970s S&S principle, it’s certainly not new Prevention often cheaper: Ounce of prevention still worth a pound of cure Prevention reduces attacker successes, makes detection & recovery practical Need total package: prevention and also detection & recovery

Detection & recovery’s impact on design & implementation
Recovery: Plan for it E.g., backup your data, enable reset back to safe/known state, implement “degraded” state(s) (e.g., “read only” mode), enable migration to alternative systems/services, user notification system (“we’re sorry to inform you…”) Rate limiting can often detect & auto-recover (by request IP address, login ID, etc.) - limits over period of time can auto-handle burstiness Often the problem is doing it, not its technical complexity.. so ID those as security requirements, design, & do it in your system before needed Detection: Key is logging (audit trails) The logs can then be monitored, along with other indicators, to detect ongoing attacks Detection important: often the trigger for recovery “How to monitor for detection” outside our scope, but we do need to discuss how to make detection possible! Thus, we need to discuss logging (an external system that is especially important for security)

Calling out to logging/debugging systems
Centralize all logging/debugging, use consistently Simplifies analysis (all data in one place) Eases change/reconfiguration Log instead of revealing problem details to users Ok to say there’s a problem, but don’t say too much Attackers love it when you give them detailed data! Record important successes & failures Try to reuse existing log systems Less code, easier to integrate, etc. Existing ones: log4j, java.util.logging, syslog, ... Deployments typically want to centralize logs so they can easily combine data from multiple sources, change how & how much to log, where it’s stored, send to separate protected system, etc.

When to log Logging only useful if important events are recorded
Log all important events, including: Login, logout, & other authorization changes Anything possibly indicating an attack or attempt to work around defenses Categorize messages so can configure what gets logged

If you must roll your own logging/debugging system
Record date/time & source Source = machine & application Sub-second accuracy very helpful Log(category, message) Allow configuration of: What to actually record (which categories) Where to send it (file, remote system, etc.) What to do on “log full” (Throw away old? New? Stop running?) Escape messages

Log/debug entries can become security vulnerabilities
Data destined for logs may include untrusted user data Including debugging systems – which will be used, since operational systems sometimes have problems Attackers may intentionally create data that will create problems later, e.g.: Crash/take over logging system Forge log entries Create attack on later retrieval Many store ASCII text In that case, encode all nonprintable chars (esp. control chars) so they’re something else, e.g., URL-encode or \ddd

Log forging example (1) // Do not do this: String val = request.getParameter("val"); try { int value = Integer.parseInt(val); } catch (NumberFormatException) { log.info("Failed to parse val = " + val); If user submit “val” value of “twenty-one”, then this entry is logged: INFO: Failed to parse val=twenty-one

Log forging example (2) But if attacker submits “val” value of:
twenty-one%0a%0aINFO:+User+logged+out%3dbadguy Then the log will falsely record: INFO: Failed to parse val=twenty-one INFO: User logged out=badguy Possibly fooling later log viewers: badguy “couldn’t” have done later actions Make it appear things okay or confuse causes Frame someone else

Protect logs Prevent read or write log access by untrusted users
Logs usually sent to separate system in operation Logs give away a lot, including: What you’re looking at.. and what you aren’t May include sensitive data Logs useful for: Debugging problems Evidence of attack

Do not include passwords & other sensitive data in logs
Logs should normally be private, but: Sometimes logs will be revealed to others Recipient or recipient’s later use may be unauthorized Thus, don’t include passwords & very sensitive data in logs Beware of including data if might include passwords Ensure URLs don’t include passwords! If must include, log encrypted data (or use salted hash) Example: IEEE log data breach 99,979 usernames + plaintext (!) passwords Publicly available on their FTP server for at least one month prior to discovery More info:

Display attacks Many displays simulate long-gone consoles
ESCAPE + codes can change color, erase screen, sometimes even send a screen content back Result: Merely displaying a filename or file content can cause command execution Many systems store info in HTML/XML May include Javascript, etc., that is executed on display Consider encoding data that users/admins might directly display later

Call Only Interfaces Intended for Programmers
Usually unwise to invoke a program intended for human interaction (text or GUI) Programs for humans are intentionally rich & often difficult to completely control May have “escape hatches” to unintended functions Interactive programs often try to intuit the “most likely” meaning May not be what you were expecting Attacker may find a way to exploit this Usually there’s a different program/API for other programs’ use; use that instead E.G., don’t invoke ed/vi/emacs for text processing, use sed/awk/perl Similarly, provide an API/programmer interface if sensible Perhaps have GUI that then invokes API (good approach anyway)

Check All System Call Returns
Check every system call that can return an error condition Nearly all system calls require limited system resources, and users can often affect resources in a variety of ways If the error cannot be handled gracefully, then fail safe

Check information when it returns
Reuse input filtering concepts from earlier… values from libraries are yet more input If number: Is it within some plausible range? If string: Does it match a whitelist filter? If complex (e.g., file/data type): Is it one of the permitted file/data types? If an image (e.g., tile from geometry server): Height/width in range? Is it really an image format? If it takes too long to respond, consider alternatives This can be hard to do everywhere, in which case, prioritize where it’s riskier Can be helpful in countering defects in components, even if it is not security issue

Counter Web Bugs When Retrieving Embedded Content
Some data formats (e.g., HTML) can embed references to content that is automatically retrieved when the data is viewed (transclusion) Without waiting for user selection Privacy issue – enables “web bug” so others can obtain information about a reader without his knowledge In a web bug: A reference intentionally inserted into a document and used by the content author to track who, where, and how often a document is read (e.g., 1x1 pixel “image” loaded from somewhere else) Can see how a “bugged” document is passed from one person to another or from one organization to another Primarily an issue with file format design (can’t undo HTML) If your users value their privacy, you probably will want to limit the automatic downloading of included files (e.g., from other sites)

Hide Sensitive Information (1)
Hide sensitive information (e.g., private, personally-identifying information, passwords, …) In transit (input & output) At rest (stored) In transit – web-based applications Typically use https: (HTTP on top of SSL or TLS) Don’t allow GET to submit information – encoded in Request-URI, which is often logged Encrypt any storage Doesn’t help if attacker breaks into the application Does defend against someone who gets storage device without encryption keys Encrypt passwords (with salted hashes – explain later)

Hide Sensitive Information (2)
Don’t send it, if you don’t have to E.G., create special “local” ids when sending to other sites Translate back Make it hard for external sites to reveal info about your users, data, etc.

Conclusions In practice, must depend on other components
Libraries, OS, DBMS, etc. Be careful about how you call out to them Metacharacters in particular can cause trouble SQL, command shell, filenames, log/debug entries Don’t allow anything to be sent to another component unless you’re sure it’s okay SQL: Use prepared statements (usual approach) In other cases, often need to escape data to be sent Input validation often not enough Support detection & recovery, e.g., by logging Be careful about what you accept back

Released under CC BY-SA 3.0
This presentation is released under the Creative Commons Attribution-ShareAlike 3.0 Unported (CC BY-SA 3.0) license You are free: to Share — to copy, distribute and transmit the work to Remix — to adapt the work to make commercial use of the work Under the following conditions: Attribution — You must attribute the work in the manner specified by the author or licensor (but not in any way that suggests that they endorse you or your use of the work) Share Alike — If you alter, transform, or build upon this work, you may distribute the resulting work only under the same or similar license to this one These conditions can be waived by permission from the copyright holder dwheeler at dwheeler dot com Details at: Attribute me as “David A. Wheeler”

SWE 681 / ISA 681 Secure Software Design & Programming: Lecture 5: Calling Out to Other Components (including injection) Dr. David A. Wheeler 2018-06-04.

Similar presentations

Presentation on theme: "SWE 681 / ISA 681 Secure Software Design & Programming: Lecture 5: Calling Out to Other Components (including injection) Dr. David A. Wheeler 2018-06-04."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

SWE 681 / ISA 681 Secure Software Design & Programming: Lecture 5: Calling Out to Other Components (including injection) Dr. David A. Wheeler 2018-06-04.

Similar presentations

Presentation on theme: "SWE 681 / ISA 681 Secure Software Design & Programming: Lecture 5: Calling Out to Other Components (including injection) Dr. David A. Wheeler 2018-06-04."— Presentation transcript:

Similar presentations

About project

Feedback