Authors: James Newsome, Dawn Song

Authors: James Newsome, Dawn Song
Dynamic Taint Analysis for Automatic Detection, Analysis, and Signature Generation of Exploits on Commodity Software Authors: James Newsome, Dawn Song Presenters: Sheikh M Qumruzzaman Khaled M Al-Naami

Welcome and Introduction
Overview Dynamic Taint Analysis TaintCheck TaintSeed TaintTracker TaintAssert Exploit Analyzer Security Analysis of TaintCheck Evaluation and Performance Automatic Signature Generation Conclusion

Overview Worms exploit software vulnerabilities.
Buffer Overflow. Format String. Dangling Pointers. SQL Injection. CodeRed and Slammer exploit vulnerabilities and can compromise hundreds of thousands of hosts within hours or minutes. Worms exploit software vulnerabilities like, buffer overrun and “format string” vulnerability, Dangling pointers, SQL injection

Slammer The geographical spread of Slammer in the 30 minutes after its release. The diameter of each circle is a function of the logarithm of the number of infected machines, so large circles visually underrepresent the number of infected cases in order to minimize overlap with adjacent locations. For some machines, we can determine only the country of origin rather than a speciﬁc city. The geographical spread of Slammer in the 30 minutes after its release. Source:

CodeRed The Code Red worm was a typical random-scanning worm. This graph shows Code Red’s probe rate during its re-emergence on 1 August, 2001, as seen on one Internet subnetwork, matched against the random constant spread worm behavior model random scanning -- it selects IP addresses at random to infect, eventually finding all susceptible hosts. Random scanning worms initially spread exponentially rapidly, but the rapid infection of new hosts becomes less effective as the worm spends more effort retrying addresses that are either already infected or immune. Thus as with the Code Red worm of 2001, the proportion of infected hosts follows a classic logistic form of initially exponential growth in a finite system [5,3]. We refer to this as the random constant spread (RCS) model. Code Red’s probe rate during its re-emergence on 1 August, 2001 Source:

WHAT DO WE NEED? An automatic detect and defense system.
Automatic development of attack signatures. In this paper authors proposed new technique ‘Dynamic Taint Analysis’ and showed how it can be used to detect and analyze software exploits In order to fight against new fast growing worms it is necessary to have an automatic detect and defense system. For a new attack, attack signatures should be developed automatically. And also we need tools for exploit analysis.

ATTACK DETECTORS Coarse grained detectors:
Detect anomalous behavior such as scanning and do not provide detailed information about vulnerability and how it exploited. Fine grained detectors: Detect attack on a program vulnerability, and provide detailed information about it. coarse-grained detectors, that detect anomalous behavior, such as scanning or unusual activity at a certain port. Coarse-grained detectors may result in frequent false positives, and do not provide detailed information about the vulnerability and how it is exploited. Several approaches for fine grained detectors. But most of them requires source code or special recompilation of the program. These constraints hinder the deployment and applicability of these methods, especially for commodity software, where for sure we don’t have access to the source code. Several approaches for fine grained detectors. But most of them are not dynamic.

Dynamic Taint Analysis
Tainted Data: Data from un-trusted sources. Keep track of tainted data. Monitor program execution to track how tainted attributes propagate Check when tainted data is used in dangerous ways TaintCheck – An automatic dynamic taint analysis tool. In dynamic taint analysis, we label data originating from or arithmetically derived from un-trusted sources such as the network or user input as tainted. We keep track of the propagation of tainted data as the program executes (i.e., what data in memory is tainted), and detect when tainted data is used in dangerous ways that could indicate an attack. This approach allows us to detect over-write attacks, attacks that cause a sensitive value (such as return addresses, function pointers, format strings, etc.) to be overwritten with the attacker’s data.

TaintCheck Doesn’t require source code/special compilation.
Reliably detects most overwrite attacks. No known false positives. Enables automatic semantic analysis based signature generation. Overwrite Attacks: Modify the return address, function pointer or function pointer offset.

Design and Implementation (TaintCheck)
TaintCheck performs dynamic taint analysis on a program by running the program in its own emulation environment. X86 instructions UCode Binary re-writer Taint Check Whenever program control reaches a new basic block, Valgrind first translates the block of x86 instructions into its own RISC-like instruction set, called UCode. It then passes the UCode block to TaintCheck, which instruments the UCode block to incorporate its taint analysis code. TaintCheck then passes the rewritten UCode block back to Valgrind, which translates the block back to x86 code so that it may be executed. Once a block has been instrumented, it is kept in Valgrind’s cache so that it does not need to be reinstrumented every time it is executed. X86 instructions UCode Courtesy: Devendra Salvi Dynamic taint analysis

Questions? What inputs should be tainted?
How should the taint attribute propagate? What usage of tainted data should raise an alarm as an attack? To use dynamic taint analysis for attack detection, we need to answer three questions.

Answers X TaintSeed. TaintTracker. TaintAssert. Memory byte
Copy Memory byte Add Use as Attack Detected To make taintCheck flexible and extensible they have designed three components. Fn Pointer Untainted Data Shadow Memory Exploit Analyzer X Taint Data structure Courtesy: Devendra Salvi Taint Check detection of an attack

Taint Seed It marks any data from untrusted sources as “tainted”
Each byte of memory has a four-byte shadow memory that stores a pointer to a Taint data structure if that location is tainted, or a NULL pointer if it is not. Optionally, logging can be disabled and the shadow memory locations can simply store a single bit indicating taint. TaintSeed considers input from network sockets to be untrusted, since for most programs the network is the most likely vector of attack. Taint seed also can be configured to taint inputs from other sources considered by an extended policy, e.g., input data from certain files or stdin. TaintSeed examines the arguments and results of each system call, and determines whether any memory written by the system call should be marked as tainted or untainted according to the TaintSeed policy. When the memory is tainted, TaintSeed allocates a Taint data structure that records the system call number, a snapshot of the current stack and a copy of the data that was written. Memory is mapped to TDS

Copy Memory byte Add Use as Attack Detected To make taintCheck flexible and extensible they have designed three components. Fn Pointer Untainted Data Shadow Memory Exploit Analyzer X Taint Data structure* Courtesy: Devendra Salvi Taint Check detection of an attack

Dynamic taint analysis
TaintTracker It tracks each instruction that manipulates data in order to determine whether the result is tainted. When the result of an instruction is tainted by one of the operands, TaintTracker sets the shadow memory of the result to point to the same Taint data structure as the tainted operand. Memory is mapped to TDS Result is mapped to TDS

Copy Memory byte Add Use as Attack Detected To make taintCheck flexible and extensible they have designed three components. Fn Pointer Untainted Data Shadow Memory Exploit Analyzer X Taint Data structure* Courtesy: Devendra Salvi Taint Check detection of an attack

2018/9/19 Taint Assert - Taint assert checks whether tainted data is used in ways that is policy defines as illegitimate. Default Policy: Jump addresses. Format strings. System call arguments. Application or library specific checks. 17

2018/9/19 Jump addresses: Checks whether tainted data is used as a jump target. Instrument before each Ucode jump instruction. Format strings: Checks whether tainted data is used as format string argument. Intercept calls to the printf family of functions. System call arguments: Checks whether the arguments specified in system calls are tainted. Optional policy for execv system call. Application or library-specific checks: To detect application or library specific attacks. Jump address: Many attacks attempt to overwrite one of these in order to redirect control flow either to the attacker’s code, to a standard library function such as exec, or to another point in the program (possi-bly circumventing security checks). They implemented these checks by having TaintCheck place instrumentation before each UCode jump instruction to ensure that the data specifying the jump target is not tainted. Format String: These checks detect format string attacks, in which an attacker provides a malicious format string to trick the program into leaking data or into writing an attacker-chosen value to an attacker-chosen memory address. These checks currently detect whenever tainted data is used as a format string, even if it does not contain malicious format specifier for attacks. To implement these checks, we intercept calls to the printf family of functions (including syslog) with wrappers that request TaintCheck to ensure that the format string is not tainted, and then call the orig-inal function. For most programs, this will catch any format string attack and not interfere with nor-mal functionality. However, if an application uses its own implementation of these functions, our wrappers may not be called. System call: As an example, we implemented an optional policy to check whether the argument specified in any execve system call is tainted. This could be used to detect if an attacker attempts to overwrite data that is later used to specify the program to be loaded via an execve system call. 18

2018/9/19 Exploit Analyzer It provide useful information about how the exploit happened and what the exploit attempt to do. Usage: Identifying vulnerabilities. Generating exploit signature. By backtracking the chain of Taint structures, the Exploit Analyzer provides information including the original input buffer that the tainted data came from, the program counter and call stack at every point the program operated on the relevant tainted data, and at what point the exploit actually occurred. The Exploit analyzer can use this information to help determine the nature and location of a vulnerability quickly, and to identify the exploit being used. 19

Security Analysis for TaintCheck
The good news is: Attacks detected by TaintCheck The bad news is: False Negatives False Positives

Security Analysis – Attacks detected Overwrite attacks
TaintCheck detects if overwriting Jump targets (such as return addresses and function pointers) whether altered to point to Existing code (existing code attack) . Injected code (code injection attack).

Security Analysis – Attacks detected Overwrite attacks
It also detects Format String attacks: An attacker provide malicious format string to trick program by writing an attacker value to an attacker chosen memory address.

Security Analysis – Attacks detected
Overwrite attacks Most worm attacks fall into the following categories. up to 2005 Overwrite Method Value Overwritten

Security Analysis – False negative analysis – the bad news
Attacker causes sensitive data not to be tainted. Scenario: Altered data originate or arithmetically derived from trusted inputs but influenced by untrusted inputs. Paper doesn’t consider tainted attribute of flags, Example: suppose x is tainted If (x == 0) y = 0; else if (x == 1) y = 1; ... same as  y = x  However, y is not tainted as influenced indirectly by x, via the condition flags. Attacker might cause y to overwrite things >(Undetected)

Security Analysis – False negative analysis – the bad news – cont’d
If TaintCheck is configured to trust inputs that should not be trusted. data from the network could be first written to a file on disk, and then read back into memory.

Security Analysis – False Positive analysis – the bad news
Attack detected while there is no real attack Taint Check detects that tainted data is being used in an illegitimate way even when there is no attack taking place. However, it indicates there are vulnerabilities in program For example, the program may be using an unchecked input as a format string.  Fix the vulnerability using check Exploit Analyzer…

Experiments and Evaluation
Compatibility and false positives Evaluation of attack detection Synthetic exploits Actual exploits

Evaluation - Compatibility and false positives
TaintCheck used to monitor some programs for false positives. Server programs: apache, ATPhttpd, bftpd, cfingerd, and named. Client programs: ssh and firebird. Nonnetwork programs: gcc, ls, bzip2, make, latex, vim, emacs, and bash. All were normal with no false positives EXCEPT for vim and firebird.

Evaluation - Evaluation of attack detection
TaintCheck ability was tested to detect attacks: Synthetic exploits Actual exploits

Synthetic exploits They wrote small programs for: Return Address Function Pointer Format String “gets” for long input Same Line input from user Overwrote the stack – overwrote return address Overwrote the stack – overwrote function pointer Overwrote format string Attack detected as return addr was tainted from user input Attack detected as func pointer was tainted from user input TaintCheck determined correctly when the format string was tainted

Actual exploits TaintCheck evaluated on exploits to three vulnerable servers: a web server, a finger daemon, and an FTP server.

ATPhttpd exploit cfingerd exploit wu-ftpd exploit Web server program Finger daemon ftp Ver 0.4b and lower are vulnerable to buffer overflow Ver and lower are vulnerable to format string Version of wu-ftpd has a format string vulnerability in a call to vsnprintf. malicious GET request with a very long file name (shellcode and a return address) was sent to server. Return address overwritten so when func retruns it jumps to shell code inside the file name  remote shell for attacker When prompts for a user name, exploit responds with a string beginning with “version” + malicious code - cfingerd copies the whole string into memory, but only reads to the end of the string “version”. Malicious code in memory starts working Format string to overwrite the return address was detected TaintCheck detected return addr was tainted and identified the new value Detected also TaintCheck successfully detects both that the format string supplied to vsnprintf is tainted, and that the overwritten return address is tainted.

Performance TaintCheck performance was measured using:
Two “worst-case” workloads (a CPU-bound workload and a short-lived process workload) In addition, common workload (a long-lived I/O-bound workload). Natively, Nullgrind, Valgrind tester Memcheck, and under TaintCheck 2.00 GHz Pentium 4, and 512 MB of RAM, RedHat 8.0.

Next slide Performance CPU-bound: bzip2
Short-lived processes: cfingerd Common case: Apache bzip2 was instrumented using TaintCheck cfingerd was instrumented to compress a 15 MB package of source code (Vim 6.2). how long cfingerd takes to start and serve a finger request Normally 8.2 sec sec Nullgrind 25.6 (3.1 times longer) 13 times longer MemCheck 109 (13.3 times longer) 32 times longer TaintCheck 305 (37.2 times longer) 36 times longer Next slide

Perf Apache – cont’d Common case
For network services the latency experienced is due to network and/or disk I/O and the TaintCheck performance penalty should not be noticeable.

Improving Performance
First, some performance overhead is due to the implementation of Valgrind. Another x86 emulator, DynamoRio, offers much better performance than Valgrind, due to better caching and other optimization mechanisms. Also, analyze each basic block to eliminate redundant tracking code “Optimization can be performed”.

Automatic signature generation
Exploit detected  generate a signature to filter this exploit request. Automatic semantic analysis of attack payloads. Implemented using TaintCheck Generate signature to filter out exploit requests until patching. Previous: Content Pattern Extraction: Considered attack payloads as opaque byte sequences. New Approach: Automatic Semantic Analysis: Identify which parts of the payloads are useful in a signature.

Conclusion To combat the rapid spread of new worms, an automatic attack detection has to happen. Dynamic taint analysis has been presented using TaintCheck without requiring source code or special compilation of a program Identify input that caused the exploit and the value used to overwrite the protected data (e.g. the return address). Automatic signature generation using TaintCheck.

Questions?

Questions Thank you.

Authors: James Newsome, Dawn Song

Similar presentations

Presentation on theme: "Authors: James Newsome, Dawn Song"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Authors: James Newsome, Dawn Song

Similar presentations

Presentation on theme: "Authors: James Newsome, Dawn Song"— Presentation transcript:

Similar presentations

About project

Feedback