Auditing Closed-Source Software Using reverse engineering in a security context © 2001 by HalVar Flake Speech Outline (I): Introduction to the topic: Different.

Slides:



Advertisements
Similar presentations
Buffer Overflows Nick Feamster CS 6262 Spring 2009 (credit to Vitaly S. from UT for slides)
Advertisements

Exploring Security Vulnerabilities by Exploiting Buffer Overflow using the MIPS ISA Andrew T. Phillips Jack S. E. Tan Department of Computer Science University.
1 Chapter 10 Strings and Pointers. 2 Introduction  String Constant  Example: printf(“Hello”); “Hello” : a string constant oA string constant is a series.
Lecture 9. Lecture 9: Outline Strings [Kochan, chap. 10] –Character Arrays/ Character Strings –Initializing Character Strings. The null string. –Escape.
What is a pointer? First of all, it is a variable, just like other variables you studied So it has type, storage etc. Difference: it can only store the.
Programming Types of Testing.
T. E. Potok - University of Tennessee Software Engineering Dr. Thomas E. Potok Adjunct Professor UT Research Staff Member ORNL.
© 2001 Halvar Flake Auditing binaries for security vulnerabilities Speech outline (I) Legal considerations concerning reverse engineering Introduction.
 2000 Prentice Hall, Inc. All rights reserved. Chapter 8 - Characters and Strings Outline 8.1Introduction 8.2Fundamentals of Strings and Characters 8.3Character.
Chapter 10.
Teaching Buffer Overflow Ken Williams NC A&T State University.
CS 61C L03 C Arrays (1) A Carle, Summer 2005 © UCB inst.eecs.berkeley.edu/~cs61c/su05 CS61C : Machine Structures Lecture #3: C Pointers & Arrays
Chapter 5: Memory Management Dhamdhere: Operating Systems— A Concept-Based Approach Slide No: 1 Copyright ©2005 Memory Management Chapter 5.
Auditing Closed-Source Applications Using reverse engineering in a security context Speech Outline: 1.Different Approaches to auditing binaries 2.How to.
Buffer Overflow Attacks. Memory plays a key part in many computer system functions. It’s a critical component to many internal operations. From mother.
Address Obfuscation: An Efficient Approach to Combat a Broad Range of Memory Error Exploits Sandeep Bhatkar, Daniel C. DuVarney, and R. Sekar Stony Brook.
Static Analysis for Security Amir Bazine Per Rehnberg.
University of Washington CSE 351 : The Hardware/Software Interface Section 5 Structs as parameters, buffer overflows, and lab 3.
Drawing pictures from code Blackhat Briefings 2002 Halvar Flake Reverse Engineer Blackhat Consulting – Graph-Based Binary Analysis.
Security Exploiting Overflows. Introduction r See the following link for more info: operating-systems-and-applications-in-
An anti-hacking guide.  Hackers are kindred of expert programmers who believe in freedom and spirit of mutual help. They are not malicious. They may.
Introduction to C programming
CAP6135: Malware and Software Vulnerability Analysis Buffer Overflow : Example of Using GDB to Check Stack Memory Cliff Zou Spring 2011.
Chapter 6 Buffer Overflow. Buffer Overflow occurs when the program overwrites data outside the bounds of allocated memory It was one of the first exploited.
Prof. Yousef B. Mahdy , Assuit University, Egypt File Organization Prof. Yousef B. Mahdy Chapter -4 Data Management in Files.
Computer Security and Penetration Testing
BLENDED ATTACKS EXPLOITS, VULNERABILITIES AND BUFFER-OVERFLOW TECHNIQUES IN COMPUTER VIRUSES By: Eric Chien and Peter Szor Presented by: Jesus Morales.
Drawing pictures from code CanSecWest 2002 Halvar Flake Reverse Engineer Blackhat Consulting Graph-Based Binary Analysis.
Program Development Life Cycle (PDLC)
Computer Science Detecting Memory Access Errors via Illegal Write Monitoring Ongoing Research by Emre Can Sezer.
CMPSC 16 Problem Solving with Computers I Spring 2014 Instructor: Tevfik Bultan Lecture 12: Pointers continued, C strings.
Mitigation of Buffer Overflow Attacks
© Janice Regan, CMPT 300, May CMPT 300 Introduction to Operating Systems Memory: Relocation.
© 2001 Halvar Flake Auditing binaries for security vulnerabilities Speech outline (I) Legal considerations concerning reverse engineering Introduction.
Overflow Examples 01/13/2012. ACKNOWLEDGEMENTS These slides where compiled from the Malware and Software Vulnerabilities class taught by Dr Cliff Zou.
Smashing the Stack Overview The Stack Region Buffer Overflow
Buffer Overflow. Introduction On many C implementations, it is possible to corrupt the execution stack by writing past the end of an array. Known as smash.
Buffer Overflow Proofing of Code Binaries By Ramya Reguramalingam Graduate Student, Computer Science Advisor: Dr. Gopal Gupta.
Buffer Overflow Group 7Group 8 Nathaniel CrowellDerek Edwards Punna ChalasaniAxel Abellard Steven Studniarz.
CSC141- Introduction to Computer programming Teacher: AHMED MUMTAZ MUSTEHSAN Lecture – 21 Thanks for Lecture Slides:
Computer Organization and Design Pointers, Arrays and Strings in C Montek Singh Sep 18, 2015 Lab 5 supplement.
Lecture 13 Page 1 CS 236 Online Major Problem Areas for Secure Programming Certain areas of programming have proven to be particularly prone to problems.
CHP-3 STACKS.
String operations. About strings To handle strings more easily, we need to include a library> #include To see what the library allows us to do, look here:
Arrays. Outline 1.(Introduction) Arrays An array is a contiguous block of list of data in memory. Each element of the list must be the same type and use.
Announcements You will receive your scores back for Assignment 2 this week. You will have an opportunity to correct your code and resubmit it for partial.
VM: Chapter 7 Buffer Overflows. csci5233 computer security & integrity (VM: Ch. 7) 2 Outline Impact of buffer overflows What is a buffer overflow? Types.
CAP6135: Malware and Software Vulnerability Analysis Buffer Overflow : Example of Using GDB to Check Stack Memory Cliff Zou Spring 2014.
ECE 103 Engineering Programming Chapter 29 C Strings, Part 2 Herbert G. Mayer, PSU CS Status 7/30/2014 Initial content copied verbatim from ECE 103 material.
Principles of Programming - NI Chapter 10: Character & String : In this chapter, you’ll learn about; Fundamentals of Strings and Characters The difference.
LECTURE 19 Subroutines and Parameter Passing. ABSTRACTION Recall: Abstraction is the process by which we can hide larger or more complex code fragments.
Some of the utilities associated with the development of programs. These program development tools allow users to write and construct programs that the.
7-Nov Fall 2001: copyright ©T. Pearce, D. Hutchinson, L. Marshall Oct lecture23-24-hll-interrupts 1 High Level Language vs. Assembly.
DYNAMIC MEMORY ALLOCATION. Disadvantages of ARRAYS MEMORY ALLOCATION OF ARRAY IS STATIC: Less resource utilization. For example: If the maximum elements.
Macro Processor Design Options Recursive Macro Expansion General-Purpose Macro Processors Macro Processing within Language Translators.
Secure Coding Rules for C++ Copyright © 2016 Curt Hill
Shellcode COSC 480 Presentation Alison Buben.
Mitigation against Buffer Overflow Attacks
Computer Organization and Design Pointers, Arrays and Strings in C
Strings CSCI 112: Programming in C.
K. K. Mookhey Network Intelligence India Pvt. Ltd.
A bit of C programming Lecture 3 Uli Raich.
A Closer Look at Instruction Set Architectures
Secure Coding Rules for C++ Copyright © Curt Hill
Arrays in C.
File I/O in C Lecture 7 Narrator: Lecture 7: File I/O in C.
CAP6135: Malware and Software Vulnerability Analysis Buffer Overflow : Example of Using GDB to Check Stack Memory Cliff Zou Spring 2015.
Course Overview PART I: overview material PART II: inside a compiler
Format String Vulnerability
Presentation transcript:

Auditing Closed-Source Software Using reverse engineering in a security context © 2001 by HalVar Flake Speech Outline (I): Introduction to the topic: Different approaches to auditing binaries Review of C/C++ programming mistakes and how to spot them in the binary Demonstration of finding a vulnerability in a binary Legal considerations --- Break ---

Auditing Closed-Source Software Using reverse engineering in a security context © 2001 by HalVar Flake Speech Outline (II): Problems encountered in the OOP world manual structure & class reconstruction automated structure & class reconstruction automating the process of scanning for suspicious constructs Free time to answer questions and discuss the topic

© 2001 by HalVar Flake Legal considerations Technically, the reverse engineer breaks the license agreement between him and the software vendor, as he is forced to accept upon installation that he will not reverse engineer the program. The vendor could theoretically sue the reverse engineer and revoke the license. Depending on your local law, there are different ways to defend your situation:

© 2001 by HalVar Flake Legal considerations (EU) EU Law: 1991 EC Directive on the Legal Protection of Computer Programs Section 6 grants the right to decompilation for interoperability purposes Section 5.3 grants the right to decompilation for error correction purposes Under EU Law, these rights cannot be contracted away.

© 2001 by HalVar Flake Legal considerations (USA) US Law: Final form of DMCA includes exceptions to copyright for: Reverse engineering for interoperability Encryption research Security testing One should ask his lawyer if these rights can be contracted away.

© 2001 HalVar Flake Why audit binaries ? If you‘re a blackhat: If you‘re a whitehat: Many interesting systems (Firewalls) run closed-source software New security vulnerabilities are every Administrators nightmare You can annoy vendors by finding problems in their code You can get an idea how secure a particular application‘s code is

© 2001 by HalVar Flake Approach A: Stress Testing Long strings of data are more or less randomly generated and sent to the application, usually trying to overflow every single string that gets parsed by a certain protocol. Pros: Stress testing tools are re-usable for a given protocol Will work automatically with little to no supervision Do not require specialized personnel to use Cons: The analyzed protocol needs to be known in advance Complex problems involving several conditions at once will be missed Undocumented options and backdoors will be missed

© 2001 by HalVar Flake Approach B: Manual Audit A reverse engineer carefully reads the disassembly of the program, tediously reconstructing the program flow and spotting programming errors. This was the approach Joey__ demonstrated at BlackHat Singapore. Pros: Even the most complex issues can be spotted Cons: The process involved is incredibly time-consuming and nearly infeasible for large applications A highly skilled and specialized auditor is needed The danger is inherent that an auditor will burn out and thus miss obvious problems

© 2001 by HalVar Flake Approach C: Looking for suspicious constructs The reverse engineer tries to identify suspicious code construcs, then works his way backwards through the application to determine how this code is reached. Pros: Reasonable depth: Even relatively complex issues can be uncovered Saves time/work in comparison to Approach B The process of identifying suspicious code constructs can be partially automated Cons: Not all problems will be uncovered Needs highly specialized auditor Reading code backwards is very time consuming and can be frustrating If nothing is found, the auditor is back to Approach B

© 2001 by HalVar Flake Skills the auditor needs A good understanding of assembly language and compiler internals Good knowledge of C/C++ and the coding mistakes that lead to security vulnerabilities Only a good C/C++ code auditor can be a good binary auditor Lots and lots of endurance, patience and time

© 2001 by HalVar Flake Tools the auditor needs As Disassembler: IDA Pro by Ilfak Guilfanov Can disassemble x86, SPARC, MIPS and much more... Includes a powerful scripting language Can recognize statically linked library calls Features a powerful plug-in interface Features CPU Module SDK for self-developed CPU modules Automatically reconstructs arguments to standard calls via type libraries, allows parsing of C-headers for adding new standard calls & types... much more...

© 2001 by HalVar Flake strcpy() and strcat() Old news: Any call to strcpy() or strcat() copying non-static strings without proper bounds checking beforehand has to be considered dangerous. C/C++ code auditing recap

© 2001 by HalVar Flake sprintf() and vsprintf() Old news: Any call to sprintf() or a homemade function that uses vsprintf() and expands user-supplied data into a buffer by just using “%s“ in the format string is dangerous. C/C++ code auditing recap

© 2001 by HalVar Flake The *scanf() function family Old news: Any call to any member of the *scanf() function family which uses the „%s“ format character in the format string to parse user-supplied data into a buffer is dangerous. C/C++ code auditing recap

© 2001 by HalVar Flake The strncpy() pitfall C/C++ code auditing recap While strncpy supports size checking, it does not guarantee NUL-termination of the destination buffer. So in cases where the code includes something like strncpy(destbuff, srcbuff, sizeof(destbuff)); problems will arise.

© 2001 by HalVar Flake The strncpy() pitfall C/C++ code auditing recap Source string\x0 data After copying the source into a smaller buffer, the destination string is not properly terminated any more. Destination string data with a \x0 somewhere Any subsequent operations which expect the string to be terminated will work on the data behind our original string as well.

© 2001 by HalVar Flake The strncat() pitfall As with strncpy(), strncat() supports size checking, but guarantees the proper termination of the string after the last byte has been written. Furthermore, the fact that strncat() will usually need to handle with dynamic values for len increases the risk for cast screwups. C/C++ code auditing recap

© 2001 by HalVar Flake The strncat() pitfall Consider code like this: strncat(dest, src, sizeof(dest)-strlen(dest)); This will write an extra NUL behind the end of dest if the maximum size is fully utilized. (so-called poison-null-byte) C/C++ code auditing recap

© 2001 by Thomas Dullien aka HalVar Flake The strncat() pitfall Furthermore, one has to be careful about handling the dynamic size_t len parameter: voidfoo(char *source1, char *source2) { charbuff[100]; strncpy(buff, source1, sizeof(buff)-1); strncat(buff, source2, sizeof(buff)-strlen(source1)-1); } C/C++ code auditing recap

void func(char *dnslabel) { char buffer[256]; char *indx = dnslabel; int count; count = *indx; buffer[0] = '\x00'; while (count != 0 && (count + strlen (buffer)) < sizeof (buffer) - 1) { strncat (buffer, indx, count); indx += count; count = *indx; } © 2001 by HalVar Flake Cast Screwups C/C++ code auditing recap

© 2001 by HalVar Flake Format String Vulnerabilities C/C++ code auditing recap Any call that passes user-supplied input directly to a *printf()-family function is dangerous. These calls can Also be identified by their argument deficiency. Consider this code: printf(„%s“, userdata); printf(userdata); Argument deficiency

© 2001 by HalVar Flake - x86 Assembly Recap - C/C++ code auditing recap void *memcpy(void *dest, void *src, size_t n); Assembly representation: push4 moveax, unkn_40D278 pusheax leaeax, [ebp+var_458] pusheax call_memcpy

© 2001 by HalVar Flake strcpy() and strcat() Finding it in the disassembly This call targets a stack buffer The source is variable, not a static string

© 2001 by HalVar Flake sprintf() and vsprintf() Finding it in the disassembly Target buffer is a stack buffer Expanded strings are not static and not fixed in length Format string containing „%s“

© 2001 by HalVar Flake The *scanf() function family Finding it in the disassembly Format string contains „%s“ Data is parsed into stack buffers

© 2001 by HalVar Flake The strncpy()/strncat() pitfall Finding it in the disassembly If the source is larger than n (4000 bytes), no NULL will be appended Copying data into a stack buffer again...

© 2001 by HalVar Flake The strncpy()/strncat() pitfall Finding it in the disassembly The target buffer is only n bytes long

© 2001 by HalVar Flake The strncat() pitfall Finding it in the disassembly Dangerous handling of len parameter

© 2001 by HalVar Flake Cast Screwups Finding it in the disassembly Generally any function that uses a size_t for copying memory into a buffer. (strncpy(), strncat(), fgets()) The size_t has to be generated on run-time and must not be hardcoded The size_t has be subtracted from or it has to be loaded via a movsx assembler instruction beforehand

© 2001 by HalVar Flake Format String Vulnerabilities Finding it in the disassembly Argument deficiency Format string is a dynamic variable

© 2001 by HalVar Flake Why go after iWS SHTML again ? An Example: iWS 4.1 SHTML Earlier research has shown that the “ improved“ SHTML parsing code has not been written with security in mind Since it was written before the wide publication of format string bugs, it has probably not been audited for it yet I already had the file disassembled and on my box, disassembly takes way too long

© 2001 by HalVar Flake The INTlog_error() call An Example: iWS 4.1 SHTML printf()-like parsing of arguments Minimum stack correction for a dynamic format string is 0x1C – 4 = 0x18

© 2001 by HalVar Flake A suspicious construct An Example: iWS 4.1 SHTML The format string is dynamic We have an argument deficiency as 0x14 < 0x18

© 2001 by HalVar Flake Creating the format string (I) An Example: iWS 4.1 SHTML Creates the string passed to INTlog_error()

© 2001 by HalVar Flake Creating the format string (II) An Example: iWS 4.1 SHTML Bingo ! Afterwards, user-supplied data is appended Some string-class size checking

© 2001 by HalVar Flake Creating the SHTML file An Example: iWS 4.1 SHTML An invalid SSI tag to trigger the error logging routine

© 2001 by HalVar Flake The happy end An Example: iWS 4.1 SHTML Exploitable user-supplied format string bug in iWS 4.1 SHTML parsing

© 2001 by HalVar Flake --- BREAK ---

© 2001 by HalVar Flake A simple sprintf()-scanning script Advanced topics: Automation Things to check for in a sprintf()-call: Does the call expand a string using “%s“ ? Does the call target a stack buffer ? Does the call suffer from an argument deficiency ? If so, is the format string dynamic ?

© 2001 by HalVar Flake Getting the stack correction Advanced topics: Automation static GetStackCorr(lpCall) { while((GetMnem(lpCall) != "add")&&(GetOpnd(lpCall, 0) != "esp")) lpCall = Rfirst(lpCall); return(xtol(GetOpnd(lpCall, 1))); } Trace the code further until an „add esp, somevalue“ is found Convert the somevalue to a number and return it

Retrieving a string Advanced topics: Automation static GetBinString(eaString) { auto strTemp, chr; strTemp = ""; chr = Byte(eaString); while((chr != 0)&&(chr != 0xFF)) { strTemp = form("%s%c", strTemp, chr); eaString = eaString + 1; chr = Byte(eaString); } return(strTemp); } Zero the stringGet a byte Until either a NULL or a 0xFF is found, append one byte at a time to the string, then return the string.

Retrieving argument n Advanced topics: Automation We must take the following steps to retrieve argument n to a certain function call: Locate the n-th push before a call if an immediate value is pushed, return that value (or the offset) if a register is push, find where it was last written to and return the value it was loaded with. © 2001 by HalVar Flake

staticGetArg(lpCall, n) { autoTempReg; while(n > 0) { lpCall = RfirstB(lpCall); if(GetMnem(lpCall) == "push") n = n-1; } if(GetOpType(lpCall, 0) == 1) { TempReg = GetOpnd(lpCall, 0); lpCall = RfirstB(lpCall); while(GetOpnd(lpCall, 0) != TempReg) lpCall = RfirstB(lpCall); return(GetOpnd(lpCall, 1)); } else return(GetOpnd(lpCall, 0)); } Trace back until the n-th push is found Is the pushed operand a register ? Find where the register was last accessed and return the value which was pushed... (source) © 2001 by HalVar Flake

(source) staticAuditSprintf(lpCall) { autofString, fStrAddr, buffTarget; buffTarget = GetArg(lpCall, 1); fString = GetArg(lpCall, 2); if(strstr(fString, "offset") != -1) fString = substr(fString, 7, -1); fStrAddr = LocByName(fString); fString = BinStrGet(fStrAddr); if(GetStackCorr(lpCall) < 12) if(strlen(fString) < 2) Message("%lx --> Format String Problem ?\n", lpCall); if(strstr(fString, "%s") != -1) if(strstr(buffTarget, "var_") != -1) Message("%lx --> Overflow problem ? \"%s\"\n", lpCall, fString); } Clean up the arguments Check if the target is a stack variable Check for a dynamic format string Check for argument deficiencyCheck for „%s“ in format string © 2001 by HalVar Flake

(source) static main() { autoFuncAddr, xref; FuncAddr = AskAddr(-1, "Enter address:"); xref = Rfirst(FuncAddr); while(xref != -1) { if(GetMnem(xref) == "call") AuditSprintf(xref); xref = Rnext(FuncAddr, xref); } xref = DfirstB(FuncAddr); while(xref != -1) { if(GetMnem(xref) == "call") AuditSprintf(xref); xref = DnextB(FuncAddr, xref); } Ask auditor to enter the address of the sprintf( ) Call the auditing function once for each call to sprintf( ) Repeat for all indirect calls © 2001 by HalVar Flake

A simple strncpy()-scanning script Advanced topics: Automation Things to check for in a strncpy()-call: Is the target buffer a stack variable ? Is the maxlen parameter equal to the estimated size of the target buffer ? Is the source buffer a non-static string ? © 2001 by HalVar Flake

Estimating Stack Buffer size Advanced topics: Automation static StckBuffSize(lpCall, cName) { auto frameID, ofs, count; frameID = GetFrame(lpCall); while(strstr(cName, "+") != -1) cName = substr(cName, strstr(cName, "+")+1, strlen(cName)); cName = substr(cName, 0, strlen(cName)-1); ofs = GetMemberOffset(frameID, cName); count = ofs + 1; while(GetMemberName(frameID, count) == "") count = count + 1; count = count-ofs; return count; } Clean up name Walk stackframe until another var is found © 2001 by HalVar Flake

The AudStrncpy()-function Advanced topics: Automation static AudStrncpy(lpCall) { auto buffTarget, buffSrc, maxlen; auto srcString; buffTarget = GetArg(lpCall, 1); buffSrc = GetArg(lpCall, 2); maxlen = GetArg(lpCall, 3); if(StckBuffSize(lpCall, buffTarget) <= xtol(maxlen)) { if(strlen(BinStrGet(LocByName(buffSrc)))<2) Message("Suspicious strncpy() at %lx !\n", lpCall); } Retrieve arguments Check stack buffer size against maxlen Check for non-static source buffer © 2001 by HalVar Flake

Structure reconstruction (I) Advanced topics Frequently, large structures on the heap are used to hold connection data, error strings and the like. IDA cannot yet reconstruct those structures In order to check strncpy() and similar calls one has to estimate the size of individual structure members © 2001 by HalVar Flake

Structure reconstruction (II) Advanced topics Access to structure members © 2001 by HalVar Flake

Automated struc reconstruction Automating the boring parts © 2001 by HalVar Flake Reconstructed struc members which can now be named as we wish

bas_objrec.idc results Automating the boring parts © 2001 by HalVar Flake

Problems with auditing OOP C++ specific topics Since the class data structure is unknown, estimating buffer size is hard. This leads to problems when analyzing certain function calls (e.g. strncpy()) Most overflows/problems occur in heap memory If dangerous constructs exist, it is hard to evaluate the risk they pose as it is difficult to determine what is overwritten © 2001 by HalVar Flake

Reconstructing classes C++ specific topics Many classes have a vtable that list all methods for that class. This table gives the reverse engineer a list of functions that all operate upon the same structure (the class itself). By using something like the bas_objrec.idc script, one can reconstruct the class data structure and thus reconstruct the member boundaries. © 2001 by HalVar Flake

Further reading RE-oriented webpages Home of the IDA Pro disassembler Cristina Cifuentes Decompilation page REC – Reverse engineering compiler © 2001 by HalVar Flake

Open discussion concerning reverse engineering Advanced topics