Windows Heap Exploitation (Win2KSP0 through WinXPSP2)

Slides:



Advertisements
Similar presentations
MEMORY MANAGEMENT Y. Colette Lemard. MEMORY MANAGEMENT The management of memory is one of the functions of the Operating System MEMORY = MAIN MEMORY =
Advertisements

Disk Storage, Basic File Structures, and Hashing
Databasteknik Databaser och bioinformatik Data structures and Indexing (II) Fang Wei-Kleiner.
Chapter 4 Memory Management Basic memory management Swapping
Richard Johnson
CS 11 C track: lecture 7 Last week: structs, typedef, linked lists This week: hash tables more on the C preprocessor extern const.
More on File Management
Part IV: Memory Management
Datorteknik VirtualMemory bild 1 Virtual Memory User memory model so far: Separate Instruction and Data memory In reality they share the same memory space.
Memory Management: Overlays and Virtual Memory
Lecture 10: Heap Management CS 540 GMU Spring 2009.
File Systems.
1 Optimizing Malloc and Free Professor Jennifer Rexford COS 217 Reading: Section 8.7 in K&R book
Allocating Memory.
1 A Real Problem  What if you wanted to run a program that needs more memory than you have?
Breno de MedeirosFlorida State University Fall 2005 Buffer overflow and stack smashing attacks Principles of application software security.
Binghamton University CS-220 Spring 2015 Binghamton University CS-220 Spring 2015 Heap Management.
CPSC 388 – Compiler Design and Construction
COMP 3221: Microprocessors and Embedded Systems Lectures 27: Virtual Memory - III Lecturer: Hui Wu Session 2, 2005 Modified.
CS 153 Design of Operating Systems Spring 2015
CS 333 Introduction to Operating Systems Class 11 – Virtual Memory (1)
1 Optimizing Malloc and Free Professor Jennifer Rexford
Building Secure Software Chapter 9 Race Conditions.
Computer Organization and Architecture
1.1 CAS CS 460/660 Introduction to Database Systems File Organization Slides from UC Berkeley.
Andrea Bittau, Adam Belay, Ali Mashtizadeh, David Maziéres, Dan Boneh
1 Memory Management in Representative Operating Systems.
CS4432: Database Systems II
Memory Allocation CS Introduction to Operating Systems.
Security Exploiting Overflows. Introduction r See the following link for more info: operating-systems-and-applications-in-
Lecture 21 Last lecture Today’s lecture Cache Memory Virtual memory
Reliable Windows Heap Exploits
Exam2 Review Bernard Chen Spring Deadlock Example semaphores A and B, initialized to 1 P0 P1 wait (A); wait(B) wait (B); wait(A)
Software attacks Lorenzo Dematté Software attacks Advanced buffer overflow: heap smashing.
Hardware Assisted Control Flow Obfuscation for Embedded Processors Xiaoton Zhuang, Tao Zhang, Hsien-Hsin S. Lee, Santosh Pande HIDE: An Infrastructure.
Chapter 4. INTERNAL REPRESENTATION OF FILES
IT253: Computer Organization
By Teacher Asma Aleisa Year 1433 H.   Goals of memory management  To provide a convenient abstraction for programming  To allocate scarce memory resources.
Virtual Memory. DRAM as cache What about programs larger than DRAM? When we run multiple programs, all must fit in DRAM! Add another larger, slower level.
1 Memory Management Basics. 2 Program P Basic Memory Management Concepts Address spaces Physical address space — The address space supported by the hardware.
Memory Management during Run Generation in External Sorting – Larson & Graefe.
1 Some Real Problem  What if a program needs more memory than the machine has? —even if individual programs fit in memory, how can we run multiple programs?
CNIT 127: Exploit Development Ch 8: Windows Overflows Part 2.
Reliable Windows Heap Exploits Matt Conover & Oded Horovitz.
Lecture 10 Page 1 CS 111 Summer 2013 File Systems Control Structures A file is a named collection of information Primary roles of file system: – To store.
Memory Management: Overlays and Virtual Memory. Agenda Overview of Virtual Memory –Review material based on Computer Architecture and OS concepts Credits.
CSCI 156: Lab 11 Paging. Our Simple Architecture Logical memory space for a process consists of 16 pages of 4k bytes each. Your program thinks it has.
Group 9. Exploiting Software The exploitation of software is one of the main ways that a users computer can be broken into. It involves exploiting the.
CS 241 Discussion Section (12/1/2011). Tradeoffs When do you: – Expand Increase total memory usage – Split Make smaller chunks (avoid internal fragmentation)
Lecture 7 Page 1 CS 111 Summer 2013 Dynamic Domain Allocation A concept covered in a previous lecture We’ll just review it here Domains are regions of.
CNIT 127: Exploit Development Ch 8: Windows Overflows Part 1.
COMP091 – Operating Systems 1 Memory Management. Memory Management Terms Physical address –Actual address as seen by memory unit Logical address –Address.
Chapter 17 Free-Space Management
Memory Management.
CS703 - Advanced Operating Systems
The Buffer Cache.
Dynamic Memory Allocation
File System Structure How do I organize a disk into a file system?
Some Real Problem What if a program needs more memory than the machine has? even if individual programs fit in memory, how can we run multiple programs?
Optimizing Malloc and Free
CS Introduction to Operating Systems
Software Security Lesson Introduction
CSE 451: Operating Systems Autumn 2005 Memory Management
OS – Memory Deallocation
The future of Software Security Dr. Si Chen
CSE 451: Operating Systems Autumn 2003 Lecture 9 Memory Management
CSE 451 Autumn 2003 November 13 Section.
CSE 451: Operating Systems Autumn 2003 Lecture 9 Memory Management
CS703 - Advanced Operating Systems
COMP755 Advanced Operating Systems
Presentation transcript:

Windows Heap Exploitation (Win2KSP0 through WinXPSP2) Original CanSecWest 04 Presentation: Matt Conover & Oded Horovitz XP SP2 Additions added/presented, Matt Conover @ SyScan 2004

Agenda “Practical” Windows heap internals How to exploit Win2K – WinXP SP1 heap overflows 3rd party (me ) assessment of WinXP SP2 improvements How to exploit WinXP SP2 heap overflows Summary

Windows Heap Internals Many heaps can coexist in one process (normally 2-3) PEB 0x0010 Default Heap 0x0080 Heaps Count 0x0090 Heap List Process Environment Bloc (PEB) Is a structure that maintain set of global variable unique for the process. The PEB is pointed to by each of the process Thread Environment Blocks (TEB)s. which in turn are pointed to by the special register FS. The PEB address is the same across all process in the system. By default the address of the PEB is 0x7FFDF000 The only case in which the PEB is not located at the default address is when the system is configured to reserve 3GB for user mode application instead of the default 2GB. Default Heap 0x70000 0x170000 2nd Heap

Windows Heap Internals Important heap structures Segments Segment List Virtual Allocation list Free Lists Lookaside List

Windows Heap Internals Introduction to Free Lists 128 doubly-linked list of free chunks (from 8 bytes to 1024 bytes) Chunk size is table row index * 8 bytes Entry [0] is a variable sized free lists contains buffers of 1KB <= size < 512KB, sorted in ascending order 1 2 3 4 5 6 1400 2000 2000 2408 16 16 48 48

Windows Heap Internals Lookaside Table Used for “fast” allocates and deallocates when available Starts empty 128 singly-linked lists of busy chunks (free but left marked as busy) 1 2 3 4 5 6 16 48 48

Windows Heap Internals Why have lookasides at all? Speed! Singly-linked Used to quickly allocate or deallocate No coalescing (leads to fragmentation) So the lookaside lists “fill up” quickly (4 entries)

Windows Heap Internals Basic chunk structure – 8 Bytes 01 – Busy 02 – Extra present 04 – Fill pattern 08 – Virtual Alloc 10 – Last entry 20 – FFU1 40 – FFU2 80 – No coalesce Previous chunk size Self Size Segment Index Flags Unused bytes Tag index (Debug) 1 2 3 4 5 6 7 8 Overflow direction

Windows Heap Internals Free chunk structure – 16 Bytes Self Size Previous chunk size Segment Index Flags Unused bytes Tag index (Debug) Next chunk Previous chunk 1 2 3 4 5 6 7 8

Windows Heap Internals Allocation algorithm (high level) If size >= 512K, virtual memory is used (not on heap) If < 1K, first check the Lookaside lists. If there is no free entries on the Lookaside, check the matching free list If >= 1K or no matching entry was found, use the heap cache (not discussed in this presentation). If >= 1K and no free entry in the heap cache, use FreeLists[0] (the variable sized free list) If still can’t find any free entry, extend heap as needed

Windows Heap Internals Allocate algorithm – FreeLists[0] This is usually what happens for chunk sizes > 1K FreeLists[0] is sorted from smallest to biggest Check if FreeLists[0]->Blink to see if it is big enough (the biggest block) Then return the smallest free entry from free list[0] to fulfill the request, like this: While (Entry->Size < NeededSize) Entry = Entry->Flink

Windows Heap Internals Allocate algorithm – Virtual Allocate Used when ChunkSize > VirtualAlloc threshold (508K) Virtual allocate header is placed on the beginning of the buffer Buffer is added to busy list of virtually allocated buffers (this is what Halvar’s VirtualAlloc overwrite is faking)

Windows Heap Internals Free Algorithm (high level) If the chunk < 512K, it is returned to a lookaside or free list If the chunk < 1K, put it on the lookaside (can only hold 4 entries) If the chunk < 1K and the lookaside is full, put it on the free list If the chunk > 1K put it on heap cache (if present) or FreeLists[0]

Windows Heap Internals Free Algorithm – Free to Lookaside Free buffer to Lookaside list only if: The lookaside is available (e.g., present and unlocked) Requested size is < 1K (to fit the table) Lookaside is not “full” yet (no more than 3 entries already) To add an entry to the Lookaside: Put to the head of Lookaside Point to former head of Lookaside Keep the buffer flags set to busy (to prevent coalescing)

Windows Heap Internals Free Algorithm – Coalesce Step 1: Buffer free Step 2: Buffer removed from free list B A C Step 3: Buffer removed from free list A + B Coalesced A + B + C Coalesced Step 4: Buffer placed back on the free list

Windows Heap Internals Free Algorithm – Coalesce Where coalesce cannot happen: Chunk to be freed is virtually allocated Chunk to be freed will be put on Lookaside Chunk to be coalesced with is busy Highest bit in chunk flags is set …

Windows Heap Internals Free Algorithm – Coalesce (cont) Where coalesce cannot happen: Chunk to be freed is first  no backward coalesce Chunk to be freed is last  no forward coalesce The size of the coalesced chunk would be >= 508K

Windows Heap Internals Summary – Questions? Just remember: Lookasides are allocated from and freed to before free lists FreeLists[0] is mainly used for 1K <= ChunkSize < 512K Coalescing only happens for entries going onto FreeList, not lookaside list Entries on a certain lookaside will stay there until they are allocated from

Heap Exploitation: Basic Terms 4-byte Overwrite Able to overwrite any arbitrary 32-bit address (WhereTo) with an arbitrary 32-bit value (WithWhat) 4-to-n-byte Overwrite Using a 4-byte overwrite to indirectly cause an overwrite of an arbitrary-n bytes

Arbitrary Memory Overwrite Explained Coalesce-On-Free 4-byte Overwrite Utilize coalescing algorithms of the heap This is the method first discussed by Oded and I at CSW04 – it is our preferred method for reliable heap exploitation on all versions < XPSP2 Just make sure to fill the Lookaside[ChunkSize] (put 4 entries on heap) before freeing a chunk of ChunkSize to ensure coalescing Arbitrary overwrite happens when the overflowed buffer gets freed Index < 64 Flags != 1 Fake Flink (WithWhat) Fake Blink (WhereTo) Overflow start

Arbitrary Memory Overwrite Lookaside List Head Overwrite: 4-to-n-byte overwrite What we want to do is overwrite a Lookaside list head and then allocate from it We must be the first one to allocate that size We will get a chunk back pointing to whatever location in memory we want Use this to overwrite a function pointer or put the shellcode at a known writable location

Arbitrary Memory Overwrite Lookaside List Head Overwrite: How To Use the Coalesce-on-Free Overwrite, with these values: FakeChunk.Blink = &Lookaside[ChunkSize] where ChunkSize is a pretty infrequently allocated size FakeChunk.Flink = what we want a pointer to To calculate the FakeChunk.Blink value: LookasideTable = HeapBase + 0x688 Index = (ChunkSize/8)+1 FakeChunk.Blink = LookasideTable + Index * EntrySize (0x30) Set FakeChunk.Flags = 0x20, FakeChunk.Index = 1-63, FakeChunk.PreviousSize = 1, FakeChunk.Size = 1

Exploition Made Simple Overwrite PEB lock routine to point to PEB space Put shellcode into PEB space Then cause the PEB lock routine to execute PEB Header PEB lock/unlock function pointers 0x7ffdf020, 0x7ffdf024 ~1k of payload 0x7ffdf130

Exploitation Made Simple Win2K through WinXP SP1 in a single attempt: First 4-byte overwrite: Blink = 0x7ffdf020, Flink = 0x7ffdf154 4-to-n-byte overwrite: Blink = &Lookaside[(n/8)+1] Be the first to allocate n bytes (cause HeapAlloc(n)): Put your shellcode into the returned buffer All done! Either wait, or cause a crash immediately: For example, do 4-byte overwrite with Blink = 0xABABABAB

Exploitation Made Simple Forcing Shellcode To Run Most applications (read: everyone but MSSQL) don’t specially handle access violations An access violation results in ExitProcess() being called Once the process attempts to exit, ExitProcess() is called The first thing ExitProcess() does is call the PEB lock routine Thus, causing crash = instant shellcode execution Nice 

Exploitation Made Simple Demo

Heap Exploitation Questions? This technique we just covered is very reliably, providing success almost every time on all Win2K (all service packs) and WinXP (up to SP2) On to XP SP2….

XP Service Pack 2 Effects on Heap Exploitation New low fragmentation heap for chunks >= 16K PEB “shuffling” (aka randomization) New security cookie in each heap chunk Safe unlinking: (usually) stops 4-byte overwrites

XP Service Pack 2 PEB Randomization In theory, it could have a big impact on heap exploitation – though not in reality Prior to XP SP2, it used to always be at the highest page available (0x7ffdf000) The first (and ONLY the first) TEB is also randomized They seem to never be below 0x7ffd4000

XP Service Pack 2 PEB Randomization – Does it make any difference? Not much, randomization is definitely a misnomer If 2 threads are present: We can write to 0x7ffdf000-0x7ffdffff, and 2 other pages between 0x7ffd4000-0x7ffdefff If 3 threads are present: 0x7ffde000-0x7ffdffff … If 11 threads are present: 100% success, no empty pages

XP Service Pack 2 PEB Randomization – Summary Provides little protection for… Any application that have m workers per n connections (IIS? Exchange?) Any service in dllhost/services/svchost or any other “active” surrogate process

XP Service Pack 2 Heap header cookie *reminder: overflow direction Previous chunk size Self Size New Cookie Flags Unused bytes Segment Index XP SP2 Header Previous chunk size Self Size Segment Index Flags Unused bytes Tag index (Debug) Current Header 1 2 3 4 5 6 7 8

XP Service Pack 2 Heap header cookie calculation If ((AddressOfChunkHeader / 8) XOR Chunk->Cookie XOR Heap->Cookie != 0) CORRUPT Since the cookie has only 8-bits, it has 2^8 = 256 possible keys We’ll randomly guess the security cookie, on average, 1 of every 256 attempts

XP Service Pack 2 On the normal WinXP SP2 system, corrupting a chunk will do nothing Since we only overwrite the Flink/Blink of the chunk, we corrupt no other chunks Thus we can keep trying until we run out of memory

XP Service Pack 2 Summary so far… At this point, we see that we can with enough time trivially defeat all the other protection mechanisms. On to “safe” unlinking…

XP Service Pack 2 Safe Unlinking A B C Safe unlinking means that RemoveListEntry(B) will make this check: (B->Flink)->Blink == B && (B->Blink)->Flink == B In other words: C->Blink == B && A->Flink == B Can it be evaded? Yes, in one particular case. Header to free A B C

XP Service Pack 2 UnSafe-Unlinking FreeList Overwrite Technique p = HeapAlloc(n); FillLookaside(n); HeapFree(p); EmptyLookaside(n); Overwrite p[0] (somewhere on the heap) with: p->Flags = Busy (to prevent accidental coalescing) p ->Flink = (BYTE *)&ListHead[(n/8)+1] - 4 p ->Blink = (BYTE *)&ListHead[(n/8)+1] + 4 HeapAlloc(n); // defeats safe unlinking (ignore result) p = HeapAlloc(n); // defeats safe unlinking // p now points to &ListHead[(n/8)].Blink

XP Service Pack 2 Defeating Safe Unlinking (before overwrite) ListHead[n-1] [4] Blink [0] Flink [0] Flink ListHead[n] FreeChunk [4] Blink [4] Blink [0] Flink ListHead[n+1]

XP Service Pack 2 Defeating Safe Unlinking: Step 1 (Overwrite) ListHead[n-1] [4] Blink [0] Flink [0] Flink ListHead[n] FreeChunk [4] Blink [4] Blink [0] Flink ListHead[n+1] Now call HeapAlloc(n) to unlink FreeChunk from ListHead FreeChunk->Blink->Flink == *(*(FreeChunk+4)+0) FreeChunk->Flink->Blink) == *(*(FreeChunk+0)+4) Both point to FreeChunk, unlink proceeds!

XP Service Pack 2 Defeating Safe Unlinking: Step 2 (1st alloc) ListHead[n-1] [4] Blink [0] Flink ListHead[n] [4] Blink [0] Flink ListHead[n+1] FreeChunk->Blink->Flink = FreeChunk->Flink FreeChunk->Flink->Blink = FreeChunk->Blink Returns pointer to previous FreeChunk

XP Service Pack 2 Defeating Safe Unlinking: Step 3 (2nd alloc) ListHead[n-1] [4] Blink [0] Flink ListHead[n] [4] Blink [0] Flink ListHead[n+1] Returns pointer to &ListHead[n-1].Blink Now the FreeLists point to whatever data the user puts in it

XP Service Pack 2 Questions?

XP Service Pack 2 Unsafe-Unlinking FreeList Overwrite Technique For vulnerabilities where you can control the allocation size, safe unlinking can be evadable. But is this reliable? Hardly. …

XP Service Pack 2 Unsafe-Unlinking FreeList Overwrite Technique (cont) We have to flood the heap with this repeating 8 byte sequence: [FreeListHead-4][FreeListHead+4] And hope the Chunk’s Flink/Blink pair is within the range we can overflow But there is an even easier method…

XP Service Pack 2 Chunk-on-Lookaside Overwrite Technique In fact on XP SP2, there is an even easier method Lookasides lists take precedence over free lists This is quite convenient because… Lookaside lists (singly linked) are easier to exploit than the free lists (doubly linked)

XP Service Pack 2 Chunk-on-Lookaside Overwrites HeapAlloc checks the lookaside before the free list There is no check to see if the cookie was overwritten since it was freed It is a singly-linked list, thus the safe unlinking check doesn’t apply Result: a clean exploitation technique (albeit with brute-forcing required)

XP Service Pack 2 Chunk-on-Lookaside Overwrites (Technique Summary) // We need at least 2 entries on lookaside a_n[0] = HeapAlloc(n) a_n[1] = HeapAlloc(n) HeapFree(a_n[1]) HeapFree(a_n[0]) Overwrite a_n[0] (somewhere on the heap) with: a_n[0].Flags = Busy (to prevent accidental coalescing) a_n[0].Flink = AddressWeWant HeapAlloc(n) // discard, this returns a_n[0] p = HeapAlloc(n) p now points to AddressWeWant

XP Service Pack 2 Chunk-on-Lookaside Overwrite - Success rate? Reqiures overwriting a chunk already freed to the lookaside If an attacker overflows a buffer repeatedly, how often will he/she need to before succeeding?

XP Service Pack 2 Chunk-on-Lookaside Overwrite – Empirical results 64K heap with 1 segment All chunk sizes sizes between 8-1024 bytes Max overflow size = 1016 bytes Random number of allocs between 10-1000 Free probability of 50% Took an average of 84 allocations to be within overflow range It will take at least 2 overwrites (one to overwrite a function pointer, one to place shellcode)

XP Service Pack 2 Chunk-on-Lookaside Overwrite – Empirical results Application specific function pointer and writable location for shellcode: 84*2 = 168 attempts to execute shellcode Using PEB lock routine + PEB space (application generic): 84*2*12=2,016 attempts to execute shellcode The 12 is for the 12 possible locations of the PEB due to PEB randomization

XP Service Pack 2 Chunk-on-Lookaside Overwrite – Summary To exploit a non-application specific heap exploit will take 2000+ attempts to do it reliably But now ask yourself… how long does it take generate 2000 heap overwrite attempts? Lets be overly conservative and assume 5 minutes That will really slow down a worm… But will it help you if someone is specifically trying to hack your machine?

XP Service Pack 2 Low Fragmentation Heap (LFH) Looks really solid… kudos to its author  Uses 32-bit cookie Obscures address of Lookaside list heads: ChunkSizes = *((DWORD *)Chunk) // (ChunkSize<<16|PrevChunkSize) pLookasideEntry = (DWORD)Chunk / 8 pLookasideEntry ^= Lookaside->Key pLookasideEntry ^= ChunkSizes pLookasideEntry ^= RtlpLFHKey …

XP Service Pack 2 Low Fragmentation Heap (LFH) The RtlpLFHKey is a “show stopper”: push eax call _RtlRandomEx@4 mov _RtlpLFHKey, eax lea eax, [ebp+var_4] imul eax, _RtlpLFHKey push esi …

XP Service Pack 2 Low Fragmentation Heap (LFH) Must be enabled manually (via NTDLL!RtlSetHeapInformation or KERNEL32!HeapSetInformation) It is used for chunks < 16K It is not used by anything on XP SP2 Professional What irony 

Summary Win2K – WinXP SP1 Fixed heap base and fixed PEB allow for writing very stable exploits Overwriting FreeList/Lookaside list heads gives us the ability to overwrite any writable address with 1K of data

Summary WinXP SP2 Decreases reliability (more bruteforcing is necessary) But with enough time, exploitation will still succeed XP SP2 will really slow worm propagation, but not help a targeted victim ...

Summary WinXP SP2 Heap corruption handling is weak PEB randomization is weak Safe unlinking is evadable Non-LFH cookie checks are weak LFH looks good

Summary Solutions Use low fragmentation heap by default Just be sure it is the lowest address on the heap Expand PEB randomization over 1MB or so Most machines have 1GB+ RAM these days Inform user if heap corruption exceeds a threshold If I have an application with 50 corrupt chunks in 60 seconds, I want to know someone is owning me Check security cookies on allocation also

Summary The eventual death of 4 byte overwrites… Whether an attacker can predict the ChunkSize/PrevSize or not, he/she won’t be able to predict a larger security cookie (like LFH has). Heap exploits will focus more on attacking application data on the heap (not the heap itself)