Download presentation
Presentation is loading. Please wait.
Published byJosephine Crewdson Modified over 9 years ago
1
The Future of RE: Dynamic Binary Visualization
{ christopher.domas // REcon 2013 //
2
Myself Myself Battelle Chris Domas Embedded Systems Engineer
Cyber Innovation Unit National Security Division The Battelle Memorial Institute Battelle World’s largest non-profit R&D organization Manages leading national laboratories Awesome Including Brookhaven National Laboratory Idaho National Laboratory Oak Ridge National Laboratory Pacific Northwest National Laboratory Lawrence Livermore National Laboratory Myself
3
Information Analysis Reverse Engineering
4
In other words What IS this?
48 3B C3 74 2A 4C 89 6C C C0 41 8D FF 15 F B F B 4C FF 15 EB EB B 0D 82 C FF 15 DC CB FF 4C 89 1D 71 C B F3 0F 84 A B C B E3 0F 85 D F B7 0E BA EF BB B CA 0F 84 D F9 FF FE F BF C B E C B C5 48 D1 E8 48 FF C8 8B F B EB B FB 0F C C0 45 8D 71 0B 41 8B D6 48 8B 0D DD BF FF 15 AF C C0 BA B B 0D C5 BF FF C C0 BA B B 0D AD BF FF 15 7F D D2 45 8D 46 F7 48 8B 0D DE BF FF 15 F0 7F B C3 0F B C8 FF C 8B F B FB 0F 84 BD B8 FF FE F B C7 4D 03 C0 48 8D B CE E8 4E F3 FF FF EB BF C 8B EB D B F B CE FF F C 3B F3 0F B C6 3B FB B CF F 84 BD C E ED 8B C C E 2E 0F 84 AE B F3 EB 00 4C 8B AC B CD FF 15 9E 7F C 89 2D 07 BF B D C D 8B C4 4C 2B C2 48 8D 81 FA FE FF 7F 48 3B C F B B C3 74 0D C E DD 48 3B CB 0F 84 8B A 49 8B CC E8 6E EC FF FF 89 1D E0 BD C7 05 0A BE C9 4C 8B 05 A4 BE BA BC B 0D 50 BE FF D E7 BD F D DB BD C C0 BA C B 0D 25 BE FF 15 E B F3 0F C9 BA D 42 A9 48 8B 0D 04 BE FF 15 D C 8B C0 BA B CA 48 8B 0D DC BD FF C9 41 8D 51 0B 45 8D B 0D D4 BD FF 15 A D2 44 8D B 0D C1 BD FF 15 F B 0D B4 BD FF 15 EE B 0D E7 BD FF B EB B 8C CC E8 EA CE FF FF 4C 8D 9C B 5B B B E3 41 5F 41 5E 41 5D 41 5C 5F C3 48 8D B CF FF 15 5F 7E B D A B D F8 FF 0F 84 D D 4C 24 7C FF B CB FF E E9 D1 EB FF FF 48 8B D6 48 8D 35 C1 C B C3 48 8B CE E BD C D C9 BA B CE 45 8B C C FF E FA BC F8 FF 0F 84 F B 15 6E BC B CE E8 3A FB FF FF 90 E9 D7 E8 FF FF C C EC B 05 6F BC C F B FF 48 8B DA 48 8B EA F8 22 0F F8 27 0F B CB FF 15 F B C7 48 8D 3D F8 C F 85 A C9 4C 8B C7 BA B CB FF D D B CF FF D F8 FF 0F 84 8B B C8 FF 15 0D 7D B C5 48 8B 8C CC E8 6A CD FF FF 4C 8D 9C B 5B B 6B B E3 5F C3 39 1D 5F BB C 8D 2D 88 C D 8D BB B CD 48 0F 45 0D 66 BE E8 1D D 77 BB B E8 83 F8 06 0F 85 C3 D2 FF FF E9 1F 0C C EC E B BB C D B F9 48 8D 4C D2 41 B8 9C E8 03 CD FF FF 48 8B BB B 0D 9D BB D C 8D D B BA B9 0B C A C 24 3C C C C B8 0B C C FF 15 6D 7F C0 0F 84 6F 0B FF 0F 84 5F 0B B CF FF C B D8 EB D 8C 24 D FF C B D D D2 FF 15 3C 7C B D C0 0F 84 2A 0B D 8C 24 D B CE 4C 8B C0 48 8B D7 E8 EE D D 4C C C C 24 5C FF B 4C BA C0 0F 48 CA 89 4C B CB FF 15 3C 7C B EB 00 8B C2 48 8B 8C 24 D CC E8 CC CB FF FF 4C 8D 9C 24 E B 5B B B E3 5F C3 FF 15 A9 7B E9 DA D0 FF FF FF 15 9D 7B E9 DE D0 FF FF C EC B F1 49 8B C8 49 8B D8 48 8B FA FF 15 4F 7B B D C0 48 8B D B CE C FF 15 4D B 5C B C4 30 5F C3 1B C0 83 D8 FF E9 13 D0 FF FF C F B7 1D 27 B DB 41 FF C9 44 8B D3 EB B D1 73 1F 0F B FF C F B C What IS this? In other words
5
The ever-present issues
How we conceptualize binary information How we use our conceptualization for analysis The ever-present issues
6
The RE Dichotomy Flexible Exact Complex Rigid Rules Succinct
B D0 FA B4 1B D0 C0 66 D1 C8 E6 74 E4 75 F6 C C4 E6 74 E C4 8A E6 66 5A C B D0 86 C4 B4 1B D0 C0 66 D1 C8 FA A E0 8A C2 86 C4 E C4 E F6 C C4 E6 74 E B C2 66 5A C3 E6 72 E4 73 C C4 E C4 E C3 66 9C FA 66 8B CB DB 8A C1 E8 87 FF FF FF 02 D8 80 D7 00 FE C1 38 E9 76 EE D C E4 80 B3 10 0A DC push dx mov dx, ax cli mov ah, 1Bh rol al, 1 ror ax, 1 out h, al in al, 75h test ah, 80h jz $+0x0A xchg al, ah mov ah, dh Flexible Exact Complex Rigid Rules Succinct The RE Dichotomy
7
Investigating the concept
xor edx, edx ; int lea rcx, [rsp+78h] ; void * mov [rsp+arg_68], ebx lea r8d, [rdx+60h] ; size_t call memset xor ecx, ecx ; hWnd call cs:GetDC mov rdi, rax cmp rax, rbx jz loc_ A4 lea rsi, stru_ A0 mov edx, 5Ah ; int mov rcx, rax ; HDC mov [rsp+arg_68], 68h mov [rsp+78h], rbp mov [rsp+88h], rsi call cs:GetDeviceCaps mov ecx, cs:nNumber mov r8d, 2D0h mov edx, eax call cs:MulDiv mov rdx, rdi ; hDC neg eax mov dword ptr [rsp+94h], h mov eax, 2000h mov [rsp+0C8h], ax call cs:ReleaseDC call cs:ChooseFontW cmp eax, ebx mov rcx, cs:hCursor ; hCursor call cs:SetCursor mov rcx, rsi ; LOGFONTW * call cs:CreateFontIndirectW jz short loc_100001B8C mov rcx, cs:wParam ; HGDIOBJ call cs:DeleteObject mov rcx, cs:qword_ A0 ; hWnd mov r9, r ; lParam mov r8, rdi ; wParam mov edx, 30h ; Msg mov cs:wParam, rdi call cs:SendMessageW What does this do? Where’s the vulnerability? Reverse engineering as an art Intuition and instinct. Investigating the concept
8
Reverse Engineering Let’s fix this. Bridge the analysis gap
Change RE from an ART to a SCIENCE Reverse Engineering
9
How? Visual RE Statistical Exploration Address the conceptualization
Address the analysis How?
10
Visual RE Idea: Take a computationally difficult task
Translate it to a problem our brains do naturally Visual RE
11
…what? In other words in 20 seconds
48 3B C3 74 2A 4C 89 6C C C0 41 8D FF 15 F B F B 4C FF 15 EB EB B 0D 82 C FF 15 DC CB FF 4C 89 1D 71 C B F3 0F 84 A B C B E3 0F 85 D F B7 0E BA EF BB B CA 0F 84 D F9 FF FE F BF C B E C B C5 48 D1 E8 48 FF C8 8B F B EB B FB 0F C C0 45 8D 71 0B 41 8B D6 48 8B 0D DD BF FF 15 AF C C0 BA B B 0D C5 BF FF C C0 BA B B 0D AD BF FF 15 7F D D2 45 8D 46 F7 48 8B 0D DE BF FF 15 F0 7F B C3 0F B C8 FF C 8B F B FB 0F 84 BD B8 FF FE F B C7 4D 03 C0 48 8D B CE E8 4E F3 FF FF EB BF C 8B EB D B F B CE FF F C 3B F3 0F B C6 3B FB B CF F 84 BD C E ED 8B C C E 2E 0F 84 AE B F3 EB 00 4C 8B AC B CD FF 15 9E 7F C 89 2D 07 BF B D C D 8B C4 4C 2B C2 48 8D 81 FA FE FF 7F 48 3B C F B B C3 74 0D C E DD 48 3B CB 0F 84 8B A 49 8B CC E8 6E EC FF FF 89 1D E0 BD C7 05 0A BE C9 4C 8B 05 A4 BE BA BC B 0D 50 BE FF D E7 BD F D DB BD C C0 BA C B 0D 25 BE FF 15 E B F3 0F C9 BA D 42 A9 48 8B 0D 04 BE FF 15 D C 8B C0 BA B CA 48 8B 0D DC BD FF C9 41 8D 51 0B 45 8D B 0D D4 BD FF 15 A D2 44 8D B 0D C1 BD FF 15 F B 0D B4 BD FF 15 EE B 0D E7 BD FF B EB B 8C CC E8 EA CE FF FF 4C 8D 9C B 5B B B E3 41 5F 41 5E 41 5D 41 5C 5F C3 48 8D B CF FF 15 5F 7E B D A B D F8 FF 0F 84 D D 4C 24 7C FF B CB FF E E9 D1 EB FF FF 48 8B D6 48 8D 35 C1 C B C3 48 8B CE E BD C D C9 BA B CE 45 8B C C FF E FA BC F8 FF 0F 84 F B 15 6E BC B CE E8 3A FB FF FF 90 E9 D7 E8 FF FF C C EC B 05 6F BC C F B FF 48 8B DA 48 8B EA F8 22 0F F8 27 0F B CB FF 15 F B C7 48 8D 3D F8 C F 85 A C9 4C 8B C7 BA B CB FF D D B CF FF D F8 FF 0F 84 8B B C8 FF 15 0D 7D B C5 48 8B 8C CC E8 6A CD FF FF 4C 8D 9C B 5B B 6B B E3 5F C3 39 1D 5F BB C 8D 2D 88 C D 8D BB B CD 48 0F 45 0D 66 BE E8 1D D 77 BB B E8 83 F8 06 0F 85 C3 D2 FF FF E9 1F 0C C EC E B BB C D B F9 48 8D 4C D2 41 B8 9C E8 03 CD FF FF 48 8B BB B 0D 9D BB D C 8D D B BA B9 0B C A C 24 3C C C C B8 0B C C FF 15 6D 7F C0 0F 84 6F 0B FF 0F 84 5F 0B B CF FF C B D8 EB D 8C 24 D FF C B D D D2 FF 15 3C 7C 00 00 In other words how to traverse a hundred thousand pages of in 20 seconds …what?
12
If we change the way we process binary information…
13
… we find unexpected ways of making sense of it.
14
Why bother? obfuscated file headers no headers
multiple binaries embedded in a single blob unique instruction sets proprietary data formats overwhelming complexity steganography memory dumps rapid RE triage firmware forensics Why bother?
15
Our best RE tools are completely dependent on known structure
Information is evolving faster than our tools can keep up Gate’s Law Software is getting slower more rapidly than hardware becomes faster The amount of information we need to analyze is growing exponentially We’re left in the dust by this avalanche of information. Every day there's a new instruction set, a new file format, a new data type that our tools need to adapt to; trying to keep them up to date is a losing battle. Why bother?
16
Some interesting ideas
Greg Conti United States Military Academy Black Hat 2010 Aldo Cortesi Nullcube corte.si Some interesting ideas
17
Digraphs (conti) Idea: Even in unstructured data, there is structure
Sequential bytes have an implicit relationship Digraphs (conti)
18
recon re (72,65) ec (65,72) co (63,6F) on (6F,6E) Digraphs (conti)
19
(0x20, 0x20) (0x20, 0x35 ) Digraphs (conti)
20
ASCII Digraphs (conti)
21
Image Digraphs (conti)
22
Audio Digraphs (conti)
23
Hilbert Translation (cortesi)
Introducing the Hilbert Curve Continuous fractal space filling curve Maps 1D space to an alternate dimensional space Preserves Locality Unexpected utility in computer science Hilbert Translation (cortesi)
24
Hilbert Translation (cortesi)
Traditional byte plot “Zig-Zag” order on left Hilbert byte plot on right Shaded by Byte Class Hilbert Translation (cortesi)
25
Our software Build on these concepts Goal understand data
independent of format Unknown code Format deviations Proprietary structure a priori Independent of format is key! We should work as well on x86 as ARM as some random instruction set nobody’s ever seen. Our software is not tied to any specific type of file or data – it treats all data equally, as arbitrary binary information, so that any file can be explored. Our software
26
Introducing ..cantor.dust..
Named after this guy → Illustrating concepts Doesn’t need to be this software Introducing ..cantor.dust..
27
..cantor.dust.. // background
An embarrassing mistake ..cantor.dust.. // background
28
..cantor.dust.. // background
B D0 FA B4 1B D0 C0 66 D1 C8 E6 74 E4 75 F6 C C4 E6 74 E C4 8A E6 66 5A C B D0 86 C4 B4 1B D0 C0 66 D1 C8 FA A E0 8A C2 86 C4 E C4 E F6 C C4 E6 74 E B C2 66 5A C3 E6 72 E4 73 C C4 E C4 E C3 66 9C FA 66 8B CB DB 8A C1 E8 87 FF FF FF 02 D8 80 D7 00 FE C1 38 E9 76 EE D C E4 80 B3 10 0A DC B D0 FA B4 1B D0 C0 66 D1 C8 E6 74 E4 75 F6 C C4 E6 74 E C4 8A E6 66 5A C B D0 86 C4 B4 1B D0 C0 66 D1 C8 FA A E0 8A C2 86 C4 E C4 E F6 C C4 E6 74 E B C2 66 5A C3 E6 72 E4 73 C C4 E C4 E C3 66 9C FA 66 8B CB DB 8A C1 E8 87 FF FF FF 02 D8 80 D7 00 FE C1 38 E9 76 EE D C E4 80 B3 10 0A DC B D0 FA B4 1B D0 C0 66 D1 C8 E6 74 E4 75 F6 C C4 E6 74 E C4 8A E6 66 5A C B D0 86 C4 B4 1B D0 C0 66 D1 C8 FA A E0 8A C2 86 C4 E C4 E F6 C C4 E6 74 E B C2 66 5A C3 E6 72 E4 73 C C4 E C4 E C3 66 9C FA 66 8B CB DB 8A C1 E8 87 FF FF FF 02 D8 80 D7 00 FE C1 38 E9 76 EE D C E4 80 B3 10 0A DC B D0 FA B4 1B D0 C0 66 D1 C8 E6 74 E4 75 F6 C C4 E6 74 E C4 8A E6 66 5A C B D0 86 C4 B4 1B D0 C0 66 D1 C8 FA B D0 FA B4 1B D0 C0 66 D1 C8 E6 74 E4 75 F6 C C4 E6 74 E C4 8A E6 66 5A C B D0 86 C4 B4 1B D0 C0 66 D1 C8 FA A E0 8A C2 86 C4 E C4 E F6 C C4 E6 74 E B C2 66 5A C3 E6 72 E4 73 C C4 E C4 E C3 66 9C FA 66 8B CB DB 8A C1 E8 87 FF FF FF 02 D8 80 D7 00 FE C1 38 E9 76 EE D C E4 80 B3 10 0A DC B D0 FA B4 1B D0 C0 66 D1 C8 E6 74 E4 75 F6 C C4 E6 74 E C4 8A E6 66 5A C B D0 86 C4 B4 1B D0 C0 66 D1 C8 FA A E0 8A C2 86 C4 E C4 E F6 C C4 E6 74 ..cantor.dust.. // background
29
..cantor.dust.. // background
An embarrassing mistake Spare time ..cantor.dust.. // background
30
Interface Patterns ..cantor.dust.. // demo
31
..cantor.dust.. // demo Visualizations Tuple systems Why Time/Space
Coordinate Systems Fun ..cantor.dust.. // demo
32
..cantor.dust.. // demo Visualizations Metric map Entropy Uses
Encryption Keys Packing Obfuscation Examples Malware Key check ..cantor.dust.. // demo
33
Notepad.exe dissection
..cantor.dust.. // demo
34
..cantor.dust.. // classificaton
Binary region identification through statistical analysis Naïve Bayes classification of n-gram models Identify regions of an object based on supplied templates Uses: Custom/proprietary formats Unique Instruction Sets Data/code features ..cantor.dust.. // classificaton
35
..cantor.dust.. // parsing Conventional parsing A new approach?
Recursive descent Linear sweep A new approach? “Probabilistic parsing” Identify structure based on statistical patterns, not grammar definitions ..cantor.dust.. // parsing
36
..cantor.dust.. // case study
“Attacking Intel BIOS” Invisible Things Lab Rafal Wojtczuk and Alexander Tereshkin Black Hat 2009 ..cantor.dust.. // case study
37
..cantor.dust.. // case study
Goal: Bypass EFI signing protection ..cantor.dust.. // case study
38
..cantor.dust.. // case study
Background: EFI is a BIOS replacement Update images can be signed and checked before they are flashed Hardware protections prevent flashing from the OS EFI sets these protections early in the boot process “Update” exe places image on disk, flags update on reboot Hardware protections prevent access to the lowest levels of the processor ..cantor.dust.. // case study
39
..cantor.dust.. // case study
Background Update image “envelope” Signed module Unsigned module ← ..cantor.dust.. // case study
40
..cantor.dust.. // case study
Background Update image “envelope” Signed module Unsigned module ← Attack Vector! ..cantor.dust.. // case study
41
..cantor.dust.. // case study
Why have an unsigned module? Boot Splash Logo Can be changed by the OEMs ..cantor.dust.. // case study
42
..cantor.dust.. // case study
Exploit! Goal: Unsigned Code Execution Can’t update code, can update bitmap Vulnerability identified in bitmap parser for splash screen In EFI template code used by most IBVs Flash the bitmap with an invalid image Overflow with width & height Cause an overflow, get execution ..cantor.dust.. // case study
43
..cantor.dust.. // case study
Observations No one is really “EFI” There’s just “more” or “less” EFI for now This means Less structure More headache ..cantor.dust.. // case study
44
..cantor.dust.. // case study
ITL covered the exploit Let’s figure out everything else ..cantor.dust.. // case study
45
case study // step 1 Choose a victim
Targeting my junk laptop Don’t care if I brick it Grab the update executable from vendor’s website case study // step 1
46
case study // step 2 Quick RE to extract the firmware image
Ultra-secret secure flags have been encrypted by adding 1 to each character We’re looking for “WRITEROMFILE” a.k.a. “XSJUFSPNGJMF” case study // step 2
47
case study // step 3 RE image Find a custom decompression routine
But only one module Where are the others? case study // step 3
48
case study // step 4 Use ..cantor.dust.. Crop out the known module
Use as a template for statistical analysis Identify regions of the binary image with patterns that match the template case study // step 4
49
case study // step 5 Decompress the located modules
Use Chris Eagle’s x86emu No need to understand the algorithm case study // step 5
50
Modules contain proprietary headers
case study // step 6
51
case study // step 6 How do we find the splash screen?
Too time consuming at a binary level! Use visual abstractions to locate the “bitmap data” fingerprint case study // step 6
52
Examine in a hex editor – follows the bitmap format, minus the ‘BM’ signature
Easy to find width/height case study // step 7
53
Repackage the splash screen with modified information
case study // step 8
54
case study // step 9 Try to flash Image is not accepted
Modules are signed Bitmap is not “Envelope” is checksum’d Need to fix the checksum Drop first module into IDA There’s no known structure to this module IDA tells us nothing case study // step 9
55
Use probabilistic parsing to identify key functions
Find debug_printf Find calls to debug_printf Isolate CRC checksum routine Identify CRC algorithm Locate checksum position in image case study // step 10
56
Repackage the image with proper checksum
Success! case study // step X
57
Without ..cantor.dust..? Took me 37 hours case study // results
58
Hope I’ve shown… case study // results
59
Useful: User interface ..cantor.dust.. // future
60
Useful: Complete integration ..cantor.dust.. // future
61
..cantor.dust.. // future
62
..cantor.dust.. // future
63
..cantor.dust.. // future
64
Pointless: Hands free ..cantor.dust.. // future
65
..cantor.dust.. // future
66
..cantor.dust.. // future
67
..cantor.dust.. // future ..minority.report..
68
Conclusions We need new ways to understand and analyze information
Otherwise RE will stagnate Conclusions
69
Conclusions We need new ways to understand and analyze information
Otherwise RE will stagnate Use this software Start thinking differently Conclusions
70
Thank you Getting a copy We’ve just gotten started
sites.google.com/site/xxcantorxdustxx/home We’ve just gotten started Thank you
71
..cantor.dust.. // redux
Similar presentations
© 2024 SlidePlayer.com Inc.
All rights reserved.