Presentation is loading. Please wait.

Presentation is loading. Please wait.

The Future of RE: Dynamic Binary Visualization

Similar presentations


Presentation on theme: "The Future of RE: Dynamic Binary Visualization"— Presentation transcript:

1 The Future of RE: Dynamic Binary Visualization
{ christopher.domas // REcon 2013 //

2 Myself Myself Battelle Chris Domas Embedded Systems Engineer
Cyber Innovation Unit National Security Division The Battelle Memorial Institute Battelle World’s largest non-profit R&D organization Manages leading national laboratories Awesome Including Brookhaven National Laboratory Idaho National Laboratory Oak Ridge National Laboratory Pacific Northwest National Laboratory Lawrence Livermore National Laboratory Myself

3 Information Analysis Reverse Engineering

4 In other words What IS this?
48 3B C3 74 2A 4C 89 6C C C0 41 8D FF 15 F B F B 4C FF 15 EB EB B 0D 82 C FF 15 DC CB FF 4C 89 1D 71 C B F3 0F 84 A B C B E3 0F 85 D F B7 0E BA EF BB B CA 0F 84 D F9 FF FE F BF C B E C B C5 48 D1 E8 48 FF C8 8B F B EB B FB 0F C C0 45 8D 71 0B 41 8B D6 48 8B 0D DD BF FF 15 AF C C0 BA B B 0D C5 BF FF C C0 BA B B 0D AD BF FF 15 7F D D2 45 8D 46 F7 48 8B 0D DE BF FF 15 F0 7F B C3 0F B C8 FF C 8B F B FB 0F 84 BD B8 FF FE F B C7 4D 03 C0 48 8D B CE E8 4E F3 FF FF EB BF C 8B EB D B F B CE FF F C 3B F3 0F B C6 3B FB B CF F 84 BD C E ED 8B C C E 2E 0F 84 AE B F3 EB 00 4C 8B AC B CD FF 15 9E 7F C 89 2D 07 BF B D C D 8B C4 4C 2B C2 48 8D 81 FA FE FF 7F 48 3B C F B B C3 74 0D C E DD 48 3B CB 0F 84 8B A 49 8B CC E8 6E EC FF FF 89 1D E0 BD C7 05 0A BE C9 4C 8B 05 A4 BE BA BC B 0D 50 BE FF D E7 BD F D DB BD C C0 BA C B 0D 25 BE FF 15 E B F3 0F C9 BA D 42 A9 48 8B 0D 04 BE FF 15 D C 8B C0 BA B CA 48 8B 0D DC BD FF C9 41 8D 51 0B 45 8D B 0D D4 BD FF 15 A D2 44 8D B 0D C1 BD FF 15 F B 0D B4 BD FF 15 EE B 0D E7 BD FF B EB B 8C CC E8 EA CE FF FF 4C 8D 9C B 5B B B E3 41 5F 41 5E 41 5D 41 5C 5F C3 48 8D B CF FF 15 5F 7E B D A B D F8 FF 0F 84 D D 4C 24 7C FF B CB FF E E9 D1 EB FF FF 48 8B D6 48 8D 35 C1 C B C3 48 8B CE E BD C D C9 BA B CE 45 8B C C FF E FA BC F8 FF 0F 84 F B 15 6E BC B CE E8 3A FB FF FF 90 E9 D7 E8 FF FF C C EC B 05 6F BC C F B FF 48 8B DA 48 8B EA F8 22 0F F8 27 0F B CB FF 15 F B C7 48 8D 3D F8 C F 85 A C9 4C 8B C7 BA B CB FF D D B CF FF D F8 FF 0F 84 8B B C8 FF 15 0D 7D B C5 48 8B 8C CC E8 6A CD FF FF 4C 8D 9C B 5B B 6B B E3 5F C3 39 1D 5F BB C 8D 2D 88 C D 8D BB B CD 48 0F 45 0D 66 BE E8 1D D 77 BB B E8 83 F8 06 0F 85 C3 D2 FF FF E9 1F 0C C EC E B BB C D B F9 48 8D 4C D2 41 B8 9C E8 03 CD FF FF 48 8B BB B 0D 9D BB D C 8D D B BA B9 0B C A C 24 3C C C C B8 0B C C FF 15 6D 7F C0 0F 84 6F 0B FF 0F 84 5F 0B B CF FF C B D8 EB D 8C 24 D FF C B D D D2 FF 15 3C 7C B D C0 0F 84 2A 0B D 8C 24 D B CE 4C 8B C0 48 8B D7 E8 EE D D 4C C C C 24 5C FF B 4C BA C0 0F 48 CA 89 4C B CB FF 15 3C 7C B EB 00 8B C2 48 8B 8C 24 D CC E8 CC CB FF FF 4C 8D 9C 24 E B 5B B B E3 5F C3 FF 15 A9 7B E9 DA D0 FF FF FF 15 9D 7B E9 DE D0 FF FF C EC B F1 49 8B C8 49 8B D8 48 8B FA FF 15 4F 7B B D C0 48 8B D B CE C FF 15 4D B 5C B C4 30 5F C3 1B C0 83 D8 FF E9 13 D0 FF FF C F B7 1D 27 B DB 41 FF C9 44 8B D3 EB B D1 73 1F 0F B FF C F B C What IS this? In other words

5 The ever-present issues
How we conceptualize binary information How we use our conceptualization for analysis The ever-present issues

6 The RE Dichotomy Flexible Exact Complex Rigid Rules Succinct
B D0 FA B4 1B D0 C0 66 D1 C8 E6 74 E4 75 F6 C C4 E6 74 E C4 8A E6 66 5A C B D0 86 C4 B4 1B D0 C0 66 D1 C8 FA A E0 8A C2 86 C4 E C4 E F6 C C4 E6 74 E B C2 66 5A C3 E6 72 E4 73 C C4 E C4 E C3 66 9C FA 66 8B CB DB 8A C1 E8 87 FF FF FF 02 D8 80 D7 00 FE C1 38 E9 76 EE D C E4 80 B3 10 0A DC push dx mov dx, ax cli mov ah, 1Bh rol al, 1 ror ax, 1 out h, al in al, 75h test ah, 80h jz $+0x0A xchg al, ah mov ah, dh Flexible Exact Complex Rigid Rules Succinct The RE Dichotomy

7 Investigating the concept
xor edx, edx ; int lea rcx, [rsp+78h] ; void * mov [rsp+arg_68], ebx lea r8d, [rdx+60h] ; size_t call memset xor ecx, ecx ; hWnd call cs:GetDC mov rdi, rax cmp rax, rbx jz loc_ A4 lea rsi, stru_ A0 mov edx, 5Ah ; int mov rcx, rax ; HDC mov [rsp+arg_68], 68h mov [rsp+78h], rbp mov [rsp+88h], rsi call cs:GetDeviceCaps mov ecx, cs:nNumber mov r8d, 2D0h mov edx, eax call cs:MulDiv mov rdx, rdi ; hDC neg eax mov dword ptr [rsp+94h], h mov eax, 2000h mov [rsp+0C8h], ax call cs:ReleaseDC call cs:ChooseFontW cmp eax, ebx mov rcx, cs:hCursor ; hCursor call cs:SetCursor mov rcx, rsi ; LOGFONTW * call cs:CreateFontIndirectW jz short loc_100001B8C mov rcx, cs:wParam ; HGDIOBJ call cs:DeleteObject mov rcx, cs:qword_ A0 ; hWnd mov r9, r ; lParam mov r8, rdi ; wParam mov edx, 30h ; Msg mov cs:wParam, rdi call cs:SendMessageW What does this do? Where’s the vulnerability? Reverse engineering as an art Intuition and instinct. Investigating the concept

8 Reverse Engineering Let’s fix this. Bridge the analysis gap
Change RE from an ART to a SCIENCE Reverse Engineering

9 How? Visual RE Statistical Exploration Address the conceptualization
Address the analysis How?

10 Visual RE Idea: Take a computationally difficult task
Translate it to a problem our brains do naturally Visual RE

11 …what? In other words in 20 seconds
48 3B C3 74 2A 4C 89 6C C C0 41 8D FF 15 F B F B 4C FF 15 EB EB B 0D 82 C FF 15 DC CB FF 4C 89 1D 71 C B F3 0F 84 A B C B E3 0F 85 D F B7 0E BA EF BB B CA 0F 84 D F9 FF FE F BF C B E C B C5 48 D1 E8 48 FF C8 8B F B EB B FB 0F C C0 45 8D 71 0B 41 8B D6 48 8B 0D DD BF FF 15 AF C C0 BA B B 0D C5 BF FF C C0 BA B B 0D AD BF FF 15 7F D D2 45 8D 46 F7 48 8B 0D DE BF FF 15 F0 7F B C3 0F B C8 FF C 8B F B FB 0F 84 BD B8 FF FE F B C7 4D 03 C0 48 8D B CE E8 4E F3 FF FF EB BF C 8B EB D B F B CE FF F C 3B F3 0F B C6 3B FB B CF F 84 BD C E ED 8B C C E 2E 0F 84 AE B F3 EB 00 4C 8B AC B CD FF 15 9E 7F C 89 2D 07 BF B D C D 8B C4 4C 2B C2 48 8D 81 FA FE FF 7F 48 3B C F B B C3 74 0D C E DD 48 3B CB 0F 84 8B A 49 8B CC E8 6E EC FF FF 89 1D E0 BD C7 05 0A BE C9 4C 8B 05 A4 BE BA BC B 0D 50 BE FF D E7 BD F D DB BD C C0 BA C B 0D 25 BE FF 15 E B F3 0F C9 BA D 42 A9 48 8B 0D 04 BE FF 15 D C 8B C0 BA B CA 48 8B 0D DC BD FF C9 41 8D 51 0B 45 8D B 0D D4 BD FF 15 A D2 44 8D B 0D C1 BD FF 15 F B 0D B4 BD FF 15 EE B 0D E7 BD FF B EB B 8C CC E8 EA CE FF FF 4C 8D 9C B 5B B B E3 41 5F 41 5E 41 5D 41 5C 5F C3 48 8D B CF FF 15 5F 7E B D A B D F8 FF 0F 84 D D 4C 24 7C FF B CB FF E E9 D1 EB FF FF 48 8B D6 48 8D 35 C1 C B C3 48 8B CE E BD C D C9 BA B CE 45 8B C C FF E FA BC F8 FF 0F 84 F B 15 6E BC B CE E8 3A FB FF FF 90 E9 D7 E8 FF FF C C EC B 05 6F BC C F B FF 48 8B DA 48 8B EA F8 22 0F F8 27 0F B CB FF 15 F B C7 48 8D 3D F8 C F 85 A C9 4C 8B C7 BA B CB FF D D B CF FF D F8 FF 0F 84 8B B C8 FF 15 0D 7D B C5 48 8B 8C CC E8 6A CD FF FF 4C 8D 9C B 5B B 6B B E3 5F C3 39 1D 5F BB C 8D 2D 88 C D 8D BB B CD 48 0F 45 0D 66 BE E8 1D D 77 BB B E8 83 F8 06 0F 85 C3 D2 FF FF E9 1F 0C C EC E B BB C D B F9 48 8D 4C D2 41 B8 9C E8 03 CD FF FF 48 8B BB B 0D 9D BB D C 8D D B BA B9 0B C A C 24 3C C C C B8 0B C C FF 15 6D 7F C0 0F 84 6F 0B FF 0F 84 5F 0B B CF FF C B D8 EB D 8C 24 D FF C B D D D2 FF 15 3C 7C 00 00 In other words how to traverse a hundred thousand pages of in 20 seconds …what?

12 If we change the way we process binary information…

13 … we find unexpected ways of making sense of it.

14 Why bother? obfuscated file headers no headers
multiple binaries embedded in a single blob unique instruction sets proprietary data formats overwhelming complexity steganography memory dumps rapid RE triage firmware forensics Why bother?

15 Our best RE tools are completely dependent on known structure
Information is evolving faster than our tools can keep up Gate’s Law Software is getting slower more rapidly than hardware becomes faster The amount of information we need to analyze is growing exponentially We’re left in the dust by this avalanche of information. Every day there's a new instruction set, a new file format, a new data type that our tools need to adapt to; trying to keep them up to date is a losing battle. Why bother?

16 Some interesting ideas
Greg Conti United States Military Academy Black Hat 2010 Aldo Cortesi Nullcube corte.si Some interesting ideas

17 Digraphs (conti) Idea: Even in unstructured data, there is structure
Sequential bytes have an implicit relationship Digraphs (conti)

18 recon re (72,65) ec (65,72) co (63,6F) on (6F,6E) Digraphs (conti)

19 (0x20, 0x20) (0x20, 0x35 ) Digraphs (conti)

20 ASCII Digraphs (conti)

21 Image Digraphs (conti)

22 Audio Digraphs (conti)

23 Hilbert Translation (cortesi)
Introducing the Hilbert Curve Continuous fractal space filling curve Maps 1D space to an alternate dimensional space Preserves Locality Unexpected utility in computer science Hilbert Translation (cortesi)

24 Hilbert Translation (cortesi)
Traditional byte plot “Zig-Zag” order on left Hilbert byte plot on right Shaded by Byte Class Hilbert Translation (cortesi)

25 Our software Build on these concepts Goal understand data
independent of format Unknown code Format deviations Proprietary structure a priori Independent of format is key! We should work as well on x86 as ARM as some random instruction set nobody’s ever seen. Our software is not tied to any specific type of file or data – it treats all data equally, as arbitrary binary information, so that any file can be explored. Our software

26 Introducing ..cantor.dust..
Named after this guy → Illustrating concepts Doesn’t need to be this software Introducing ..cantor.dust..

27 ..cantor.dust.. // background
An embarrassing mistake ..cantor.dust.. // background

28 ..cantor.dust.. // background
B D0 FA B4 1B D0 C0 66 D1 C8 E6 74 E4 75 F6 C C4 E6 74 E C4 8A E6 66 5A C B D0 86 C4 B4 1B D0 C0 66 D1 C8 FA A E0 8A C2 86 C4 E C4 E F6 C C4 E6 74 E B C2 66 5A C3 E6 72 E4 73 C C4 E C4 E C3 66 9C FA 66 8B CB DB 8A C1 E8 87 FF FF FF 02 D8 80 D7 00 FE C1 38 E9 76 EE D C E4 80 B3 10 0A DC B D0 FA B4 1B D0 C0 66 D1 C8 E6 74 E4 75 F6 C C4 E6 74 E C4 8A E6 66 5A C B D0 86 C4 B4 1B D0 C0 66 D1 C8 FA A E0 8A C2 86 C4 E C4 E F6 C C4 E6 74 E B C2 66 5A C3 E6 72 E4 73 C C4 E C4 E C3 66 9C FA 66 8B CB DB 8A C1 E8 87 FF FF FF 02 D8 80 D7 00 FE C1 38 E9 76 EE D C E4 80 B3 10 0A DC B D0 FA B4 1B D0 C0 66 D1 C8 E6 74 E4 75 F6 C C4 E6 74 E C4 8A E6 66 5A C B D0 86 C4 B4 1B D0 C0 66 D1 C8 FA A E0 8A C2 86 C4 E C4 E F6 C C4 E6 74 E B C2 66 5A C3 E6 72 E4 73 C C4 E C4 E C3 66 9C FA 66 8B CB DB 8A C1 E8 87 FF FF FF 02 D8 80 D7 00 FE C1 38 E9 76 EE D C E4 80 B3 10 0A DC B D0 FA B4 1B D0 C0 66 D1 C8 E6 74 E4 75 F6 C C4 E6 74 E C4 8A E6 66 5A C B D0 86 C4 B4 1B D0 C0 66 D1 C8 FA B D0 FA B4 1B D0 C0 66 D1 C8 E6 74 E4 75 F6 C C4 E6 74 E C4 8A E6 66 5A C B D0 86 C4 B4 1B D0 C0 66 D1 C8 FA A E0 8A C2 86 C4 E C4 E F6 C C4 E6 74 E B C2 66 5A C3 E6 72 E4 73 C C4 E C4 E C3 66 9C FA 66 8B CB DB 8A C1 E8 87 FF FF FF 02 D8 80 D7 00 FE C1 38 E9 76 EE D C E4 80 B3 10 0A DC B D0 FA B4 1B D0 C0 66 D1 C8 E6 74 E4 75 F6 C C4 E6 74 E C4 8A E6 66 5A C B D0 86 C4 B4 1B D0 C0 66 D1 C8 FA A E0 8A C2 86 C4 E C4 E F6 C C4 E6 74 ..cantor.dust.. // background

29 ..cantor.dust.. // background
An embarrassing mistake Spare time ..cantor.dust.. // background

30 Interface Patterns ..cantor.dust.. // demo

31 ..cantor.dust.. // demo Visualizations Tuple systems Why Time/Space
Coordinate Systems Fun ..cantor.dust.. // demo

32 ..cantor.dust.. // demo Visualizations Metric map Entropy Uses
Encryption Keys Packing Obfuscation Examples Malware Key check ..cantor.dust.. // demo

33 Notepad.exe dissection
..cantor.dust.. // demo

34 ..cantor.dust.. // classificaton
Binary region identification through statistical analysis Naïve Bayes classification of n-gram models Identify regions of an object based on supplied templates Uses: Custom/proprietary formats Unique Instruction Sets Data/code features ..cantor.dust.. // classificaton

35 ..cantor.dust.. // parsing Conventional parsing A new approach?
Recursive descent Linear sweep A new approach? “Probabilistic parsing” Identify structure based on statistical patterns, not grammar definitions ..cantor.dust.. // parsing

36 ..cantor.dust.. // case study
“Attacking Intel BIOS” Invisible Things Lab Rafal Wojtczuk and Alexander Tereshkin Black Hat 2009 ..cantor.dust.. // case study

37 ..cantor.dust.. // case study
Goal: Bypass EFI signing protection ..cantor.dust.. // case study

38 ..cantor.dust.. // case study
Background: EFI is a BIOS replacement Update images can be signed and checked before they are flashed Hardware protections prevent flashing from the OS EFI sets these protections early in the boot process “Update” exe places image on disk, flags update on reboot Hardware protections prevent access to the lowest levels of the processor ..cantor.dust.. // case study

39 ..cantor.dust.. // case study
Background Update image “envelope” Signed module Unsigned module ← ..cantor.dust.. // case study

40 ..cantor.dust.. // case study
Background Update image “envelope” Signed module Unsigned module ← Attack Vector! ..cantor.dust.. // case study

41 ..cantor.dust.. // case study
Why have an unsigned module? Boot Splash Logo Can be changed by the OEMs ..cantor.dust.. // case study

42 ..cantor.dust.. // case study
Exploit! Goal: Unsigned Code Execution Can’t update code, can update bitmap Vulnerability identified in bitmap parser for splash screen In EFI template code used by most IBVs Flash the bitmap with an invalid image Overflow with width & height Cause an overflow, get execution ..cantor.dust.. // case study

43 ..cantor.dust.. // case study
Observations No one is really “EFI” There’s just “more” or “less” EFI for now This means Less structure More headache ..cantor.dust.. // case study

44 ..cantor.dust.. // case study
ITL covered the exploit Let’s figure out everything else ..cantor.dust.. // case study

45 case study // step 1 Choose a victim
Targeting my junk laptop Don’t care if I brick it Grab the update executable from vendor’s website case study // step 1

46 case study // step 2 Quick RE to extract the firmware image
Ultra-secret secure flags have been encrypted by adding 1 to each character We’re looking for “WRITEROMFILE” a.k.a. “XSJUFSPNGJMF” case study // step 2

47 case study // step 3 RE image Find a custom decompression routine
But only one module Where are the others? case study // step 3

48 case study // step 4 Use ..cantor.dust.. Crop out the known module
Use as a template for statistical analysis Identify regions of the binary image with patterns that match the template case study // step 4

49 case study // step 5 Decompress the located modules
Use Chris Eagle’s x86emu No need to understand the algorithm case study // step 5

50 Modules contain proprietary headers
case study // step 6

51 case study // step 6 How do we find the splash screen?
Too time consuming at a binary level! Use visual abstractions to locate the “bitmap data” fingerprint case study // step 6

52 Examine in a hex editor – follows the bitmap format, minus the ‘BM’ signature
Easy to find width/height case study // step 7

53 Repackage the splash screen with modified information
case study // step 8

54 case study // step 9 Try to flash Image is not accepted
Modules are signed Bitmap is not “Envelope” is checksum’d Need to fix the checksum Drop first module into IDA There’s no known structure to this module IDA tells us nothing case study // step 9

55 Use probabilistic parsing to identify key functions
Find debug_printf Find calls to debug_printf Isolate CRC checksum routine Identify CRC algorithm Locate checksum position in image case study // step 10

56 Repackage the image with proper checksum
Success! case study // step X

57 Without ..cantor.dust..? Took me 37 hours case study // results

58 Hope I’ve shown… case study // results

59 Useful: User interface ..cantor.dust.. // future

60 Useful: Complete integration ..cantor.dust.. // future

61 ..cantor.dust.. // future

62 ..cantor.dust.. // future

63 ..cantor.dust.. // future

64 Pointless: Hands free ..cantor.dust.. // future

65 ..cantor.dust.. // future

66 ..cantor.dust.. // future

67 ..cantor.dust.. // future ..minority.report..

68 Conclusions We need new ways to understand and analyze information
Otherwise RE will stagnate Conclusions

69 Conclusions We need new ways to understand and analyze information
Otherwise RE will stagnate Use this software Start thinking differently Conclusions

70 Thank you Getting a copy We’ve just gotten started
sites.google.com/site/xxcantorxdustxx/home We’ve just gotten started Thank you

71 ..cantor.dust.. // redux


Download ppt "The Future of RE: Dynamic Binary Visualization"

Similar presentations


Ads by Google