DroidScope: Seamlessly Reconstructing the OS and Dalvik Semantic Views for Dynamic Android Malware Analysis Lok Yan Heng Yin August 10, 2012.

Slides:



Advertisements
Similar presentations
DroidScope: Seamlessly Reconstructing the OS and Dalvik Semantic Views for Dynamic Android Malware Analysis Lok Kwong Yan, and Heng Yin Syracuse University.
Advertisements

Dynamic Taint Analysis for Automatic Detection, Analysis, and Signature Generation of Exploits on Commodity Software Paper by: James Newsome and Dawn Song.
1 Created by Another Process Reason: modeling concurrent sub-tasks Fetch large amount data from network and process them Two sub-tasks: fetching  processing.
CSC 501 Lecture 2: Processes. Von Neumann Model Both program and data reside in memory Execution stages in CPU: Fetch instruction Decode instruction Execute.
The Process Model.
1 Processes and Pipes COS 217 Professor Jennifer Rexford.
Processes CSCI 444/544 Operating Systems Fall 2008.
LIFT: A Low-Overhead Practical Information Flow Tracking System for Detecting Security Attacks Feng Qin, Cheng Wang, Zhenmin Li, Ho-seop Kim, Yuanyuan.
Computer System Overview
Advanced OS Chapter 3p2 Sections 3.4 / 3.5. Interrupts These enable software to respond to signals from hardware. The set of instructions to be executed.
Process in Unix, Linux and Windows CS-3013 C-term Processes in Unix, Linux, and Windows CS-3013 Operating Systems (Slides include materials from.
CS-502 Fall 2006Processes in Unix, Linux, & Windows 1 Processes in Unix, Linux, and Windows CS502 Operating Systems.
Unix & Windows Processes 1 CS502 Spring 2006 Unix/Windows Processes.
Processes in Unix, Linux, and Windows CS-502 Fall Processes in Unix, Linux, and Windows CS502 Operating Systems (Slides include materials from Operating.
ThreadsThreads operating systems. ThreadsThreads A Thread, or thread of execution, is the sequence of instructions being executed. A process may have.
D2Taint: Differentiated and Dynamic Information Flow Tracking on Smartphones for Numerous Data Sources Boxuan Gu, Xinfeng Li, Gang Li, Adam C. Champion,
Process in Unix, Linux, and Windows CS-3013 A-term Processes in Unix, Linux, and Windows CS-3013 Operating Systems (Slides include materials from.
OPERATING SYSTEM OVERVIEW. Contents Basic hardware elements.
Introduction to Processes CS Intoduction to Operating Systems.
Cellular Networks and Mobile Computing COMS , Spring 2014
Cellular Networks and Mobile Computing COMS , Spring 2013 Instructor: Li Erran Li
Processes and Threads CS550 Operating Systems. Processes and Threads These exist only at execution time They have fast state changes -> in memory and.
Lecture 3 Process Concepts. What is a Process? A process is the dynamic execution context of an executing program. Several processes may run concurrently,
Hardware process When the computer is powered up, it begins to execute fetch-execute cycle for the program that is stored in memory at the boot strap entry.
4P13 Week 3 Talking Points 1. Process State 2 Process Structure Catagories – Process identification: the PID and the parent PID – Signal state: signals.
Source: Operating System Concepts by Silberschatz, Galvin and Gagne.
CNIT 127: Exploit Development Ch 3: Shellcode. Topics Protection rings Syscalls Shellcode nasm Assembler ld GNU Linker objdump to see contents of object.
Processes CS 6560: Operating Systems Design. 2 Von Neuman Model Both text (program) and data reside in memory Execution cycle Fetch instruction Decode.
Cellular Networks and Mobile Computing COMS , Fall 2012 Instructor: Li Erran Li
Wireless and Mobile Security
Hardware process When the computer is powered up, it begins to execute fetch-execute cycle for the program that is stored in memory at the boot strap entry.
Dynamic Taint Analysis for Automatic Detection, Analysis, and Signature Generation of Exploits on Commodity Software Paper by: James Newsome and Dawn Song.
Question What technology differentiates the different stages a computer had gone through from generation 1 to present?
ACCESS CONTROL. Components of a Process  Address space  Set of data structures within the kernel - process’s address space map - current status - execution.
What is a Process ? A program in execution.
QEMU, a Fast and Portable Dynamic Translator Fabrice Bellard (affiliation?) CMSC 691 talk by Charles Nicholas.
CopperDroid Logan Horton. Android - Background Android is complicated to analyse due to having 2 places to check for code execution Normally, code is.
Qin Zhao1, Joon Edward Sim2, WengFai Wong1,2 1SingaporeMIT Alliance 2Department of Computer Science National University of Singapore
Introduction to Operating Systems Concepts
Chapter 4: Threads Modified by Dr. Neerja Mhaskar for CS 3SH3.
Android Mobile Application Development
Lok Yan Manjukumar Jayachandra Mu Zhang Heng Yin
Processes and threads.
Process Management Process Concept Why only the global variables?
Protection of System Resources
Pinpointing Vulnerabilities
CASE STUDY 1: Linux and Android
Concurrency, Processes and Threads
Taint tracking Suman Jana.
Processes in Unix, Linux, and Windows
Improving java performance using Dynamic Method Migration on FPGAs
Processes in Unix, Linux, and Windows
More examples How many processes does this piece of code create?
Processes in Unix, Linux, and Windows
Mid Term review CSC345.
Lecture Topics: 11/1 General Operating System Concepts Processes
Threads and Concurrency
Lecture 6: Multiprogramming and Context Switching
Reverse engineering through full system simulations
Chapter 1 Computer System Overview
Processes in Unix, Linux, and Windows
Introduction to Virtual Machines
Processes in Unix and Windows
CS510 Operating System Foundations
Computer System Overview
Introduction to Virtual Machines
Concurrency, Processes and Threads
Chapter 11 Processor Structure and function
Dynamic Binary Translators and Instrumenters
Presentation transcript:

DroidScope: Seamlessly Reconstructing the OS and Dalvik Semantic Views for Dynamic Android Malware Analysis Lok Yan Heng Yin August 10, 2012

Android Java Components System Services Native Components Apps

Android Java Components System Services Native Components Apps

Motivation: Static Analysis Dalvik/Java Static Analysis: ded, Dexpler, soot, Woodpecker, DroidMoss Native Static Analysis: IDA, binutils, BAP

Motivation: Dynamic Analysis Android Analysis: TaintDroid, DroidRanger System Calls logcat, adb

Motivation: Dynamic Analysis External Analysis: Anubis, Ether, TEMU, …

DroidScope Overview

Goals Dynamic binary instrumentation for Android Leverage Android Emulator in SDK No changes to Android Virtual Devices External instrumentation Linux context Dalvik context Extensible: plugin-support / event-based interface Performance Partial JIT support Instrumentation optimization

Roadmap External instrumentation Linux context Dalvik context Extensible: plugin-support / event-based interface Evaluation Performance Usage

Linux Context: Identify App(s) Shadow task list pid, tid, uid, gid, euid, egid, parent pid, pgd, comm argv[0] Shadow memory map Address Space Layout Randomization (Ice Cream Sandwich) Update on fork, execve, clone, prctl and mmap2

Java/Dalvik View Dalvik virtual machine mterp Which Dalvik opcode? register machine (all on stack) 256 opcodes saved state, glue, pointed to by ARM R6, on stack in x86 mterp offset-addressing: fetch opcode then jump to (dvmAsmInstructionStart + opcode * 64) dvmAsmSisterStart for emulation overflow Which Dalvik opcode? Locate dvmAsmInstructionStart in shadow memory map Calculate opcode = (R15 - dvmAsmInstructionStart) / 64.

Just In Time (JIT) Compiler Designed to boost performance Triggered by counter - mterp is always the default Trace based Multiple basic blocks Multiple exits or chaining cells Complicates external introspection Complicates instrumentation

dvmGetCodeAddr(PC) != NULL Disabling JIT dvmGetCodeAddr(PC) != NULL

Roadmap External instrumentation Linux context Dalvik context Extensible: plugin-support / event-based interface Evaluation Performance Usage

Instrumentation Design Event based interface Execution: e.g. native and Dalvik instructions Status: updated shadow task list Query and Set, e.g. interpret and change cpu state Performance Example: Native instructions vs. Dalvik instructions Instrumentation Optimization

Dynamic Instrumentation Update PC Translate Execute inCache? yes no (un)registerCallback needFlush? flushType invalidateBlock(s) flushCache yes

Instrumentation

Dalvik Instruction Tracer (Example) 1. void opcode_callback(uint32_t opcode) { 2. printf("[%x] %s\n", GET_RPC, opcodeToStr(opcode)); 3. } 4. 5. void module_callback(int pid) { 6. if (bInitialized || (getIBase(pid) == 0)) 7. return; 8. 9. gva_t startAddr = 0, endAddr = 0xFFFFFFFF; 10. 11. addDisableJITRange(pid, startAddr, endAddr); 12. disableJITInit(getGetCodeAddrAddress(pid)); 13. addMterpOpcodesRange(pid, startAddr, endAddr); 14. dalvikMterpInit(getIBase(pid)); 15. registerDalvikInsnBeginCb(&opcode_callback); 16. bInitialized = 1; 17. } 18. 19. void _init() { 20. setTargetByName("com.andhuhu.fengyinchuanshuo"); 21. registerTargetModulesUpdatedCb(&module_callback); 22. } getModAddr(“dfk@classes.dex”, &startAddr, &endAddr);

Plugins API Tracer Native and Dalvik Instruction Tracers Taint Tracker System calls open, close, read, write, includes parameters and return values Native library calls Java API calls Java Strings converted to C Strings Native and Dalvik Instruction Tracers Taint Tracker Taints ARM instructions One bit per byte Data movement & Arithmetic instructions including barrel shifter Does not support control flow tainting

Roadmap External instrumentation Linux context Dalvik context Extensible: plugin-support / event-based interface Evaluation Performance Usage

Implementation Configuration QEMU 0.10.50 – part of Gingerbread SDK “user-eng” No changes to source Linux 2.6.29, QEMU kernel branch

Performance Evaluation Seven free benchmark Apps AnTuTu Benchmark (ABenchMark) by AnTuTu CaffeineMark by Ravi Reddy CF-Bench by Chainfire Mobile processor benchmark (Multicore) by Andrei Karpushonak Benchmark by Softweg Linpack by GreeneComputing Six tests repeated five times each Baseline NO-JIT Baseline – uses a build with JIT disabled at runtime Context Only API Tracer Dalvik Instruction Trace Taint Tracker

Select Performance Results APITracer vs. NOJIT Results are not perfect Dynamic Symbol Retrieval Overhead

Usage Evaluation Use DroidScope to analyze real world malware API Tracer Dalvik Instruction Tracer + dexdump Taint Tracker – taint IMEI/IMSI @ move_result_object after getIMEI/getIMSI Analyze included exploits Removed patches in Gingerbread Intercept system calls Native instruction tracer

Droid Kung Fu Three encrypted payloads Three execution methods ratc (Rage Against The Cage) killall (ratc wrapper) gjsvro (udev exploit) Three execution methods piped commands to a shell (default execution path) Runtime.exec() Java API (instrumented path) JNI to native library terminal emulator (instrumented path) Instrumented return values for isVersion221 and getPermission methods

Droid Kung Fu: TaintTracker

DroidDream Same payloads as DroidKungFu Two processes Normal droiddream process clears logcat droiddream:remote is malicious xor-encrypts private information before leaking Instrumented sys_connect and sys_write

Droid Dream: TaintTracker

DroidDream: crypt trace

Summary DroidScope Dynamic binary instrumentation for Android Built on Android Emulator in SDK External Introspection & Instrumentation support Four plugins API Tracer Native Instruction Tracer Dalvik Instruction Tracers TaintTracker Partial JIT support

Related Works Static Analysis Dynamic Analysis Introspection ded, Dexpler, soot Woodpecker, DroidMoss Dynamic Analysis TaintDroid DroidRanger PIN, Valgrind, DynamoRIO Anubis, TEMU, Ether, PinOS Introspection Virtuoso VMWatcher

Challenges JIT Emulation detection Full JIT support Flushing JIT cache Emulation detection Real Sensors: GPS, Microphone, etc. Bouncer Timing assumptions, timeouts, events Closed source systems, e.g. iOS

Q0. Where can I get DroidScope? Questions? Q0. Where can I get DroidScope?

ratc Vulnerability Three generation (stage) exploit setuid() fails when RLIMIT_NPROC reached adbd fails to verify setuid() success Three generation (stage) exploit Locate adbd in /proc and spawns child Child fork() processes until -11 (-EAGAIN) is returned then spawns child – continues fork() Grandchild kill() adbd and waits for process to re-spawn

ratc: exploit diagnosis

Symbol Information Native library symbols - Static From objdump of libraries Java symbols - Dynamic Dalvik data structures -> address of string Given address, load from Memory File mapped into memory dexdump as backup