Presentation is loading. Please wait.

Presentation is loading. Please wait.

AccessMiner Using System- Centric Models for Malware Protection Andrea Lanzi, Davide Balzarotti, Christopher Kruegel, Mihai Christodorescu and Engin Kirda.

Similar presentations


Presentation on theme: "AccessMiner Using System- Centric Models for Malware Protection Andrea Lanzi, Davide Balzarotti, Christopher Kruegel, Mihai Christodorescu and Engin Kirda."— Presentation transcript:

1 AccessMiner Using System- Centric Models for Malware Protection Andrea Lanzi, Davide Balzarotti, Christopher Kruegel, Mihai Christodorescu and Engin Kirda ACM CCS 2010 Oct. 1

2 OUTLINE Malware Detection System Call Data Collection Program-Centric Models and Detection System-Centric Models and Detection Discussion and Conclusion 2

3 OUTLINE Malware Detection System Call Data Collection Program-Centric Models and Detection System-Centric Models and Detection Discussion and Conclusion 3

4 Malware Detection Signature ◦ Static content ◦ Byte strings, instruction sequences =>Code obfuscation Behavior ◦ Dynamic actions ◦ Sequences of System calls, API functions ◦ A program-centric approach ◦ …good results? 4

5 Malware Detection Problem Test case ◦ Small scale  About 10 benign applications ◦ Limited execution  A few minutes, sandbox ◦ Synthetic inputs ◦ Single machine 5

6 Malware Detection Problem(cont.) Program-centric model ◦ Narrow view on a program ◦ Diversity of system call information ◦ How benign programs interact with their environment? ◦ Their models may specific to a small set of benign applications only 6

7 OUTLINE Malware Detection System Call Data Collection Program-Centric Models and Detection System-Centric Models and Detection Discussion and Conclusion 7

8 System Call Data Collection A Microsoft Windows kernel module ◦ Collect, anonymize, and upload system call logs ◦ Hooks the System Services Descriptor Table ◦ Mindful of system resource 8

9 Kernel collector 79 different system calls ◦ Related to files, regs, processes and threads, networking, memory. ◦ Same subset in Anubis 9

10 System Call Data Sensitive data are replaced ◦ Non-system paths, user-root registry key, IP addresses 10

11 System Call Data Collection Large and diverse set of system call traces ◦ Ten different machines, different users ◦ Serveral weeks ◦ 114.5GB of data ◦ 1.556 billion system call ◦ 362,600 processes ◦ 242 applications 11

12 Data set 2~4 days with 2~12 hours Production systems, development systems 12

13 Data Normalization Raw data(system call logs) =>Accessed resources and access type Tracking the access operations ◦ The set of resources open at any given time  OS handles ◦ Until the resource is released(NtClose) Execution path and file name: ◦ NtOpenFile, NtCreateSection, NtCreateThread 13

14 OUTLINE Malware Detection System Call Data Collection Program-Centric Models and Detection System-Centric Models and Detection Discussion and Conclusion 14

15 Analysis of System Call Data How diverse is the collected system call data? Focus on types ◦ Long tradition in the security community ◦ Most models rely upon characteristic patterns Ignore argument values 15

16 Creating n-gram Models Follow a ” standard ” approach 1.Extract n-grams Models for a set of malware programs and a set of benign programs 2.Find all n-grams appear in malware programs but not in benign programs 3.Hope those n-grams are characteristic for malware programs 16

17 Unique n-gram analysis 17

18 n-gram Models 10,838 malware samples from Anubis Ten experiments(ten machines) ◦ System call traces from 9 machines and 2/3 of the malware set to train an n-grams ◦ Perform detection with remaining system calls traces and 1/3 malwares 18

19 Detection Results 19

20 Program-Centric Models and Detection Since system-call sequences invoked by benign applications are diverse ◦ Have difficulties in distingushing normal and malicious behaviors A large amount of data is needed 20

21 OUTLINE Malware Detection System Call Data Collection Program-Centric Models and Detection System-Centric Models and Detection Discussion and Conclusion 21

22 System-Centric Models and Detection Generalize how benign programs interact with the operating system Record the files and the registry entries ◦ Read, write, execute It is “ convergence ” 22

23 Access Activity Model A set of labels for operating system resources A label “L” is a set of access tokens ◦ {t 0,t 1,…,t n } A token “t” is a pair ◦, a => application op => type of access 23

24 Initial Access Activity Model(1) Use system-call traces of all benign processes A virtual file system tree Application “a” C:\foo\a.txt (write) Application “b” C:\foo\bar\b.rar (exec) 24

25 Model Pre-processing(2) Remove some elements in the tree ◦ Microsoft Windows services ◦ Desktop indexing programs ◦ Anti-virus software Identify applications that start processes with different names ◦ C:\Windows\system32 => win_core 25

26 Model Generalization(3) Propagated Container ◦ All children are private(without *) ◦ C:\Program Files Merged => 26

27 System-Centric Model Detection For any op Find the longest prefix P shared between the path to the resource and the folders in the virtual tree stored by our model Ten experiments ◦ File system access activity model  About 100 labels ◦ Registry access activity model  About 3000 labels ◦ Full access activity model 27

28 Detection Results(Files) //Looks sobering Many samples(Malware) don ’ t work(!) ◦ 10,838 -> 7,847 Use only write operation ◦ Our own logging component ◦ Software updates 28

29 Detection Results(Regs) 29 HKEY_USER\Software\Microsoft ◦ Need a larger training set

30 OUTLINE Malware Detection System Call Data Collection Program-Centric Models and Detection System-Centric Models and Detection Discussion and Conclusion 30

31 Discussion and Conclusion Full access activity model ◦ 91% detection / 0% false positives System-centric approach Policy violations occurred only for few, specific classes of programs Network limitation MAC policy ◦ SELinux 31


Download ppt "AccessMiner Using System- Centric Models for Malware Protection Andrea Lanzi, Davide Balzarotti, Christopher Kruegel, Mihai Christodorescu and Engin Kirda."

Similar presentations


Ads by Google