Presentation is loading. Please wait.

Presentation is loading. Please wait.

Research in Next-Generation Digital Forensics Golden G. Richard III, Ph.D. Associate Professor Dept. of Computer Science

Similar presentations

Presentation on theme: "Research in Next-Generation Digital Forensics Golden G. Richard III, Ph.D. Associate Professor Dept. of Computer Science"— Presentation transcript:

1 Research in Next-Generation Digital Forensics Golden G. Richard III, Ph.D. Associate Professor Dept. of Computer Science

2 Digital Forensics Research Group Fall 2006: –Thursdays @ 1pm in NSSAL (Math 322) Primary Collaborators: –Vassil Roussev [UNO CS] –Vico Marziale [UNO Ph.D. student] –Frank Adelstein [ATC-NY]

3 Digital Forensics Definition: “Tools and techniques to recover, preserve, and examine digital evidence on or transmitted by digital devices.” Devices include computers, PDAs, cellular phones, videogame consoles, copy machines, printers, …

4 Examples of Digital Evidence Threatening emails Documents (e.g., in places they shouldn’t be) Suicide notes Bomb-making diagrams Malicious Software –Viruses –Worms –… Child pornography (contraband) Evidence that network connections were made between machines Cell phone SMS messages

5 Facts (or: Why Digital Forensics?) Deleted files aren’t securely deleted –Recover deleted file + when it was deleted! Renaming files to avoid detection is pointless Formatting disks doesn’t delete much data Web-based email can be (partially) recovered directly from a computer Files transferred over a network can be reassembled and used as evidence

6 Facts (2) Uninstalling applications is much more difficult than it might appear… “Volatile” data hangs around for a long time (even across reboots) Remnants from previously executed applications Using encryption properly is difficult, because data isn’t useful unless decrypted Anti-forensics (privacy-enhancing) software is mostly broken “Big” magnets (generally) don’t work Media mutilation (except in the extreme) doesn’t work Basic enabler: Data is very hard to kill

7 Privacy Through Media Mutilation degausser or forensically-secure file deletion software (but make sure it works!) or

8 Digital Forensics Process Identification of potential digital evidence –Where might the evidence be? –Which devices did the suspect use? Preservation and copying of evidence –On the crime scene… –First, stabilize evidence…prevent loss and contamination –If possible, make identical copies of evidence for examination Careful examination of evidence Presentation –“The FAT was fubared, but using a hex editor I changed the first byte of directory entry 13 from 0xEF to 0x08 to restore ‘HITLIST.DOC’…” –“The suspect attempted to hide the Microsoft Word document ‘HITLIST.DOC’ but I was able to recover it without tampering with the file contents.” Legal: Balance of need to investigate vs. privacy

9 “Traditional” Digital Forensics Pull the plug “Image” (make bit-perfect copies) of hard drives, floppies, USB keys, etc. Use forensics software to analyze copies of drives Investigator typically uses a single computer to perform investigation in the lab Present results to client, to officer-in-charge, court

10 Traditional: Where’s the evidence? Undeleted files, expect some names to be incorrect Deleted files Windows registry Print spool files Hibernation files Temp files (all those.TMP files!) Slack space Swap files Browser caches Alternate or “hidden” partitions On a variety of removable media (floppies, ZIP, Jazz, tapes, …)

11 But Evidence is Also… In RAM “In” the network On machine-critical machines –Can’t turn off without severe disruption –Can’t turn them ALL off just to see! On huge storage devices –1TB server: image entire machine and drag it back to the lab to see if it’s interesting? –10TB?

12 Next Generation: Needs Broad: –Better design, better software Yes, some of it is engineering (and hacking) Someone has to do it –Better vision, application of ‘real’ CS to problems More specific: –Need for speed –Machine correlation –Machine profiling –Better auditing of investigative process –On-the-spot forensics: Triage –Live forensics –Network forensics –Specific tools for detection and remediation of malware –Phishing investigation –…

13 Next Generation: UNO Better file carving Forensic-aware OS components In-place file carving Forensic accountability On-the-spot forensics Distributed digital forensics

14 File Carving: Basic Idea one cluster one sector header, e.g., 0x474946e8e761 (GIF) unrelated disk blocks interesting file footer, e.g., 0x003B (GIF) “milestones” or “anti-milestones”

15 File Carving: Fragmentation header, e.g., 0x474946e8e761 (GIF) footer, e.g., 0x003B (GIF) “milestones” or “anti-milestones”

16 File Carving: Fragmentation header, e.g., 0x474946e8e761 (GIF) footer, e.g., 0x003B (GIF)

17 File Carving: Damaged Files header, e.g., 0x474946e8e761 (GIF) “milestones” or “anti-milestones” No footer

18 File Carving: Doing a Better Job Better design Faster Distributed implementation More flexible description of file types Automatic generation of type descriptions –Patterns –Rule sets Multiple-pass carving –Carve, “remove” validated files from block list, re- carve, hope that some fragmented files coalesce Block-sniffing

19 File Carving: Block Sniffing header, e.g., 0x474946e8e761 (GIF) Do these blocks “smell” right? N-gram analysis entropy tests parsing

20 Better Software: File Carving: Scalpel Two-pass design Minimizes: –Reads –Seeks –Writes –Data copying –Memory usage Doesn’t yet incorporate all of the carving wizardry we have in mind G. G. Richard III, V. Roussev, "Scalpel: A Frugal, High Performance File Carver," Proceedings of the 2005 Digital Forensics Research Workshop (DFRWS 2005), New Orleans, LA.

21 Some Scalpel Results (1) Big targets, large carve sizes, huge improvement (over 5 hours faster) T read + 238,270,750,000 bytes

22 Some Scalpel Results (2) Big targets, large carve sizes, huge improvement (over 7 hours faster) T read + 117,622,357,936 bytes

23 OS Support for Digital Forensics Export raw disk devices across network for processing –Others: network block device (NBD) –Us: optimization “In-place” file carving –Us: Export results from file carving as a filesystem, w/ minimal extra storage Better auditing of investigative process –Us: “digital evidence bag”-aware filesystems

24 FUSE (Filesystem in User Space) user space kernel space Linux Virtual File System Interface (VFS) C library dd if=/evidence/DEC/img.dd of=copy.dd read() FUSE ext3 reiserFS C library FUSE library Filesystem Implementation

25 In-Place File Carving preview database FUSE scalpel_fs client applications nbd server nbd client network local drive remote drive G. G. Richard III, V. Roussev, V. Marziale, “In-Place File Carving,” submitted to the Third Annual IFIP WG 11.9 International Conference on Digital Forensics, 2007. Scalpel

26 Better Auditing Want: Digital Evidence Bags See: P. Turner, “Unification of Digital Evidence from Disparate Sources (Digital Evidence Bags),” DFRWS 2005 See: Common Digital Evidence Storage Format (CDESF) working group,

27 Better Auditing (2) DEC (DEB, AFF, Gfzip …) FDAM ddscalpel FTK … VFS Interface TSK Evidence Data Audit Log Import/ Export Applications (User space) (Kernel) Operating System Block-level Data Access Filesystem Data Access FDAM Block Device G. G. Richard III, V. Roussev, "Toward Secure, Audited Processing of Digital Evidence: Filesystem Support for Digital Evidence Bags," Research Advances in Digital Forensics, Springer, 2006. Digital Evidence Container

28 Bluepipe: On the Spot Digital Forensics Y. Gao, G. G. Richard III, V. Roussev, “Bluepipe: An Architecture for On-the-Spot Digital Forensics,” International Journal of Digital Evidence (IJDE), 3(1), 2004. Bluepipe Patterns


30 Distributed Digital Forensics V. Roussev, G. G. Richard III, "Breaking the Performance Wall: The Case for Distributed Digital Forensics,“ Proceedings of the 2004 Digital Forensics Research Workshop (DFRWS 2004), Baltimore, MD 750GB 300GB

31 Distributed Digital Forensics Scalable –Want to support at least IMAGE SIZE / RAM_PER_NODE nodes Platform independent –Want to be able to incorporate any (reasonable) machine that’s available Lightweight –Horsepower is for forensics, not the framework—less fat Highly interactive Extensible –Allow incorporation of existing sequential tools –e.g., stegdetect, image thumbnailing, file classification, hashing, … Robust –Must handle failed nodes smoothly

32 Distributed Digital Forensics (2)

33 Distributed Digital Forensics (3)

34 Beowulf [RIP], Slayer of Computer Criminals…

35 DDF: Results (1) Live string search: “Vassil Roussev” Regular expression search: v[a-z]*i[a-z]*a[a-z]*g[a-z]*r[a-z]*a

36 DDF: Results (2) Stego detection using Stegdetect 0.5 under RH9 Linux on the cluster Traditional: –6GB image mounted using loopback device –find /mnt/loop –exec./stegdetect ‘{}’ \; –790 seconds == 13:10 minutes Using the distributed framework –Stegdetect 0.5 code incorporated into framework –Detection against cached files –“STEGO” command (after IMAGE/CACHE) –82 seconds == 1:22 minutes 9.6X faster with 8 machines CPU bound operation

37 DDF: To Do List User interface! (unless you love Putty)

38 DDF: To Do (2) Case persistence Secure support for overlapping cases Better fault tolerance Intelligent caching schemes to support larger images Collaboration with colleagues (you?) working in: –Image analysis/classification –Speech recognition –More stego –Other CPU horsepower-intensive, forensics- applicable stuff –We provide cycles…you provide…

39 Current: Live Forensics Physical memory dumps –Hard to do when adversarial OS is present –Via USB hacking? –Firewire proof of concept developed by Maximillian Dornseif Defeating process hiding techniques, e.g., FU “rootkit” –Check OS components from many angles Remnants of applications (executed) past… –e.g., instant messenger fragments –e.g., recent invocations of process hiding –e.g., fingerprints of recently executed (or executing) malware

40 Conclusion: Lots of Work To Do Benevolent hacking (engineering) meets science Desperately need methods for pipelining investigative process Live forensics critically important –volatile computing –whole disk encryption –hardware-based whole disk encryption! –nasty malware

41 Conclusion (2) Arguably, almost any field in CS can collaborate –All media handling needs work –Algorithms for dealing with huge, partially-organized datasets –Attribution –Correlation –Profiling –Document similarity measures –Databases –High-performance computing –OS Internals

42 Random Bedside Reading… (Digital Forensics Research Workshop) (International Journal of Digital Evidence) F. Adelstein, “Live Forensics: Diagnosing Your System Without Killing it First,” Communications of the ACM, February 2006. M. A. Caloyannides, Privacy Protection and Computer Forensics, Second Edition, 2004. B. Carrier, File System Forensic Analsis, Addison-Wesley, 2005. B. Carrier, “Risks of Live Digital Forensics Analysis,” Communications of the ACM, February 2006. E. Casey, Digital Evidence and Computer Crime, Academic Press, 2004. J. Chow, B. Pfaff, T. Garfinkel, M. Rosenblum, “Shredding Your Garbage: Reducing Data Lifetime Through Secure Deallocation,” 14th USENIX Security Symposium, 2005. M. Geiger, “Evaluating Commercial Counter-Forensic Tools,” 5th Annual Digital Forensic Research Workshop (DFRWS 2005), New Orleans, 2005. G. G. Richard III, V. Roussev, "Next Generation Digital Forensics," Communications of the ACM, February 2006. G. G. Richard III, V. Roussev, “Digital Forensics Tools: The Next Generation,” invited chapter in Digital Crime and Forensic Science in Cyberspace, IDEA Group Publishing, 2005. A. Schuster, “Searching for Processes and Threads in Microsoft Windows Memory Dumps,” 6th Annual Digital Forensic Research Workshop (DFRWS 2006), West Lafayette, IN, 2006. S. Sparks, J. Butler, “Raising the Bar for Windows Rootkit Detection,” Phrack Issue # 63. G. Hoglund, J. Butler, “Rootkits: Subverting the Windows Kernel,” Addison-Wesley, 2005.

43 Presentation available: Security Lab (NSSAL): Math 322 ?

Download ppt "Research in Next-Generation Digital Forensics Golden G. Richard III, Ph.D. Associate Professor Dept. of Computer Science"

Similar presentations

Ads by Google