Presentation is loading. Please wait.

Presentation is loading. Please wait.

CTF Forensics Part 0x01 MHgwNjB4MEEweDBEMHgxNjB4MTgweDIyMHgyODB4MkYweDMyMHgzNjB4MzkweDNFMHg0MTB4NDQ=

Similar presentations


Presentation on theme: "CTF Forensics Part 0x01 MHgwNjB4MEEweDBEMHgxNjB4MTgweDIyMHgyODB4MkYweDMyMHgzNjB4MzkweDNFMHg0MTB4NDQ="— Presentation transcript:

1 CTF Forensics Part 0x01 MHgwNjB4MEEweDBEMHgxNjB4MTgweDIyMHgyODB4MkYweDMyMHgzNjB4MzkweDNFMHg0MTB4NDQ=

2 In this lecture: The forensics mindset - what to expect from CTF forensics. Using the Command Line Binary, Hex, and other encoding schemes. Files & Metadata File Analysis Tools: Strings, xxd, exiftools, diff, hex editors, Audacity File Systems & Deleted Files File System Analysis Tools: Sleuth Kit

3 Next Lecture Review Forensics I Homework File Carving
File Carving Tools: BinWalk Networking & the OSI Stack Networking Tools: Wireshark Memory Forensics Memory Forensics Tools: Volatility

4 Forensics “You can have data without information, but you cannot have information without data.” -Daniel Keys Moran

5 CTF Forensics - Definitions
What is Digital Forensics? Forensics can be considered an art and a science. The science of forensics combined with the art of investigation. Applying scientific method and deductive reasoning to data is the science, and interpreting the data to reconstruct events is an art. Goals of Digital Forensics: Find facts From these facts recreate the truth of an event

6 CTF Forensics - Why Forensics?
In real life it’s about catching bad guys and doing disaster recovery. All computer activity produces data. Bad guys destroy or obfuscate data to hide their activity. Understanding forensics will teach you some fundamentals that will help you catch the people doing the stuff you will learn later in this course. If you go into pentesting, understanding forensics will help you not get caught. Real life applications Catch criminals dealing in illegal files. Reconstruct a criminal’s digital movements and communications. Protect intellectual property / enforce NDAs Fight cybercriminals & APTs Incident Response Computer Crime Investigations

7 CTF Forensics - Typical Challenges
Information hidden inside file contents or metadata. Recovering lost or delete information or files. Identifying covert communications channels. Reconstructing lost information. Using recovered information to bypass security controls. Basically some hacker is trolling you with obsessive obfuscation.

8 CTF Forensics - Example Problems
Find flag hidden in image metadata Find hidden file inside an MS Office document Repair broken zip archive Reconstruct a file from TCP stream Recover a deleted file from a file system image Find flag steganographically hidden in an image or audio file Recover logon credentials from client crash dump and log in to account. Detect covert communication channel in network traffic capture. Carve hidden file out of a larger file.

9 CTF Forensics - The Mindset
Digital data is just a giant blob of binary bits. Meaning is generated because there exists some external structure which tells both humans and computers how to parse the giant pile of zeros and ones. Things like files, file systems, protocols, etc. are external abstractions we impose on raw binary data. Forensics will become easier one you realize binary information exists independent from the abstractions with which we are familiar. Forensics challenges will require spotting deviations from our abstractions, means by which data was transformed from one abstraction to another, or identifying new abstractions someone has layered over the data.

10 CTF Forensics - The Mindset
Forensics takes time! You will probably be doing research. You will probably have to teach yourself a new file format, or tool, or library. Start Early!

11 How Do I Linux? “The Linux philosophy is 'Laugh in the face of danger'. Oops. Wrong One. 'Do it yourself'. Yes, that's it.” - Linus Torvalds

12 Linux Command Line From here on out most of your work is going to be done out of the linux terminal using command line tools. If you are not comfortable with it already you hopefully will be by the end of this course. Non technical people will think you are literally a wizard if you use a command line interface.

13 Linux Basics Terminal Shell
There are many different kinds of terminals by default on Ubuntu you are probably using the gnome-terminal this is the GUI interface and program that runs the shell you are using. If you are using gnome-terminal here are some useful commands/suggestions You can create tabs within the terminal using ctrl-shift-T, you can select one of the tabs using alt-num where num is the number tab you wish to switch to Do yourself a favor and go into the terminal settings to change the colors and transparency of the terminal to something that won't hurt your eyes Shell There are many different kinds of shells. This is the programm interpreting your input and providing basic commands By default you are probably using GNU Bash. This gives you things like tab completion and by default shows your current directory username and computer name for the prompt

14 Linux Basics Shell special chars & >> << | && ;
Run preceding command as a background process >> Output from preceding command is written to a specified file << Output from a file on the right is used as input to the command preceding this char sequence | The pipe operator takes the output from the preceding command and uses it as input to the next command && The AND operator execute the preceding command and the following command ; Execute the preceding statement and then execute the next statement

15 Linux Commands Refresher
Shell special chars cd Changes directories. “cd ..” goes up a level, “cd ~” goes to your home directory, “cd /” goes to root ls Lists directory contents. “ls –l” provides verbose listing pwd Prints directory you are in. mkdir Makes a new directory cp Copy a file. “cp –r” recursively copies a directory and all items inside mv Renames or moves a file or directory rm delete a file. “rm –r” deletes a directory and all items inside (required to delete directories) cat dumps contents of file to screen. Longer files can be viewed page at a time with less or more. man get command usage information (sometimes can/must use info instead)

16 Binary, Hex, and other Encodings
“There are 10 kinds of people in the world: those who understand binary numerals, and those who don't.”

17 Hex Basics Base 16 number system Binary 2 Hex Hex 2 Binary
echo "obase=16;ibase=2; " | bc Python -c ‘print hex(int(‘ ’,2)’ Hex 2 Binary echo "obase=2;ibase=16; FFFF" | bc echo "FFFF" | xxd -r -p | xxd -b python -c 'print bin(0xFF)' Hex to Decimal echo $((16#FF)) python -c 'print(int("FF", 16))' Number 1 2 3 4 5 6 7 Binary 0000 0001 0010 0011 0100 0110 0111 Hexadecimal Number 8 9 10 11 12 13 14 15 Binary 1000 1001 1010 1011 1100 1101 1110 1111 Hexadecimal A B C D E F

18 Hex Basics Hex can represent many things including text encodings
You will need to know how to look up an encoding character from hex There are many different encodings but a lot of the time you will be dealing with ASCII (1 byte encoding 0x0-0x7f) man ascii (Table of ASCII conversions) Online ascii conversions python -c ‘print chr(0x45)’ cat binary_file | xxd

19 Hex Editors xxd (CLI) - creates a hex dump of a given file or standard input. It can also convert a hex dump back to its original binary form. hexedit (CLI) - shows a file both in ASCII and in hexadecimal. The file can be a device as the file is read a piece at a time. You can modify the file and search through it. ghex (GUI) - allows the user to load data from any file, view and edit it in either hex or ascii. Any Scripting language and usually text editors

20 Base64 Encoding Binary to text encoding scheme that represents ASCII string formatted binary data by translating it into a radix-64 representation. That is, it encodes arbitrary binary data into ASCII text. Each base64 digit represents exactly 6 bits of data. Base64 encoding allows protocol agnostic streaming of binary data without worrying that any particular protocol will incorrectly parse a control character (\r, \n, etc…) in the binary stream. The standard alphabet is A-Z, a-z, 0-9, +, and -, with = used for padding. Hint: If you see a long alphanumeric string ending with equals sign(s) think base64.

21 XOR Encoding Bitwise operation that takes two input bitstrings and performs a logical exclusive OR operation on each pair of corresponding bits. The result in each position is 1 if and only if exactly one input bit is 1. That is, it outputs one when the input bits are different. A very commonly used way to obfuscate data. Why? XOR is a fast operation. XOR is its own inverse. Message XOR Key = Ciphertext Ciphertext XOR Key = Message

22 Hashing - Overview A hash function is any function that can be used to map data of arbitrary size to data of fixed size. A cryptographic hash function allows one to easily verify that some given input hashes to a given value, but the input cannot be easily reconstructed if given only the hash value. Used for integrity checks and certain cryptographic purposes. CTF problems typically involve hash functions that are either: Cryptographically broken Secure but have readily available lookup tables for common inputs (rainbow tables) Many of these tables come from leaked password databases

23 Hashing - Functions Common Hash Functions:
MD5 No longer cryptographically secure. Output typically represented as 32 hexadecimal digits Ex: 5f4dcc3b5aa765d61d8327deb882cf99, 2ab96390c7dbe3439de74d0c9b0b1767 SHA1 SHA is a family of cryptographic hash functions. SHA-1 is no longer considered cryptographically secure. Output typically represented as a 40 hexadecimal digits. Ex: 5baa61e4c9b93f3f b6cf8331b7ee68fd8, f3bbbd66a63d4bf ec3d e21d One of many easily googleable hash crackers:

24 Morse Code Method of transmitting text information as a series of on-off tones, lights, or clicks. Notice this is a binary medium, only two possible data units. Morse code is a popular choice for encoding information as any signal with only two values can be used. Black and white pixels Two different audio frequencies Variable length transmission

25 QR Codes QR Codes are a type of two dimensional barcode.
Machine readable optical label containing information. Various formats, some more obscure than others. Occasionally used to hide flags. For examples:

26 "Relax, it’s only ONES and ZEROS !"
Files & File Metadata "Relax, it’s only ONES and ZEROS !"

27 File Analysis Basics Rationale - Often times you will need to find hidden information in files or you may not know about a certain file type. Additionally, a forensics tool may not support an image that you have been tasked to extract data from. This is where you will need file analysis techniques to figure out what type of file/image you are examining and develop a tool to parse the file

28 File Analysis Basics Files are just a huge blob of binary data. The data’s meaning comes from some abstract structure we impose over the raw bits. The file type tells us the structure to use in interpreting the binary information. File type is indicated by the magic number, a hex string(s) at a specific offset(s). See: File type is one form of metadata: data about data. Metadata is a common source of artifacts.

29 File Analysis Basics Basic tools file command on Linux
Searches for magic numbers in file and also basic header information strings (linux or Windows) This will give you ascii strings contained in a data blob it may be helpful in identifying what type of data you are examining. Note strings requires arguments specifying encoding (ASCII, UTF-8, etc.) and endianness. xxd Sometimes looking at the hexdump will give you clues as well File extensions This is more relevant on Windows but can still be used on Linux This cannot necessarily be trusted Exiftool Analyze exif metadata for image types such as jpeg File carvers Binwalk, we will go over these more in the future

30 File Analysis Basics More basic tools diff Audacity
Compare two text files and print all the differences Audacity Open source tool for analyzing audio files Specific file metadata tools Once you know the type of image or file you are dealing with there are often existing tools that can help you interpret the metadata. Archive tools You will often receive files compressed using a variety of common to obscure formats.

31 File Analysis Basics Sometimes the tools mentioned previously leave you with no information or false information. This is when Google becomes helpful. You may still be able to search for strings or hex sequences on Google that will give you valuable information File analysis can be difficult and can sometimes waste your time.

32 File Analysis Examples
“I hope you have the worst headache of your life, then you will begin to understand” ~unknown

33 Files - example1.png The file is named example1.png, but it won’t open. When we run the file command it will tell us that this is a jpg. We can manually verify with xxd. Notice the ASCII string “JFIF” near the beginning. This is the JPG format magic number. Let’s fix it by renaming it as a *.jpg file.

34 Files - example2.jpg In the last slide when we ran xxd we saw other readable ASCII strings after the magic number. This is metadata about the image stored in the jpg file. Many cameras and image processors automatically include a great deal of metadata. This can include camera type, geolocation data, name, copyright, MAC data, etc. Easy place to hide a flag. Let’s run exiftool on example2.jpg In the “Artist” entry we see “flag{ b2ab299cc8bbee5a46af01b3}” That’s a 32 character hexadecimal string, so we should see if it’s an MD5 hash. Note that we could also have found the flag by using strings.

35 Files - example3.jpg This file appears to be a blob of binary data.
Let’s assume the filename is correct and this is supposed to be a jpg. Most likely it was encrypted using an XOR operation. Need to discover the key if we are going to decrypt it. We need to find some known value in the jpg file so we can deduce the key. Fortunately we know the magic number’s location and value. If we XOR the bytes corresponding to the magic number with the jpg magic value 4A Now that we have the key of 221B, we can write a python script to decrypt the entire file.

36 Files - example4.jpg At first this seems to be a copy of example2, but there is a second flag here. strings doesn’t find a second one, but remember that there are different encodings. After trying a few different patterns we mostly achieve success with strings -e b example4.jpg. However the first f in flag is missing. We can confirm by looking at the hexdump using xxd or ghex.

37 Files - example5.txt Here we have two text files, example5.txt and example5copy.txt. Are they really the same? We check to see if there are differences by using the diff command. Here we’re lucky and diff immediately shows us the flag.

38 Files - example6.docx All MS Office file formats are really zip archives. Note that when we run xxd we get the same magic number as a zip archive. Bad guys like hiding other files inside of the archive and rezipping it. We may need to rename it as a .zip before extracting it. Looking in .../example6.docx_FILES/word/media we find the second image. Finally note that strings would have recovered this flag because the hidden image was not compressed. Normally you will not be so fortunate.

39 File Systems & Deleted Files
"Double your drive space — delete Windows!"

40 File Systems - Disk Structure
Terms Physical Disk - the physical storage storage disk that can be further classified into logical disks Logical Disk (Drive)- virtual space that is allocated within a drive (Ex. C:\ drive, D:\ drive) Block - A unit storing contiguous data (usually 512 bytes) Logical Block - represents contiguous logical data but may be physically stored across drives Volume - Set of addressable sectors used for storage Can span multiple devices unlike a partition Partition - Collection of consecutive sectors on a device Partition Table - identifies start and end of each partition Can be falsified (“sigfind” tool can be used to recover missing and deleted partitions) “TestDisk” can be used to recover partitions in case of a corrupted disk or intentional tampering Partition Scheme - defines how information is structured on a partition and how partitions are structured. Two major schemes MBR (master boot record) and GPT (GUID partition table)

41 File Systems - Partitioning and Disk Layout
Linux disk image creation and management dd - create randomized blobs of data or a blob of data full of zeroes parted Create partition schemes using mklabel Create partitions using mkpart mkfs.* Create file system mount -o loop,offset=<some_offset> or losetup then mount man - Read the man pages it might be faster than google Useful files /dev/zero /dev/random /dev/null #for when you don’t want to see your output ;)

42 File Systems - Volume to Disk Comparison

43 File Systems - More Terminology
File System - used to control how data is stored and retrieved. Without a file system we would just have a large body of data with no way to tell where one piece of information begins and ends. It is an abstraction of the underlying data. MANY different file systems exist and is a huge research area within computer science File systems can exist on many different types of storage devices and can even be stored in RAM Purposes (Similar to all abstraction concepts) Portability Security; the user is not relied on ‒ or even allowed ‒ to access the drive directly. Convenience File - an ordered collection of data blocks Additional metadata is stored about a file by the file system Directory - A data structure containing an organized collection of files and sub-directories

44 File Systems - File System Abstraction Model
Disk - physical storage device. Usually beyond the scope of an average forensics analyst. More possible with SSDs Volume - created using all or part of one or more disks File System - A file system is laid down on a volume and describes the layout of files and their associated metadata. Items in the file system layer include metadata specific to and solely used for the file system’s operation Data Unit - the smallest available freestanding unit of data storage available in a given file system. Metadata - data about data File Name - where humans operate. Artifacts available in this layer vary depending on the file system.

45 File Systems - FAT FAT - File Allocation Table
FAT8, FAT12 FAT16, FAT32, exFAT, vFAT (Extension to regular FAT*) Commonly used by removable media (Originally for floppy disks). Supported by many Operating Systems Each file is stored as a linked list of clusters holding part of the file. The titular File Allocation Table contains pointers to the head nodes of each file.

46 File Systems - FAT Structure
Boot section More reserved sectors (optional) FAT #1 FAT #2 Root directory (FAT 12/16 only) Data region (remainder of disk)

47 File Systems - FAT Structure
Boot Sector - Execution is passed from the MBR to the Boot Record contained here. Executable code as well as the OEM identifier, number of FATs, media descriptor (type of storage device), information about the OS to be booted. FAT #1 and #2 - Keeps track of the allocation status of clusters (allocated, unallocated, end of file, bad sector). Usually FAT#2 is a mirror of FAT#1 to provide redundancy but this can be turned off. Root Directory - Contains an entry for each file and directory stored in the file system. Includes information like the file name, starting cluster number, and file size. The root directory has a fixed size of 512 bytes (For a hard disk). Usually right after the FATs but with FAT32 it can be anywhere on the partition. Data Region - Where the files are actually stored.

48 File Systems - Fat Structure
Each type of FAT file system has a different size for each entry. Each entry contains one of five things the cluster number of the next cluster in a chain a special end of cluster-chain (EOC) entry that indicates the end of a chain a special entry to mark a bad cluster a zero to note that the cluster is unused Fragmentation Issues and File Slack Space File1.txt 0002 File2.txt 0005 File3.TXT 0007 Address 1 2 3 4 5 6 7 8 Value 0003 0004 FFFF 0006 0008

49 File Systems - NTFS Created by Microsoft to provide many features not available in a FAT file system. Journaling Security Features Larger Volume Support Other performance and space saving enhancements Supported by Windows NT 4.0, Windows Server 2000, and above Used for hard disks The default file system your Windows OS resides in NTFS - New Technology File Systems

50 File Systems - NTFS Structure
Clusters on an NTFS volume are numbered sequentially from the beginning of the partition into logical cluster numbers. NTFS stores all objects in the file system using a record called the Master File Table (MFT), similar in structure to a database.

51 File Systems - NTFS Structure
NTFS Boot Sector - Contains the BIOS parameter block that stores information about the layout of the volume and the file system structures, as well as the boot code (MBR). Master File Table - Contains the information necessary to retrieve files from the NTFS partition, such as the attributes of a file. File System Data - Stores data that is not contained within the Master File Table. Master File Table Copy - Includes copies of the records essential for the recovery of the file system if there is a problem with the original copy. NTFS Boot Sector Master File Table File System Data Mater File Table (Copy)

52 File Systems - NTFS MFT The Master File Table (MFT) contains entries that describe all system files, user files, and directories. The MFT even contains an entry (#0) that describes the MFT itself, which is how we determine its current size. Other system files in the MFT include the Root Directory (#5), the cluster allocation map, Security Descriptors, and the journal.

53 File Systems - NTFS MFT Entries
Each MFT Entry is given a number. The user files and directories start at MFT #25 Contains attribute information MAC times File Name $DATA stream (File content) Index Alloc and Index root which contain directory contents stored in a B-Tree Each type of attribute is given a numerical value and more than one instance of a type can exist for a file. The "id" value for each attribute allows one to specify an instance. A given file can have more than one "$Data" attribute. To get a mapping of attribute type values to name, use the 'fsstat' command. It displays the contents of the $AttrDef system file.

54 File Systems - NTFS MFT Entries
Each attribute has a header and a value and an attribute is either resident or non-resident. A resident attribute has both the header and the content value stored in the MFT entry. This only works for attributes with a small value (the file name for example). For larger attributes, the header is stored in the MFT entry and the content value is stored in Clusters in the data area. A Cluster in NTFS is the same as FAT, it is a consecutive group of sectors. If a file has too many different attributes, an "Attribute List" is used that stores the other attribute headers in additional MFT entries.

55 File Systems - NTFS MFT Entries
Standard Information File or Directory Name Data or Index Unused Space

56 Deleted Files and Deleted File Recovery
Deleted Files (most recoverable)- files that have been unlinked, the filename entry is no longer presented when a user views a directory, and the filename, metadata structure, and data units are marked as “free”. However, the connection between these layers are still intact when forensic techniques are applied to the file system. Recovery consists of recording the relevant file name and metadata structures and then extracting the data units. Orphaned Files - similar to deleted files except the link between the file name and metadata structure is no longer accurate. Recovery of the data is still possible but there is no direct correlation from the file name to recovered data.

57 Deleted Files and Deleted File Recovery
Unallocated Files - files have had their once-allocated filename entry and associated metadata structure become unlinked and/or reused. In this cast the only means of recovery is carving the not-yet-reused data from the unallocated space of the volume Overwritten Files - have had one or more of their data units reallocated to another file. Full recovery is no longer possible, but partial recovery may depend on the extent of overwriting. Files with file names and/or metadata structures intact that have had some or all data units overwritten are sometimes referred to as Deleted/Overwritten or Deleted/Reallocated

58 Deleted File Recovery Tools
"Back up my hard disk? I can't find the reverse switch!"

59 Deleted File Recovery - The Sleuth Kit
A collection of command line tools that allows you to analyze disk images and recover files from them. Compatible with most file systems. Makes it easy to recover deleted files.

60 Deleted File Recovery - SleuthKit
Tools “mm-”: tools that operate on volumes (aka “media management”) “fs-”: tools that operate on file system structures “blk-”: tools that operate at the data unit (or “block”) layer “i-”: tools that operate at the metadata (or “inode”) layer “f-”: tools that operate at the file name layer “j-”: tools that operate against file system journals “img-”: tools that operate against image files “-stat”: displays general information about the queried item “-ls”: lists the contents of the queried layer “-cat”: dumps/extracts the content of the queried layer

61 Deleted File Recovery - SleuthKit
More Tools jpg_extract - extracts jpeg tsk_recover - Export files from an image into a local directory.

62 Deleted File Recovery - Mounting Images
Use mmls to retrieve the partition table from the disk image. We need to associate our loop device with a partition, not a disk image. The loopback device requires offset in bytes, not sectors. mmls will give us the offset of the partition containing the files system and the sector size. Mount the partition. mount -o loop,offset=$((<offset>*<sectorsize>)),gid=$USER,uid=$USER <disk.img> ~/mnt To see which loop device mount is used can use losetup --list. You now have access to the file system through your normal OS tools.

63 Deleted File Recovery - tsk_recover
Use mmls to retrieve the partition table from the disk image. Run tsk_recover -o <offset> <disk.image> ~/recovered. Recover all deleted files from disk.image at specified offset and save them to ~/output directory -a #recover allocated files only -e #recover all file allocated and unallocated

64 DFR- Disk versus Partition Images
Previous two slides assumed you received an image of the entire disk. But mmls will fail if you only receive an image of a single partition. For a single partition we use the fs- layer tools and the offset is usually zero.

65 Deleted File Recovery - The manual way
Using fls & icat Once we have determined where the file system resides we can use these tools to recover data fls -p -l -r -o119 disk.img -l #list extended information such as creation time/modified time -r #recursively go through directories and list contents -o #offset of the beginning of the file system (in sectors) -d #display deleted entries only -p #display the full path of each entry icat -o119 disk.img 7 > <output.file> #dump the contents of inode 7 -o offset of filesystem (in sectors)

66 Errata Users /nm./: collective term for those who use computers. Users are divided into three types: novice, intermediate and expert. Novice Users: people who are afraid that simply pressing a key might break their computer. Intermediate Users: people who don't know how to fix their computer after they've just pressed a key that broke it. Expert Users: people who break other people's computers.

67 Next Time File Carving OSI Stack, Packet Captures, and Network Forensics Memory, Volatility, and Memory Forensics

68 Homework Advice and Hints
Remember, you need to give a writeup that shows how you got the flag. Don’t forget to include the flag Flags will be in the format flag{flag_contents_here}. Hey, there are some letters with weird formatting in these slides, what’s up with that? Start early, these problems often require time to think. You will have to use at least one tool I didn’t tell you about. Forensic problem authors love to find obscure things to test you on. We’re kinda evil trolls like that. Try and precisely define your problem when google searching. Make a backup copy before you try anything! Expect things to be buggy and have to find workarounds or alternative solutions Believe in yourself!

69 Questions?


Download ppt "CTF Forensics Part 0x01 MHgwNjB4MEEweDBEMHgxNjB4MTgweDIyMHgyODB4MkYweDMyMHgzNjB4MzkweDNFMHg0MTB4NDQ="

Similar presentations


Ads by Google