A UNIVERSAL CARVING APPROACH FOR DATABASE FORENSIC ANALYSIS James Wagner, Alexander Rasin, Jonathan Grier
Background (File System Forensics vs. Database Forensics) Reconstructing Data and DICE Reconstructing Deleted Data Experiments Conclusion and Future Work Overview
File System Forensics Header File Fragment 1 File Fragment 2 Footer gap
Database Forensics Challenges Database specific data models No file headers Record reconstruction Values are encoded with metadata Forensics is Reconstruction, not Recovery No database intervention Backups are not available Corrupted disk, deleted files, read-only
Table Customer Table Supplier Index C_Age Materialized View C_Account Database Structures IDNameCity IDName Steve Sally Sam Susy San Diego Springfield St. Louis Seattle Craig Claire Chris Carol Account $2000 $5000 $1000 $4000 Name Claire Carol Account $5000 $4000 AgePointer Age SELECT Name, Account FROM Customer WHERE Account > 3000;
Page Header Row Directory Row Data Other Structures Row 4 : 4, Mark, Boston Row 3 : 3, Mary, Dallas Row 2 : 2, Jane, Chicago Row 1 : 1, John, Boston Row 1 Address Row 2 Address Row 3 Address Row 4 Address Table Data Customer Free space, etc. 20% 80% Database Storage: Pages Minimum Unit of Storage Typically 4KB or 8KB
Parameter Detector Database Management System Content Parser Iteratively load synthetic data Capture DB storage DB config. files Generate DB config. file DBMS disk image DBMS RAM snapshot Data pages (tables & indexes) Deleted data, etc. Volatile user artifacts DICE: Database Image Content Explorer
Parameter Detector Database Management System Content Parser Iteratively load synthetic data Capture DB storage DB config. files Generate DB config. file DBMS disk image DBMS RAM snapshot Data pages (tables & indexes) Deleted data, etc. Volatile user artifacts DICE: Parameter Collector
Parameter Detector Database Management System Content Parser Iteratively load synthetic data Capture DB storage DB config. files Generate DB config. file DBMS disk image DBMS RAM snapshot Data pages (tables & indexes) Deleted data, etc. Volatile user artifacts DICE: Content Parser
DICE and Python Adapters for every database Portable across operating systems Simple database connections Constantly changing and creating page parameters
Parameter Collection: Row Data Example: MySQL primary key storage
Parameter Collection: Datatypes How do you automate it? Example: Integers PostgreSQL(4 bytes): (256 0 * B 1 ) + (256 1 * B 2 ) + (256 2 * B 3 ) + (256 3 * B 4 ) 1 = 1, 0, 0, = 0, 1, 0, = 1, 1, 0, 0 Oracle: Uses zero compression. 4 = 3, 192, 5 40 = 3, 192, = 3, 193, = 4, 193, 5, 41 Datatype Detection. Example: PostgreSQL Integer of String? ASCII Decimal3 J ay
5 | 10John Smith | 11 | 5Texas 1&3&5&15 || 5 | 11 | John Smith | Texas Generalizing Parameters
Parsing Example: PostgreSQL ParameterValue Raw Data Delimiter 2, 9, 24 Raw Data Position 4 Number Storage4 String X String Y 2 3 Number MethodPSQL ASCIIDecimal '39 C67 u117 s115 t116 ASCIIDecimal o111 m109 e101 r114 # C1 = 1579(Integer) C2 = Customer# (String) Length = (39 – 3)/2 = 18
DBMSes Different parameters No Linux Support Hard to get some older DB versions
Deleted Data No longer recoverable A delete only marks data, not overwrite. Unallocated storage Three flavors of deleted data DICE can reconstruct: 1. Rows 2. Pages 3. Values
Deleted Data: Rows 1. MySQL or Oracle 2. PostgreSQL 3. SQLite *DB2 and SQL Server mark a deletion in the row directory
Deleted Data: Pages
Deleted Data: Values
Experiment 1: Recovering Deleted Rows Introduction Lifetime of deleted rows Representative DBs Oracle - Percent page utilization (39%) SQL Server – Overwrite if there’s space Setup 2 tables: 20K random sized rows 85 rows/page, 236 pages Random & contiguous deletes Inserted rows were random size OracleSQLServer ActionT1(Rand)T2(Cont) Delete 1K Rows Insert 1K Rows T1(Rand)T2(Cont) Step 1 Step 2 Step 3
Experiment 1: Example
Experiment 2: Memory Monitoring (Step 1 of 3) Part Synthetic Tables Lineorder EmptyIndexes CustomerOther 1 pixel = 1 page 50K pages total Snapshot 1Snapshot 2
Experiment 2: Memory Monitoring (Step 2 of 3) Part Synthetic Tables Lineorder EmptyIndexes CustomerOther 1 pixel = 1 page 50K pages total Snapshot 2Snapshot 3
Experiment 2: Memory Monitoring (Step 3 of 3) Part Synthetic Tables Lineorder EmptyIndexes CustomerOther 1 pixel = 1 page 50K pages total Snapshot 3Snapshot 4
Experiment 2: Memory Monitoring (Summary)
Conclusion DICE generalizes storage at the page level Reconstruct data (corrupted, deleted, RAM) Future Work Detect log file modification with DICE Audit the DBA Disk - Deleted, updated and inserted rows RAM - Read-only queries
Questions