Presentation on theme: "Forensic Analysis of Database Tampering Kyriacos Pavlou and Richard T. Snodgrass Computer Science Department The University of Arizona."— Presentation transcript:
Forensic Analysis of Database Tampering Kyriacos Pavlou and Richard T. Snodgrass Computer Science Department The University of Arizona
Introduction The problem : How to systematically perform forensic analysis on a compromised database. Recent federal laws (HIPAA, Sarbanes-Oxley Act etc.) and incidents of corporate collusion mandate audit log security. Snodgrass et al. [VLDB04] showed how to detect database tampering. Approach: Hash using a cryptographically strong hash function, notarize data manipulated by transactions and periodically validate. Forensic analysis to ascertain: –When the intrusion transpired –What data was altered –Who the intruder is –Why has this transpired
Tamper Detection Several related ideas that allow tamper detection: 1. DBMS can maintain audit log in background Transaction-time table Append-only 2.Data modified can be cryptographically hashed to produce a secure one-way hash of transaction. 3.Notarize hash value with external notarization service. The hash value cannot change. 4.Implementation optimizations: –opportunistic hashing –transaction ordering list –linked hashing The latest hash value is a hash of all the changes made to the database since database creation.
Tamper Detection Two phases: Normal Processing Validation The validation result is a single bit. hash value notary ID rehash transactions hashing + result transactions hashing + hash value notary ID hash value notary ID +
Definitions Corruption Event (CE): any event that corrupts data and compromises database (intrusion, human intervention, bug) Corruption time ( t c ): actual time instant at which a CE occurred. Validation Event (VE): validation of the audit log by the Notarization Service (NS). Time of VE ( t v ): time instant at which a VE occurred. Validation Failure vs. Validation Success: NS’s answer to a query for a particular hash value. Denotes tampering or lack thereof respectively. Notarization Event (NE): the notarization of a document by the NS. Time of NE ( t n ): time instant at which a NE occurred.
Definitions (cntd) Forensic analysis involves the following: Temporal detection: determination of t c Spatial detection: determination of “where,” i.e., the location in the database of the data affected in a CE. This data is termed the corruption locus data ( l c ). In fact, try to ascertain locus time ( t l ), the time instant l c was originally stored (transaction commit time). Note that a CE can have many l c ’s, termed multi-locus, or a only one l c termed single-locus CE.
The Corruption Diagram NE 0 NE 2 NE 1 NE 3 VE 1 When Where link NE: Notarization Event = TRUE VE: Validation Event
= TRUE The Corruption Diagram NE 0 NE 1 NE 2 NE 3 VE 1 NE 4 NE 5 NE 6 VE 2 = TRUE When Where I N notarization interval IVIV validation interval NE: Notarization Event VE: Validation Event link Actual time Commit time CE. CE: Corruption Event VE 2 clock time commit time
Forensic Analysis If a corruption is detected, the forensic analyzer springs into action. The analyzer tries to ascertain a corruption region: the bounds on the uncertainty of the “where” and “when” of the corruption.
Notarization and Validation Intervals Non-aligned validation just delays detection of tampering. Validation factor I V = V·I N
Analyzing Timestamp Corruption So far considered data-only CEs. We now examine the case where the timestamps of the tuples are changed. Retroactive Introactive Data-only Backdating Postdating ×
Monochromatic Algorithm NE 0 NE 1 NE 2 NE 3 VE 1 = TRUE NE 4 NE 5 NE 6 VE 2 = FALSE When Where FTFF CE. Forensic analysis begins F time of corruption ( t c ) t l : place of corruption (commit time) Corruption Region: captures the uncertainty as to the position of CE
Monochromatic Algorithm Central insight: data can be rehashed by validator and checked. Corruption region bounds: I V I N –Area is solely dependent on the two intervals. Cannot handle CEs involving timestamp corruption. ×
The RGB Forensic Algorithm NE 0 NE 1 NE 2 NE 3 VE 1 = TRUE NE 4 NE 5 NE 6 VE 4 = FALSE When Where VE 2 = TRUE VE 3 = TRUE NE 7 NE 8 R R B G B G T T T tctc tltl Forensic analysis begins F FFFFFFT CE. Postdating CE tptp t p : postdating time I V = 4 days I N = 2 days Notarization of Red Notarization of Blue & Green Notarization of Red x x
The RGB Forensic Algorithm Introduction of RGB partial hash chains: –Allows the bounding of both t l and t p –Incurs extra NS cost Each of two corruption regions bounds: I V I N We would like to reduce the area of the corruption regions. ×
The Polychromatic Algorithm NE 0 NE 1 NE 2 NE 3 VE 1 = TRUE NE 4 NE 5 NE 6 VE 2 = TRUE VE 3 = TRUE NE 7 NE 8 CE. T T tctc R Forensic analysis begins B G R B G VE 4 = FALSE FFF F FF T F F T F F tbtb tltl t b : backdating time I V = 4 days I N = 2 days Desired = 1 day When Backdating CE x x Uncertainty can be arbitrarily shrunk via a logarithmic number of red and blue hash chains. Notarization of 2 Reds Notarization of 2 Blues & 1 Green Notarization of 2 Reds
The Polychromatic Algorithm Introduction of extra partial hash chains: –Reduces uncertainty of corruption region –Incurs additional NS cost Uncertainty can be arbitrarily shrunk via a logarithmic number of red and blue hash chains. Hence, the width is no longer dependent on I V and I N.
Forensic Strength Components: –Work of forensic analysis –Region-area of CE –Width of postdating / backdating uncertainty Inverse Forensic Strength: IFS( D, I N,V ) = ( NumNotarizes( D, I N,V ) + ForensicAnalysis( D, I N,V ) ) · RegionArea( I N,V ) · UncertaintyWidth( D, I N ) where V = I V / I N is the validation factor and D is the number of days before first validation failure. Monochromatic: O( V · D 2 · I N ) RGB: O( V · D · I N 2 ) We assume that D >> I N. Polychromatic: O( ( V + lg I N ) · D )
Future Work Develop a stronger lower bound for this problem. Accommodate multi-locus and complex CEs. Differentiate postdating and backdating CEs. Implement forensic analysis in validator. Consider interaction between transaction-time storage manager and underlying WORM storage.
Summary We have presented a means of performing forensic analysis. We have introduced a graphical representation to visualize CEs, termed the corruption diagram. We have designed three forensic algorithms. –Monochromatic –RGB –Polychromatic
Acknowledgements NSF grants IIS , IIS and EIA and a grant from Microsoft provided partial support for this work.