Presentation on theme: "Forensic Analysis of Database Tampering"— Presentation transcript:
1Forensic Analysis of Database Tampering Kyriacos Pavlou and Richard T. SnodgrassComputer Science DepartmentThe University of Arizona
2IntroductionThe problem : How to systematically perform forensic analysison a compromised database.Recent federal laws (HIPAA, Sarbanes-Oxley Act etc.) and incidents of corporate collusion mandate audit log security.Snodgrass et al. [VLDB04] showed how to detect database tampering.Approach: Hash using a cryptographically strong hash function, notarize data manipulated by transactions and periodically validate.Forensic analysis to ascertain:When the intrusion transpiredWhat data was alteredWho the intruder isWhy has this transpired
3Outline Tamper Detection Forensic Analysis Forensic Algorithms The corruption diagramTypes of corruption eventsForensic AlgorithmsThree algorithmsForensic strengthFuture Work
4Tamper Detection Several related ideas that allow tamper detection: DBMS can maintain audit log in backgroundTransaction-time tableAppend-onlyData modified can be cryptographically hashed to produce a secure one-way hash of transaction.Notarize hash value with external notarization service. The hash value cannot change.Implementation optimizations:opportunistic hashingtransaction ordering listlinked hashingThe latest hash value is a hash of all the changes made to thedatabase since database creation.
5Tamper Detection Two phases: Normal Processing Validation transactionsTwo phases:Normal ProcessingValidationThe validation result is a single bit.hash valuetransactionstransactionshashing+notary IDhash valuetransactionshashing+notary IDrehashhash valuenotary ID+result
6DefinitionsCorruption Event (CE): any event that corrupts data and compromises database (intrusion, human intervention, bug)Corruption time (tc): actual time instant at which a CE occurred.Validation Event (VE): validation of the audit log by the Notarization Service (NS).Time of VE (tv): time instant at which a VE occurred.Validation Failure vs. Validation Success: NS’s answer to a query for a particular hash value. Denotes tampering or lack thereof respectively.Notarization Event (NE): the notarization of a document by the NS.Time of NE (tn): time instant at which a NE occurred.
7Definitions (cntd) Forensic analysis involves the following: Temporal detection: determination of tcSpatial detection: determination of “where,” i.e., the location in the database of the data affected in a CE.This data is termed the corruption locus data (lc).In fact, try to ascertain locus time (tl), the time instant lc was originally stored (transaction commit time).Note that a CE can have many lc’s, termed multi-locus, or aonly one lc termed single-locus CE.
10Forensic AnalysisIf a corruption is detected, the forensic analyzer springs into action.The analyzer tries to ascertain a corruption region: the bounds on the uncertainty of the “where” and “when” of the corruption.
11Notarization and Validation Intervals Non-aligned validation just delays detection of tampering.Validation factor IV = V·IN
12Analyzing Timestamp Corruption So far considered data-only CEs. We now examine the case where the timestamps of the tuples are changed.Data-onlyBackdatingPostdatingRetroactiveIntroactive×
13Monochromatic Algorithm WhenForensic analysis beginsTFFFFVE2 = FALSENE6CE.time of corruption (tc)NE5NE4VE1 = TRUENE3Corruption Region: captures the uncertainty as to the position of CENE2NE1tl: place of corruption (commit time)NE0Where
14Monochromatic Algorithm Central insight: data can be rehashed by validator and checked.Corruption region bounds: IV INArea is solely dependent on the two intervals.Cannot handle CEs involving timestamp corruption.×
15The RGB Forensic Algorithm WhenBGTTFFFFFFFVE4 = FALSENE8Forensic analysis beginsCE.Postdating CEIV = 4 daysIN = 2 daystcNE7TNotarization of RedRVE3 = TRUENE6NE5BGTNotarization of Blue & GreenVE2 = TRUEtptp: postdating timeNE4NE3Notarization of RedRVE1 = TRUENE2NE1xxNE0tlWhere
16The RGB Forensic Algorithm Introduction of RGB partial hash chains:Allows the bounding of both tl and tpIncurs extra NS costEach of two corruption regions bounds: IV INWe would like to reduce the area of the corruption regions.×
17The Polychromatic Algorithm BGFWhenFTTFFFFFFVE4 = FALSENE8Forensic analysis beginsCE.IV = 4 daysIN = 2 daysDesired = 1 daytcNE7TNotarization of 2 RedsRVE3 = TRUENE6NE5Backdating CETBGFFNotarization of 2 Blues & 1 GreenVE2 = TRUENE4Uncertainty can be arbitrarily shrunk via a logarithmic number of red and blue hash chains.NE3Notarization of 2 RedsRVE1 = TRUENE2tb: backdating timeNE1xxNE0tbtl
18The Polychromatic Algorithm Introduction of extra partial hash chains:Reduces uncertainty of corruption regionIncurs additional NS costUncertainty can be arbitrarily shrunk via a logarithmic number of red and blue hash chains.Hence, the width is no longer dependent on IV and IN .
19Forensic Strength Components: Inverse Forensic Strength: Work of forensic analysisRegion-area of CEWidth of postdating / backdating uncertaintyInverse Forensic Strength:IFS( D , IN ,V ) = ( NumNotarizes( D , IN ,V ) + ForensicAnalysis( D , IN ,V ) )· RegionArea( IN ,V ) · UncertaintyWidth( D , IN )where V = IV / IN is the validation factor andD is the number of days before first validation failure.Monochromatic: O( V · D2 · IN )RGB: O( V · D · IN2 ) We assume that D >> IN .Polychromatic: O( ( V + lg IN ) · D )
20Future Work Develop a stronger lower bound for this problem. Accommodate multi-locus and complex CEs.Differentiate postdating and backdating CEs.Implement forensic analysis in validator.Consider interaction between transaction-time storage manager and underlying WORM storage.
21Summary We have presented a means of performing forensic analysis. We have introduced a graphical representation to visualize CEs, termed the corruption diagram.We have designed three forensic algorithms.MonochromaticRGBPolychromatic
22AcknowledgementsNSF grants IIS , IIS and EIA and a grant from Microsoft provided partial support for this work.