Presentation is loading. Please wait.

Presentation is loading. Please wait.

Haigh & Harp, Learning from Limited Experience Improving Self-Defense by Learning from Limited Experience Karen HaighSteven Harp BBN TechnologiesAdventium.

Similar presentations


Presentation on theme: "Haigh & Harp, Learning from Limited Experience Improving Self-Defense by Learning from Limited Experience Karen HaighSteven Harp BBN TechnologiesAdventium."— Presentation transcript:

1 Haigh & Harp, Learning from Limited Experience Improving Self-Defense by Learning from Limited Experience Karen HaighSteven Harp BBN TechnologiesAdventium Labs

2 Haigh & Harp, Learning from Limited ExperienceOverview Goal: Systems that autonomously improve their defenses with experience. Several ways to do this... Examples discussed: –Learning to recognize anomalies –Self Immunizing against observed exploits –Acquiring multistage attacks concepts –Learning effective responses

3 Haigh & Harp, Learning from Limited Experience Learning in Cyber Security What is (machine) learning? –Automatically using prior experience to improve performance over time Problems addressable by learning? –Detection: distinguish problem from non-problem –Immunity: Good: “an exploit should succeed at most once” Better: “a vulnerability should be exploitable at most once” –Response: how best to actively counter an attack? Long Term Goal: Cognitive Immunity

4 Haigh & Harp, Learning from Limited Experience Opportunities & Techniques Detecting AttacksResponding to attacks Passive Observation Relatively well explored. Example: anomaly detection Work remains to be done to detect attacks extended over multiple hosts and steps. Not well explored. CSISM innovation: Situation-dependent utility of responses. Experiment (e.g. in sandbox / taster / laboratory) Cortex innovation: Use experiments to generalize from instances of attacks to classes of attacks. CSISM innovation: Use experiments to identify necessary & sufficient elements of multi-step attacks. Not well explored CSISM innovation: Variations on responses

5 Haigh & Harp, Learning from Limited Experience Modelling Defended Systems Expert Rules Offline Learning Online Learning Experimental Sandbox Offline Training + Good data + Complex environment - Dynamic system Online Training - Unknown data + Complex environment + Dynamic system Experimental Sandbox + Good data (self-labeled)‏ + Complex environment + Dynamic system Very hard for adversary to “train” the learner!!! Expert Heuristics + Good data - Complex environment - Dynamic system

6 Haigh & Harp, Learning from Limited Experience Complex Domain: Human Rules are Incomplete Quad 0&1 are slower than Quads 2&3. Complex domain: human calibration (incorrectly) claimed that Quad 1 was slowest, missing Quad 0 DPASA (DARPA OASIS)‏ Registration Time by Quad

7 Haigh & Harp, Learning from Limited Experience Complex Domain (2)‏ caf_plan, chem_haz and maf_plan are slower than other clients Complex domain: human calibration (incorrectly) claimed that caf_plan & maf_plan were slowest because of hand-typed password, missing chem_haz DPASA (DARPA OASIS)‏ Registration Time by Client Type

8 Haigh & Harp, Learning from Limited Experience Learning for Calibration Calibrate the parameters of rules for normal operating conditions –Important first step because it learns how to respond to normal conditions –For example: learn timing parameters for rapid response controller, e.g. Client Registration, PSQ server local probes, SELinux enforcement, SELinux flapping, File integrity checks –Need to handle multi-modal data: CSISM / BBN

9 Haigh & Harp, Learning from Limited Experience Results for CombOps Registration If threshold were 0.90, then x- values inside the green box would be OK Beta=0.001 Beta= 0.0025 Beta= 0.005 CSISM / BBN

10 Haigh & Harp, Learning from Limited Experience Beta=0.00 05 Results for all Registration times These two “shoulder” points indicate upper and lower limits. As more observations are collected, the estimates become more confident of the range of expected values (i.e. tighter estimates to observations)‏ CSISM / BBN Algorithm of Last & Kandel, 2001

11 Haigh & Harp, Learning from Limited Experience Generalization of Attack Signatures Cortex Project

12 Haigh & Harp, Learning from Limited ExperienceGeneralization Goal: Learn a most general concept from instances of attacks and block all similar attacks against the vulnerability.  Dealing with Zero-day attacks... Payload Analysis Challenges –How to automatically recognize which element(s) of an attack are essential? –How to generalize them to their boundary conditions? avoid the fragility of simple pattern matching rules Approach: Experimentation –Validation of attack concepts  0 false positives Cortex / Honeywell

13 Haigh & Harp, Learning from Limited Experience Generalization by Experimentation Model contains axes of vulnerability Payload content –Binary machine instructions –Unusual payload (e.g. unix commands, registry keys, database administrative commands) –Length (# bytes/terms) Resource consumption patterns Probing (e.g. password guessing) Session-wide (multiple queries) Taste Tester Model of normal traffic Experiment 1)Score suspicious elements 2)Replace with innocuous or generalized values 3)Validate in tester normal attack Blocking Rules Cortex / Honeywell

14 Haigh & Harp, Learning from Limited Experience Cortex Demo Architecture and Use Cases Tasters Replicator Delete tasters Create tasters Switch Tasters Replicate queries Heartbeat Status RTS Replicate Switch Tasters Rebuild Tasters Send to Learning. AMP CSM Once per phase Proxy (Dexter)‏ Block known bad queries Taste test Log results Master DB Query Learner Read Training Data Experiment Generate Rules Normal Query Mission Planning Cortex / Honeywell

15 Haigh & Harp, Learning from Limited Experience Tasters Replicator Delete tasters Create tasters Switch Tasters Replicate queries Heartbeat Status RTS Replicate Switch Tasters Rebuild Tasters Send to Learning. AMP CSM Proxy (Dexter)‏ Block known bad queries Taste test Log results Master DB Query Learner Read Training Data Experiment Generate Rules Attack gets through Attack is blocked Cortex / Honeywell Cortex Demo Architecture and Use Cases

16 Haigh & Harp, Learning from Limited Experience Cortex Learning Algorithm Let M = NULL be the current model of normal traffic Let S = NULL be the current set of suspicion scores For an item i of traffic –if i is normal Add i to M –else S = compare i to M i safe = Replace (all) suspicious elements S in i with innocuous elements Validate that i safe is not an attack foreach s in sort( S ) –s0 = calculate experimentation value for s (i.e. value to test) –i test = Replace s in i safe with s0 –if Experiment with i test is an attack »Add blocking rule for i test –else Add i test to M Simplified for slide: Complete algorithm includes joint probabilities Cortex / Honeywell

17 Haigh & Harp, Learning from Limited Experience Example Results: MySQL Noted that hex bytes were suspicious, so generalized bytes and correctly blocked integer overflow! MySQL DOS attack Correctly generalized single attack to 0x7FFF max value Integer overflow Correctly generalized single attack to number of valid bytes. String buffer overflow (password) NotesAttacks Project was tested with a red-team model Cortex / Honeywell

18 Haigh & Harp, Learning from Limited Experience Identification of Multistage Attacks CSISM Project

19 Haigh & Harp, Learning from Limited Experience MultiStage Attacks: Challenges Detect and generalize multi-step attacks across time and space. –Multistage attacks involve a sequence of actions that span multiple hosts and take multiple steps to succeed. Challenges: –Which observations are necessary & sufficient? Incidental observations that are either –side effects of normal operations, or –chaff explicitly added by an attacker to divert the defender. Concealment (e.g. to remove evidence)‏ Probabilistic actions (e.g. to improve probability of attack success)‏ –What are the most reliable observations? –What are the parameter boundaries? Approach: Experimentation –Allows validation of pruning CSISM / BBN

20 Haigh & Harp, Learning from Limited Experience Architectural Schema CSISM Sensors (ILC, IDS)‏ Observations ending in failure of protected system. Only some are essential. 123456 Defense Measures Experimenter ABC X ? Viable Attack Theories Viable Defense Strategies and Detection Rules Attack Theory Experimenter 12346 ABD 5 C ABC “Sandbox” AC BC A BD 2 A Observations Actions CSISM / BBN

21 Haigh & Harp, Learning from Limited Experience Multi-Stage Learner Do { –Generate Theory according to heuristic Complete set of theories is Permutations( Powerset( observations )) –Test Theory –Incrementally update controller rulebases } while Theories remain For only 10 observations, there are > 10,000,000 possible theories (not including variations on steps!) The hard part! CSISM / BBN

22 22 Hypothesis Generation Query learner generates attack hypotheses –in heuristic order to acquire the concept rapidly Candidate Heuristics –Look for shorter attacks first (adjustable prior) –Suspect order of steps has an influence –Suspect steps to interact positively (for the attacker) –Prefer hypotheses with less common / more suspicious elements CSISM / BBN Project was tested with a red-team model

23 Haigh & Harp, Learning from Limited Experience Response Learning CSISM Project

24 Haigh & Harp, Learning from Limited Experience Situation-dependent Action Utilities Learn tradeoffs among potential responses; context changes appropriateness of responses changes –Context includes descriptions of users, attack elements, system performance, etc –Benefit is effectiveness of defense action –Cost includes effort to mount response and impact on availability Challenges: –Measuring the effect of responses is hard: Complex domain  rarely identical situations  non-deterministic actions/effects Approach: Experimentation –System “snapshots” get close to identical conditions CSISM / BBN

25 Haigh & Harp, Learning from Limited Experience Response Learning: Results Pending Bias toward results that worked in similar situations in the past –Hybrid Reinforcement learning and Nearest-Neighbour approaches Given a set of hypotheses about the locus of an attack –Search for true locus: Hierarchical based on system architecture Bias by historical attack patterns –Select response based on similarity match to prior attacks: Same response when quality was high Alternate response when quality was low  Project will be tested by a red-team on 20 May 2008. Goal is to demonstrate “better” responses over time. CSISM / BBN

26 Haigh & Harp, Learning from Limited Experience Conclusion

27 Learning Benefits Learning can improve the defensive posture –better knowledge (about the attacks or attacker), better policies Learning can improve how the system responds to symptoms –better connection between response actions and their triggers Active Learning –A mechanism for recognizing Zero-day attacks –No false positives — only validated attacks are added Learning techniques are enablers for the next level of enhancements in adaptive defense Adaptation is the key to survival

28 Haigh & Harp, Learning from Limited Experience From Proof-of-Concept to Production DemonstratedFuture Directions GeneralizationAble to generalize instances to classes. More axes of vulnerability More handling of joint probabilities More domains Meta learning to induce new axes Multi-stage attack Able to identify Chaff Probabilistic actions Concealment Model of normal Generalization ResponsesAble to map context to response Richer context, richer responses Automatic measurement of benefit Scalable “snapshots”

29 Haigh & Harp, Learning from Limited Experience Backup

30 Multistage Attacks Detect and then generalize multi-step attacks across time and space. Multistage attacks involve a sequence of actions that span multiple hosts and take multiple steps to succeed. –A sequence of actions with causal relationships. –An action A must occur set up the initial conditions for action B. Action B would have no effect without previously executing action A. –For example 1.gain ability to execute commands on Box1 as unprivileged user by exploiting a buffer overflow in Service1 2.gain root shell by running an exploit of a race condition 3.disable protection mechanism, e.g. SElinux 4.replace dpasa jar with attacker jar code 5.run attacker code that sends bad refs to Box2, Box3, Box4. Walk-Away-Message

31 Haigh & Harp, Learning from Limited Experience Attacks (MySQL DoS-1)‏ mysql-com_table-dump-memory-corruption –Malformed request leaves MySQL unstable Countermeasures: –Block the malformed com_table_dump command using learned pattern and proxy filter rules. –Restart the server –Block all requests from the offending sources

32 Haigh & Harp, Learning from Limited Experience Attacks (MySQL DoS-2)‏ mysql-password-handler-buffer-overflow –Excessive password length can crash server Countermeasures: –Block connections which proffer “abnormal” passwords (learned response or statistical anomaly). –Restart the server. –Block all requests from the offending sources.

33 Haigh & Harp, Learning from Limited Experience Attacks (MySQL DoS-3)‏ mysql-remote-fulltext-search-DoS –Malformed request crashes server Countermeasures: –Detect and block malformed queries –Block all queries of this type (fulltext-search)‏ –Block all requests from the offending sources. –Restart the server


Download ppt "Haigh & Harp, Learning from Limited Experience Improving Self-Defense by Learning from Limited Experience Karen HaighSteven Harp BBN TechnologiesAdventium."

Similar presentations


Ads by Google