Fault Injection and a Timing Channel on an Analysis Technique John A Clark and Jeremy L Jacob Dept. of Computer Science University of York, UK

Slides:



Advertisements
Similar presentations
Heuristic Search techniques
Advertisements

G5BAIM Artificial Intelligence Methods
Lecture 23 Exemplary Inverse Problems including Earthquake Location.
An Odd Take on Formality and Testing A Title Best Left Unannounced??? John A Clark Senior Lecturer in Critical Systems Dept. of Computer Science University.
The General Linear Model Or, What the Hell’s Going on During Estimation?
1 SSS II Lecture 1: Correlation and Regression Graduate School 2008/2009 Social Science Statistics II Gwilym Pryce
Theoretical Program Checking Greg Bronevetsky. Background The field of Program Checking is about 13 years old. Pioneered by Manuel Blum, Hal Wasserman,
Gizem ALAGÖZ. Simulation optimization has received considerable attention from both simulation researchers and practitioners. Both continuous and discrete.
Introduction to Cryptography and Security Mechanisms: Unit 5 Theoretical v Practical Security Dr Keith Martin McCrea
Motion Detection And Analysis Michael Knowles Tuesday 13 th January 2004.
Resampling techniques Why resampling? Jacknife Cross-validation Bootstrap Examples of application of bootstrap.
MAE 552 – Heuristic Optimization Lecture 8 February 8, 2002.
Maximum likelihood Conditional distribution and likelihood Maximum likelihood estimations Information in the data and likelihood Observed and Fisher’s.
Two-Stage Optimisation in the Design of Boolean Functions John A Clark and Jeremy L Jacob Dept. of Computer Science University of York, UK
Optimization via Search CPSC 315 – Programming Studio Spring 2009 Project 2, Lecture 4 Adapted from slides of Yoonsuck Choe.
MAE 552 – Heuristic Optimization Lecture 6 February 6, 2002.
Exploiting the Search Process John A Clark Dept. of Computer Science University of York, UK
Assumptions in the Use of Heuristic Optimisation in Cryptography John A Clark Dept. of Computer Science University of York, UK
Evolving Boolean Functions Satisfying Multiple Criteria John A Clark, Jeremy L Jacob and Susan Stepney (University of York,UK) Subhamoy Maitra (Indian.
Maximum likelihood (ML) and likelihood ratio (LR) test
Evaluating Hypotheses
Heuristic Optimisation in Design and Analysis John A Clark University of York, UK
MAE 552 – Heuristic Optimization Lecture 26 April 1, 2002 Topic:Branch and Bound.
Answers Will Out Or….what’s wrong with repeated failure?
Code and Decoder Design of LDPC Codes for Gbps Systems Jeremy Thorpe Presented to: Microsoft Research
The Evolution of Protocol Security and Insecurity John A Clark Dept. of Computer Science University of York, UK Canterbury.
Planning operation start times for the manufacture of capital products with uncertain processing times and resource constraints D.P. Song, Dr. C.Hicks.
Protocols are Programs Too: Using GAs to Evolve Secure Protocols John A Clark Dept. of Computer Science University of York, UK
Session 6: Introduction to cryptanalysis part 1. Contents Problem definition Symmetric systems cryptanalysis Particularities of block ciphers cryptanalysis.
Challenging Assumptions in the Use of Heuristic Search Techniques in Cryptography John A Clark Dept. of Computer Science University of York, UK
Linear and generalised linear models Purpose of linear models Least-squares solution for linear models Analysis of diagnostics Exponential family and generalised.
Making and Breaking Security Protocols with Heuristic Optimisation John A Clark Dept. of Computer Science University of York, UK
Problem Warping and Computational Dynamics in the Solution of NP-hard Problems John A Clark Dept. of Computer Science University of York, UK
Introduction to Simulated Annealing 22c:145 Simulated Annealing  Motivated by the physical annealing process  Material is heated and slowly cooled.
Optimization via Search CPSC 315 – Programming Studio Spring 2008 Project 2, Lecture 4 Adapted from slides of Yoonsuck Choe.
Radial Basis Function Networks
Bootstrapping applied to t-tests
Separate multivariate observations
Elements of the Heuristic Approach
Multiple Sequence Alignment CSC391/691 Bioinformatics Spring 2004 Fetrow/Burg/Miller (Slides by J. Burg)
Curve Modeling Bézier Curves
© Copyright McGraw-Hill CHAPTER 6 The Normal Distribution.
Genetic Algorithm.
Chapter 6 The Normal Probability Distribution
1 Validation & Verification Chapter VALIDATION & VERIFICATION Very Difficult Very Important Conceptually distinct, but performed simultaneously.
880.P20 Winter 2006 Richard Kass 1 Confidence Intervals and Upper Limits Confidence intervals (CI) are related to confidence limits (CL). To calculate.
CHAPTER 16: Inference in Practice. Chapter 16 Concepts 2  Conditions for Inference in Practice  Cautions About Confidence Intervals  Cautions About.
Slides are based on Negnevitsky, Pearson Education, Lecture 12 Hybrid intelligent systems: Evolutionary neural networks and fuzzy evolutionary systems.
Example 5.8 Non-logistics Network Models | 5.2 | 5.3 | 5.4 | 5.5 | 5.6 | 5.7 | 5.9 | 5.10 | 5.10a a Background Information.
1 Simulated Annealing Contents 1. Basic Concepts 2. Algorithm 3. Practical considerations.
Yaomin Jin Design of Experiments Morris Method.
Copyright © 2009 Pearson Education, Inc. Chapter 18 Sampling Distribution Models.
Simulated Annealing.
Brian Macpherson Ph.D, Professor of Statistics, University of Manitoba Tom Bingham Statistician, The Boeing Company.
1 Chapter 10: Introduction to Inference. 2 Inference Inference is the statistical process by which we use information collected from a sample to infer.
Error Control Code. Widely used in many areas, like communications, DVD, data storage… In communications, because of noise, you can never be sure that.
2005MEE Software Engineering Lecture 11 – Optimisation Techniques.
Thursday, May 9 Heuristic Search: methods for solving difficult optimization problems Handouts: Lecture Notes See the introduction to the paper.
Lecture 4: Statistics Review II Date: 9/5/02  Hypothesis tests: power  Estimation: likelihood, moment estimation, least square  Statistical properties.
559 Fish 559; Lecture 5 Non-linear Minimization. 559 Introduction Non-linear minimization (or optimization) is the numerical technique that is used by.
Local Search and Optimization Presented by Collin Kanaley.
SEM Basics 2 Byrne Chapter 2 Kline pg 7-15, 50-51, ,
Trees Example More than one variable. The residual plot suggests that the linear model is satisfactory. The R squared value seems quite low though,
A survey of Constraint Handling Techniques in Evolutionary Computation Methods Author: Zbigneiw Michalewicz Presenter: Masoud Mazloom 27 th Oct
URBDP 591 A Lecture 16: Research Validity and Replication Objectives Guidelines for Writing Final Paper Statistical Conclusion Validity Montecarlo Simulation/Randomization.
CHAPTER- 3.2 ERROR ANALYSIS. 3.3 SPECIFIC ERROR FORMULAS  The expressions of Equations (3.13) and (3.14) were derived for the general relationship of.
Kevin Stevenson AST 4762/5765. What is MCMC?  Random sampling algorithm  Estimates model parameters and their uncertainty  Only samples regions of.
Introduction It had its early roots in World War II and is flourishing in business and industry with the aid of computer.
 Negnevitsky, Pearson Education, Lecture 12 Hybrid intelligent systems: Evolutionary neural networks and fuzzy evolutionary systems n Introduction.
Heuristic search INT 404.
Presentation transcript:

Fault Injection and a Timing Channel on an Analysis Technique John A Clark and Jeremy L Jacob Dept. of Computer Science University of York, UK Amsterdam

Structure of the Talk Background Specific technical Part I: Describing underlying perceptron problems Part II: Describing simulated annealing Part III: Solving by search Part IV: Fault injection analogy Part V: Timing channel analogy Conclusions and future work

Background: Side Channels for All Some very high profile attacks have been demonstrated in the past decade that attack the implementation and not the algorithm Fault injection (Boneh, de Milo and Lipton) Timing attacks (Kocher) In this talk we aim to demonstrate that analysis techniques too may use such concepts You can try to solve mutated or warped problem instances to see what happens (fault injection on the problem) Observe the computational dynamics of the search (timing channel) Will concentrate on general concepts

Background: Identification Problems Zero-knowledge (Goldwasser and Micali) Early identification scheme by Shamir Several schemes of late based on NP-complete problems Permuted Kernel Problem (Shamir) Syndrome Decoding (Stern) Constrained Linear Equations (Stern) Permuted Perceptron Problem (Pointcheval) We shall demonstrate some new attacks on this problem

Part I: Underpinning Perceptron Problems Won’t go into details of the protocols. “A New Identification Scheme Based on the Perceptron Problems” (Pointcheval Eurocrypt 1995)

Perceptron Problem GivenFind So That Simple version used in some experiments.

Permuted Perceptron Problem GivenFindSo That Make Problem harder by imposing extra constraint. Has particular histogram H of positive values 135..

Example: PPP Problem PP and PPP-example Every PPP solution is a PP solution. Has particular histogram H of positive values 135

Generating Instances Suggested method of generation Generate random secret S Calculate AS Generate random matrix A Significant structure in this problem; high correlation between majority values of matrix columns and secret corresponding secret bits If any (AS) i <0 then negate ith row of A

Instance Properties Each matrix row/secret dot product is the sum of n Bernouilli (+1/-1) variables. Initial image histogram has Binomial shape and is symmetric about 0 After negation simply folds over to be positive -7– … … Image elements tend to be small

Part II: Search - Simulated Annealing

Simulated Annealing x0x0 x1x1 x2x2 z(x) Allows non-improving moves so that it is possible to go down x 11 x4x4 x5x5 x6x6 x7x7 x8x8 x9x9 x 10 x 12 x 13 x in order to rise again to reach global optimum In practice neighbourhood may be very large and trial neighbour is chosen randomly. Possible to accept worsening move when improving ones exist.

Simulated Annealing Improving moves always accepted Non-improving moves may be accepted probabilistically and in a manner depending on the temperature parameter T. Loosely the worse the move the less likely it is to be accepted a worsening move is less likely to be accepted the cooler the temperature The temperature T starts high and is gradually cooled as the search progresses. Initially virtually anything is accepted, at the end only improving moves are allowed (and the search effectively reduces to hill-climbing)

Simulated Annealing Current candidate x. Minimisation formulation. At each temperature consider 400 moves Always accept improving moves Accept worsening moves probabilistically. Gets harder to do this the worse the move. Gets harder as Temp decreases. Temperature cycle

Simulated Annealing 1 Do 400 trial moves 2 Do 400 trial moves 3 Do 400 trial moves 4 Do 400 trial moves m Do 400 trial moves n Do 400 trial moves Iteration

Part III: Solving By Search

Using Search Aim to search the space of possible secret vectors x to find one that is an actual solution to the problem at hand. Define a cost function: vectors that nearly solve the problem have low cost vectors that are far from solving the problem have high cost. Define a means of generating neighbours to the current vector Define a means of determining whether to move to that neighbour or not.

PP Using Search: Pointcheval Pointcheval couched the Perceptron Problem as a search problem. current solution Y Neighbourhood defined by single bit flips on current solution Cost function punishes any negative image components costNeg(y)=|-1|+|-3| =4

Using Annealing: Pointcheval PPP solution is also PP solution. Based estimates of cracking PPP on ratio of PP solutions to PPP solutions. Calculated sizes of matrix for which this should be most difficult Gave rise to (m,n)=(m,m+16) Recommended (m,n)=(101,117),(131,147),(151,167) Gave estimates for number of years needed to solve PPP using annealing as PP solution means Instances with matrices of size 200 ‘could usually be solved within a day’ But no PPP problem instance greater than 71 was ever solved this way ‘despite months of computation’.

Perceptron Problem (PP) Knudsen and Meier approach in 1999 (loosely): Carrying out sets of runs Note positions where results obtained all agree Fix those elements where there is complete agreement and carry out new set of runs and so on. If repeated runs give same values for particular bits assumption is that those bits are actually set correctly Used this sort of approach to solve instances of PP problem up to 180 times faster than previous for (151,167) problem.

Profiling Annealing Approach is not without its problems. Not all bits that have complete agreement are correct. Actual Secret Run 1 Run 2 Run 3 Run 4 Run 5 Run 6 All runs agree All agree (wrongly) 1

Knudsen and Meier (1999) Have used this method to attack PPP problem sizes (101,117) Uses enumeration stage (to search for wrong bits). Used new cost function w 1 =30, w 2 =1 with histogram punishment cost(y)=w 1 costNeg(y)+w 2 costHist(y)

Part IV: Fault Injection

PP Move Effects What limits the ability of annealing to find a PP solution? A move changes a single element of the current solution. Want current negative image values to go positive But changing a bit to cause negative values to go positive will often cause small positive values to go negative

Problem Fault Injection Can significantly improve results by punishing at positive value K For example punish any value less than K=4 during the search Drags the elements away from the boundary during search. Also use higher exponent in differences, e.g. |W i -K| 2 rather than simple deviation (201,217): K=20,15,10 (401,417): K=30,25,20,15 (501,517): K=25 (601,617): K=25 R=2 R=3

Results for PP Fault Injection Have solved instances of size (number of solutions from 30 runs). Some solved directly - others after 1, 2, or 3 bit local search (201,217): (601,617): Secret vectors solved three times as long as previously

PP Solution Correlation with Generating Secrets (201,217): 79.2%-87.1% (401,417): 83.4%-87.5% (501,517): 80.6%-86.4% (601,617): 77.5%-86.1%

PPP Extensions Used similar cost function as Knudsen And Meier but with fault injection on the negativity part (plus different exponents) Attack each PPP problem instance using a variety of different weightings G, bounds K and values of exponent R. These are different `viewpoints’ on each problem.

PPP Results: Final Bits Correct Consequence is that warped problems typically give rise to solutions with more agreement than the original secret than non-warped ones. For example (101,117): up to 108 bits correct (131,147): up to 139 bits correct (151,167): up to 157 bits correct. However, results may vary considerably and also between runs for the same problem

Democratic Viewpoint Analysis Problem P Problem P 1 Problem P 2 Problem P n-1 Problem P n Essentially same as K&M before but this time go for substantial rather than unanimous agreement. By choosing the amount of disagreement tolerated carefully you can sometimes get over half the key this way. And on occasion have had only 1 bit in 115 most agreed bits incorrect (out of 167) It’s a 1 No. It’s a -1

Part V: Timing Channel:PPP

Profiling Annealing: Timing A lot of information is thrown away – better to monitor the search process as it cools down. Based on notion of thermostatistical annealing. Analysis shows that some elements will take some values early in the search and then never subsequently change. They get ‘stuck’ early in the search. The ones that get stuck early often do so for good reason – they are the correct values.

Results: Initial Bits Correct The timing profile of warped problems can reveal significant information. For example (101,117): up to 72 initial bits correct (131,147): up to 97 initial bits correct (151,167): up to 98 initial bits correct Again, results may vary considerably and also between runs for the same problem

PPP (101, 117)

PPP (131, 147)

PPP (151, 167)

Multiple Clock Watchers Analysis Problem P Problem P 1 Problem P 2 Problem P n-1 Problem P n Essentially same as for timing analysis but this time add up the times over all runs where each bit got stuck. As you might expect those bits that often get stuck early (i.e. have low aggregate times to getting stuck) generally do so at their correct values (take the majority value). Also seems to have significant potential but needs more work.

Conclusions I Search techniques have a computational dynamics too. Have profiled the action of annealing on various warped problems - mutants of the original problem. Analogy with fault injection, though here it is fault injection on public mathematics The trajectory by which a search reaches its final path may reveal more information about the sought secret than the final result of the search timing channel on an analysis

Future Work A local optimum is a strong source of information for cryptanalysis purposes: Can more subtle use be made of the distribution of local optima found using annealing searches? Use ‘results’ of optimising as sources of information. Can we detect secrets with extreme correctness properties? MAX-XOR problems. If you are given a large number of linear approximations for key bits (some of which may be misleading) what happens if you try to maximise the number solved?