Master Program in Web Science,

Master Program in Web Science,
Security Procedures Master Program in Web Science, Veroia, March 2010 Y.C. Stamatiou Department of Mathematics, University of Ioannina and Research and Academic Computer Technology Institute

It is all about the following simple, but highly important, scenario:
Cryptography! It is all about the following simple, but highly important, scenario:

Cryptanalysis

What is used in Cryptology?
Cryptography: Linear algebra, abstract algebra, number theory Cryptanalysis: Probability, statistics, combinatorics, computing But the foundations lie in Complexity Theory! In essence, cryptology resulted from a “collaboration” between Number Theory and Complexity Theory!

The mathematical model of the computer!
Turing machine: The mathematical model of the computer! # 1 1 ALAN TURING   q0 q1  qn (q1,0)  (q2,1,) Infinite tape divided into cells (memory) Each cell can hold one input/output symbol, usually a bit(0 ή 1), or the blank (#) A head that can read/write a cell and move about on the tape A “decision making” mechanism (state transition)

An algorithm! The “program” below computes the difference between two positive integers m and n (only if m > n, otherwise it “returns” 0) given in the form 0m10n on the tape of the Turing machine (isn’t it, a bit, reminiscent of good, old Assembly?): q0 q1 q2 q3 q4 q5 q6 (q1,#,Δ) (q1,0,Δ) (q3,1,Α) (q3,0,Α) (q4,0,Α) (q5,#,Δ) - (stops) 1 (q2,1,Δ) (q4,#,Α) # (hangs) (q0,#,Δ) (q6,0,Δ) (q6,#,Δ)

Computation resources
Memory (number of tape cells/memory locations used) Time (number of movements of the read/write head) Time/space complexity functions, where n is the size of the input: It is important not to have combinatorial explosion for these functions so as to avoid exponential increase in time/space requirements as the input size increases The complexity functions that avoid the combinatorial explosion are called polynomial An important note! The size of, e.g., an array or a list of numbers is roughly equal to the number of elements! The size of an integer n is not n, but logn (the base is immaterial )! t(n), s(n)

Observe how the functions that are bounded from above by a polynomial have “reasonable” rate of increase!

Two important time complexity classes of problems
P: Problems for which there exists a polynomial time deterministic Turing machine (algorithm) that solves them NP: Problems for which no polynomial time deterministic Turing machine has been discovered, yet, that solves them but for which a polynomial time non-deterministic Turing machine exists!

Integers! God made the integers; all else is the work of man
Leopold Kronecker (1823 – 1891)

Primes: the building blocks of integers!
prime numbers are integers greater than 1 that have as divisors 1 and self i.e., they cannot be written as a product of other integers e.g. 2, 3, 5, 7 are prime but 4, 6, 8, 9, 10 are not prime numbers are central to number theory list of prime number less than 200 is: The set of primes is infinite (Euclid) A central concern of number theory is the study of prime numbers. Indeed, whole books have been written on the subject. An integer p>1 is a prime number if and only if its only divisors are 1 and itself. Prime numbers play a critical role in number theory and in the techniques discussed in this chapter. Stallings Table 8.1 (excerpt above) shows the primes less than Note the way the primes are distributed. In particular note the number of primes in each range of 100 numbers. From Wolfram Demonstration Projects

Prime Factorisation to factor an integer n is to write it as a product of other numbers greater than 1 the prime factorisation of an integer n is its decomposition into a product of primes e.g. 91=7x13, 3600=24x32x52 Important! Factoring an integer is hard compared to the ease of multiplying the factors together to generate the integer! The idea of "factoring" a number is important - finding numbers which divide into it. Taking this as far as can go, by factorising all the factors, we can eventually write the number as a product of (powers of) primes - its prime factorisation. Note also that factoring a number is relatively hard compared to multiplying the factors together to generate the number.

Relatively Prime Numbers & GCD
Two integers a and b are relatively prime if they have no common divisors e.g. 8 & 15 are relatively prime since the factors of 8 are 2,4,8 and of 15 are 3,5,15 – no common factor exists Conversely, we can determine the Greatest Common Divisor (GCD) by comparing their prime factorizations and using least powers e.g. 300=21x31x52 18=21x32 hence gcd(18,300)=21x31x50=6 Of course, GCDs are computed much faster with Euclid’s algorithm! Have the concept of “relatively prime” if two number share no common factors other than 1. Another common problem is to determine the "greatest common divisor” GCD(a,b) which is the largest number that divides into both a & b.

Fermat's Little Theorem (FLT)
The following holds: ap-1 = 1 (mod p) where p is prime, with gcd(a,p)=1 (i.e. a, p are coprime) Also: ap = p (mod p) Useful result in public key cryptography and primality testing Two theorems that play important roles in public-key cryptography are Fermat’s theorem and Euler’s theorem. Fermat’s theorem (also known as Fermat’s Little Theorem) as listed above, states an important property of prime numbers. See Stallings section 8.2 for its proof.

Euler Totient Function φ(n)
when doing arithmetic (addition/multiplication) modulo n complete set of residues is: 0..n-1 (i.e. the set of remainders when an integer is divided by n) reduced set of residues is those numbers (residues) which are relatively prime to n e.g. for n = 10: The complete set of residues is {0,1,2,3,4,5,6,7,8,9}. The reduced set of residues is {1,3,7,9}. The number of elements in reduced set of residues is called the Euler Totient Function φ(n) Now introduce the Euler’s totient function ø(n), defined as the number of positive integers less than n & relatively prime to n. Note the term “residue” refers to numbers less than some modulus, and the “reduced set of residues” to those numbers (residues) which are relatively prime to the modulus (n). Note by convention that ø(1) = 1.

Euler Totient Function ø(n)
to compute φ(n) need to count number of residues to be excluded in general need prime factorization, but for p prime φ(p) = p-1 for p.q primes φ(pq) =(p-1)x(q-1) e.g. φ(37) = 36 φ(21) = (3–1)x(7–1) = 2x6 = 12 To compute ø(n) need to count the number of residues to be excluded. In general you need use a complex formula on the prime factorization of n, but have a couple of special cases as shown.

Euler's Theorem a generalisation of Fermat's Theorem aφ(n) = 1(mod n)
for any a, n where gcd(a,n)=1 e.g. a = 3; n = 10; φ(10) = 4 hence 34 = 81 = 1 mod 10 a = 2; n = 11; φ(11) = 10 hence 210 = 1024 = 1 mod 11 Euler's Theorem is a generalization of Fermat's Theorem for any number n. See Stallings section 8.2 for its proof.

Primality Testing often need to find large prime numbers
traditionally sieve using trial division ie. divide by all numbers (primes) in turn less than the square root of the number only works for small numbers alternatively can use statistical primality tests based on properties of primes for which all primes numbers satisfy property but some composite numbers, called pseudo-primes, also satisfy the property can use a slower deterministic primality test For many cryptographic functions it is necessary to select one or more very large prime numbers at random. Thus we are faced with the task of determining whether a given large number is prime. Traditionally sieve for primes using trial division of all possible prime factors of some number, but this only works for small numbers. Alternatively can use repeated statistical primality tests based on properties of primes, and then for certainty, use a slower deterministic primality test, such as the AKS test.

The Miller Rabin Test A primality test based on Fermat’s Theorem (observe, however, this theorem is not an “if and only if” theorem!): We have the Miller-Rabin primality test This is a probabilistic, polynomial time algorithm The AKS primality test: deterministic, polynomial time algorithm if p prime then ap-1 = 1 (mod p) The algorithm shown is due to Miller and Rabin is typically used to test a large number for primality. See Stallings section 8.3 for its proof, which is based on Fermat’s theorem.

Algorithm Miller-Rabin probabilistic primality test MILLER-RABIN (n,t) INPUT: an odd integer n  3 and security parameter t  1. OUTPUT: an answer “prime” or “composite”. 1. Write n – 1 = 2sr such that r is odd. 2. For i from 1 to t do the following: Choose a random integer a, 2  a  n – Compute y = ar mod n If y  1 and y  n – 1 then do the following: j  While j  s – 1 and y  n – 1 do the following: Compute y  y2 mod n If y  1 then return (“composite”) If y  n – 1 then return (“composite”). j j Return (“prime”).

Probabilistic Considerations
if Miller-Rabin returns “composite” the number is definitely not prime otherwise is a prime or a pseudo-prime chance it detects a pseudo-prime is < 1/4 hence if repeat test with different random a then chance n is prime after t tests is: Pr(n prime after t tests) = 1 – (1/4)t This converges exponentially fast to 1 e.g. for t = 10 this probability is > If Miller-Rabin returns “composite” the number is definitely not prime, otherwise it is either a prime or a pseudo-prime. The chance it detects a pseudo-prime is < 1/4 So if apply test repeatedly with different values of a, the probabiility that the number is a pseudo-prime can be made as small as desired, eg after 10 tests have chance of error < If really need certainty, then would now expend effort to run a deterministic primality proof such as AKS.

Prime Number Distribution
The prime number theorem states that primes occur roughly every ln(n) integers, thus prime numbers abound! However, even numbers can be ignored immediately Thus, in practice one needs only to test 0.5ln(n) numbers of size n to locate a prime note this is only the “average” sometimes primes are close together and other times are quite far apart A result from number theory, known as the prime number theorem, states that primes near n are spaced on the average one every (ln n) integers. Since you can ignore even numbers, on average need only test 0.5 ln(n) numbers of size n to locate a prime. eg. for numbers round 2^200 would check 0.5ln(2^200) = 69 numbers on average. This is only an average, can see successive odd primes, or long runs of composites.

Chinese Remainder Theorem
Used to speed up modulo computations if working modulo a product of numbers e.g. mod M = m1m2..mk Chinese Remainder theorem lets us work in each moduli mi separately Since computational cost is proportional to size, this is faster than working in the full modulus M One of the most useful results of number theory is the Chinese remainder theorem (CRT), so called because it is believed to have been discovered by the Chinese mathematician Sun-Tse in around 100 AD. It is very useful in speeding up some operations in the RSA public-key scheme, since it allows you to do perform calculations modulo factors of your modulus, and then combine the answers to get the actual result. Since the computational cost is proportional to size, this is faster than working in the full modulus sized modulus.

Chinese Remainder Theorem
can implement CRT in several ways to compute A(mod M) first compute all ai = A mod mi separately determine constants ci below, where Mi = M/mi then combine results to get answer using: One of the useful features of the Chinese remainder theorem is that it provides a way to manipulate (potentially very large) numbers mod M, in terms of tuples of smaller numbers.This can be useful when M is 150 digits or more. However note that it is necessary to know beforehand the factorization of M. See worked examples in Stallings section 8.4.

Primitive Roots from Euler’s theorem have aφ(n)mod n=1
consider am=1 (mod n), gcd(a,n)=1 must exist for m = φ(n) but may be smaller once powers reach m, cycle will repeat if smallest is m = φ(n) then a is called a primitive root if p is prime, then successive powers of a "generate" the group mod p these are useful but relatively hard to find Consider the powers of an integer modulo n. By Eulers theorem, for every relatively prime a, there is at least one power equal to 1 (being ø(n)), but there may be a smaller value. If the smallest value is m = ø(n) then a is called a primitive root. If n is prime, then the powers of a primitive root “generate” all residues mod n. Such generators are very useful, and are used in a number of public-key algorithms, but they are relatively hard to find.

Discrete Logarithms the inverse problem to exponentiation is to find the discrete logarithm of a number modulo p that is to find x such that y = gx (mod p) this is written as x = loggy (mod p) if g is a primitive root then it always exists, otherwise it may not, e.g. x = log3 4 mod 13 does not exist x = log2 3 mod 13 = 4 (e.g. by trying successive powers) whilst exponentiation is relatively easy, finding discrete logarithms is generally a computationally hard problem much like the factoring problem. Discrete logarithms are fundamental to a number of public-key algorithms, including Diffie-Hellman key exchange and the digital signature algorithm (DSA). Discrete logs (or indices) share the properties of normal logarithms, and are quite useful. The logarithm of a number is defined to be the power to which some positive base (except 1) must be raised in order to equal that number. If working with modulo arithmetic, and the base is a primitive root, then an integral discrete logarithm exists for any residue. However whilst exponentiation is relatively easy, finding discrete logs is not, in fact is as hard as factoring a number. This is an example of a problem that is "easy" one way (raising a number to a power), but "hard" the other (finding what power a number is raised to giving the desired answer). Problems with this type of asymmetry are very rare, but are of critical usefulness in modern cryptography.

One-Way Functions: Number Theory meets Complexity Theory!
A function f: DR is called one-way if: Computing f(x) is “easy” (i.e. polynomial fast). Computing f-1(y) for almost all the images is “hard”. e.g. (under the Discrete Logarithm assumption) Prime p and a generator g of Zp*. f(x) = gx (mod p).

Public key cryptography

Public key cryptography
Factoring related: RSA, Rabin Discrete-log related: Diffie-Hellman (El Gamal) Elliptic curves Modern Lattice Based Ajtai-Dwork: only one for which worst case to hardness reduction is known Goldreich-Goldwasser and Halevi

RSA Invented by Rivest, Shamir and Adleman in 1978
Based on difficulty of factoring. Used to “hide” the size of a group Zn* since: Factoring has not been reduced to RSA an algorithm that generates m from c does not give an efficient algorithm for factoring On the other hand, factoring has been reduced to finding the private-key. there is an efficient algorithm for factoring given one that can find the private key.

RSA Public-key Cryptosystem
Public Key: (e,n) Private Key: d What we need: p and q, primes of approximately the same size n = pq φ(n) = (p-1)(q-1) e  Z φ(n)* d = e-1 mod φ(n) Encode: m  Zn E(m) = me mod n Can you give an example in which solving the discrete log is simple? All finite groups are isomorphic to (Zn,+) so why isn’t it always easy? Decode: D(c) = cd mod n

RSA continued Why it works: D(c) = cd mod n = cd mod pq = med mod pq
= m1 + k(p-1)(q-1) mod pq = m · (mp-1)k(q-1) mod pq = m · (mq-1)k(p-1) mod pq Chinese Remainder Theorem: If p and q are relatively prime, and a = b mod p and a = b mod q, then a = b mod pq. m · (mp-1)k(q-1) = m mod p m · (mq-1)k(p-1) = m mod q D(c) = m mod pq

RSA computations To generate the keys, we need to To encode and decode
Find two primes p and q. Generate candidates and use primality testing to filter them. Find e-1 mod (p-1)(q-1). Use Euclid’s algorithm. Takes time log2(n) To encode and decode Take me or cd. Use the power method. Takes time log(e) log2(n) and log(d) log2(n) . In practice e is selected to be small so that encoding is fast.

Security of RSA Warning: Possible security holes:
Do not use this or any other algorithm naively! Possible security holes: Need to use “safe” primes p and q. In particular p-1 and q-1 should have large prime factors. p and q should not have the same number of digits. Can use a middle attack starting at sqrt(n). e cannot be too small Don’t use same n for different e’s. You should always “pad”

Algorithm to factor given d and e
If an attacker has an algorithm that generates d from e, then he/she can factor n in PPT. Variant of the Rabin-Miller primality test. Function TryFactor(e,d,n) write ed – 1 as 2sr, r odd choose w at random < n v = wr mod n if v = 1 then return(fail) while v  1 mod n v0 = v v = v2 mod n if v0 = n - 1 then return(fail) return(pass, gcd(v0 + 1, n)) LasVegas algorithm Probability of pass is > .5. Will return p or q if it passes. Try until you pass. Note that the only roots of unity for a prime are 1 and n-1. This algorithm gives a hint of why you should have a large prime factor for (p-1) and/or (q-1) w2sr = wed-1 = wkφ = 1 mod n v02 = 1 mod n (v0 – 1)(v0 + 1)= k’n

RSA in the “Real World” Part of many standards: PKCS, ITU X.509, ANSI X9.31, IEEE P1363 Used by: SSL, PEM, PGP, Entrust, … The standards specify many details on the implementation, e.g. e should be selected to be small, but not too small “multi prime” versions make use of n = pqr… this makes it cheaper to decode especially in parallel (uses Chinese remainder theorem).

Factoring in the Real World
Quadratic Sieve (QS): Used in 1994 to factor a 129 digit (428-bit) number Machines, 8 months. Number field Sieve (NFS): Used in 1999 to factor 155 digit (512-bit) number. 35 CPU years. At least 4x faster than QS The RSA Challenge numbers

ElGamal Based on the difficulty of the discrete log problem.
Invented in 1985 Digital signature and Key-exchange variants DSA based on ElGamal AES standard Incorporated in SSL (as is RSA) Public Key used by TRW (avoided RSA patent) Works over various groups Zp, Multiplicative group GF(pn), Elliptic Curves

ElGamal Public-key Cryptosystem
Encode: Pick random k  Z|G| E(m) = (y1, y2) = (αk, m * βk) (G,*) is a group α a generator for G a  Z|G| β = αa G is selected so that it is hard to solve the discrete log problem. Decode: D(y) = y2 * (y1a) = (m * βk) * (αka) = m * βk * (βk) = m You need to know a to easily decode y! Can you give an example in which solving the discrete log is simple? All finite groups are isomorphic to (Zn,+) so why isn’t it always easy? Public Key: (α, β) and some description of G Private Key: a

ElGamal: Example G = Z11* α = 2 a = 8 β = 28 (mod 11) = 3 Encode: 7
Pick random k = 4 E(m) = (24, 7 * 34) = (5, 6) G = Z11* α = 2 a = 8 β = 28 (mod 11) = 3 Decode: (5, 6) D(y) = 6 * (58) = 6 * = 6 * 3 (mod 11) = 7 Can you give an example in which solving the discrete log is simple? All finite groups are isomorphic to (Zn,+) so why isn’t it always easy? Public Key: (2, 3), Z11* Private Key: a = 8

Probabilistic Encryption
For RSA one message goes to one cipher word. This means we might gain information by running Epublic(M). Probabilistic encryption maps every M to many C randomly. Cryptanalysists can’t tell whether C = Epublic(M). ElGamal is an example (based on the random k), but it doubles the size of message.

Digital Signatures We focus on electronic signatures that use public-key cryptography. E.g. (Based on RSA) A key generation algorithm Same as in RSA encryption. A signing algorithm Same as decryption of MZN* by C=D(M)=Md mod N. A verification algorithm Same as encryption of CZN* by M=E(C)=Ce mod N. Can be calculated and verified by anyone. Concept of Blind Signatures … Explain RSA breifly

Secret Sharing Based on the next problem:
Assuming that there are N players, how can a dealer share a secret in a way that any group of t (< N) or more players could recreate the secret, but any group of less then t players will not be able to do so? Such schemes are called (t,N) - threshold secret sharing schemes.

Shamir Secret Sharing Scheme
The dealer selects t-1 random integers, which forms a t-1 degree polynomial f(x) such that f(0) = S. The dealer calculates f(i) for each player i. Those are their private shares. Any group of t or more players can recreate the polynomial and S (using Lagrange interpolation).

Threshold Encryption In threshold encryption we have N authorities, and we want to encrypt a message in a way that any t or more authorities could decrypt it. Again, any group of less then t authorities will not be able to do so. No trusted dealer. Solutions are similar to Shamir’s scheme [CGS,Pederson].

Zero-knowledge Proofs
Interactive protocols between two players, Prover and Verifier, in which the prover proves to the verifier, with high probability, that some statement is true. Does not leak any information besides the veracity of this statement. In the case of honest verifier ZKP, we can modify the protocol to non-interactive.

Zero-knowledge Proof Example
Let g1, g2 generators of Zq*. The Prover claims that logg1v = logg2w (=x) for publicly known v, w, g1, g2. P chooses random z  [1..q] and sends a=g1z, b=g2z. V selects random c  [1..q] and sends it. P sends r = (z+cx) V verifies that g1r=avc and g2r=bwc Can be turned into non-interactive c = Hash(a,b,v,w)

The Woo-Lam Authentication Protocol
Alice tries to prove her identity to Bob but she does not share a key with Bob, only with Trent The protocol goes as follows: In Step 1 Alice declares her identity In Step 2 Bob provides a nonce challenge In Step 3 Alice returns the challenge encrypted with KAT In Step 4 Bob passes this encrypted information to Trent for translation In Step 5 Trent translates the nonce and returns it to Bob – then Bob verifies the nonce

A weakness … There is a protocol failure in Woo-Lam that comes from the fact that the connection between Bob-to-Trent’s message and Trent-to-Bob’s message is not strong enough The only “connection” comes from the fact that message 4 and message 5 happen shortly one after another. This weak association can be used in an attack where Eve impersonates Alice: Eve tries to authenticate herself to Bob (or Bob’s computer) at about the same time as Alice. Trent will respond to each at roughly the same time. Eve intercepts both responses, and swaps them. Let us see how in a step-by-step description

Details of the impersonation attack
Step 1: Eve, acting as both herself and Alice, attempts to authenticate herself to Bob as both herself and Alice. Step 2: Bob, as he should, replies with two nonce challenges. Eve gets her nonce but, at the same time, intercepts the nonce directed to Alice. Step 3: Eve answers both challenges. Eve, naturally, can only send a wrong reply on behalf of Alice. She can, however, swap her response with Alice’s before contacting Bob. Step 4: Bob receives both responses and contacts Trent for translation. Step 5: Trent responds. One response consists, as expected, of garbage. The other respond, for Alice, is of course correct. Bob gets, correctly, back the challenge he issued for Alice and then authenticates Eve as Alice!

A way round this problem
The problem was (again) that the last message was not tied to the identity of who it corresponded to. One simple fix is to make message 5 include Alice’s identity: So, Trent tells Bob who the response corresponds to. Then, Bob will be able to tell that message 5’ does not correspond to Eve’s nonce! One problem is that Trent does not know what host that Alice is trying to log onto. Eve might get Alice to log onto Eve’s computer. Then Eve can start a logon in Alice’s name to Bob’s machine. Eve then gets Alice to answer Bob’s challenges to Eve… Before : Fix:

The Needham-Schroeder Key Exchange Protocol
Step 1: Alice tells Trent what she is requesting Step 2: Trent gives Alice the session key and gives Alice a package to deliver to Bob. Step 3: Bob can get the session key, and the identity of who he is talking with (verified because it came from Trent). Step 4: Bob sends Alice a challenge Step 5: Alice answers challenge

An attack on Needham-Schroeder
In 1981, Denning and Sacco showed if the session key is compromised, then Eve can make Bob think that he is communicating with Alice. Assume the NS protocol took place, and that Eve has recorded the first 3 steps. Also, assume that Eve has obtained the session key. The following steps subvert NS: Step 1: Eve replays step 3 from NS as if she were Alice. Step 2: Bob gets this message and issues a challenge to Alice in the form of a new nonce. This challenge is intercepted by Eve. Step 3: Since Eve knows the session key, she can respond correctly to the challenge. The basic problem: messages can be replayed once the session key is compromised!

Security is notoriously subtle!
The morale? Security is notoriously subtle!

Even more so, when we leave mathematics and go to complex hardware/software systems!
We will look into how theory and practice meet using two working systems: e-Lotteries! e-Voting!

A protocol for the support of large-scale national lotteries
A real nationwide electronic lottery Frequent number of drawing per day Strict drawing times Large number of expected players Preclusion any participation in the number generation and winner identification processes.

Special System Characteristics
Cryptographic robustness Protection against various (premature & future) manipulations Extensive real-time auditing facilities Performance (time constraint) requirements Incorporation of Security mechanisms System with High –availability

An overview of the system
Audit Information Connected in high Availability Configuration Telephone lines Lottery Organization Computer Verifier Gen1 Gen2 Data to Optical Signal Optical Fibre Converter To TV Station Agencies Audit Information Audit Information Coupon File &Audit Information

Operational Requirements
Uniformly Distributed Numbers Unpredictable Results Prevention of internal/external interference with the drawing mechanism & with the choice of winners Constant monitoring towards early detection of interference attempts

Security & Safety Requirements
Confidentiality No leaks of information Encryption methods Secure random number sources Integrity Authentication request for any step Use of Hash and MAC functions State Stamping Detection of any past or future modification (e.g. coupon file) Mainly through cryptographic tools (e.g. Hash functions)

Security & Safety Requirements
Availability Service all the authorized requests Component and data path replication Accountability Detection of any unauthorized access to or modification of the system Authentication schemes are necessary Use of mechanisms for singing and commitment

Design considerations
Randomness Sources Seed Commitment & number reproduction State Stamping Seed processing Signing & Authenticating

Design Considerations Randomness Sources
Approaches Disadvantages Advantages Common (e.g. as given by Java) Pseudorandom Number Generators Algorithm is susceptible to clever attacks Uniform distributed numbers Cryptographically Secure PNG In principle they could be guessed, given the initial state. Guessing is intractable however! Based on deterministic algorithms Handles the disadvantage above Truly Random Number Generators Physical processes often obey specific distribution laws They depend on environmental parameters (e.g. temperature) Hard to reproduce their output Non deterministic method, truly random output

Design Considerations Seed Commitment & Reproduction of received numbers
Elimination of any modification on seeds: from the time they are produced until the time that they will be used. Bit-Commitment Protocol certifies the integrity and accountability on the connection between the Generator and the Verifier The Verifier reproduces the numbers with additional information from generator for a final check.

Design Considerations State Stamping
Prevention of Post-betting Elimination any coupon file modification Fingerprint (hash value) of coupon file Check whether the hash function has the same value before and after the draw. If check fails, the protocol should be terminated immediately and reports the modification in highest priority rmd160

Design Considerations Seed Processing
Seed1->Produced from Physical Generator Hash value of The Coupon File Naor-Reingold Pseudorandom Function Input(1) Input(2) NR function is initially seeded With a strong random key Seed2 does not depend on (the online drawn) physical bits Final Seed2

Seed Processing Naor-Reingold function
NR function key is a tuple <P,Q,g,a> Where P is a large prime (1000 bits) Q is a large prime divisor of P-1(200 bits) g is an element of order Q in Zp* And a=<a0,a1,…an> is an uniformly distributed sequence of n+1elements ZQ For every input x and n bits, x=x1…xn, NR function :

Design Considerations Signing and Authenticating
To boost confidentiality and accountability : After Numbers Generation Numbers & Seeds Encryption Scheme Signing Process Verifier

A high-level description of the protocol
Exchange keys for encryption & A private /public key for signature VERIFIER Idle GEN1 Drawing Initiation signal Random bits from the TRNG Hash value of the Coupon’s file Seed1 XOR Verify and decrypt Seeds & nums Bit-commitment &Signature NR function Verify that Gen1Commited on the True seeds Generate the Numbers From PRNG From the retrieved seeds Regenerate the numbers Encrypt and sign Seeds & numbers Check the numbers Seed2 System Failed SUCCESS!

Time Table 6 min before the Draw time 3 min later:
If the verifier hasn’t received the numbers, he sends Initiation Signal to Gen2 Gen2 produces the numbers in 3 minutes, on time, with the same processes of the Gen1 Verifier GEN1 Draw initiation signal GEN2 Initiation

Software random number generators
2 algebraic generators BBS (proposed by Blum,Blum and Shub), one of the most frequently used Cryptographically strong PRNG RSA/Rabin generator based on RSA function 2 block cipher based generators DES and AES

Physical random number generators
We combine three physical generators with XOR Based on the phase differences on the two motherboard's clocks (The VonNeumannBytes function) ZRANDOM hardware generator SG100 hardware generator

Output Processing Outputs combined with two shuffling algorithms:
Algorithm M (proposed by MacLaren and Marsalia): takes two input sequences Xn and Yn, and is shuffling the sequence Xn using elements of the sequence Yn as indexes into the sequence Xn Algorithm B (proposed by Bays and Durham): is similar to M, with one input sequence, and the output is a shuffled instance of input

Output Processing Combine the output with XOR operation
The four generators are combined with bit-wise XOR The protocol moves periodically to different combinations of the generators

Output Testing Statistical tests are applied (Diehard Battery of tests) on: The produced random numbers The hardware random number generators On line tests

Considerations Many factors should be considered for a robust protocol designed to support an electronic lottery The generation of sequences that are exceptionally difficult to guess The measures against many possible attacks on the generation and on the entire system operation Business management process

The Issue of Trust Trust plays major role in the way people view and use information systems. Trust should be the first priority for eGovernment applications. Trust is of great importance for the success of eVoting.

Our Goal Propose and apply a “trust preserving” approach for handling the increasingly difficult complexity issues of building eVoting systems and, in general, trust-critical eGovernment applications. Design and implementation of a secure and efficient eVoting platform with a focus on trust establishment

Our approach Decomposition of eVoting into layers containing basic trust components facilitate the management of trust in each component Concrete notion of trust components should be taken into consideration by designers of security critical applications in general

Pragmatic Trust Pragmatic approach to security critical applications should be based on layering. The layered approach to trust reflects the “trust engineering” phases by combining technology, policy and public awareness issues.

The trust-centered approach

Layers of the architecture
Scientific Soundness: Crypto-based justification of all components (e.g. cryptographically secure random number generators, homomorphic functions)

Implementation Soundness: Formal methodology for the verification of the implementation (applied periodically)

Internal Operational Soundness: High availability and fault tolerance (self-auditing, self-checking, self-recovery from malfunction)

Externally Visible Operational Soundness: Impossible for someone to interfere with the system from the outside (quickly detectable)

Convincing the Public: Crucial for the success of the eVoting system (details available to the public, organize campaigns etc)

Scientific Soundness: Crypto-based justification of all components (e.g. cryptographically secure random number generators, homomorphic functions)

Some basic requirements for a general e-Voting scheme
Privacy: only the final result is made public, no additional information about votes will leak. Robustness: the result reflects all submitted and well-formed ballots correctly, even if some voters and/or possibly some of the entities running the election cheat. Universal verifiability: after the election, the result can be verified by anyone.

How to meet these requirements?
we obviously need cryptographic techniques but tamper resistant devices as well and we need to provide appropriate protocols and mechanisms to meet these requirements which we will be discussing digital signatures to identify voters data correctness and integrity proofs etc.

Mixnets Mixnets A mechanism for destroying the relationship between a voter and his vote through the application of consecutive vote permutations Permutations without fixed points – derangements Random walks in permutation groups: how many steps until the uniform distribution appears (random walk mixing time)? Votes are fully decrypted in the last step but their link to the voters has, now, disappeared Parallelizing efficiently the process, we conjecture, is P-complete (reduction from CVP): Given n inputs in some particular order, is the i let to output j after the application of all the permutation stages of the Mixnet?

Homomorphic functions
Another mechanism for destroying the relationship between voter and his vote – based on homomorphic functions (i.e. ElGamal encryption!) Based on the computational difficulty in inverting these functions Votes are never decrypted by they are added, homomorphically, in their encrypted form! The vote outcome is in encrypted form too and needs to be decrypted (this is not hard since the number of voters is usually small and a brute force inversion suffices – also use of Pollard Ρho, Baby-step-giant-step etc.) Efficient parallelization:

Registering voters It is note imperative that we have an independent X.509 PKI system in place (if a PKI is available, that’s fine!) But we will assume we have an existing registration scheme in place Thus, we can simply send something out to a voter by mail, like a PIN-mailer which he may use for electronic registration at which stage a public key pair is generated for his use, and the private key is stored securely in a central server all using HSMs the private key never leaves the HSM controlled environment

This registration could take place
at home from the voter’s own work station or at a polling station where he presents a fairly traditional voting card received in the mail for proper identification and counting and uses an additional small slip with a PIN or similar to vote, as in the vote home scenario using the PIN for identification

Counting the votes P(m1,r1)+P(m2,r2) =P(m1+m2,R)
Let alone the issues of anonymity etc., adding up votes electronic could be virtually instant In order to meet some of all our requirements, it would be extremely useful with the following property Given any two votes, m1 and m2, and their encryption, P(m1), P(m2), assume P(m1)+P(m2) =P(m1+m2), even better, if we can “randomise” to anonymise using individual random numbers ri for each vote, and we have the property P(m1,r1)+P(m2,r2) =P(m1+m2,R) for some number R (actually, R=r1+r2), then

Counting by exploiting the homomorphism property
we call P(.,.) a homomophic public key if: for any set of votes, there always exist some R (which will vary with the votes) with ∑P(xi,ri) = P(∑xi,R) Now we have it (assuming that such a function exists, of course!): the voter casts the electronic vote x the application chooses a random number r and calculates P(x,r) signs and forwards SA(P(x,r)) the authenticating server verifies the signature and forwards P(x,r) for counting the counting server calculates ∑P(xi,ri) = P(∑xi,R) and descrypts to recover ∑xi, while R is discharged the result is available less than 1 minute after the closing of the polling stations

CGS97 - The Protocol

CGS97 - The Protocol All authorities publish Their shares.
Initialization All authorities publish Their shares. A threshold public key S. Another generator h of the multiplicative group The legal votes will be h-1, h1. Voting A voter encrypts his vote bi using E(hbi,S;r) and publishes it along with a non-interactive proof of validity of the vote on a public board. Verification All voter's non interactive proofs are verified (publicly) and invalid votes are deleted.

Tallying After elections ends, t authorities calculates E(htotal,S;rtotal) = E(hbi ,S;r) and publicly decrypt it to get htotal. Now, anyone can find Total (using linear time exhaustive search) which is the difference between the number of votes for each candidate. Those calculation can also be verified using non-interactive zero knowledge proof of equality of discrete logarithms.

More on Scientific Soundness: Randomness
Cryptographically strong pseudorandom generators: 1. Generators based on number theoretic problem (BBS, RSA/Rabin, Discrete Log) 2. Generators employing symmetric (block) ciphers or secure hash functions (DES, AES, SHA, MD5) In order to confuse cryptanalysts the generation process can periodically use different combination of algorithms. shuffling algorithms (algorithm M and B) XOR operation

More on Scientific Soundness: Randomness
Physical random number generators: 1. The seed of any software random number generator must be drawn from a source of true randomness. 2. Combine more than one such generators to avoid problems if some of the generators fail (for example with XOR). 3. Use pseudorandom function (Naor-Reingold) for processing the combination of the seeds.

Implementation Soundness: Formal methodology for the verification of the implementation (applied periodically)

Implementation Soundness
The theoretically established cryptographic security by itself disappears if a simple implementation error occurs in the implementation code. Testing the implementation is a crucial step in building a secure and trustworthy electronic eVoting system.

Implementation Soundness
There is a number of verification methodologies and tools that can be applied, that are based on various statistical tests.

Risk Analysis and Management (2/11)
The CORAS Methodology Methodology for security risk analysis Customised language for threat and risk modelling (UML based) + extended documentation (diagrams, tables) Provides detailed guidelines Context identification Risk identification Risk Analysis Risk Evaluation Risk Treatment Proposes different tools and techniques for each step + software tool to integrate tools and document results

Basic steps of CORAS Context Identification Application scenario, assets, data flows UML modeling language Risk Identification Identification of threats Threat Diagrams HazOp Analysis Fault Tree Analysis Risk Analysis Specification of Likelihood, Consequence and Risk levels Assessment of risks (Likelihood of occurrence and Consequence) - Qualitative - Quantitative (through Fault Tree Analysis) Risk Evaluation Risk categorization matrix Risk Treatment Countermeasures for critical risks

Abstract Class Diagram
Risk Analysis and Management (3/11) Step 1: Context Identification Abstract Class Diagram Use Case DIagram Activity Diagram

Step 1 (continues) Example of Time Sequence Diagram (Decryption and Calculation of Result)

Part of high-level risk table
Risk Analysis and Management (5/11) Step 2: Risk Identification Who/what causes it? How? What is the incident? What does it harm? What makes it possible? Keyholders Disclosure of secret keys Corrupted Keyholders (software) Voter Disclosure of credentials (id, password, πιστοποιητικό) to another person Malicious Voter EA Vote Alteration Corrupted ΕΑ Vote disclosure Tallying error Software Error Result Alteration Coercer Voter coercing Lack of monitoring during remote vote casting Hacker Insufficient Security Final result Alteration Part of high-level risk table

Step 2 (continues) Asset: Keys Κi (step 1) Guideword Threats Likelihood Consequence Countermeasures Manipulation Alteration of key generator operation by authorized person Small Keys are not secret or are not random Testing of key generator before elections Restricted access to software Disclosure Disclosure of some Ki by their holders Medium Corruption in elections is possible Key sharing (k out of k). In order for the overall Key to be disclosed, all keyholders need to disclose their keys Programming Εrrors Errors in generator software The keys are not randomly generated (fake randomness). The keys do not satisfy the requirements (e.g. length) Application of good programming practices. Extensive testing and debugging. Use of secure random number generators Part of HazOp Table

Fault Tree Diagram (ITEM Toolkit)
Risk Analysis and Management (7/11) Step 2 (Continues) Fault Tree Diagram (ITEM Toolkit)

Calculation of threat occurrence likelihood
Risk Analysis and Management (8/11) Step 3: Risk Analysis Assessment of likelihood of occurrence of unwanted incidents Event Description Likelihood Disclosure by Voter 1 Disclosure of Vote by Voter 0,05 2 Voter software error 0,1 3 Malicious software in Voter’s PC Stolen while in transit 4 SSL failure Disclosure by Vote Manager 5 Malicious Election Authority (vote manager) 6 Malicious software in Election Authority (vote manager) Calculation of threat occurrence likelihood Threat ID Description Events involved Likelihood 1 Disclosure of vote Μ 1-6 0,38 (Medium)

Qualitative assessment of Consequence using FMEA
Risk Analysis and Management (9/11) Step 3 (Continues) Qualitative assessment of Consequence using FMEA ID Function/ Entity Failure Mode Effects Causes Consequences Local System wide 1 GenerateElGamalParameters (size) Size parameter is not available in system config file The public parameters may not be created System initialization is not possible Config file is not properly updated by system administrator. Access to config file/database is not possible Voting process may not begin 2 Publish(elGamalParameters) Bulletin Board is not updated with the public parameters Keyholders may not produce keys Connection to database is not possible

Risk Categorization Matrix
Risk Analysis and Management (10/11) Step 4: Risk Assessment Risk Categorization Matrix Consequence Value Likelihood Value Rare Unlikely Possible Likely Certain Insignificant Minor 4, 10, 12, 30, 31 29, 32, 34, 35, 36, 39, 40 14 Moderate 3 8, 22 Major 1, 9, 21, 23, 26, 27 7, 17 , 20, 24, 25, 28, 33, 37 13 Catastrophic 2, 5, 11, 47 6, 15, 16, 18, 19, 41, 43, 44, 45, 46 38, 48, 49 42

Step 5: Risk Treatment (taken into account in
the design/implementation phases) Risk ID Description Risk Level Treatment options - measures Risks with regard to Partial Keys disclosure or non-availability 2 Disclosure of some of Ki by their keyholders Extreme The disclosure of partial keys would be catastrophic, as it would allow the decryption of individual votes and the final result by unauthorized parties (or even the EA) Threshold cryptography techniques are used as a countermeasure. Such techniques require for at least t out of n keyholders to cooperate for the conduction of the elections. Moreover, colluding interests of the keyholders discourage potential alliances among them. For ultimate security, we suggest that t=n, which means that all keyholders need to cooperate. 5 Some of the Ki are not available

Layers of the Architecture
Internal Operational Soundness: High availability and fault tolerance (self-auditing, self-checking, self-recovery from malfunction)

Internal Operation Soundness
One of the most important issues in an eVoting application is the ability to self-check its internal operation and give warnings when needed. Self-checking reduces human intervention and increases the responsibility of the system in case of a non-normal operation. Self-checking approaches include: Intrusion Detection Systems, hardware-based software bootloaders for secure start-up (embedded systems)

Internal Operation Soundness
All the internal activity of the system must be supervised by authorized personnel. A personnel security plan must be deployed so that every person in the eVoting is responsible for a different action. The computer room where the servers are kept must be isolated: 1. Biometric access control system is needed. 2. The access control system must use cameras and movement detectors.

Layers of the Architecture
Externally Visible Operational Soundness: Impossible for someone to interfere with the system from the outside (quickly detectable)

Externally Visible Operational Soundness
It should be possible to detect erratic behavior or ascertain that everything is as expected: Detect some frequently eVoting system failures and attacks as fast as possible. Possible failures and attacks: Failure of a random number generator System database damage Forging votes “Bogus” voting servers

Operational physical security: system operators’ actions should be subjected to monitoring and logging visual monitoring of the system and strict access control strict maintenance process for modifications of any part of the system is needed Forging votes: not possible – no double or non-authenticated votes are accepted by the system

“Bogus” servers: the system should be protected from intrusions a third party is needed to operate as a firewall between the servers and the vote database The third party (central Election Authority): 1. Responsible for monitoring the operation of the voting servers. 2. Re-tallying to make sure that local EAs have valid local tallies 3. Analyze IDS information

Convincing the Public: Crucial for the success of the eVoting system (details available to the public, organize campaigns etc)

“Reassure the public that all measures have been taken in order to produce an error-free, secure and useful application.” Such measures include: 1. Trust by increasing awareness (educate the public about security and data protection issues in non technical terms). 2. Trust by continual evaluation and accreditation (continual evaluation and certification of system’s operation, results of the evaluation publicly available). 3. Trust by independence of evaluators (the system must be verified by experts outside the organization). 4. Trust by open challenges (call for hackers).

Convincing the public 5. Trust by extensive logging and auditing of system activities (logging and auditing activities are scheduled on daily basis, results available for public scrutiny). 6. Trust by contingency planning (failures in system that offer e-services are not acceptable, contingency plan publicly available). 7. Trust by regulation and laws (system operator introduces suitable legislation for the protection of the public in case of mishaps). 8. Trust by reputation and past experience (the involvement of engineers and experts should be accompanied by credentials that prove their expertise).

System and implementation related aspects
Bouncy Castle Java crypto library OpenCA OpenVPN Apache Tomcat SSL NTP for obtaining time PostgreSQL HELENA IDS Hardware RNGs for seeding ATMEL’s ATMega8 microcontroller for secure bootstrapping of parameters and startup code

Application server: Apache Tomcat
Application Tier of the Election Authorities (EAs) Execution of Java servlets (servlet container) Responsible for: The presentation of the web interfaces to voters who connect to the EA The recognition of the web page for which a request for an http (or https) connection was made by a voter’s web browser (supported web browsers include: Internet Explorer, Mozilla Firefox, Netscape Navigator, Opera, and Safari) The identification and activation of the requested page, including the activation of all Java scripts linked to it (Tomcat has an internal compiler that transforms Java servlets into Java Server Pages, which are suitable for presentation by a voter’s web browser) The execution of the requests contained in the servlets (e.g. PostrgreSQL requests) The implementation of the secure https connections through the activation of the SSL module (mod_ssl) The activation of load balancing support (JK native connector)

Intrusion Detection System:
HELENA Developed by RACTI Constantly gathers and analyzes incoming and outgoing traffic from a target network (the network with the central EAs in our case) Local computer agent Master console agent “Not-used” request database Threshold values – updates: target network is modeled with a directed graph with connections (vertices: computers + ports, edges: connection requests)

Voter authentication:
OpenCA Used for the identification of legal voters Was installed to operate with Linux Ubuntu 6.10 (Edgy Eft) Implementation of a Certification and a Registration Authority (CA and RA) CA and RA operate at the same server and use a PostrgreSQL The voter submits a request for the receipt of a certificate – if entitled to vote, the certificate is issued and the user installs it in the web browser. Then the voter is allowed to access the local EA The Apache Tomcat receives and validates the certificates using SSL-based authentication protocols

Ensuring privacy in the network:
OpenVPN Installed at the Central EAs using the client – server model: The VPN server has a static IP address and is accessible from the Internet. If the VPN server is behind NAT (Network Address Translation) then the NAT router should be configured to rout traffic directed to the connection port of OpenVPN (default 1194 udp) to the VPN server. After the installation of the OpenVPN, certificates are constructed that allow clients (i.e. Local EAs) to request VPN connections. After installing their certificates, the clients can request and establish secure VPN connection from the VPN server

High availability and fault tolerance: mon, heartbeat, and coda (1/2)
The "mon", "heartbeat", and "coda" tools from Linux Virtual Server Mon is a monitor of the state of the servers and the network, heartbeat sends frequent signals so as to signify the availability of the servers, and coda implements a fault tolerant distributed file storage system (actually implemented by Slony-I in our case – see below) There is also fake, which is an IP take-over module that employs ARP spoofing

High availability and fault tolerance: mon, heartbeat, and coda (1/2)

Database replication:
Slony-I (1/2) An asynchronous data replication platform (with periodic updates) for PostgreSQL that supports cascading and failover. It creates a cluster of local databases (in our case, the local databases of votes in each Local EA and in the Central EAs) It creates mirrors, at a master database, of databases kept at slave databases

Database replication:
Slony-I (2/2)

Heartbeat and Slony-I:
An architecture for high availability and fault tolerance

Secure EA bootstrapping: MCUs with protected memory
Secure storage of keys, voting parameters and bootstrapping code Secure code execution and authentication of external applications Low cost and easy to develop solution (as opposed to TPM based ones) that easily fits legacy hardware and software New version of code and new keys can be dispatched over any insecure communication means in encrypted form – decryption takes place within the MCU

Performance aspects/ System simulation
Network architecture: Directed Acyclic Graph (DAG) Traffic: open Jackson network of M/M/1 queues (Poisson distributed arrival rate – exponentially distributed service rate – one server – unlimited queue size) Voters’ arrival behavior: Weibull distributed with a peak around noon Simulation tool: Uses the CSIM 19 (C and C++) simulation library

Shifted Weibull distribution with parameters α = 2.5, b = 5 and t0 = 8
Performance aspects/ System simulation Shifted Weibull distribution with parameters α = 2.5, b = 5 and t0 = 8 Time interval si (incoming vote rate) [8:00,10:00) 0.11 [10:00,12:00) 0.20 [12:00,14:00) 0.13 [14:00,16:00) 0.039 [16:00,18:00) 0.005 [18:00,20:00) 0.0005 Time interval λsi [8:00,10:00) 5.67 [10:00,12:00) 10.32 [12:00,14:00) 6.70 [14:00,16:00) 2 [16:00,18:00) 0.26 [18:00,20:00) 0.026

Performance aspects/ System simulation

Summary We have presented a general, trust-centered, layered approach towards trust building in eVoting and, generally, eGovernment applications. This approach is based on a design process that incorporates risk analysis/management methodologies for security critical systems (e.g. CORAS) Large scale simulation results to evaluate the architecture’s efficiency as a function of the voter population size Evaluated during a mock-up election for the members of the Western Greece sector of the Technical Chamber of Greece – useful feedback, that was incorporated in the current version of the eVoting platform Project site:

Elliptic Curve Cryptography
Based on groups which are defined on elliptic curves. Elliptic Curve: Defined over a prime (Fp) or a binary field EC over Fp (E(Fp)): set of solutions (x,y) in Fp to along with a special point denoted by О , called the point at infinity.

Example y2 = x3- 4x solutions (x,y) in F23 Q F23

Generation of a key pair (private-public)
Elliptic Curve Cryptosystems based on Fp 1. Choose at random a private key d {1,m-1} 2. Find a random point G on the EC 3. Calculate the public key e = dG mod p Conventional Cryptosystems based on Fp 1. Choose at random a private key d {1,p-1} 2. Find a generator g of the field 3. Calculate the public key e = gd mod p

EC Cryptosystems vs. Conventional Systems
Same level of security: N  M1/3(ln(Mln2))2/3)

Advantages of ECC More Efficient (smaller parameters) Faster
Less Power and Computational Consumption Cheaper Hardware (Less Silicon Area, Less Storage Memory)

Generation of secure ECs
Cryptographic Strength suitable order m Suitable order m = nq where q a prime > 2160 m  p pk ≢ 1 (mod m) for all 1  k  20 The above conditions guarantee resistance to all known attacks to solve ECDLP

Generation of ECs The goal is to determine the defining parameters of an EC: y2 = x3 +ax + b The order p of the finite field Fp. The order m of the elliptic curve. The coefficients a and b.

Generation of ECs-Known Methods
Constructive Weil descent Samples from a, rather, limited subset of ECs. Point counting Rather slow The Complex Multiplication method Rather involved, but efficient for generating secure ECs.

The Complex Multiplication Method
Input:an integer D Calculate the Hilbert polynomial HD(x) YES Is one of them suitable? Choose prime p = x2+Dy2 and find integers (x,y) Possible orders: m = p+1  2x NO Calculate the roots of the Hilbert polynomial From every root generate a pair of ECs Find the EC which has order m

Shortcomings of the CM method
Time consuming construction of Hilbert polynomials as D increases – huge polynomial coefficients Need for improvements, especially for hardware devices where memory and speed are limited resources

A practical approach A variant of the CM method
On line computation (or precomputation) of Weber polynomials Roots of these polynomials can be transformed into the roots of the corresponding Hilbert polynomials, but no Hilbert polynomial is actually constructed But why use Weber polynomials?

Weber vs. Hilbert Polynomials
The construction of both types of polynomials requires high precision complex, floating point arithmetic. Drawback of Hilbert polynomials: their fast growing (with D) coefficients - time consuming construction and difficult to implement in limited resources devices. Weber polynomials on the other hand, have much smaller coefficients.

An Example (D = 292) W292(x) = x4 - 5x3 - 10x2 - 5x + 1
H292(x) = x x3 - x2 + x

Implementation Algorithms for the basic algebraic operations
Generation of secure ECs EC Protocols Implemented: in ANSI C using the GNU Multiple Precision Library

Implementation Considerations
Choice of prime fields:simplicity in number representation and in basic algebraic operations. GNUMP had to be enhanced to include: high-precision implementation of useful functions (factorization, primitive root location, etc) high-precision complex number arithmetic high-precision floating point arithmetic of various functions, e.g. cos(x), sin(x), exp(x), ln(x), arctan(x) [Taylor series expansion suitable truncated]

Architecture

Attacks on ECC The security of ECC is based on the difficulty of solving ECDLP (Elliptic Curve Discrete Logarithm Problem). ECDLP: find m for which Q=mP, where Q,P are two known points on the EC. An attack on ECC is an algorithm for solving ECDLP exponential time

Signatures: from “syntax” to “semantics”
A bit-sequence may be looked upon from two different aspects: Its pattern (i.e. its “syntax”): this is simply the sequence of 0s,1s Its content (i.e. its “semantics”): the string may represent some other object (e.g. a Boolean formula, a graph, or an automaton under a suitable encoding) We could use the knowledge of a property of the object represented by a bit-sequence in order to prove that we have created or own the sequence If this knowledge is hard to come up with or to deduce then Knowledge of the property of the object (bit-sequence) = Proof of identity The tools are already here: Computational complexity & Threshold phenomena!

The methodology Find a class of objects and identify some property of theirs such that It is hard to deduce or compute it if not known in advance It is easy to construct an object having the property TOOL: Combinatorial threshold phenomena Construct an “ownership proof” procedure with which you can prove knowledge of the property without divulging it TOOL: Zero Knowledge Interactive Proofs (ZKIPs) Use suitably produced objects encoded as bit-sequences as signatures!

The 3-coloring problem We are given an undirected graph
We are asked to color the vertices of the graph using at most 3 colors so that no two adjacent vertices are assigned the same color 2 1 2 3 4 5 1 3 4 5

The complexity of 3-coloring
The founders of modern complexity theory: Cook (1971), Karp (1972), and Levin (1973) – Computational Complexity – SAT: the “drosophila” of complexity 3-Coloring, like SAT, is computationally intractable (technically, NP-complete) – thousands of other problems share this property! This means that if we are given a graph and ask to find a 3-coloring of its vertices, the number of steps required may be prohibitively large. Thus, 3-colorings graphs are hard to find. IDEA: Use bit-sequences that represent graphs and proof of ownership is equivalent to the ability to exhibit readily a 3-coloring of the graph

The “hard”-instance region for 3-coloring
G: a graph with m edges and n vertices with r the ratio m/n. Cheeseman, Kanefsky, and Taylor [1991]: for values of r around 2.3, randomly generated graphs with rn edges were either almost all 3-colorable or almost none 3-colorable depending on whether r < 2.3 or r > 2.3 respectively. Thus, we have a transition from almost certain 3-colorability to almost certain non 3-colorability. And what is more, graphs with ratio r around the value r0 = 2.3 were the most difficult to handle by the best of algorithms! This, implies, that one can use such graphs to create graphs whose colorings are hard to find!

Threshold phenomena in other problems: 3-SAT
Many combinatorial problems exhibit a threshold behavior: Instances generated with their critical parameter (clause/variable ratio in 3-SAT) around the value (4.2 in 3-SAT) that marks the transition from almost certain solubility (satisfiability in 3-SAT) to almost certain insolubility, seem to be among the hardest to solve with the best of algorithms available PROBLEM: Proof of existence and calculation of the critical value

Producing random 3-colorable graphs
Let p1, p2, and p3 be real numbers such that p1 + p2 + p3 = 1 and p1, p2, and p3 > 0. For each j = 1, …, n, vertex vj is assigned to color class Ck with probability pk, k = 1, 2, 3. For each pair u, v of vertices that do not belong to the same color class, introduce the undirected edge (u,v) with probability p. The above algorithm is simple and very fast. It produces, a random graph with specified 3-coloring known only to the owner of the graph (i.e. the signature)

Targeting at the “hard” instances region
Set r = E[m]/n (expected number of edges/number of vertices) This gives r = p(p1p2 + p1p3 + p2p3)n Set r ≈ 2.3 and p1 = p2 = p3 = 1/3 (color classes of equal size give, in general, more difficult instances) Then solving for p, we obtain

Zero Knowledge Interactive Proof Protocols (ZKIP)
Introduced by Goldwasser et al. (1985) and Babai (1985) Convince someone of a piece of (generally) hard to acquire knowledge without disclosing it! A “graphical” description of a ZKIP for 3-coloring: Secretly permute, at random, the 3 colors Spread the graph on the floor with vertices hidden The other party chooses at random a pair of adjacent vertices Expose their colors, showing that they are, indeed, different The above procedure is repeated until the other party is convinced that we really know the 3-coloring

The “gory” details … Setting: G = (V,E) where a Prover knows a 3-coloring of G and a Verifier needs a proof of this knowledge (Goldreich et al. (1991)) P does the following (“commitment”) Chooses a random permutation π of {1,2,3} For each v in V, applies the color permutation π and expresses the result using two binary bits kv,0 and kv,1 Chooses two random values rv,0, rv,1 ≤ |V|/2 Computes (“<<” is the “left shift” operator): Rv,0 = RSA(<<rv,0 + kv,0) and Rv,1 = RSA(<<rv,1 + kv,1) Sends to V {Rv,0, Rv,1 for all v in V}

P V P V P V Challenge by V: Response by P: Checking by V:
Selects an edge (u,v) at random and sends it to P Response by P: Sends out the RSA decrypt keys to V Checking by V: If the revealed colors are the same, V rejects. Otherwise, V accepts. 1 2 n R1,0, R1,1 R2,0, R2,1 Rn,0, Rn,1 P V P V RSA keyu ,RSA keyv P V

Why the ZKIP for 3-coloring works?
If we really did not know a 3-coloring (i.e. we tried to impersonate the legal owner) then at each interrogation by the other party there is some fixed probability r that a pair is not properly colored The probability that for a sequence of n trials we will manage to fool the other party is at most (1-r)n, which tends to 0 exponentially as r is a constant less than 1 This means that we are doomed to get caught lying as the number of rounds gets larger and larger!

More formally … Completeness: If G is indeed 3-colorable, P knows a 3-coloring and both P and V follow the protocol, then V will be convinced that P knows a 3-coloring. Soundness: If, now, P does not know a 3-coloring then P will fail on at least one edge (u,v) which P will have been colored illegally. V on the other hand, will pick such an edge with probability 1/|E| which can be brought arbitrarily close to 1 by repeating the protocol sufficiently many times

Current research efforts
How to produce graphs that with high probability have a small number of colorings as solved 3-coloring instances (i.e. instances constructed to have a specific coloring) can have a very large number of additional colorings Identify classes of hard 3-coloring instances Give a partial effective characterization of hard instances – Instance Complexity stemming from work of Kolmogorov (1965), Solomonoff (1964), and Chaitin (1966) & Average Case complexity by Levin (1986) Build an integrated smart card application that includes the ZKIP protocol for identity verification – do the same for the graph generation algorithm (i.e. signature construction algorithm) Arrive at a standard

Thank you!

Master Program in Web Science,

Similar presentations

Presentation on theme: "Master Program in Web Science,"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Master Program in Web Science,

Similar presentations

Presentation on theme: "Master Program in Web Science,"— Presentation transcript:

Similar presentations

About project

Feedback