Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 Chapter 15 System Errors Revisited Ali Erol 10/19/2005.

Similar presentations


Presentation on theme: "1 Chapter 15 System Errors Revisited Ali Erol 10/19/2005."— Presentation transcript:

1 1 Chapter 15 System Errors Revisited Ali Erol 10/19/2005

2 2 System Errors Revisited Quantify the accuracy of FAR and FRR estimates. Confidence Intervals, a well known technique used in statistical analysis. See references [22],[23]. The first three author’s algorithm [23] experimentally demonstrated to provide better Confidence Intervals estimates.

3 3 FAR/FRR Definition: FRR(x)=Prob(s m  x/H 0 )=F(x) FAR(y)=Prob(s n >y/H a )=1-Prob(s n  y/H a )=1-G(y) We need –F(x)=Dist(x) : Genuine (Matching) score DF –G(y)= Dist(y): Imposter (Non-matching) score DF

4 4 FAR/FRR Instead we have –Set of genuine scores X={X 1, X 2, …., X M } –Set of imposter scores Y={Y 1, Y 2, …., Y N } We estimate

5 5 Problem What is the accuracy of these error rates? –The number of biometric samples –The quality of the samples Data collection procedure (e.g. 10 consecutive samples) Subjects involved, the acquisition device etc.

6 6 An Estimation Problem Given x: A random variable (F(x) denotes Dist(x)) X={X 1, X 2, …., X M }: Sample set Estimate  =E(x) Solution Error (Unbiased estimator*)

7 7 Biased/Unbiased Estimators For an unbiased estimator we have Example: Gaussian Model: Estimate mean  1 and variance  2 using maximum likelihood criterion i.e. maximize Prob(X/ , ) (Unbiased estimator) (Biased estimator)

8 8 Confidence Interval Assume F(x) is given then Dist(r) can be calculated –r is function of, which is a function of x Calculate (1-  ) 100% certainty (Next Slide) r  [  1 ( ,X),  2 ( ,X)] Which leads to (1-  )100% confidence interval for  given by

9 9 Confidence Interval Example –Discard  /2 on lower and higher ends –Find the r values corresponding to the interval boundary (called quantile) Dist(r) r Prob(q(  /2 )  r  q(1-  /2))=1- 

10 10 Confidence Interval Interpretation: –Generate sample sets X from F(x) –Calculate confidence intervals for each X –(1-  )100% of these intervals contain .

11 11 Parametric Method X i identically distributed Assume X i are independent (not true in general) Then can be taken to be normal distribution using central limit theorem (large M). Result: E.g. For 95% confidence z=1.96 Smaller interval with increasing M and 

12 12 Non-Parametric Method Assume F(x) is available. Sample Set X Additional Sample Sets f(x)f(x) Density of Random Variable

13 13 Non-Parametric Method FACT: For large B we have Define error to be Calculate Dist(r) Solution:

14 14 Non-Parametric Method Interval calculation: Sorting and counting Dist(r) r  /2

15 15 Bootstrap Method F(x) is not available; all we have is X How do we generate ? Solution (i.e. Bootstrap method): Sampling with replacement from X. Put the samples in a bag, draw, record and put it back. Draw M samples from X B times. Some samples X i may not be in each set.

16 16 Bootstrap Method (Imperfections) X i are not independent. –In SR the dependence between samples is not replicated. Effect of dependence for independent samples –Variance of is smaller –Leads to smaller CIs  /2

17 17 Subset Bootstrap Potential sources of dependency –All samples from the same person (e.g. multiple fingers) –All samples from same biometric (e.g. finger) Partition X into independent subsets Apply SR on subsets.

18 18 Subset Bootstrap (An example) Fingerprint database –P persons –c fingers per person  D=cP Fingers –d samples per finger –DB Size= cPd Matching pairs –d(d-1) per finger –cd(d-1) per person –cPd(d-1)=Dd(d-1) total Using a symmetric and asymmetric matcher does not make any difference [23].

19 19 Subset Bootstrap (An Example) X 1  X 2 X 1 : P=10 c=2, D=20, d=8  M=1120 X 2 : P=50 c=2, D=100, d=8  M=5600 Finger based partition: Set subsets to be the samples from the same finger (i.e. D subsets of d(d-1) matching scores) Person based partition: Set subsets to be the samples from the same person (i.e. P subsets of cd(d-1) matching scores)

20 20 Subset Bootstrap (An Example) We expect –CI 1 (light gray) to be larger than CI 2 (dark gray) Because X 1 has smaller number of samples –CI 2 (dark gray) to be contained in CI 1 (light gray) Because X 1  X 2 The intervals are larger for person based partitioning –There is dependency between fingers of the same person

21 21 CIs for FAR/FRR Calculate CIs for each threshold T=t 0 and given an 

22 22 CI for FRR Given genuine score set X –Generate –Calculate –Sort and count

23 23 CI for FAR Given imposter score set Y –Generate –Calculate –Sort and count

24 24 Subset Bootstrap for FAR Imposter scores Y are not independent We are using multiple impressions of the same finger. Let I xk : k th finger impression from subject x then sim(I a1,I b1 ), sim(I a1,I b2 ), sim(I a2,I b3 ) are not statistically independent Use a finger only once; for D fingers we have only D/2 such pairs There is actually dependency between X and Y

25 25 Subset Bootstrap for FAR Fingerprint database –P persons –c fingers per person  D=cP Fingers –d samples per finger –DB Size= cPd Non-matching pairs –N=d 2 D(D-1)=P[(dc) 2 (P-1)+d 2 c(c-1)] –d 2 (D-1) per finger –(dc) 2 (P-1)+d 2 c(c-1) per person

26 26 Subset Bootstrap for FAR I1I1 IiIi ININ …. DB Partition IiIi Y 1 =I i xI 1 Y N-1= I i xI N Finger (N=D): Take I i (d elements), match it against I k  i (d 2 pairs) then we have d 2 (D-1) pairs. Repeat it with all I i to construct subsets Y k Person (N=P): Take I i (cd elements), match it against I k  i ((dc) 2 pairs) then we have (dc) 2 (P-1) pairs. Inside I i we have d 2 c(c-1) pairs. Repeat it with all I i to construct subsets Y k Not completely independent: We use I i many times. x

27 27 Subset Bootstrap for FRR Person subset is a better estimate

28 28 How good are the CIs? There exists a true confidence interval (At the beginning we assumed F(x) is known) The CI we calculate is just one estimate. How accurate is that estimate?

29 29 How good are the CIs? We estimate E(x) Ideal Test: Assume F(x) is available –Generate –Calculate –Assume and test if

30 30 How good are the CIs? Practical Test (for comparison) 1. Randomly split X into two subsets X a and X b 2.Calculate and CI a 3.Test 4.Repeat 1-3 many times and count the number of hits i.e. probability of falling into the CI a Hit rate is not equal to the confidence. Assume have normal distribution. The higher the hit rate is the better the estimates are.

31 31 How good are the CIs?  =0.1 Person based partitioning provide more accurate confidence intervals 73.10% is very close to the expected value


Download ppt "1 Chapter 15 System Errors Revisited Ali Erol 10/19/2005."

Similar presentations


Ads by Google