Presentation is loading. Please wait.

Presentation is loading. Please wait.

Universal and composite hypothesis testing via Mismatched Divergence Jayakrishnan Unnikrishnan LCAV, EPFL Collaborators Dayu Huang, Sean Meyn, Venu Veeravalli,

Similar presentations


Presentation on theme: "Universal and composite hypothesis testing via Mismatched Divergence Jayakrishnan Unnikrishnan LCAV, EPFL Collaborators Dayu Huang, Sean Meyn, Venu Veeravalli,"— Presentation transcript:

1 Universal and composite hypothesis testing via Mismatched Divergence Jayakrishnan Unnikrishnan LCAV, EPFL Collaborators Dayu Huang, Sean Meyn, Venu Veeravalli, University of Illinois Amit Surana, UTRC IPG seminar 2 March 2011

2 Outline Universal Hypothesis Testing – Hoeffding test Problems with large alphabets – Mismatched test Dimensionality reduction Improved performance Extensions – Composite null hypotheses – Model-fitting with outliers – Rate-distortion test – Source coding with training Conclusions 2

3 Universal Hypothesis Testing Given a sequence of i.i.d. observations test the hypothesis – Focus on finite alphabets i.e. PMFs Applications: anomaly detection, spam filtering etc. 3

4 Sufficient statistic Empirical distribution: – where denotes the number of times letter appears in – is a random vector 4

5 Hoeffding’s Universal Test Hoeffding test [1965] : – Uses KL divergence between and as test statistic 5

6 Hoeffding’s Universal Test Hoeffding test is optimal in error-exponent sense: – Sanov’s Theorem in Large Deviations implies 6

7 Hoeffding’s Universal Test Hoeffding test is optimal in error-exponent sense: – Sanov’s Theorem in Large Deviations implies Better approximation of false alarm probability via – Weak convergence under 7

8 Error exponents are inaccurate 8 Alphabet size, A = 20

9 Large Alphabet Regime Hoeffding test performs poorly for large (alphabet size) – suffers from high bias and variance 9

10 Large Alphabet Regime Hoeffding test performs poorly for large (alphabet size) – suffers from high bias and variance A popular fix: Merging low probability bins 10

11 Binning 11

12 Quantization 12

13 General principle Dimensionality reduction Essentially we compromise on universality but improve performance against typical alternatives Generalization: parametric family for typical alternatives 13

14 Hoeffding test 14

15 Mismatched test 15

16 Mismatched test 16

17 Mismatched test 17

18 Mismatched test 18

19 Mismatched test Use mismatched divergence instead of KL divergence – interpretable as a lower bound to KL divergence Idea in short: replace with ML estimate from i.e., it is a GLRT 19

20 Exponential family example Mismatched divergence is solution to a convex problem 20

21 Exponential family example Mismatched divergence is solution to a convex problem Binning when 21

22 Mismatched Test properties + Addresses high variance issues - However not universally optimal in error-exponent sense + Optimal when alternate distribution lies in achieves same error exponents as Hoeffding implies optimality of GLRT for composite hypotheses 22

23 Performance comparison 23 A = 19, n = 40

24 Weak convergence When observations – Approximate thresholds for target false alarm 24

25 Weak convergence When observations – Approximate thresholds for target false alarm When observations – Approximate power of test 25

26 EXTENSIONS AND APPLICATIONS 26

27 Composite null hypotheses Composite null hypotheses / model fitting 27

28 Composite null hypotheses Composite null hypotheses / model fitting 28

29 Composite null hypotheses Composite null hypotheses / model fitting 29

30 Weak convergence When observations 30

31 Weak convergence When observations 31

32 Weak convergence When observations – Approximate thresholds for target false alarm – Approximate power of test – Study outlier effects 32

33 Outliers in model-fitting Data corrupted by outliers or model-mismatch – Contamination mixture model 33

34 Outliers in model-fitting Data corrupted by outliers or model-mismatch – Contamination mixture model 34

35 Outliers in model-fitting Data corrupted by outliers or model-mismatch – Contamination mixture model Goodness of fit metric – Limiting behavior used to quantify the goodness of fit 35

36 Outliers in model-fitting Data corrupted by outliers or model-mismatch – Contamination mixture model Limiting behavior of goodness of fit metric changes 36

37 Outliers in model-fitting Data corrupted by outliers or model-mismatch – Contamination mixture model Sensitivity of goodness of fit metric to outliers 37

38 Rate-distortion test Different generalization of binning –Rate-distortion optimal compression Test based on optimally compressed observations [P. Harremoës 09] –Results on limiting distribution of test statistic 38

39 Source coding with training A wants to encode and transmit source to B – Unknown distribution on known alphabet – Given training samples 39

40 Source coding with training A wants to encode and transmit source to B – Unknown distribution on known alphabet – Given training samples Choose codelengths based on empirical frequencies 40

41 Source coding with training A wants to encode and transmit source to B – Unknown distribution on known alphabet – Given training samples Choose codelengths based on empirical frequencies Expected excess codelength is chi-squared 41

42 CLT vs LDP Empirical distribution (type) of 42

43 CLT vs LDP Empirical distribution (type) of Obeys LDP (Sanov’s theorem): Obeys CLT: 43

44 CLT vs LDP LDP Good for large deviations Approximates asymptotic slope of log- probability – Pre-exponential factor may be significant CLT Good for moderate deviations Approximates probability 44

45 Conclusions – Error exponents do not tell the whole story Not a good indicator of exact probability Tests with identical error exponents can differ drastically over finite samples – Weak convergence results give better approximations than error exponents (LDPs) – Compromising universality for performance improvement against typical alternatives – Threshold selection, Outlier sensitivity, Source coding with training 45

46 References J. Unnikrishnan, D. Huang, S. Meyn, A. Surana, and V. V. Veeravalli, “Universal and Composite Hypothesis Testing via Mismatched Divergence” IEEE Trans. Inf. Theory, to appear. J. Unnikrishnan, S. Meyn, and V. Veeravalli, “On Thresholds for Robust Goodness-of-Fit Tests” presented at IEEE Information Theory Workshop, Dublin, Aug. 2010. J. Unnikrishnan, “Model-fitting in the presence of outliers” submitted to ISIT 2011. – available at http://lcavwww.epfl.ch/~unnikris/http://lcavwww.epfl.ch/~unnikris/ 46

47 Thank You! 47


Download ppt "Universal and composite hypothesis testing via Mismatched Divergence Jayakrishnan Unnikrishnan LCAV, EPFL Collaborators Dayu Huang, Sean Meyn, Venu Veeravalli,"

Similar presentations


Ads by Google