Presentation is loading. Please wait.

Presentation is loading. Please wait.

HW1 Beta(p;4,10),Beta(p;9,15), Beta(6,6) likelihood Beta(p;1,1)

Similar presentations


Presentation on theme: "HW1 Beta(p;4,10),Beta(p;9,15), Beta(6,6) likelihood Beta(p;1,1)"— Presentation transcript:

1 HW1 Beta(p;4,10),Beta(p;9,15), Beta(6,6) likelihood Beta(p;1,1)

2 FDR, Evidence Theory, Robustness

3 function PV=testdep(D,N) % General dependecy test TESTDEP(D,N) % D: n by d Data Matrix % N : Monte Carlo comparison sample F=[]; [n,d]=size(D); mv=mean(D); D=D-repmat(mv,n,1); %remove mean st=std(D); D=D./repmat(st,n,1); %standardize variance for i=1:d for j=1:i-1 q=mean(D(:,i).*D(:,j)); F=[F q]; end General Dependency test

4 EE=[]; for iN=1:N-1 E=[]; for i=1:d q=[]; if i>1 D(:,i)=D(randperm(n),i); for j=1:i-1 q=mean(D(:,i).*D(:,j)); E=[E q]; end EE=[EE;E]; end Empirical no-dependency distribution

5 %Sorting twice gives value ranks of EE - test statistics EE=[F ; EE]; [EEs,iix]=sort(EE); [EEs,iix]=sort(iix); % p-value is proportional to value rank PV=iix(1,:)/N; % reshuffle to matrix PVM(ix)=PV Computing P-value

6 Correlation coefficient >> D=[1:100]'; >> D=[D -D D.^2 D+200*rand(size(D)) randn(size(D))]; >> [c pv]=corrcoef(D) c = 1.0000 -1.0000 0.9689 0.2506 -0.0977 -1.0000 1.0000 -0.9689 -0.2506 0.0977 0.9689 -0.9689 1.0000 0.2959 -0.0540 0.2506 -0.2506 0.2959 1.0000 -0.0242 -0.0977 0.0977 -0.0540 -0.0242 1.0000

7 Correlation coefficient >> [c pv]=corrcoef(D) pv = 1.0000 0 0.0000 0.0119 0.3335 >> [,pv]=testdep(D,N) pv = 0 0 1.0000 0.9936 0.1668 >>

8 Multiple testing The probability of rejecting a true null hypothesis at 99% is 1%. Thus, if you repeat test 100 times, each time with new data, you will reject sometime with probability 0.63 Bonferroni correction, FWE control: in order to reach significance level 1% in an experiment involving 1000 tests, each test should be checked with significance 1/1000 %

9 Multiple testing Several approaches try to verify an excess of small p-values Sort set of p-values and test if there is an excess of small values - this is an indication of false null hypotheses

10 Approaches to multiple testing

11 Definition of FDR, positive correlation

12 No significance Lower envelope FDR corrected

13 Some significance One of 15 first tests not null-- On 5% significanc

14 More significance FDR: 95% of first 3 tests not null hypothesis

15 Even more significance 95% of first 14 tests not null - worth effort to investigate all

16 FDR Example - independence Fdrex(pv,0.05,0) 10 signals suggested. Smallest p-value not significant with Bonferroni correction (0.019 vs 0.013)

17 FDR Example - dependency Fdrex(pv,0.05,1) 10 signals suggested assuming independence all disappear with correction term

18 Ed Jaynes devoted a large part of his career to promote Bayesian inference. He also championed the use of Maximum Entropy in physics Outside physics, he received resistance from people who had already invented other methods. Why should statistical mechanics say anything about our daily human world??

19 Generalisation of Bayes/Kalman: What if: You have no prior? Likelihood infeasible to compute (imprecision)? Parameter space vague, i.e., not the same for all likelihoods? (Fuzziness, vagueness)? Parameter space has complex structure (a simple structure is e.g., a Cartesian product of reals, R, and some finite sets)?

20 Philippe Smets (1938-2005) Developed Dempster’s and Shafer’s method in uncertainty management into the Transferable Belief Method, that combines imprecise ‘evidence’ (likelihood or prior) using Dempster’ rule, and uses pignistic transformation to get a sharp decision criterion

21 Some approaches... Robust Bayes: replace distributions by convex sets of distributions (Berger m fl) Dempster/Shafer/TBM: Describe imprecision with random sets DSm: Transform parameter space to capture vagueness. (Dezert/Smarandache, controversial) FISST: FInite Set STatistics: Generalises observation- and parameter space to product of spaces described as random sets. (Goodman, Mahler, Ngyuen)

22 Combining Evidence

23

24

25

26 Robust Bayes Priors and likelihoods are convex sets of probability distributions (Berger, de Finetti, Walley,...): imprecise probability: Every member of posterior is a ’parallell combination’ of one member of likelihood and one member of prior. For decision making: Jaynes recommends to use that member of posterior with maximum entropy (Maxent estimate).

27 Ellsberg’s Paradox: Ambiguity Avoidance ? ? ? ? Urna A innehåller 4 vita och 4 svarta kulor, och 4 av okänd färg (svart eller vit) Urna B innehåller 6 vita och 6 svarta kulor Du får en krona om du drar en svart kula. Ur vilken urna vill du dra den? En precis Bayesian bör först anta hur ?-kulorna är färgade och sedan svara. Men en majoritet föredrar urna B även om svart byts mot vit

28 Prospect Theory: Kahneman, Tversky Safety belts eliminate car collision injuries at low speed completely (I BUY IT!!!) Safety belts eliminate 90% of injuries in car accidents. In 10% the speed is to high (So belts are not that good!???)

29 Hur används imprecisa sannolikheter? Förväntad nytta för beslutsalternativ blir intervall i stället för punkter: maximax, maximin, maximedel? u a pessimist optimist Bayesian

30 Dempster/Shafer/Smets Evidence is random set over over . I.e., probability distribution over. Probability of singleton: ‘Belief’ allocated to alternative, i.e., probability. Probability of non-singelton: ‘Belief’ allocated to set of alternatives, but not to any part of it. Evidences combined by random intersection conditioned to be non-empty (Dempster’s rule).

31 Logic of Dempster’s rule Each observer has a private state space and assesses the posterior over it. Each private state can correspond to one or more global or common states, multivalued mapping Observers state spaces are assumed independent.

32 Correspondence DS-structure-- set of probability distributions For a pdf (bba) m over 2^ , consider all ways of reallocating the probability mass of non-singletons to their member atoms: This gives a convex set of probability distributions over . Example:  ={A,B,C} A: 0.1 B: 0.3 C: 0.1 AB: 0.5 A: 0.1+0.5*x B: 0.3+0.5*(1-x) C: 0.1 Can we regard any set of pdf:s as a bba? Answer is NO!! There are more convex sets of pdf:s than DS-structures for all x  [0,1] bba set of pdfs

33 Representing probability set as bba: 3-element universe Rounding up: use lower envelope. Rounding down: Linear programming Rounding is not unique!! Black: convex set Blue: rounded up Red: rounded down

34 Another appealing conjecture Precise pdf can be regarded as (singleton) random set. Bayesian combination of precise pdf:s corresponds to random set intersection (conditioned on non-emptiness) DS-structure corresponds to Choquet capacity (set of pdf:s) Is it reasonable to combine Choquet capacities by (nonempty) random set intersection (Dempster’s rule)?? Answer is NO!! Counterexample: Dempster’s combination cannot be obtained by combining members of prior and likelihood: Arnborg: JAIF vol 1, No 1, 2006

35 Consistency of fusion operators DS rule MDS rule Rounded robust Operands (evidence) Robust Fusion Dempster’s rule Modified Dempster’s rule Axes are probabilities of A and B in a 3-element universe P(A) P(B) P(C )=1-P(A)-P(B)

36 Deciding target type Attack aircraft: small, dynamic Bomber aircraft: large, dynamic Civilian: Large, slow dynamics Prior: (0.5,0.4,0.1); Observer 1: probably small, likelihood (0.8,0.1,0.1); Observer 2: probably fast, likelihood (0.4,0.4,0.2);

37

38

39

40 Estimators Pignistic MaxEnt Center encl sphere 3 states: P( C ) = 1-P(A)-P(B)

41

42

43

44

45

46

47

48 What about Smets’ TBM?? TBM combines the original Dempster’s rule with the pignistic transformation. This is not compatible with precise Bayesian analysis. However, there is nothing against claiming TBM to be some kind of Robust Bayesian scheme. Main problem: Dempster’s rule and its motivation using multi-valued mappings is against the dominant argumentation used in introductions and tutorials: TBM is incompatible with the Capacity interpretation of DS structures


Download ppt "HW1 Beta(p;4,10),Beta(p;9,15), Beta(6,6) likelihood Beta(p;1,1)"

Similar presentations


Ads by Google