# Introduction to Evidential Reasoning Belief Functions.

## Presentation on theme: "Introduction to Evidential Reasoning Belief Functions."— Presentation transcript:

Introduction to Evidential Reasoning Belief Functions

Incompleteness: Combinatory John is married, but his wife's name is not given Combinatory All computer scientists like pizza, but their names are not available. Imprecision: Combinatory John's wife is Jill or Joan. Combinatory Jill is not John's wife. Interval theory Paul's height is between 170 and 180. Fuzzy sets Paul is tall. Possibility Theory the possibility for Paul's height to be about 175 cm. (physical form) Uncertainty: Probability Theory Upper-Lower Probabilities Possibility Theory the possibility that Paul's height is about 175 cm. (epistemic form) Subjective Probabilities Belief functions (Credibility) the chance of it being "heads" when tossing a coin. my degree of belief that cancer X is due to a virus. Ignorance Types

0.7 0.3 0.15 A somehow reliable witness testifies that the killer is a male. -Testimony reliability  -A priori equal belief that the killer is a male or a female. Is the killer Male? M = ‘the killer is a male’ A murder case Justified component of the probability given to M (called the belief or the support) aleatory component of that probability. Classical probability analysis P(M) = P(M/reliable)P(reliable) + P(M/¬reliable)P(¬reliable) P(M)=1x0.7 + 0.5x0.3 = 0.85 Evidence Theory- DS Theory bel(M) = 0.7.

Bayesian approach Answering ‘‘what is the belief in A? as expressed by the unconditional probability that A is true given evidence, e ?’’ Assumption: precise probabilities can be assessed for all events. Too rare… Why believe one hypothesis other than that provided by the evidence?? The evidence have to be re-organised so that probabilities sum to unity. Pros -rules of probability calculus : uncontroversial, constant conclusions with the probability assessments. -Bayesian theory is easy to understand. Cons It is least suited to problems where there is -partial or complete ignorance -limited or conflicting information due to assumptions made(e.g. equi-probability) Cannot deal with imprecise, qualitative or natural language judgements such as ‘‘if A then probably B’’.

Dempster–Shafer approach Answering the question ‘‘what is the belief in A, as expressed by the probability that the proposition A is provable given the evidence?’’ An alternative to traditional probabilistic theory for the mathematical representation of uncertainty. Whereas a Bayesian approach assesses probabilities directly for the answer, the Dempster–Shafer approach assesses evidence for related questions. -combination of evidence obtained from multiple sources and the modeling of conflict between them. -Allocation of a probability mass to sets or intervals Pros -Ability to model various types of partial ignorance, limited or conflicting evidence -more flexible model than Bayes’ theorem. -computationally simpler than Bayes’ theorem. - No assumption regarding the probability of the individual constituents of the set or interval. -evaluation of risk and reliability in engineering applications when it is not possible to obtain a precise measurement from experiments, or when knowledge is obtained from expert elicitation. Cons -Can produce conclusions that are counter-intuitive. Dempster–Shafer is most suited to situations where beliefs are numerically expressed and where there is some degree of ignorance, i.e. there is an incomplete model.

Belief functions Ω the frame of discernment(elements of the set Ω are called ‘worlds’) One “actual world” ω 0. But which? An agent can only express the strength of his/her opinion (called degree of belief) that the actual world belongs to this or that subset of Ω. Shafer belief function bel : 2 Ω → [0, 1] bel(A) denotes the strength of Agent’s belief that ω 0  A. bel satisfies the following inequalities: Other useful functions (‘1-1’ with bel) 1.basic belief assignment (bba) m : 2 Ω → [0, 1] defined as: m(A) for A  Ω is called the basic belief mass (bbm) given to A. It may happen that m(  ) > 0. The relation from m to bel is given by: 2. plausibility function pl : 2 Ω → [0, 1] is defined as: Shafer : bel is ‘normalized’ => closed world assumption=> bel(Ω)=1, pl(Ω)=1,m(  ) = 0.

Entertained beliefs and beliefs in a decision context Uncertainty induces beliefs=“graded dispositions that guide our behavior” ‘rational’agent behavior described within decision contexts “It has been argued that decisions are ‘rational’ only if we use a probability measure over the various possible states of the nature and compute with it the expected utility of each possible act, the optimal act being the one that maximizes these expected utilities (DeGroot, 1970; Savage, 1954)”. beliefs can only be observed through our decisions=>use of probability functions to represent quantified beliefs 2 categories of beliefs:Entertained beliefs and beliefs in a decision context Entertained beliefs=>provide the quantified belief of the Agent (use of Justified Evidences) Beliefs in a decision context=>provide a method for rational decision making(probability function). Not supporting any strictly more specific propositions A basic belief mass given to a set A supports also that the actual world is in every subsets that contains A. The degree of belief bel(A) for A  quantifies the total amount of justified specific support given to A. We say justified because we include in bel(A) only the basic belief masses given to subsets of A. m({x,y}) given to {x,y} could support x if further information indicates this.However given the available information the basic belief mass can only be given to {x,y}. We say specific because the basic belief mass m(Ø) is not included in bel(A) as it is given to the subset Ø. Observations in belief functions

Dempster Rule of Combination Zadeh provides a compelling example of erroneous results. 1 patient with neurological symptoms, 2 physicians Doctor1: meningitis 0.99 brain tumor 0.01 Doctor2: concussion 0.99 brain tumor 0.01. Using Dempster : m (brain tumor) = Bel (brain tumor) = 1 !!! Complete support for a very unlikely diagnosis problem when strongly conflicting evidence

System failure, 2 experts. Failure caused by Component A, B or C. Expert 1: m1(A) = 0.99 (failure due to Component A) m1(B) = 0.01 (failure due to Component B) Expert 2: m2(B) = 0.01 (failure due to Component B) m2(C) = 0.99 (failure due to Component C) Dempster’s Rule combination of the masses 1. To calculate the combined basic probability assignment for a particular cell, simply multiply the masses from the associated column and row. 2. Where the intersection is nonempty, the masses for a particular set from each source are multiplied, e.g., m12(B) = (0.01)(0.01) = 0.0001. 3. Where the intersection is empty, this represents conflicting evidence and should be calculated as well. For the empty intersection of the two sets A and C associate with Expert 1 and 2, respectively, there is a mass associated with it. m1(A) m2(C)=(0.99)(0.99) =(0.9801). 4. Then sum the masses for all sets and the conflict. 5. The only nonzero value is for the combination of B, m12(B) = 0.0001. In this example there is only one intersection that yields B, but in a more complicated example it is possible to find more intersections to yield B. 6. For K, there are three cells that contribute to conflict represented by empty intersections. K = (0.99)(0.01) + (0.99)(0.01) + (0.99)(0.99) = 0.9999 7. Calculate the joint, m1(B) m2(B) = (.01)(.01) / [1-0.9999] =1 Bel (B) = 1!!! Problem of Dempster when highly conflicting evidence

Yager’s rule almost same matrix as Dempster’s rule. Exceptions in the nomenclature and allocation of conflict: 1. Ground probability assignments (q) instead of basic probability assignments (m) 2. q(Ø) instead of using K (but q(Ø)=K) Not normalization by factor (1-K). significant reduction of the value for Belief -> counterintuitive results sometimes large expansion of Plausibility. Inagaki’s Rule The matrix same as Dempster’s. -ground probability functions like Yager. m12(B) depends on the value of k which is now a parameter. k : experimentally or by expert expectation When k = 0 => Yager’s Rule. When k  1/(1  q(  )) => Dempster’s rule m12(B)  1, because sums of all masses must be equal to 1. k  =>  filtering of the evidence. Other combination Rules

Zhang’s Rule measure of intersection based on the cardinality of the sets. Problems with Zhang’s measure of intersection: 1. The equivalence with Dempster’s rule when the cardinality is 1 for all relevant sets or when the |C|=|A||B| in the circumstance of conflicting evidence. (This should not pose a problem if there is no significant conflict.) 2. If the cardinality of B was greater than 1, even completely overlapping sets will be scaled. Mixing The formulation for mixing in this case corresponds to the sum of m1(B)(1/2) and m2(B)(1/2). m12(A) = (1/2)(0.99) = 0.445 m12(B) = (1/2)(0.01)+ (1/2) (0.01) = 0.01 m12(C) = (1/2)(0.99) = 0.445 Dubois and Prade’s Disjunctive Consensus Pooling Unions of multiple sets Other combination Rules

Which model to use depends on the specific application