Slides are based on Negnevitsky, Pearson Education, 2005 1 Lecture 3 Uncertainty management in rule- based expert systems n Introduction, or what is uncertainty?

Slides:

Advertisements

Similar presentations

© Negnevitsky, Pearson Education, Introduction, or what is uncertainty? Introduction, or what is uncertainty? Basic probability theory Basic probability.

Advertisements

Chapter 7 Hypothesis Testing

FT228/4 Knowledge Based Decision Support Systems

Lahore University of Management Sciences, Lahore, Pakistan Dr. M.M. Awais- Computer Science Department 1 Lecture 12 Dealing With Uncertainty Probabilistic.

Bayesian Network and Influence Diagram A Guide to Construction And Analysis.

CHAPTER 13 Inference Techniques. Reasoning in Artificial Intelligence n Knowledge must be processed (reasoned with) n Computer program accesses knowledge.

 Negnevitsky, Pearson Education, Lecture 3 Uncertainty management in rule- based expert systems n Introduction, or what is uncertainty? n Basic.

Uncertainty in Expert Systems (Certainty Factors).

Rulebase Expert System and Uncertainty. Rule-based ES Rules as a knowledge representation technique Type of rules :- relation, recommendation, directive,

Week 11 Review: Statistical Model A statistical model for some data is a set of distributions, one of which corresponds to the true unknown distribution.

Copyright © 2010, 2007, 2004 Pearson Education, Inc. Chapter 14 From Randomness to Probability.

PROBABILITY. Uncertainty  Let action A t = leave for airport t minutes before flight from Logan Airport  Will A t get me there on time ? Problems :

Psychology 290 Special Topics Study Course: Advanced Meta-analysis April 7, 2014.

Hypothesis Testing A hypothesis is a claim or statement about a property of a population (in our case, about the mean or a proportion of the population)

Uncertainty Everyday reasoning and decision making is based on uncertain evidence and inferences. Classical logic only allows conclusions to be strictly.

FT228/4 Knowledge Based Decision Support Systems

Chapter 4 Probability and Probability Distributions

COUNTING AND PROBABILITY

Statistical Issues in Research Planning and Evaluation

B. Ross Cosc 4f79 1 Uncertainty Knowledge can have uncertainty associated with it - Knowledge base: rule premises, rule conclusions - User input: uncertain,

5/17/20151 Probabilistic Reasoning CIS 479/579 Bruce R. Maxim UM-Dearborn.

Introduction to Probability and Probabilistic Forecasting L i n k i n g S c i e n c e t o S o c i e t y Simon Mason International Research Institute for.

Certainty and Evidence

AI – CS364 Uncertainty Management Introduction to Uncertainty Management 21 st September 2006 Dr Bogdan L. Vrusias

KETIDAKPASTIAN (UNCERTAINTY) Yeni Herdiyeni – “Uncertainty is defined as the lack of the exact knowledge that would enable.

AI – CS364 Uncertainty Management 26 th September 2006 Dr Bogdan L. Vrusias

Lecture 05 Rule-based Uncertain Reasoning

Slide 1 Statistics Workshop Tutorial 4 Probability Probability Distributions.

Probability (cont.). Assigning Probabilities A probability is a value between 0 and 1 and is written either as a fraction or as a proportion. For the.

CS Bayesian Learning1 Bayesian Learning. CS Bayesian Learning2 States, causes, hypotheses. Observations, effect, data. We need to reconcile.

Problem A newly married couple plans to have four children and would like to have three girls and a boy. What are the chances (probability) their desire.

Respected Professor Kihyeon Cho

Copyright © Cengage Learning. All rights reserved. 8 Tests of Hypotheses Based on a Single Sample.

Lecture 12 Statistical Inference (Estimation) Point and Interval estimation By Aziza Munir.

Chapter 8 Probability Section R Review. 2 Barnett/Ziegler/Byleen Finite Mathematics 12e Review for Chapter 8 Important Terms, Symbols, Concepts  8.1.

Sullivan – Fundamentals of Statistics – 2 nd Edition – Chapter 5 Section 1 – Slide 1 of 33 Chapter 5 Section 1 Probability Rules.

Slides are based on Negnevitsky, Pearson Education, Lecture 12 Hybrid intelligent systems: Evolutionary neural networks and fuzzy evolutionary systems.

Theory of Probability Statistics for Business and Economics.

ECE 8443 – Pattern Recognition LECTURE 07: MAXIMUM LIKELIHOOD AND BAYESIAN ESTIMATION Objectives: Class-Conditional Density The Multivariate Case General.

BINOMIALDISTRIBUTION AND ITS APPLICATION. Binomial Distribution  The binomial probability density function –f(x) = n C x p x q n-x for x=0,1,2,3…,n for.

1 1 Slide © 2016 Cengage Learning. All Rights Reserved. Probability is a numerical measure of the likelihood Probability is a numerical measure of the.

1 Reasoning Under Uncertainty Artificial Intelligence Chapter 9.

© Copyright McGraw-Hill Correlation and Regression CHAPTER 10.

Uncertainty Management in Rule-based Expert Systems

Probability Course web page: vision.cis.udel.edu/cv March 19, 2003  Lecture 15.

Uncertainty in Expert Systems

Making sense of randomness

Reasoning Under Uncertainty. 2 Objectives Learn the meaning of uncertainty and explore some theories designed to deal with it Find out what types of errors.

Artificial Intelligence CIS 342 The College of Saint Rose David Goldschmidt, Ph.D.

Probability is a measure of the likelihood of a random phenomenon or chance behavior. Probability describes the long-term proportion with which a certain.

Copyright © 2014 by McGraw-Hill Higher Education. All rights reserved. Essentials of Business Statistics: Communicating with Numbers By Sanjiv Jaggia and.

Chapter 2 Logic 2.1 Statements 2.2 The Negation of a Statement 2.3 The Disjunction and Conjunction of Statements 2.4 The Implication 2.5 More on Implications.

Reasoning with Uncertainty دكترمحسن كاهاني

International Conference on Fuzzy Systems and Knowledge Discovery, p.p ,July 2011.

Textbook Basics of an Expert System: – “Expert systems: Design and Development,” by: John Durkin, 1994, Chapters 1-4. Uncertainty (Probability, Certainty.

+ Chapter 5 Overview 5.1 Introducing Probability 5.2 Combining Events 5.3 Conditional Probability 5.4 Counting Methods 1.

UNIVERSITI TENAGA NASIONAL 1 CCSB354 ARTIFICIAL INTELLIGENCE Chapter 8.2 Certainty Factors Chapter 8.2 Certainty Factors Instructor: Alicia Tang Y. C.

CSE 473 Uncertainty. © UW CSE AI Faculty 2 Many Techniques Developed Fuzzy Logic Certainty Factors Non-monotonic logic Probability Only one has stood.

AP Statistics From Randomness to Probability Chapter 14.

 Negnevitsky, Pearson Education, Lecture 12 Hybrid intelligent systems: Evolutionary neural networks and fuzzy evolutionary systems n Introduction.

Chapter 12 Certainty Theory (Evidential Reasoning) 1.

REASONING UNDER UNCERTAINTY: CERTAINTY THEORY

CHAPTER 5 Handling Uncertainty BIC 3337 EXPERT SYSTEM.

Unit 5: Hypothesis Testing

Bayes for Beginners Stephanie Azzopardi & Hrvoje Stojic

Basic Probabilistic Reasoning

Significance Tests: The Basics

28th September 2005 Dr Bogdan L. Vrusias

Certainty Factor Model

basic probability and bayes' rule

Presentation transcript:

Slides are based on Negnevitsky, Pearson Education, Lecture 3 Uncertainty management in rule- based expert systems n Introduction, or what is uncertainty? n Basic probability theory n Bayesian reasoning n Bias of the Bayesian method n Certainty factors theory and evidential reasoning n Summary

Slides are based on Negnevitsky, Pearson Education, Introduction, or what is uncertainty? n Information can be incomplete, inconsistent, uncertain, or all three. In other words, information is often unsuitable for solving a problem. n Uncertainty is defined as the lack of the exact knowledge that would enable us to reach a perfectly reliable conclusion. Classical logic permits only exact reasoning. It assumes that perfect knowledge always exists and the law of the excluded middle can always be applied: IF A is true IF A is false THEN A is not false THEN A is not true

Slides are based on Negnevitsky, Pearson Education, n Variety of logics –Propositional Logics »These are exclusively concerned with the logic of sentential operators such as "and", "or", "not". –Predicate logics »These usually cover propositional logic and the logic of quantifiers such as "all" and "some". –Modal logics »Modal logics are usually propositional logics with added propositional operators concerning, for example, necessity, time, knowledge, belief, provability –Non-monotonic logics »Logics (devised for AI) in which adding premises may diminish the set of derivable conclusions.

Slides are based on Negnevitsky, Pearson Education, n Weak implications. Domain experts and knowledge engineers have the painful task of establishing concrete correlations between IF (condition) and THEN (action) parts of the rules. Therefore, expert systems need to have the ability to handle vague associations, for example by accepting the degree of correlations as numerical certainty factors. n For example: IFtoday is raining IFtoday is raining THENtomorrow will rain {with probability p} Sources of uncertain knowledge

Slides are based on Negnevitsky, Pearson Education, n Imprecise language. Our natural language is ambiguous and imprecise. We describe facts with such terms as often and sometimes, frequently and hardly ever. As a result, it can be difficult to express knowledge in the precise IF-THEN form of production rules. n However, if the meaning of the facts is quantified, it can be used in expert systems. n In 1944, Ray Simpson asked 355 high school and college students to place 20 terms like often on a scale between 1 and 100. In 1968, Milton Hakel repeated this experiment.

Slides are based on Negnevitsky, Pearson Education, Quantification of ambiguous and imprecise terms on a time-frequency scale

Slides are based on Negnevitsky, Pearson Education, n Unknown data. When the data is incomplete or missing, the only solution is to accept the value “unknown” and proceed to an approximate reasoning with this value. n Combining the views of different experts. Large expert systems usually combine the knowledge and expertise of a number of experts. n Experts often have contradictory opinions and produce conflicting rules. To resolve the conflict, the knowledge engineer has to attach a weight to each expert and then calculate the composite conclusion. n no systematic method exists to obtain these weights.

Slides are based on Negnevitsky, Pearson Education, Basic probability theory n The concept of probability has a long history that goes back thousands of years when words like “probably”, “likely”, “maybe”, “perhaps” and “possibly” were introduced into spoken languages. However, the mathematical theory of probability was formulated only in the 17th century. n The probability of an event is the proportion of cases in which the event occurs. Probability can also be defined as a scientific measure of chance.

Slides are based on Negnevitsky, Pearson Education, n Probability can be expressed mathematically as a numerical index with a range between zero (an absolute impossibility) to unity (an absolute certainty). n Most events have a probability index strictly between 0 and 1, which means that each event has at least two possible outcomes: favourable outcome or success, and unfavourable outcome or failure.

Slides are based on Negnevitsky, Pearson Education, n If s is the number of times success can occur, and f is the number of times failure can occur, then and and p + q = 1 n If we throw a coin, the probability of getting a head will be equal to the probability of getting a tail. In a single throw, s = f = 1, and therefore the probability of getting a head (or a tail) is 0.5.

Slides are based on Negnevitsky, Pearson Education, n Let A be an event in the world and B be another event. Suppose that events A and B are not mutually exclusive, but occur conditionally on the occurrence of the other. The probability that event A will occur if event B occurs is called the conditional probability. Conditional probability is denoted mathematically as p(A  B) in which the vertical bar represents GIVEN and the complete probability expression is interpreted as “Conditional probability of event A occurring given that event B has occurred”. Conditional probability

Slides are based on Negnevitsky, Pearson Education, n The number of times A and B can occur, or the probability that both A and B will occur, is called the joint probability of A and B. It is represented mathematically as p(A  B). The number of ways B can occur is the probability of B, p(B), and thus n Similarly, the conditional probability of event B occurring given that event A has occurred equals p(B  A) / p(A) P(B|A) = p(B  A) / p(A)

Slides are based on Negnevitsky, Pearson Education, Hence, Substituting the last equation into the equation and yields the Bayesian rule: p(A  B) = p(A|B) × p(B)

Slides are based on Negnevitsky, Pearson Education, where: p(A  B) is the conditional probability that event A occurs given that event B has occurred; p(B  A) is the conditional probability of event B occurring given that event A has occurred; p(A) is the probability of event A occurring; p(B) is the probability of event B occurring. Bayesian rule

Slides are based on Negnevitsky, Pearson Education, The joint probability

Slides are based on Negnevitsky, Pearson Education, If the occurrence of event A depends on only two mutually exclusive events, B and NOT B, we obtain: where  is the logical function NOT. Similarly, Substituting this equation into the Bayesian rule yields:

Slides are based on Negnevitsky, Pearson Education, Suppose all rules in the knowledge base are represented in the following form: represented in the following form: IFH is true THENE is true {with probability p} This rule implies that if event H occurs, then the probability that event E will occur is p. probability that event E will occur is p. In expert systems, H usually represents a hypothesis and E denotes evidence to support this hypothesis. Bayesian reasoning

Slides are based on Negnevitsky, Pearson Education, The Bayesian rule expressed in terms of hypotheses and evidence looks like this: and evidence looks like this: where: p(H) is the prior probability of hypothesis H being true; p(E  H) is the probability that hypothesis H being true will result in evidence E; p(  H) is the prior probability of hypothesis H being false; p(E  H) is the probability of finding evidence E even when hypothesis H is false.

Slides are based on Negnevitsky, Pearson Education, n In expert systems, the probabilities required to solve a problem are provided by experts. An expert determines the prior probabilities for possible hypotheses p(H) and p(  H), and also the conditional probabilities for observing evidence E if hypothesis H is true, p(E  H), and if hypothesis H is false, p(E  H). n Users provide information about the evidence observed and the expert system computes p(H  E) for hypothesis H in light of the user-supplied evidence E. Probability p(H  E) is called the posterior probability of hypothesis H upon observing evidence E.

Slides are based on Negnevitsky, Pearson Education, n We can take into account both multiple hypotheses H 1, H 2,..., H m and multiple evidences E 1, E 2,..., E n. The hypotheses as well as the evidences must be mutually exclusive and exhaustive. n Single evidence E and multiple hypotheses follow: n Multiple evidences and multiple hypotheses follow:

Slides are based on Negnevitsky, Pearson Education, n This requires to obtain the conditional probabilities of all possible combinations of evidences for all hypotheses, and thus places an enormous burden on the expert. n Therefore, in expert systems, conditional independence among different evidences assumed. Thus, instead of the unworkable equation, we attain:

Slides are based on Negnevitsky, Pearson Education, Ranking potentially true hypotheses Let us consider a simple example. Suppose an expert, given three conditionally independent evidences E 1, E 2 and E 3, creates three mutually exclusive and exhaustive hypotheses H 1, H 2 and H 3, and provides prior probabilities for these hypotheses – p(H 1 ), p(H 2 ) and p(H 3 ), respectively. The expert also determines the conditional probabilities of observing each evidence for all possible hypotheses.

Slides are based on Negnevitsky, Pearson Education, The prior and conditional probabilities Assume that we first observe evidence E 3. The expert system computes the posterior probabilities for all hypotheses as

Slides are based on Negnevitsky, Pearson Education, Thus, After evidence E 3 is observed, belief in hypothesis H 2 increases and becomes equal to belief in hypothesis H 1. Belief in hypothesis H 3 also increases and even nearly reaches beliefs in hypotheses H 1 and H 2.

Slides are based on Negnevitsky, Pearson Education, Suppose now that we observe evidence E 1. The posterior probabilities are calculated as Hence, Hypothesis H 2 has now become the most likely one.

Slides are based on Negnevitsky, Pearson Education, After observing evidence E 2, the final posterior probabilities for all hypotheses are calculated: Although the initial ranking was H 1, H 2 and H 3, only hypotheses H 1 and H 3 remain under consideration after all evidences (E 1, E 2 and E 3 ) were observed.

Slides are based on Negnevitsky, Pearson Education, nd example (1) n Suppose there is a school having 60% boys and 40% girls as students. The female students wear trousers or skirts in equal numbers; the boys all wear trousers. An observer sees a (random) student from a distance; all the observer can see is that this student is wearing trousers. What is the probability this student is a girl?

Slides are based on Negnevitsky, Pearson Education, an example (2) n The event A is that the student observed is a girl, and the event B is that the student observed is wearing trousers. n To compute P(A|B), we first need to know: –P(A), or the probability that the student is a girl regardless of any other information. Since the observers sees a random student, meaning that all students have the same probability of being observed, and the fraction of girls among the students is 40%, this probability equals 0.4.

Slides are based on Negnevitsky, Pearson Education, an example (3) –P(B|A), or the probability of the student wearing trousers given that the student is a girl. As they are as likely to wear skirts as trousers, this is 0.5. –P(B), or the probability of a (randomly selected) student wearing trousers regardless of any other information. Since P(B) = P(B|A)P(A) + P(B|A')P(A'), Since P(B) = P(B|A)P(A) + P(B|A')P(A'), this is 0.5× ×0.6 = 0.8. this is 0.5× ×0.6 = 0.8.

Slides are based on Negnevitsky, Pearson Education, an example (4) n Given all this information, the probability of the observer having spotted a girl given that the observed student is wearing trousers can be computed by substituting these values in the formula:

Slides are based on Negnevitsky, Pearson Education, Forecast: Bayesian accumulation of evidence n Based on London weather for March 1982 n It gives the minimum and maximum temperatures, rainfall and sunshine for each day –If rainfall is zero it is a dry day n The expert system should display two possible outcomes: (i.e. hypotheses H) tomorrow is rain and tomorrow is dry (with likelihood respectively)

Slides are based on Negnevitsky, Pearson Education, n H: rain tomorrow n LS=p(E|H)/p(E|¬H) it represents a measure of the expert belief in hypothesis H if evidence E is present. It is called likelihood of sufficiency. –In this case, LS is the probability of getting rain today if we have rain tomorrow divided by the probability of getting rain today if there is no rain tomorrow.

Slides are based on Negnevitsky, Pearson Education, n LN=p(¬ E|H)/p(¬ E|¬H) it represents a measure of discredit to hypothesis H if evidence E is missing. It is called likelihood of necessity. –In the case, LN is the probability of not getting rain today if we have rain tomorrow divided by the probability of not getting rain today if there is no rain tomorrow.

Slides are based on Negnevitsky, Pearson Education, n The expert must provide both LN and LS independently n High values LS (LS>>1) indicate that the rule strongly supports the hypothesis if the evidence is observed n Low values of LN (LN<<1) suggest that the rule also strongly oppose the hypothesis if the evidence is missing

Slides are based on Negnevitsky, Pearson Education, n Rule 1: If today is rain {LS 2.5 LN 0.6} If today is rain {LS 2.5 LN 0.6} Then tomorrow is rain {prior 0.5} Then tomorrow is rain {prior 0.5} n Rule 2: If today is dry {LS 1.6 LN 0.4} If today is dry {LS 1.6 LN 0.4} Then tomorrow is dry {prior 0.5} Then tomorrow is dry {prior 0.5}

Slides are based on Negnevitsky, Pearson Education, n Rule 1 tells us that if it is raining today, there is a high probability of rain tomorrow (LS 2.5). But even there is no rain today, or today is dry, there is still some chance of having rain tomorrow (LN =0.6)

Slides are based on Negnevitsky, Pearson Education, n Prior odds O(H)=p(H)/(1-p(H)) O(H)=p(H)/(1-p(H)) In order to obtain the posterior odds, the prior odds are updated by LS if the evidence is true and by LN if it is false O(H|E)=LS*O(H) O(H|¬E)=LN*O(H)

Slides are based on Negnevitsky, Pearson Education, n p(H|E)=O(H|E)/(1+O(H|E)) n p(H|¬E)=O(H|¬E)/(1+O(H|¬E)) n Now suppose that today is rain. Rule 1 is fired and the prior probability of tomorrow is rain is converted into the prior odds: –O( tomorrow is rain)=0.5/(1-0.5)=1 –the evidence “today is rain” is true, therefore: O( tomorrow is rain | today is rain) =LS*O(H)=2.5*1=2.5 O( tomorrow is rain | today is rain) =LS*O(H)=2.5*1=2.5 –P (tomorrow is rain| today is rain) = 2.5/(1+2.5)=0.71 –Increasing the probability from 0.5 to 0.71

Slides are based on Negnevitsky, Pearson Education, n Rule 2 is fired and the prior probability of tomorrow is dry is converted into the prior odds: –O( tomorrow is dry)=0.5/(1-0.5)=1 –the evidence “today is rain” is true (would reduce the odds), therefore: O( tomorrow is dry | today is rain) =LN*O(H)=0.4*1=0.4 O( tomorrow is dry | today is rain) =LN*O(H)=0.4*1=0.4 –p( tomorrow is dry| today is rain) = 0.4/(1+0.4)=0.29 –diminishing the probability of tomorrow is dry from 0.5 to 0.29 (because of the evidence of today is rain)

Slides are based on Negnevitsky, Pearson Education, n Now suppose that today is dry. n That is: H today is dry –By following the same procedure, we have: –There is a 62% chance of it being dry tomorrow –And a 38% chance of it raining tomorrow

Slides are based on Negnevitsky, Pearson Education, n The framework for Bayesian reasoning requires probability values as primary inputs. The assessment of these values usually involves human judgement. However, psychological research shows that humans either cannot elicit probability values consistent with the Bayesian rules. n This suggests that the conditional probabilities may be inconsistent with the prior probabilities given by the expert. Bias of the Bayesian Method

Slides are based on Negnevitsky, Pearson Education, n Consider, for example, a car that does not start and makes odd noises when you press the starter. The conditional probability of the starter being faulty if the car makes odd noises may be expressed as: IF the symptom is “odd noises” THEN the starter is bad {with probability 0.7} n Consider, for example, a car that does not start and makes odd noises when you press the starter. The conditional probability of the starter being faulty if the car makes odd noises may be expressed as: p(starter is not bad  odd noises) = = p(starter is good  odd noises) = 1  0.7 = 0.3

Slides are based on Negnevitsky, Pearson Education, n Therefore, we can obtain a companion rule that states IF the symptom is “odd noises” THEN the starter is good {with probability 0.3} n Domain experts do not deal with conditional probabilities and often deny the very existence of the hidden implicit probability (0.3 in our example). n We would also use available statistical information and empirical studies to derive the following rules: IF the starter is bad THEN the symptom is “odd noises” {probability 0.85} IF the starter is bad THEN the symptom is not “odd noises” {probability 0.15}

Slides are based on Negnevitsky, Pearson Education, n To use the Bayesian rule, we still need the prior probability, the probability that the starter is bad if the car does not start. Suppose, the expert supplies us the value of 5 per cent. Now we can apply the Bayesian rule to obtain: n The number obtained is significantly lower than the expert’s estimate of 0.7 given at the beginning of this section. n The reason for the inconsistency is that the expert made different assumptions when assessing the conditional and prior probabilities.

Slides are based on Negnevitsky, Pearson Education, Certainty factors theory and evidential reasoning n Certainty factors theory is a popular alternative to Bayesian reasoning. n A certainty factor (cf ), a number to measure the expert’s belief. The maximum value of the certainty factor is, say, +1.0 (definitely true) and the minimum  1.0 (definitely false). For example, if the expert states that some evidence is almost certainly true, a cf value of 0.8 would be assigned to this evidence.

Slides are based on Negnevitsky, Pearson Education, Uncertain terms and their interpretation in MYCIN

Slides are based on Negnevitsky, Pearson Education, n In expert systems with certainty factors, the knowledge base consists of a set of rules that have the following syntax: IF  evidence  THEN  hypothesis  {cf } where cf represents belief in hypothesis H given that evidence E has occurred.

Slides are based on Negnevitsky, Pearson Education, n The certainty factors theory is based on two functions: measure of belief MB(H,E), and measure of disbelief MD(H,E). p(H) is the prior probability of hypothesis H being true; p(H  E) is the probability that hypothesis H is true given evidence E.

Slides are based on Negnevitsky, Pearson Education, n The values of MB(H, E) and MD(H, E) range between 0 and 1. The strength of belief or disbelief in hypothesis H depends on the kind of evidence E observed. Some facts may increase the strength of belief, but some increase the strength of disbelief. n The total strength of belief or disbelief in a hypothesis:

Slides are based on Negnevitsky, Pearson Education, n Example: Consider a simple rule: IF A is X THEN B is Y An expert may not be absolutely certain that this rule holds. Also suppose it has been observed that in some cases, even when the IF part of the rule is satisfied and object A takes on value X, object B can acquire some different value Z. IF A is X THEN B is Y {cf 0.7}; B is Z {cf 0.2} B is Z {cf 0.2}

Slides are based on Negnevitsky, Pearson Education, n The certainty factor assigned by a rule is propagated through the reasoning chain. This involves establishing the net certainty of the rule consequent when the evidence in the rule antecedent is uncertain: cf (H,E) = cf (E)  cf For example, IF sky is clear THEN the forecast is sunny {cf 0.8} and the current certainty factor of sky is clear is 0.5, then cf (H,E) = 0.5  0.8 = 0.4 This result can be interpreted as “It may be sunny”.

Slides are based on Negnevitsky, Pearson Education, n For conjunctive rules such as the certainty of hypothesis H, is established as follows: cf (H,E 1  E 2 ...  E n ) = min [cf (E 1 ), cf (E 2 ),..., cf (E n )]  cf For example, IF sky is clear AND the forecast is sunny THEN the action is ‘wear sunglasses’ {cf 0.8} and the certainty of sky is clear is 0.9 and the certainty of the forecast of sunny is 0.7, then cf (H,E 1  E 2 ) = min [0.9, 0.7]  0.8 = 0.7  0.8 = 0.56

Slides are based on Negnevitsky, Pearson Education, n For disjunctive rules such as the certainty of hypothesis H, is established as follows: cf (H,E 1  E 2 ...  E n ) = max [cf (E 1 ), cf (E 2 ),..., cf (E n )]  cf For example, IF sky is overcast OR the forecast is rain THEN the action is ‘take an umbrella’ {cf 0.9} and the certainty of sky is overcast is 0.6 and the certainty of the forecast of rain is 0.8, then cf (H,E 1  E 2 ) = max [0.6, 0.8]  0.9 = 0.8  0.9 = 0.72

Slides are based on Negnevitsky, Pearson Education, n When the same consequent is obtained as a result of the execution of two or more rules, the individual certainty factors of these rules must be merged to give a combined certainty factor for a hypothesis. Suppose the knowledge base consists of the following rules: Rule 1:IF A is X THEN C is Z {cf 0.8} Rule 2:IF B is Y THEN C is Z {cf 0.6} What certainty should be assigned to object C having value Z if both Rule 1 and Rule 2 are fired?

Slides are based on Negnevitsky, Pearson Education, C ommon sense suggests that, if we have two pieces of evidence (A is X and B is Y) from different sources (Rule 1 and Rule 2) supporting the same hypothesis (C is Z), then the confidence in this hypothesis should increase and become stronger than if only one piece of evidence had been obtained.

Slides are based on Negnevitsky, Pearson Education, To calculate a combined certainty factor we can use the following equation: where: cf 1 is the confidence in hypothesis H established by Rule 1; cf 2 is the confidence in hypothesis H established by Rule 2;  cf 1  and  cf 2  are absolute magnitudes of cf 1 and cf 2, respectively.

Slides are based on Negnevitsky, Pearson Education, The certainty factors theory provides a practical alternative to Bayesian reasoning. The heuristic manner of combining certainty factors is different from the manner in which they would be combined if they were probabilities. The certainty theory is not “mathematically pure” but does mimic the thinking process of a human expert.

Slides are based on Negnevitsky, Pearson Education, Comparison of Bayesian reasoning and certainty factors n Probability theory is the oldest and best-established technique to deal with inexact knowledge and random data. It works well in such areas as forecasting and planning, where statistical data is usually available and accurate probability statements can be made.

Slides are based on Negnevitsky, Pearson Education, n However, in many areas of possible applications of expert systems, reliable statistical information is not available or we cannot assume the conditional independence of evidence. As a result, many researchers have found the Bayesian method unsuitable for their work. This dissatisfaction motivated the development of the certainty factors theory. n Although the certainty factors approach lacks the mathematical correctness of the probability theory, it appears to outperform subjective Bayesian reasoning in such areas as diagnostics, particularly in medicine.

Slides are based on Negnevitsky, Pearson Education, n Certainty factors are used in cases where the probabilities are not known or are too difficult or expensive to obtain. The evidential reasoning mechanism can manage incrementally acquired evidence, the conjunction and disjunction of hypotheses, as well as evidences with different degrees of belief. n The certainty factors approach also provides better explanations of the control flow through a rule- based expert system.

Slides are based on Negnevitsky, Pearson Education, n The Bayesian method is likely to be the most appropriate if reliable statistical data exists, the knowledge engineer is able to lead, and the expert is available for serious decision-analytical conversations. n In the absence of any of the specified conditions, the Bayesian approach might be too arbitrary and even biased to produce meaningful results. n The Bayesian belief propagation is of exponential complexity, and thus is impractical for large knowledge bases.