# STOCHASTIC METHODS Misbah Arif, Kayras, Abid ARTIFICIAL INTELLIGENCE

## Presentation on theme: "STOCHASTIC METHODS Misbah Arif, Kayras, Abid ARTIFICIAL INTELLIGENCE"— Presentation transcript:

STOCHASTIC METHODS Misbah Arif, Kayras, Abid ARTIFICIAL INTELLIGENCE
5.0 Introduction 5.1 The Elements of Counting 5.2 Elements of Probability Theory 5.3 Applications of the Stochastic Methodology 5.4 Bayes’ Theorem Presented By: Misbah Arif, Kayras, Abid George F Luger ARTIFICIAL INTELLIGENCE Structures and Strategies for Complex Problem Solving Luger: Artificial Intelligence, 5th edition. © Pearson Education Limited, 2005

5.0. Introduction Stochastic Methods- Next problem solving methodology after heuristics "Stochastic" means being or having a random variable or “pertaining to chance” or probabilistic reasoning. Definition “A stochastic process is one whose behavior is non-deterministic and is determined both by the process's predictable actions and by a random element (non predicatble actions)” By definition “A stochastic process is one whose behavior is non-deterministic and is determined both by the process's predictable actions and by a random element (non predicatble actions). Stochastic process are complex systems whose practitioners, even if experts, acknowledge that outcomes result from both known and unknown causes.” Example of a Teacher - An instructor may apply methodology to teaching which when practiced is the same for all students in the course; however, all students absorb knowledge & information in different ways and thus a strict pattern of teaching can have different effects on the overall outcome of each student.

Applications of Stochastic Methodology
Diagnostic Reasoning Decides symptoms and their causes Example: Fever caused either by flu or an infection Natural Language Understanding Supports understanding of language Planning and Scheduling During planning, no deterministic sequence of operation is guaranteed to succeed. To solve problems, as in simulated annealing, stochastic neural networks, stochastic optimization, and genetic algorithms Applications of Stochastic Methodology Diagnostic Reasoning (in which cause/ effect relationships are not always captured in a purely deterministic fashion). A Diagnostic

5.1. Elements of Counting 5.1.1. Addition Rules
|A|= no of elements in set A U= Universal Sets A’ =U-A A ⊆ B (A is the subset of B) AUB (A union B) A B (A intersection B) |AUC |= |A|+|C|- | A C | |AUBUC| = ?

Elements of Counting--- cont
Multiplication Rules – If we have two sets of elements A and B of size a and b respectively, then there are a*b unique ways of combining the elements of sets together Cartesian Product |A*B|=|A|*|B| Permutations and Combinations P (n, r) = n! / (n − r)! C (n, r) = P (n, r)/ r! = n! / (n − r)! r!

5.2. Elements of Probability Theory

Probability Probability
the Likelihood of occurrence of a specified event, often represented as a number between 0 (never) and 1 (always) a mathematic ratio of the number of times something will occur to the total number of possible outcomes P(A) =   The Number Of Ways Event A Can Occur The total number Of Possible Outcomes

Example of a Spinner.. Problem 1:   A spinner has 4 equal sectors colored yellow, blue, green and red. What are the chances of landing on blue after spinning the spinner? Outcomes:  The possible outcomes of this experiment are yellow, blue, green, and red. Probabilities:  P(blue)  =  # of ways to land on blue total # of colors

Can You Solve this?? Problem 2:
A single 6-sided die is rolled. What is the probability of each outcome? What is the probability of rolling an even number? of rolling an odd number???

Important Terms Event - An outcome to a random occurrence. For example, the following are events: drawing a black card from a 52-card deck, a coin landing on heads, rolling an even number on a 6-sided die. Probability, P(A) - The likelihood of event A occurring. For example, the probability of a coin landing on heads is 0.5 since there is a 50% chance (i.e., probability) of the coin turning up heads when flipped. 10

Example Problem 3:   What is the probability that a 7 or an 11 is the result of the roll of two fair dice ? Solution:   1. First determine the Sample Space and Event 2. Determine the atomic event i.e. combinations of two dice that can give 7 3. Determine their probability 3. Determine the probability of rolling 11 and then both 7 and 11

So……. Atomic event: (1,6), (2,5),(3,4),(4,3),(5,2),(6,1) Event: 7
Sample Space: 36 (6 *6) Probability : 6/36= 0.16

Axioms The probability of any event E from the sample space S is:
The sum of the probabilities of all possible outcomes is 1 The probability of the compliment of an event is Intersection of Events, P(A∩B) - When two events are fulfilled simultaneously. For example, the intersection of the events "rolling an even number" and "rolling a number less than three" is rolling a 2 since rolling a 2 fulfills both events (i.e., 2 is both even and less than 3). Union of Events, P(A∪B) - When either of two events is fulfilled. For example, the union of the events "rolling an even number" and "rolling a number less than three" is rolling a 2, 4, 6 (an even number) or rolling a 1, 2 (a number less than three). The union of events includes not only the probability of A and B, but also the probability of only A and the probability of only B. P(A∪B) = P(A) + P(B) - P(A∩B) The last term is important since it removed items that appear in both A and B, thereby avoiding double counting. Consider the following example, meant to clarify the importance of the - P(A∩B) term in the formula for P(A∪B). When rolling a die, what is the probability of rolling a number greater than 1 or a number that is even? The presence of the word "or" is a clue that the question is asking about the union of events. Let Event A = rolling a number greater than 1 Let Event B = rolling a number that is even P(A) = 5/6 since there are five possibilities and six total numbers that could be rolled P(B) = 3/6 since there are three possibilities (i.e., 2, 4, 6) and six total numbers that could be rolled (i.e., 1, 2, 3, 4, 5, 6) P(A∪B) ≠ 5/6 + 3/6 since: (1) this number is greater than one, which is impossible for a probability (2) this number double counts the even numbers greater than 1 (i.e., 2, 4, 6) P(A∩B) = rolling a number greater than 1 and even = 3/6 since there are three possibilities (i.e., 2, 4, 6) and six total numbers that could be rolled (i.e., 1, 2, 3, 4, 5, 6) P(A∪B) = 5/6 + 3/6 - 3/6 = 5/6The union of events includes not only the probability of A and B, but also the probability of only A and the probability of only B. When rolling a die, what is the probability of rolling a number greater than 1 or a number that is even? The presence of the word "or" is a clue that the question is asking about the union of events. Let Event A = rolling a number greater than 1 Let Event B = rolling a number that is even P(A) = 5/6 since there are five possibilities and six total numbers that could be rolled P(B) = 3/6 since there are three possibilities (i.e., 2, 4, 6) and six total numbers that could be rolled (i.e., 1, 2, 3, 4, 5, 6) P(A∪B) ≠ 5/6 + 3/6 since: (1) this number is greater than one, which is impossible for a probability (2) this number double counts the even numbers greater than 1 (i.e., 2, 4, 6) P(A∩B) = rolling a number greater than 1 and even = 3/6 since there are three possibilities (i.e., 2, 4, 6) and six total numbers that could be rolled (i.e., 1, 2, 3, 4, 5, 6) P(A∪B) = 5/6 + 3/6 - 3/6 = 5/6 The probability of the contradictory or false outcome of an event 13

Explanation with Example
P(Blue) ¼= 0.25 P(Red) P(Yellow) P(Green) P (S) 1

Overview of Previous Lecture
Stochastic Process- non deterministic process Stochastic Process are related to probabilistic reasoning Probability- Likelihood of occurrence of a specified event P(A) =   The Number Of Ways Event A Can Occur The total number Of Possible Outcomes Sample Space- Set of all possible outcomes Sum of Probabilities of all possible outcomes = 1

Today’s Lecture Multiple Events
Mutually Exclusive and Non-Mutually Exclusive Events Independent and Dependent Events Types of Probability Unconditional or Prior Probability Conditional or Posterior Probability Bayes’ Thoerom

Probability of Multiple Events
For multiple events, the answer combines the probabilities for each event in 2 different ways If two events have to occur together, generally an "and" is used. Statement 1: "I will only be happy today if I get and win the lottery." The "and" means that both events are expected to happen together. If both events do not necessarily have to occur together, an "or" may be used as in Statement 2, "I will be happy today if I win the lottery or have ." For questions involving single events, the formula for simple probability is sufficient. For questions involving multiple events, the answer combines the probabilities for each event in ways that may seem counter-intuitive. The following strategy is excellent for acquiring a better feel for probability questions involving multiple events or for making a quick guess if time is short. We will focus on questions involving two events. If two events have to occur together, generally an "and" is used. Take a look at Statement 1: "I will only be happy today if I get and win the lottery." The "and" means that both events are expected to happen together. If both events do not necessarily have to occur together, an "or" may be used as in Statement 2, "I will be happy today if I win the lottery or have ." Consider Statement 1. Your chances of getting may be relatively high compared to your chances of winning the lottery, but if you expect both to happen, your chances of being happy are slim. Like placing all your bets at a race on one horse, you've decreased your options, and therefore you've decreased your chances. The odds are better if you have more options, say if you choose horse 1 or horse 2 or horse 3 to win. In Statement 2, we have more options; in order to be happy we can either win the lottery or get . Source:

Multiple Events- cont.. These two types of probability are formulated as follows: Probability of A and B P(A and B) = P(A) × P(B) {probability is smaller than the individual probabilities of either A or B} Probability of A or B P(A or B) = P(A) + P(B) – P(A and B) {probability is greater than the individual probabilities of either A or B } These two types of probability are formulated as follows: Probability of A and B P(A and B) = P(A) × P(B). In other words, the probability of A and B both occurring is the product of the probability of A and the probability of B. Probability of A or B P(A or B) = P(A) + P(B). In other words, the probability of A or B occurring is the sum of the probability of A and the probability of B (this assumes A + B cannot both occur). If there is a probabiilty of A and/or B occuring, then you must subtract the overlap. The issue here is that if a question states that event A and event B must occur, you should expect that the probability is smaller than the individual probabilities of either A or B. If the question states that event A or event B must occur, you should expect that the probability is greater than the individual probabilities of either A or B. This is an excellent strategy for eliminating certain answer choices.

Mutually Exclusive Events
Two or more events are said to be mutually exclusive if the occurrence of any one of them means the others will not occur (That is, we cannot have 2 events occurring at the same time) For example, throwing a die once can yield a 5 or 6, but not both, in the same toss. Thus if E1 and E2 are mutually exclusive events, then P(E1 and E2) = 0. Example Male students and Female Students If E1 and E2 are mutually exclusive events: P(E1 or E2) = P(E1) + P(E2)

Non-Mutually Exclusive
Two or more events are said to be non- mutually exclusive if the occurrence of any one of them means the others will also occur (That is, we can have 2 events occurring at the same time) An example for non-mutually exclusive events could be: E1 = students in the swimming team E2 = students in the debating team If E1 and E2 are not mutually exclusive events: P(E1 or E2) = P(E1) + P(E2) − P(E1 and E2) Or P(E1 U E2) = P(E1) U P(E2) − P(E1 ∩ E2)

Example Problem 1: It is known that the probability of obtaining zero defectives in a sample of 40 items is 0.34 whilst the probability of obtaining 1 defective item in the sample is What is the probability of obtaining not more than 1 defective item in a sample? Problem 2: The probability that a student passes Mathematics is 2/3 and the probability that he passes English is 4/9 . If the probability that he will pass at least one subject is 4/5 , What is the probability that he will pass both subjects? It is known that the probability of obtaining zero defectives in a sample of 40 items is 0.34 whilst the probability of obtaining 1 defective item in the sample is What is the probability of (a) obtaining not more than 1 defective item in a sample? Answer (a) Mutually exclusive, so P(E1 or E2) = P(E1) + P(E2) = = 0.8 It is possible for a student to either: Pass math only Pass English only Pass both math and English So we conclude that these are not mutually exclusive events. We have: P(E1 or E2) = P(E1) + P(E2) − P(E1 and E2) So

Independent Events Two events are independent if the occurrence of one event does not affect the probability of the occurrence of the other. For example, the probability of flipping a coin twice and the coin landing on heads the second time is not affected by (i.e. is independent of) whether the first coin flip turned up heads or tails. P(A and B) = P(A) × P(B) Independent Events - Two events are independent if the occurrence of one event does not affect the probability of the occurrence of the other. For example, the probability of flipping a coin twice and the coin landing on heads the second time is not affected by (i.e., is independent of) whether the first coin flip turned up heads or tails. An independent event is the opposite of a dependent event. 23

What is the difference between Independent and Mutually Exclusive Events?
Mutually Exclusive- If A occurs, B cannot occur Independent Events- Outcome of A doesn't affect outcome of B If result of P(A) * P (B) = 0, then A and B are mutually exclusive and independent events

Dependent events If the outcome of one event affects the outcome of another, then the events are said to be Dependent Events. Real-world Connections for Dependent Events If we wake up late, we will be late to office. If it rains, we use an umbrella. For dependent events P(A and B) = P(A) × P(B\A) Trick is to figure out ahead of time if the events are independent or dependent, and then use the formula Now for dependent events. A dependent event is one where the outcome of the second event is influenced by the outcome of the first event. For example, let's say we have a box with 6 marbles: 3 red, 1 blue, 1 green and 1 yellow. What's the probability of picking a yellow marble? We know that probablity is 1/6. What's the probability of picking a blue marble? Can it be 1/6 also? Well, it could be if we put back the first marble we picked. But if we don't put back the first marble, our sample space will have changed. We started with six marbles, picked one, and now we only have five marbles in the sample space, so the probability of picking a blue marble is now 1/5. And in such a case we have dependent events, because something about the first one changed the second one. The probability of two dependent events occurring, one right after the other, is still found by using the same formula: P(A,B) = P(A) * P(B) The big difference is that the individual probabilities won't have the same sample spaces. So from our example, what is the probability of picking a yellow marble and then a blue marble, without putting the first marble back? P(Yellow) = 1/6 P(Blue) = 1/5 P(Y,B) = 1/6 * 1/5 = 1/30 This is a very different number from what we would get if the events were independent, that is if the sample space remained the same because we put the first marble we picked back into the box. Then: P(Yellow) = 1/6 P(Blue) = 1/6 P(Y,B) = 1/6 * 1/6 = 1/36 So the trick is to figure out ahead of time if the events are independent or dependent, and then use the formula: P(A,B) = P(A) * P(B) I hope this answers your question. Please write back if you need more help. 25

Can You Tell?? Choices: 2 2 and 3 1 and 3 4 3 and 4
Which of the following are dependent events Getting an even number in the first roll of a number cube and getting an even number in the second roll Getting an odd number on the number cube and spinning blue color on the spinner. Fair die is tossed twice. Find the probability of getting a 4 or 5 on the first toss and a 1, 2, or 3 in the second toss Getting a face card in the first draw from a deck of playing cards and getting a face card in the second draw. (The first card is not replaced.) Choices: 2 2 and 3 1 and 3 4 3 and 4

Step 1: In (1), rolling a number cube two times are two independent events. Step 2: In (2), rolling an odd number and spinning blue color are two independent events. Step 3: In (3), getting 1, 2, or 3 doesn’t depends on 4 or 5, so they are independent events Step 4: In (4), since the first card is not replaced back, the probability of the second draw depends on the first draw. Step 5: So, the two events in (4) are dependent events.

Example Problem 4:   For combination of 3 bits, determine whether event containing odd number of 1s is independent of and event of bit strings ends at 0? Solution:   1. Calculate atomic events of Event containing odd number of 1s and Atomic events of Event of bit strings ends at 0 2. Calculate atomic events of Event containing both odd number of 1s and bit strings ends at 0 See buk page 173 28

Example Problem 5 A box contains 3 white marbles and 4 black marbles. What is the probability of picking 2 black marbles and 1 white marble in succession without replacement? Solution p (A)= Probability of picking 1st black marble: 4/7 p (B\A) = On the second draw the probability of picking 2nd black marble: 3/6= ½ p (C \ B and A) = On the third draw the probability of picking a white marble is: 3/5 Therefore, the probability of drawing 2 black marbles and 1 white marble is: p (A) * p (B\A) * p (C\B and A) = 4/7 * 1/2 * 3/5 = 6/35

Types of Probability Prior Probability or Unconditional Probability
Probabilities that worked out prior to have any new information about the expected outcome of events in a particular situation. Prior probability of a person having a disease is the number of people with the disease divided by the total number of people in the domain Symbolized by: p (event)

Types of Probability- cont..
Posterior or Conditional Probability Probability of an event given some new evidence or information on that event If E1 and E2 are two events, the probability that E2 occurs given that E1 has occurred is called the conditional probability and is denoted by P(E2|E1). where, P(E2|E1) of E2 given that E1 has occurred.

Example of Conditional Probability
Let A denote the event `student is female' and let B denote the event `student is French'. In a class of 100 students suppose 60 are French, and suppose that 10 of the French students are females. Find the probability that if I pick a French student, it will be a girl, that is, find P(A|B). Since 10 out of 100 students are both French and female, then P(A and B) = 10/100 Also, 60 out of the 100 students are French, so P(B) = 60/100 So the required probability is:

5.3. Applications of Stochastic Methodology

Probabilistic Finite State Machine
In this section, examples are there to use probability measures to reason about the interpretation of ambiguous information. Probabilistic Finite State Machine Probabilistic Finite State Machine is a Finite State Machine where the next state function is a probability distribution over the full set of states of the machine. Probabilistic Finite State Acceptor Probabilistic Finite State Machine is an acceptor, when one or more states are indicated as the start states or one or more as the accept states. Addition of non determinism Next state function is a probability distribution over al possible next states

Example – Pronunciation of Tomato

Example- Hidden Markov Models
The Hidden Markov Model is a finite set of states, each of which is associated with a (generally multidimensional) probability distribution. Transitions among the states are governed by a set of probabilities called transition probabilities. In a particular state an outcome or observation can be generated, according to the associated probability distribution. It is only the outcome, not the state visible to an external observer and therefore states are ``hidden'' to the outside; hence the name Hidden Markov Model. In a hidden Markov model, the state is not directly visible, but output dependent on the state is visible. Each state has a probability distribution over the possible output tokens.

5.4. Bayes’ Theorem

Definitions Prior Probability Posterior Probability Likelihood
Unconditional probabilities of our hypothesis before we get any data or any NEW evidence. Simply speaking, it is the state of our knowledge before the data is observed Posterior Probability A conditional probability about our hypothesis (our state of knowledge) based on the new data Likelihood The conditional probability based on our observation data given that our hypothesis holds Prior probability: Unconditional probabilities of our hypothesis before we get any data or any NEW evidence. Simply speaking, it is the state of our knowledge before the data is observed. posterior probability: A conditional probability about our hypothesis (our state of knowledge) after we revised based on the new data. Likelihood is the conditional probability based on our observation data given that our hypothesis holds. 38

Bayes’ Theorem Bayes’ Solution to a problem of “inverse probability”
The Theorem – Relates cause & effect in such a way that by understanding the effect we can learn the probability of its causes. Importance – Important for determining the causes of diseases such as cancer. Useful for determining the effects of some particular medication on that disease Thomas Bayes ( )

Understanding Bayes’ Theorem
Marie is getting married tomorrow, at an outdoor ceremony in the desert. In recent years, it has rained only 5 days each year. Unfortunately, the weatherman has predicted rain for tomorrow. When it actually rains, the weatherman correctly forecasts rain 90% of the time. When it doesn't rain, he incorrectly forecasts rain 10% of the time. What is the probability that it will rain on the day of Marie's wedding?

Solution: The sample space is defined by two mutually-exclusive events - it rains or it does not rain. Additionally, a third event occurs when the weatherman predicts rain. Notation for these events appears below: Event A1. It rains on Marie's wedding. Event A2. It does not rain on Marie's wedding Event B. The weatherman predicts rain.

In Probabilistic Terms …
We know the following: P( A1 ) = 5/365 = [It rains 5 days out of the year.] P( A2 ) = 360/365 = [It does not rain 360 days out of the year.] P( B | A1 ) = 0.9 [When it rains, the weatherman predicts rain 90% of the time.] P( B | A2 ) = 0.1 [When it does not rain, the weatherman predicts rain 10% of the time.]

Cont.. We want to know P( A1 | B ), the probability it will rain on the day of Marie's wedding, given a forecast for rain by the weatherman. The answer can be determined from Bayes' theorem, as shown below.

Note the somewhat unintuitive result
Note the somewhat unintuitive result. Even when the weatherman predicts rain, it only rains only about 11% of the time. Despite the weatherman's gloomy prediction, there is a good chance that Marie will not get rained on at her wedding.

The general form of Bayes’ theorem where we assume the set of hypotheses H partition the evidence set E: Luger: Artificial Intelligence, 5th edition. © Pearson Education Limited, 2005 22 47

Application of Bayes Theorem in AI

Basic rules Conditional probability: P(A|B)= P(AB) / P(B) if P(B)≠0 Product rule: P(AB) = P(A|B) P(B) Bayes’ Rule: P(A|B)= P(B|A)P(A) / P(B) Why important?? Assesing diagnostic probability from causal probability: P(Cause|Effect)=P(Effect|Cause)P(Cause)/P(Effect) E.g., if M is meningitis, S is a stiff neck, then we can compute: P(M|S)=P(S|M)P(M)/P(S)=0,8*0,0001/0,1=0,0008 (posterior probability of meningitis is still very small!) Makes connection between diagnostic and causal probability. 49

Non deterministic Algorithm
In computer science , a nondeterministic algorithm is an algorithm with one or more choice points where multiple different continuations are possible, without any specification of which one will be taken. Use In algorithm design, nondeterministic algorithms are often used as specifications. This is natural when the problem solved by the algorithm inherently allows multiple outcomes, or when there is a single outcome but there are multiple ways to get there and we simply don't care which of them is chosen. What these cases have in common is that the nondeterministic algorithm always arrives at a valid solution, no matter which choices are made at the choice points encountered underway. 50

Example Example 1: Shopping list
Consider a shopping list: a list of items to buy. It can be interpreted in two ways: The instruction to buy all of those items, in any order. This is a (nondeterministic algorithm) The instruction to buy all of those items, in the order given. This is a deterministic algorithm. 51

Monte Carlo Methods Monte Carlo methods are a class of computational algorithms that rely on repeated random sampling to compute their results. Monte Carlo methods are often used when simulating physical and mathematical systems. Because of their reliance on repeated computation of random or pseudo-random numbers, Monte Carlo methods are most suited to calculation by a computer. Monte Carlo methods tend to be used when it is unfeasible or impossible to compute an exact result with a deterministic algorithm. 52

Monte Carlo Methods – cont..
More broadly, Monte Carlo methods are useful for modeling phenomena with significant uncertainty in inputs, such as the calculation of risk in business. It is a widely successful method in risk analysis when compared with alternative methods or human intuition. When Monte Carlo simulations have been applied in space exploration and oil exploration, actual observations of failures, cost overruns and schedule overruns are routinely better predicted by the simulations than by human intuition or alternative "soft" methods 53