Presentation is loading. Please wait.

Presentation is loading. Please wait.

- 1 - Outline Introduction to the Bayesian theory –Bayesian Probability –Bayes’ Rule –Bayesian Inference –Historical Note Coin trials example Bayes rule.

Similar presentations


Presentation on theme: "- 1 - Outline Introduction to the Bayesian theory –Bayesian Probability –Bayes’ Rule –Bayesian Inference –Historical Note Coin trials example Bayes rule."— Presentation transcript:

1 - 1 - Outline Introduction to the Bayesian theory –Bayesian Probability –Bayes’ Rule –Bayesian Inference –Historical Note Coin trials example Bayes rule and inference –Elementary definition: Mahadevan. Probability, reliability, and statistical methods in engineering design. Wiley, 2000. –Modern definition: Gelman, Andrew, et al. Bayesian data analysis. CRC press, 2003. Example: Inference about a genetic probability

2 - 2 - Bayesian Probability What is Bayesian probability ? –Classical (Frequentist) versus Bayesian (Alternative) definition –Classical: relative frequency of an event, given many repeated trials –Bayesian: degree of belief that it was true based on evidence at hand Saturn mass estimation –Classical: mass is fixed but unknown. –Bayesian: mass in probabilistic way based on the observations, which represents our degree of belief. Data Analysis, A Bayesian Tutorial, DS Sivia, 1996. Bayesian approach Classical approach Not suitable to tackle this problem Always single value after many trials ?

3 - 3 - Bayes’ Rule What is Bayes’ rule ? –Probability distribution (degree of belief) is obtained by combining prior knowledge (subjective) and observed data (objective). Features –Integrated framework for uncertainty : aleatory uncertainty (classical statistics) & epistemic uncertainty (insufficient data, vague knowledge). –Bayesian updating: as more data provided, posterior PDF is used as prior at next, and values are updated to more confident information. Updated prior PDF Observed data added Prior distribution Likelihood function Observed data Posterior distribution

4 - 4 - Bayesian Inference Deductive vs Inductive process –Deductive process: deduce outcome from hypothesis. –Inductive process: infer hypothesis(Cause) from outcome(observation). –In the Bayesian approach, inductive problem is easily solved by using the Bayes rule. Deduction (forward): If A then B. Cause Observed data Induction (reverse): B & C observed. Thus, A is supported Cause  =10  =2 Get data x. X ~ N(  ) Given  Observed data Observeddata x Estimate  Prior for  if available  

5 - 5 - Historical Note Birth of Bayesian –Thomas Bayes (priest) proposed Bayes’ theory (1763):  of Binomial dist. is estimated using observed data. Laplace discovered, put his name (1812), generalized to many prob’s. –For more 100 years, Bayesian “degree of belief” was rejected due to vague and subjectivity. Objective “frequency” was accepted in statistics. –Jeffreys (1939) rediscovered, made modern theory (1961). Until 80s, still only a theory due to requirement for computation. Flourishing of Bayesian –From year 1990, due to rapid growth of HW & SW, became practical. –Bayesian technique for practical use developed by math & statisticians, applied to areas of science (economics, medical) & engineering., 1999

6 - 6 - Coin Trials Example Problem –For a weighted (uneven) coin, probability of front side is to be determined based on the experiments. –Assume the true  is 0.78, obtained after ∞ trials. But we don’t know this. Only infer based on experiments. Bayesian parameter estimation Experiment data: x times out of n trials. 4 out of 5 trials 78 out of 100 trials Experiment data: x times out of n trials. 4 out of 5 trials 78 out of 100 trials Prior knowledge on p 0 (  ) 1. No prior information 2. Normal dist centered at 0.5 with  =0.05 3. Uniform distribution [0.5, 0.7] Prior knowledge on p 0 (  ) 1. No prior information 2. Normal dist centered at 0.5 with  =0.05 3. Uniform distribution [0.5, 0.7] This is the parameter  to be estimated.

7 - 7 - Coin Trials Example Estimated results –Case 1: no prior  4 out of 5 trials  78 out of 100 –Case 2: normal dist. N(0.5,0.05)  4 out of 5 trials  78 out of 100 Due to wrong prior, only a great # trials will give true answer. –Case 3: unif. dist. [0.5 0.7]  4 out of 5 trials  78 out of 100 can’t exceed the barrier due to the incorrect prior. –Followings are the degree of belief on “probability of front side  ” as represented by PDF.

8 - 8 - Monte Carlo simulation Coin Trials Example Posterior prediction –Once we have posterior PDF of , we can predict future trials using this. –Let’s say we have obtained posterior PDF of  from 78 out of 100 trials. What is the predicted probability to get all 5 fronts when try 5 times ? Compute p based on each  binom(5,5,  ) Compute p based on each  binom(5,5,  ) Posterior PDF of  10,000 samples of predicted p median5% CI95% CI 0.2820.1720.416 If we know exact value 0.78, binom(5,5,0.78) = 0.78 5 = 0.289 10,000 samples of  Draw random samples of  from PDF Posterior prediction process Estimation process

9 - 9 - Bayes rule and inference Axioms of probability –P(E): probability of an event E. –Let us consider two events E 1, E 2. –For mutually exclusive events, –When occurrence of a event precludes the other, mutually exclusive. –When joint occurrence E 1 ∩E 2 = E 1 E 2 exists, Conditional probability –When joint occurrence exists, –Symbol | denotes conditional probability or probability of event given the other has occurred. –For statistically independent events, –If mutually exclusive, we can’t have joint occurrence. Mahadevan. Probability, reliability, and statistical methods in engineering design. Wiley, 2000.

10 - 10 - Bayes rule and inference Example 1 –E1: 80, E2: ≥90. P(E1∩E2) = ? P(E1UE2) = ? –Are E1, E2 mutually exclusive ? –E1: 80, 100. E2: ≥90. P(E1∩E2) = ? P(E1UE2) = ? –Are E1, E2 mutually exclusive ?

11 - 11 - Bayes rule and inference Example 2 –P(F U E) = ? –Are F, E mutually exclusive ? Are F, E statistically independent ? Example 3 –P(F U S) ? –Can we draw Venn diagram ? –P(F U S) when F, S are independent ? –Which probability is higher ?, i.e., more conservative ?

12 - 12 - Bayes rule and inference Total probability (Forward problem) –Consider damage caused by 3 events: fire (F), wind (W), earthq (E). –Given P(F),P(W),P(E), then the probability of D: where P(D|E i ) is likelihood of D given E i. Bayes rule (Inverse problem) –Given D, then the probability of F: –This is evident from D FWE

13 - 13 - Bayes rule and inference Example 4 –P(D) ? –P(F|D) ?

14 - 14 - Bayes rule and inference Bayes rule for modern applications –Original is modified to –In Gelman, where often called marginal distribution of y. This represents sum of all possible values of . –Interpretations Prior probability: our state of knowledge about the truth of the hypothesis before we experience the data. This belief is modified by the likelihoods of the experienced observations. Then Posterior probability is obtained, which represents updated state of knowledge about the truth of the hypothesis. In this sense, Bayes’ theorem gives us the process of learning.

15 - 15 - Bayes rule and inference Posterior prediction –Make inference about an unknown observable. Bayes rule was to make inference on . Posterior prediction is to make inference on another y, denoted by y ̃ conditional on the observed data y. –Interpretations Posterior because it is conditional on the observed y Predictive because it is prediction for a new observation y ̃

16 - 16 - Bayes rule and inference Example 5 –P(F|D), P(W|D), P(E|D) without P(D) ?

17 - 17 - 1.4 Example: Inference about a genetic probability Problem –Statement Let Xg be normal gene whereas Xb is affected gene. If two Xb’s found in woman, she is in disease. If one Xb found in man, he is in disease. Let us estimate probability of Xb of a normal woman. –Observations A woman’s father is normal  Xg Y; Brother is disease  Xb Y. Woman’s mother normal but  Xg Xb because of the brother. She has two sons y1 & y2, neither of whom is affected. Husband is normal  Xg Y. –Prior knowledge Regarding the Xb of the woman, we have no prior knowledge. Bayesian inference –Unknown parameters and observations: Introduce  for X such that if  =0, X=Xg, if  =1, X=Xb. Introduce y for outcome such that if y=0, normal, if y=1, disease. Then the observations are y1=0 & y2=0.

18 - 18 - 1.4 Example: Inference about a genetic probability Formulation –Prior distribution –Likelihood from observations –Posterior distribution Estimation result –Probability that she doesn’t have bad gene is 80%. –Posterior PDF of 

19 - 19 - 1.4 Example: Inference about a genetic probability Posterior prediction –What is the probability of disease if she has third son? –Answer is 0.01 Bayesian updating by adding more data –Third son is found to be normal. How is  updated ? –In this case, previous posterior becomes now prior. –Observation is only one y. –Answer is 0.111


Download ppt "- 1 - Outline Introduction to the Bayesian theory –Bayesian Probability –Bayes’ Rule –Bayesian Inference –Historical Note Coin trials example Bayes rule."

Similar presentations


Ads by Google