Presentation is loading. Please wait.

Presentation is loading. Please wait.

Bayesian Belief Network AI4190. 2 Contents t Introduction t Bayesian Network t KDD Data.

Similar presentations


Presentation on theme: "Bayesian Belief Network AI4190. 2 Contents t Introduction t Bayesian Network t KDD Data."— Presentation transcript:

1 Bayesian Belief Network AI4190

2 2 Contents t Introduction t Bayesian Network t KDD Data

3 3 Introduction  Bayesian network: A graphical model for probabilistic relationships among a set of variables.  The difference between physical probability and personal probability u Physical probability: based on the frequency, need repetition u Personal probability: a person’s degree of belief, do not need repetition.

4 4 Introduction  (Example) Flipping a thumbtack: (Q) What is P(head) after N flippings ? u physical probability of heads: an unknown physical probability is assumed to exist and estimate this from the N observations using criteria as low bias and low variance. u Bayesian probability of heads: denote  as a state of personal information. Find P(X=head |  ). -The uncertainty of parameter : p(  |  ) Compute p(X N = x | D,  ) from prior p(  |  ).

5 5 Introduction p(X N = x | D,  ) = E [  ] wrt. posterior p(  | D,  ). Bayes Rule: p(  | D,  ) = p(  |  ) p( D | ,  ) / p(D |  ) = p(  |  )  h (1 -  ) t / p(D |  ) (where p(D |  ) =  p(D | ,  ) p(  |  ) d  )  For a beta prior Beta(  | a(h), a(t) ), the posterior is also Beta(  | a(h)+h, a(t)+t ) (Conjugate wrt. Binomial). Ex) p(X N = x | D,  ) = E [  ] = a(h)/[a(h)+a(t)] (for prior) = [a(h)+h]/[a(h)+a(t)+N] (for posterior)

6 6 Bayesian Network t Bayesian network  DAG(Directed Acyclic Graph)  Express dependence relations between variables  Can use prior knowledge on the data(parameter) A B C P(A,B,C,D,E) =P(A)P(B|A,D)P(C|B) P(D|A)P(E|C,D) D E  Examples of conjugate priors : Dirichlet for multinomial data, Normal-Wishart for normal data

7 7 Bayesian Network  Bayesian Network: Network Structure (S) + Local Probability (P).  D: Data of N cases. u Each case X has r-categories (Univariate): Use (  (1),  (2),..,  (r)) ~ Dir(a(1), a(2), …, a(r)) ( Dirichlet prior wrt. multinomial.) E(  i ) = a(i) /  a(i)

8 8 Bayesian Network The probability of observing a new case x N+1 in the k-th category: p(X N+1 = x k | D) =   p(X N+1 = x k |  ) p(  | D) = [a(k) + N k ] / [a + N] ( N k : frequency of observing the k-th category in D)  For multivatiate X=(X 1, X 2, … X n ). p(X) =  j p(X j | Pa (X j )) For k-th category of X i and j-th category of Pa(X i ),  (i,j,k) ~ Dir [a(i,j,1), a(i,j 2), …,a(i,j,r(i)] (Dirichlet Prior)

9 9 Bayesian Network N(i,j,k) : frequency of X j = x k under j-th category of Pa(X i ) in data D. X i ~ MN [  (i,j,1),  (i,j,2), …  (i,j,r(i)) ] Corresponding posterior  (i,j,k):  (i,j,k) ~ Dir [a(i,j,1)+N(i,j,1), a(i,j 2)+N(I,j,2), …, a(i,j,r(i)+N(i,j,r(i))].  BDe Score: For the calculation of marginal likelihood or evidence. p(D | S) = [  (  ) /  (  +N)]  [  (  k ) /  (  k + N k ) ] ( p(D | S) =  p(D | , S) p(  | S) d  )

10 10 Bayesian Network  Methods of searching: Greedy, Reverse, Exhaustive ( The prior order or structure of nodes are given.)  For missing values: -Gibbs sampling -Gaussian Approximation -EM -Bound and Collaps etc.

11 11 Bayesian Network  Interpretations: - depends on the prior order of nodes or prior structure. -local conditional probability -choise of nodes -the overall nature of data

12 12 KDD Data t KDD Data  Data: 465 features over 1700 customers u Features include friend promotion rate, date visited, weight of items, price of house, discount rate, … u Data was collected during Jan. 30 – March 30 2000 u Friend promotion was started from 29 Feb. with TV advertisement.  Aims: Description of heavy/low spender

13 13

14 14 KDD Data t Features selected by various ways DecisionTree+Factor Analysis Decision TreeDiscriminant Model V368 (Weight Average) V243 (OrderLine Quantity Sum) V245 (OrderLine Quantity Maximum) F1 = 0.94*V324 + 0.868*V374 + 0.898*V412 F2 = 0.829*V234 + 0.857*V240 F3 = -0.795*V237+ 0.778*V304 V13 (SendEmail) V234 (OrderItemQuantity Sum% HavingDiscountRange(5. 10)) V237 (OrderItemQuantitySum% Having DiscountRange(10.)) V240 (Friend) V243 (OrderLineQuantitySum) V245 (OrderLineQuantity Maximum) V304 (OrderShippingAmtMin) V324 (NumLegwearProduct Views) V368 (Weight Average) V374 (NumMainTemplateViews) V412 (NumReplenishable Stock Views) V240 (Friend) V229 (Order-Average) V304 (OrderShippingAmtMin.) V368 (Weight Average) V43 (Home Market Value) V377 (NumAcountTemplate Views) + V11 (Which DoYouWearMostFrequent) V13 (SendEmail) V17 (USState) V45 (VehicleLifeStyle) V68 (RetailActivity) V19 (Date)

15 15

16 16 KDD Data  A Bayesian net of KDD data u V229 (Order-Average), V240 (Friend) influence directly V312 (Target) u V19 (Date) was influenced by V240 (Friend) reflecting the TV advertisement.


Download ppt "Bayesian Belief Network AI4190. 2 Contents t Introduction t Bayesian Network t KDD Data."

Similar presentations


Ads by Google