Presentation is loading. Please wait.

Presentation is loading. Please wait.

Probabilistic networks Inference and Other Problems Hans L. Bodlaender Utrecht University.

Similar presentations


Presentation on theme: "Probabilistic networks Inference and Other Problems Hans L. Bodlaender Utrecht University."— Presentation transcript:

1 Probabilistic networks Inference and Other Problems Hans L. Bodlaender Utrecht University

2 Probabilistic networks - IPA fall days2 Overview Probabilistic networks The inference problem Tree decompositions and an algorithm for probabilistic inference The Maximum Probable Assignment problem Monotonicity

3 Probabilistic networks - IPA fall days3 Decision support systems Reasoning with uncertainty Decision support systems (and/or expert systems) Reasoning with uncertainty –Set of stochastic variables Observations Other variables Variable(s) of interest In 1980s the probabilistic network model was proposed. –Also called: Bayesian networks, belief networks, graphical models

4 Probabilistic networks - IPA fall days4 Probabilistic networks Directed acyclic graph Each node is a (discrete) stochastic variable E.g. Boolean variable Given for each variable is its conditional probability distribution: Conditional to values for the parents of the node Pr(x 1 )=0.7 Pr(¬x 1 )=0.3 Pr(x 1 )=0.7 Pr(¬x 1 )=0.3 x1x1 x2x2 x3x3 x4x4 x5x5 Pr(x 5 |x 3 )= 0.6 Pr(¬x 5 |x 3 )= 0.4 Pr(x 5 |¬x 3 )= 0.2 Pr(¬x 5 |¬x 3 )=0.8 Pr(x 5 |x 3 )= 0.6 Pr(¬x 5 |x 3 )= 0.4 Pr(x 5 |¬x 3 )= 0.2 Pr(¬x 5 |¬x 3 )=0.8 Pr(x 4 | x 2 and x 3 ) = 0.12 etc. Pr(x 4 | x 2 and x 3 ) = 0.12 etc. … … … …

5 Probabilistic networks - IPA fall days5

6 6

7 7 A probabilistic network consists of A directed acyclic graph G=(V,E) For each vertex (variable) v, a table with conditional probabilities (conditional to values of parents of v) v wx Pr(v | w and x) Pr(¬v | w and x) Pr(v | ¬w and x) Pr(¬v | ¬w and x) Pr(v | w and ¬x) Pr(¬v | w and ¬x) Pr(v | ¬w and ¬x)Pr(¬v | ¬w and ¬x) Example

8 Probabilistic networks - IPA fall days8 Configuration A configuration c is an assignment of a value to each variable (node). For set W of variables, or variable v, and configuration c, denote c W and c v for the restrictions (partial configurations). Probability of configuration c:

9 Probabilistic networks - IPA fall days9 Topological sort of directed acyclic graph Order of vertices such that edges go from left to right: –List vertices v 1, …, v n such that for each arc (v i,v j ): i < j. Always exists for dag, and can be found in O(|V|+|E|) time.

10 Probabilistic networks - IPA fall days10 Pr(v 2 |v 1 ) = 0.3 Pr(¬v 2 |v 1 ) = 0.7 Pr(v 2 |¬v 1 ) = 0.4 Pr(¬v 2 |¬v 1 ) = 0.6 Generating a random configuration Make a topological sort of G For i= 1 to n do generate a value for v i using the probabilities dictated by values already generated for the parents of i Pr(v 1 )=0.7 Pr(¬v 1 )=0.3 Pr(v 1 )=0.7 Pr(¬v 1 )=0.3 x1x1 x2x2 x3x3 x4x4 x5x5 Pr(v 5 |v 3 )= 0.6 Pr(¬v 5 |v 3 )= 0.4 Pr(v 5 |¬v 3 )= 0.2 Pr(¬v 5 |¬v 3 )=0.8 Pr(v 5 |v 3 )= 0.6 Pr(¬v 5 |v 3 )= 0.4 Pr(v 5 |¬v 3 )= 0.2 Pr(¬v 5 |¬v 3 )=0.8 … … … …

11 Probabilistic networks - IPA fall days11 Inference problem Given: values for some variables (observations) c O Question: probability distribution on one variable conditional to observations, or:

12 Probabilistic networks - IPA fall days12 Use of inference problem Network models information from application domain (medical, agricultural, weather forecasting, …) User gives values for some variables (symptoms of patient, observed values) and wants to know distribution for other variables (likeliness of success of treatment, diagnostic) Used nowadays in many applications

13 Probabilistic networks - IPA fall days13

14 Probabilistic networks - IPA fall days14 Inference problem is #P-complete #P-completeness implies NP-hardness. Proof of #P-hardness: –Number of satisfying truth assignments of 3CNF formula is #P-complete E.g.: (x 1 or x 2 or ¬x 4 ) and (x 5 or ¬ x 1 or ¬x 3 ) and … –Transform to probabilistic network

15 Probabilistic networks - IPA fall days15 Example transformation (x 1 or x 2 or ¬ x 4 ) and … Pr(x i ) = 0.5 Pr(¬x i ) = 0.5 T: Probability 1 when satisfied; otherwise 0 T: Probability 1 when both parents true Probability equals #sat / 2 n x1x1 x2x2 x3x3 x4x4 x5x5 x6x6

16 Probabilistic networks - IPA fall days16 Lauritzen-Spiegelhalter algorithm Using tree decompositions to solve the problem Fast when width of tree decomposition is small Tree decomposition of moralisation of G: –Tree with each node a bag: set of variables –For all v: Bags with v form connected subtree There is a bag containing v and its parents (bag covers v) x1x1 x2x2 x3x3 x4x4 x5x5 x1x1 x2x2 x2x2 x3x3 x3x3 x4x4 x4x4 x5x5

17 Probabilistic networks - IPA fall days17 LS algorithm Here: description without observations. Take node with variable of interest in it as root. Compute for each node i with bag X of T a table –For each assignment c X of values to the variables in X, with Y the variables covered in the subtree with root i, compute v i (c x ): X extends Y

18 Probabilistic networks - IPA fall days18 Computing tables bottom up A table for a node can be computed when the tables of the children are known E.g., compute tables in postorder (bottom up)

19 Probabilistic networks - IPA fall days19 Example: node with two children with all bags identical X XX i j1j2 Y1Y1 Y2Y2

20 Probabilistic networks - IPA fall days20 LS algorithm For other types of nodes, similarly table can be computed. Time for one table linear in size of table: bag size k gives time O(2 k ) for binary variables. Linear time when bag size bounded by constant (bounded treewidth). Happens often in practice! Table of root allows to compute distribution for variables in root bag Similar scheme when observations are given; when variables are discrete but not all binary Scheme with also moving downwards in tree to compute distribution for all variables: also linear time for bounded treewidth

21 Probabilistic networks - IPA fall days21 MAP problem Given: probabilistic network, some observations Question: most likely configuration given the observations Applications: most likely explanation, verification of design of probabilistic networks

22 Probabilistic networks - IPA fall days22 MAP is NP-hard (x 1 or x 2 or ¬ x 4 ) and … Pr(x i ) = 0.5 Pr(¬x i ) = 0.5 T: Probability 1 when satisfied or Y is T otherwise 1/2 x1x1 x2x2 x3x3 x4x4 x5x5 x6x6 Shimoney, 1994

23 Probabilistic networks - IPA fall days23 MAP with tree decompositions Similar algorithm as for inference can solve MAP in linear time when tree decomposition of moralisation with bounded bag size (treewidth) is given Compute for c X : X Y

24 Probabilistic networks - IPA fall days24 Fixed parameter variant of MAP MAP(p): Given: probabilistic network Question: is there a configuration with probability at least p? Can be solved in O(f(p) n) time, i.e., linear for fixed p. Joint work with van der Gaag and van den Eijkhof. Similar result when there are observations (values for some variables), and we look to a configuration consistent with the observations

25 Probabilistic networks - IPA fall days25 Algorithm uses branch and bound Look at variables in order of a topological sort Recursive process: –Branch for assignment of value to next variable –Plus … bounding mechanism Start here v 1 =Tv 1 =F v 1 =T, v 2 =T v 1 =T, v 2 =F v 1 =T, v 2 =T v 1 =T, v 2 =F v 1 =T, v 2 =T, v 3 =T v 1 =T, v 2 =T, v 3 =T …

26 Probabilistic networks - IPA fall days26 Bounding Recall: Parents of v are before v in topological sort Compute for a node z in branch and bound tree with assigned values P(z) can be computed from P(parent(z)) and choice for i th variable Bound when P(z) < p: this can never be a solution

27 Probabilistic networks - IPA fall days27 Recursive scheme E-MPA-p(values for first i variables, p, pz) If i=n (we have done all variables), then return true (output the sequence); stop. Else: For each possible value x for v i+1 : –Compute pznew = pz * Pr(v i+1 =x | values for first i vertices) –If pznew  p, then E-MPA-p( values for first i variables and then x, p, pznew)

28 Probabilistic networks - IPA fall days28 Time analysis If a node has at least two children in the tree, then –For each child, pznew  p, hence Pr(v i+1 =x | values for first i vertices)  p –Hence: pznew  pz * (1-Pr(v i+1 =x | values for first i vertices))  pz * (1-p) After a node in the tree with two children, value of pz is a factor at least (1-p) smaller Tree has at most log p / log (1-p) leaves. (How often can you divide 1 by 1-p till you are smaller than p?) Time is O(f(p) * n).

29 Probabilistic networks - IPA fall days29 Partial MAP Variant of MAP where we ask for values to subset of variables with maximum probability, given some observations Park: NP PP -complete, and NP-complete when G is a polytree (underlying undirected graph is a tree)

30 Probabilistic networks - IPA fall days30 Monotonicity Joint work with Linda van der Gaag and Ad Feelders Monotonicity is often a requested property of a probabilistic network –E.g.: if a patient has more severe symptons, one expects the diagnosis is more severe Ordering on the values of variables c X  c’ X if for all x in X: c X (x)  c’ X (x) Two observations that are ordered should imply ordering of probabilities of values for variable of interest (formal definition follows).

31 Probabilistic networks - IPA fall days31 Monotonicity in mode Let z be the output variable. The mode of z given values c X for some other variables X: T(z | c X ) is that value for z such that Pr(z| c x ) is maximal. (+ tie-breaking rule) Take ordering on values of each variable. The probabilistic network with observable nodes X and output variable z is isotone when each pair of value assignments to X, c X,c’ X, one has: –c X  c’ X implies T(z | c X )  T(z | c’ X ) Antitone: c X  c’ X implies T(z | c X )  T(z | c’ X ) Monotone: isotone or antitone Monotone in distribution: similar, but looking to cumulative distribution. Identical to monotonicity in mode when all variables are binary.

32 Probabilistic networks - IPA fall days32 Results Testing if network is monotone (isotone, antitone) in mode (in distribution) is: –coNP PP complete –coNP-complete for polytrees

33 Probabilistic networks - IPA fall days33 Hardness proof (sketch) Transformation from variant of Partial MAP problem –Can we set values for M, such that Pr(E=T|c M ) > p ? Pr(A=T| E=T) = 1 Pr(A=T| E=F) = (1/2 –p)/(1-p) Pr(C=T| A, B) = 1 if A and B F, otherwise 0 Proof shows that the new network is monotone in mode, and monotone in distribution, if and only if there is a c M with Pr(E=T|c M ) > p G: instance of Partial MAP E AB C M M U B set of observable variables; C variable of interest

34 Probabilistic networks - IPA fall days34 Conclusions Probabilistic (belief, Bayesian) networks form mathematical precise model Used in several decision support system Use and design of networks pose interesting challenges, many algorithmic Sometimes special structures help (tree decompositions), also in practice


Download ppt "Probabilistic networks Inference and Other Problems Hans L. Bodlaender Utrecht University."

Similar presentations


Ads by Google