Presentation is loading. Please wait.

Presentation is loading. Please wait.

Kansas State University Department of Computing and Information Sciences CIS 890: Special Topics in Real-Time AI Wednesday, March 14, 2001 Haipeng Guo.

Similar presentations


Presentation on theme: "Kansas State University Department of Computing and Information Sciences CIS 890: Special Topics in Real-Time AI Wednesday, March 14, 2001 Haipeng Guo."— Presentation transcript:

1 Kansas State University Department of Computing and Information Sciences CIS 890: Special Topics in Real-Time AI Wednesday, March 14, 2001 Haipeng Guo KDD Research Group Department of Computing and Information Sciences Kansas State University Real Time Bayesian Networks Inference KDD Group Presentation

2 Kansas State University Department of Computing and Information Sciences CIS 890: Special Topics in Real-Time AI Presentation Outline Bayesian Networks Introduction Bayesian Networks Inference Algorithms Review Real Time Related Issues A Distributed Anytime Architecture for Probabilistic reasoning from Santos’ paper[Santos 1995] Summary

3 Kansas State University Department of Computing and Information Sciences CIS 890: Special Topics in Real-Time AI Bayesian Networks Introduction Definition Why is it important? Examples Applications

4 Kansas State University Department of Computing and Information Sciences CIS 890: Special Topics in Real-Time AI Bayesian Networks Bayesian Networks, also called Bayesian Belief networks, causal networks, or probabilistic networks, are a network-based framework for representing and analyzing causal models involving uncertainty A BBN is a directed acyclic graph (DAG) with conditional probabilities for each node. –Nodes represent random variables in a problem domain –Arcs represent conditional dependence relationship among these variables. –Each node contains a CPT(Conditional Probabilistic Table) that contains probabilities of this node being specific values given the values of its parent nodes.

5 Kansas State University Department of Computing and Information Sciences CIS 890: Special Topics in Real-Time AI Family-Out Example " Suppose when I go home at night, I want to know if my family is home before I try the doors.(Perhaps the most convenient door to enter is double locked when nobody is home.) Now, often when my wife leaves the houses, she turns on an outdoor light. However, she sometimes turns on the lights if she is expecting a guest. Also, we have a dog. When nobody is home, the dog is put in the back yard. The same is true if the dog has bowel problems. Finally, if the dog is in the back yard, I will probably hear her barking(or what I think is her barking), but sometimes I can be confused by other dogs. " Family-Out Light-On Bowel-problem Dog-out Hear-bark

6 Kansas State University Department of Computing and Information Sciences CIS 890: Special Topics in Real-Time AI Asia Example from Medical Diagnostics

7 Kansas State University Department of Computing and Information Sciences CIS 890: Special Topics in Real-Time AI Why is BBN important? Offers a compact, intuitive, and efficient graphical representation of dependence relations between entities of a problem domain. (model the world in a more natural way than Rule-based systems and neural network) Handle uncertainty knowledge in mathematically rigorous yet efficient and simple way Provides a computational architecture for computing the impact of evidence nodes on beliefs(probabilities) of interested query nodes Growing numbers of creative applications

8 Kansas State University Department of Computing and Information Sciences CIS 890: Special Topics in Real-Time AI Alarm Example: the power of BBN PCWP CO HRBP HREKG HRSAT ERRCAUTER HR HISTORY CATECHOL SAO2 EXPCO2 ARTCO2 VENTALV VENTLUNG VENITUBE DISCONNECT MINVOLSET VENTMACH KINKEDTUBE INTUBATIONPULMEMBOLUS PAPSHUNT ANAPHYLAXIS MINOVL PVSAT FIO2 PRESS INSUFFANESTHTPR LVFAILURE ERRBLOWOUTPUT STROEVOLUMELVEDVOLUME HYPOVOLEMIA CVP BP The Alarm network 37 variables, 509 parameters (instead of 2 37 )

9 Kansas State University Department of Computing and Information Sciences CIS 890: Special Topics in Real-Time AI Applications Medical diagnostic systems Real-time weapons scheduling Jet-engines fault diagnosis Intel processor fault diagnosis (Intel); Generator monitoring expert system (General Electric); Software troubleshooting (Microsoft office assistant, Win98 print troubleshooting) Space shuttle engines monitoring(Vista project) Biological sequences analysis and classification ……

10 Kansas State University Department of Computing and Information Sciences CIS 890: Special Topics in Real-Time AI Bayesian Networks Inference Given an observed evidence, do some computation to answer queries An evidence e is an assignment of values to a set of variables E in the domain, E = { X k+1, …, X n } –For example, E = e : { Visit Asia = True, Smoke = True} Queries: –The posteriori belief: compute the conditional probability of a variable given the evidence, P(Lung Cancer| Visit Asia = TRUE AND Smoke = TRUE) = ?  This kind of inference tasks is called Belief Updating –MPE: compute the Most Probable Explanation given the evidence  An explanation for the evidence is a complete assignment { X 1 = x 1, …, X n = x n } that is consistent with evidence. Computing a MPE is finding an explanation such that no other explanation has higher probability  This kind of inference tasks is called Belief revision

11 Kansas State University Department of Computing and Information Sciences CIS 890: Special Topics in Real-Time AI Belief Updating The problem is to compute P(X=x|E=e): the probability of query nodes X, given the observed value of evidence nodes E = e. For example: Suppose that a patient arrives and it is known for certain that he has recently visited Asia and has dyspnea. - What’s the impact that this evidence has on the probabilities of the other variables in the network ? P(Lung Cancer) = ? Visit to Asia Smoking Lung Cancer Tuberculosis tub. or lung cancer Bronchitis X-Ray Dyspnea

12 Kansas State University Department of Computing and Information Sciences CIS 890: Special Topics in Real-Time AI Belief Revision Let W is the set of all nodes in our given Bayesian network Let the evidence e be the observation that the roses are okay. Our goal is to now determine the assignment to all nodes which maximizes P(w|e). We only need to consider assignments where the node roses is set to okay and maximize P(w), i.e. the most likely “state of the world” given the evidence that rose is ok in “this world”. The best solution then becomes - P(sprinklers = F, rain = T, street = wet, lawn = wet, soil = wet, roses = okay) = 0.2646

13 Kansas State University Department of Computing and Information Sciences CIS 890: Special Topics in Real-Time AI Complexity of BBN Inference Probabilistic Inference Using Belief Networks is NP-hard. [Cooper 1990] Approximating Probabilistic Inference in Bayesian Belief Networks is NP-hard [Dagum 1993] Hardness does not mean we cannot solve inference. It implies that –We cannot find a general procedure that works efficiently for all networks –However, for particular families of networks, we can have provably efficient algorithms either exact or approximate –Instead of a general exact algorithm, we seek for special case, average case, approximate algorithms –Various of approximate, heuristic, hybrid and special case algorithms should be taken into consideration

14 Kansas State University Department of Computing and Information Sciences CIS 890: Special Topics in Real-Time AI BBN Inference Algorithms Exact algorithms –Pearl’s message propagation algorithm(for single connected networks only) –Variable elimination –Cutset conditioning –Clique tree clustering –SPI(Symbolic Probabilistic Inference) Approximate algorithms –Partial evaluation methods by performing exact inference partially –Variational approach by exploiting averaging phenomena in dense networks(law of large numbers) –Search based algorithms by converting inference problem to an optimization problem, then using heuristic search to solve it –Stochastic sampling also called Monte Carlo algorithms

15 Kansas State University Department of Computing and Information Sciences CIS 890: Special Topics in Real-Time AI PolyTree Singly Connected Networks(or Polytrees)Singly Connected Networks(or Polytrees) Definition : A directed acyclic graph (DAG) in which at most one undirected path exists between any two nodes. Do not satisfy definition Polytree structure satisfies definition Multiple parents and/or multiple children

16 Kansas State University Department of Computing and Information Sciences CIS 890: Special Topics in Real-Time AI Propagation Algorithm Objective The algorithm’s purpose is “… fusing and propagating the impact of new evidence and beliefs through Bayesian networks so that each proposition eventually will be assigned a certainty measure consistent with the axioms of probability theory.” (Pearl, 1988, p 143) Data

17 Kansas State University Department of Computing and Information Sciences CIS 890: Special Topics in Real-Time AI PolyTree Propagation Example “The impact of each new piece of evidence is viewed as a perturbation that propagatesthroughthe network via message-passing betweenneighboring variables...” (Pearl, 1988, p 143) Data Message to Parent  Message from Parent –Exact algorithm, for Polytree only, linear in the size of the network

18 Kansas State University Department of Computing and Information Sciences CIS 890: Special Topics in Real-Time AI Cutset Conditioning Algorithm Transfer the network into several simpler polytrees by conditioning the cutset and then call the Polytree propagation algorithm. Each simple network has one or more variable instantiated to a definite value. P(X|E) is computed as a weighted average over the values computed by each polytree. [Pearl 1988] A cutset is a set of nodes when instantiated will render the network single connected. conditioning –First exact algorithm for multiple connected networks, exponential time complexity in the size of the cutset. –There are exponentially many such cutset instantiations

19 Kansas State University Department of Computing and Information Sciences CIS 890: Special Topics in Real-Time AI Clique Tree Clustering Algorithm Transform the network into a tree of cliques, then computes probabilities for the cliques during a two-way message passing and the individual node probabilities P(X|E) are calculated from the probabilities of cliques A clique W of G is a maximal complete subset of G, that is, there is no other complete subset of G which properly contains W The most common used exact inference algorithm for general networks Efficient for sparse networks, but could have a very bad performance for more general, dense networks Exact, for multiple connected networks, exponential time complexity in the size of the network AB CD E Clique 1: {A, B, C, D} Clique 2: {B, D, E}

20 Kansas State University Department of Computing and Information Sciences CIS 890: Special Topics in Real-Time AI Clique tree clustering A D BE G C H FA D BE G C H F A1A1 D8D8 B2B2 E3E3 G5G5 C4C4 H7H7 F6F6 D8D8 C4C4 A1A1 B2B2 E3E3 C4C4 B2B2 G5G5 E3E3 C4C4 G5G5 F6F6 E3E3 G5G5 H7H7 C4C4 Clique 6 Clique 5 Clique 4 Clique 3 Clique 1 Clique 2 Clq 1 Clq 2 Clq 3 Clq 4 Clq 5 Clq 6 Moralization Triangulation Identify Cliques Form Clique Tree,  Message passing,  Message passing P(Clq i ) and P(X|E)

21 Kansas State University Department of Computing and Information Sciences CIS 890: Special Topics in Real-Time AI Variable Elimination Algorithm General idea: Write query in the form Iteratively –Move all irrelevant terms outside of innermost sum –Perform innermost sum, getting a new term –Insert the new term into the product Computation depends on order of elimination, a “good” elimination orderings can reduce complexity The size of the largest clique in the induced graph is thus an indicator for the complexity of variable elimination. This quantity is called the induced width of a graph according to the specified ordering Finding an ordering that minimizes the induced width is NP-Hard Exact, for all networks, exponential time complexity, inefficient

22 Kansas State University Department of Computing and Information Sciences CIS 890: Special Topics in Real-Time AI SPI(Symbolic Probabilistic Inference) General idea: Transform BBN inference problem into a well-defined combinatorial optimization problem - the Optimal Factoring Problem(OFP). Thus the problem becomes to find an optimal factoring given a set of probability distribution. The solution of the OFP is then used to combine the CPT that describe the BBN and extract the desired marginal distribution. OFP itself is NP-Hard. Exact, for all networks, exponential time complexity, inefficient Factoring 1 needs 72 multiplications Factoring 2 needs only 28 multiplications

23 Kansas State University Department of Computing and Information Sciences CIS 890: Special Topics in Real-Time AI Approximate Algorithms Exact Inference for large-scale networks is apparently infeasible. Real life network can be up to thousands nodes. For example: QMR(Quick medical Reference) consists of a combination of statistical and expert knowledge for approximately 600 significant diseases and 4000 findings. The median size of the maximal clique of the moralized graph is 151.5 nodes. It’s intractable for all exact inference algorithms. Approximate algorithms can be categorized into: –Partial evaluation methods by performing exact inference partially –Variational approach by exploiting averaging phenomena in dense networks(law of large numbers) –Search based algorithms by converting inference problem to an optimization problem, then using heuristic search to solve it –Stochastic sampling also called Monte Carlo algorithms

24 Kansas State University Department of Computing and Information Sciences CIS 890: Special Topics in Real-Time AI Perform Exact Algorithm Partially General idea: reduce the complexity by reducing the solution space Partial sets of nodes instantiation Partial sets of hypotheses Partial set of nodes –“Bounded conditioning”[Cooper 1991] –“Localized partial evaluation”[Draper 1994] –“incremental SPI”[D’Ambrosio 1993] –“Probabilistic partial evaluation”[Poole 1997] –“Mini-buckets” algorithm[Dechter 1997] Approximate, for all networks, complexity not clear

25 Kansas State University Department of Computing and Information Sciences CIS 890: Special Topics in Real-Time AI Variational Method General idea: exploit averaging phenomena in dense graph A sum can be avoided if it contains a sufficient number of terms such that a law of large numbers can be invoked Graphically, the model is transformed into a sub-graph of the original model in which some of the finding nodes are delinked until it’s possible to run an exact algorithm on the resulting graph. [Jaakkola & Jordan 1999] Approximate, efficient, for dense graph only

26 Kansas State University Department of Computing and Information Sciences CIS 890: Special Topics in Real-Time AI Search based algorithms General idea: Convert the problem into an optimization problem then use heuristic search to solve it. Consider node instantiations across the entire graph Exploiting characteristics of problem domain to help search A general hop is that a relatively small fraction of the exponentially many node instantiations contains a majority of the probability mass, and by exploring the high probability instantiations(bounding the unexplored probability mass) one can obtain reasonable bounds on posterior probabilities. –Cooper 1985, Peng & Reggia 1987, Henrion 1991 –Best-first search(A * ), linear programming, genetic algorithm –Charniak 1994, Santos 1993, Carlos 1993 Approximate, heuristic, maybe fail

27 Kansas State University Department of Computing and Information Sciences CIS 890: Special Topics in Real-Time AI Stochastic Sampling Algorithms General idea: Run repeated simulations according to the BBN, the probability of an event of interest is estimated using the frequency with which that event occurs in a set of samples. –Logic sampling [henrion 1988] –forward sampling –backward sampling [Fung 1994] –Likelihood weighting [Fung & Chang 1990] –Important sampling [Shachter 1990] Approximate, performance depends only on the CPTs, can handle very large networks, but has difficulty with extremely unlikely events.

28 Kansas State University Department of Computing and Information Sciences CIS 890: Special Topics in Real-Time AI Inference Algorithm Conclusions The general problem of exact inference is NP-Hard. The general problem of approximate inference is NP-Hard. Exact inference works for small, sparse networks only. No single champion either exact or inference algorithms. The goal of research should be that of identifying effective approximate techniques that work well in large classes of problems. Another direction is the integration of various kinds of approximate and exact algorithms exploiting the best characteristics of each algorithm.

29 Kansas State University Department of Computing and Information Sciences CIS 890: Special Topics in Real-Time AI A Distributed Anytime Inference Architecture “On a Distributed Anytime Architecture for Probabilistic Reasoning” Air Force Institute of Technology Eugene Santos Jr., 1995

30 Kansas State University Department of Computing and Information Sciences CIS 890: Special Topics in Real-Time AI Anytime algorithms To meet the demand for real-time inference, an inference algorithm must have two capibilities: –Provide a near optimal solution at any given moment –Improving upon solutions as more time and resources are allocated Algorithms which have this property of producing a solution at any point in time are called “anytime” algorithms

31 Kansas State University Department of Computing and Information Sciences CIS 890: Special Topics in Real-Time AI Anywhere Algorithms To exploit parallelism and distributed processing to reduce the time complexity, the tasks in the distributed environment must be able to exploit intermediate results produced by the other components of the system. Algorithms with this property are called “anywhere” algorithms. When different algorithms having both anytime and anywhere properties are harnessed together into a cooperative system, the resultant architecture can exploit the best characteristics of each algorithm.

32 Kansas State University Department of Computing and Information Sciences CIS 890: Special Topics in Real-Time AI The OVERMIND Architecture Part of PESKI, an online expert system for engine diagnosis for the Space Shuttle Program Three components: –IRA(Intelligent Resource Allocator) Manages and allocates available computing resources –OVERSEER(Overseer task Manager) Initiates new tasks, directs messages/information –LOTS(Library of Tasks) A set of BBN inference algorithms suitable for performing various including an A * search algorithm, a genetic algorithm, an integer linear programming algorithm and a hybrid stochastic algorithm(HySS)

33 Kansas State University Department of Computing and Information Sciences CIS 890: Special Topics in Real-Time AI General Idea The best algorithm to use is problem-instance dependent. In a set of anywhere algorithms, if each particular algorithm is good at certain portion of a problem we can then take the partial solution of an algorithm and pass it to another approach which itself works better on the new portion This leads to an anytime anywhere solution

34 Kansas State University Department of Computing and Information Sciences CIS 890: Special Topics in Real-Time AI Genetic Algorithms A heuristic search algorithm modeled after natural genetic evolutions Has anytime and anywhere property. No stopping criterion that guarantees an optimal answer. Its ability to generate solutions early can serve as a starting point if possible for other deterministic algorithm.

35 Kansas State University Department of Computing and Information Sciences CIS 890: Special Topics in Real-Time AI Best-First Search(A * ) A heuristic algorithm searching for optimal solution from initial state Provide an approximate answer when interrupted Allow the algorithm to accept initial guess from another sources Use Best-first search to find the most probable complete instantiation among those compatible with the guess

36 Kansas State University Department of Computing and Information Sciences CIS 890: Special Topics in Real-Time AI IRA(Intelligent Resource Allocator) Serve to maximize processor use by coordinating requests for resources from OVERSEER and the tasks themselves. Hardware: a network of workstations Identify resource requirements for different tasks –GA: single CPU –ILP: multi processing

37 Kansas State University Department of Computing and Information Sciences CIS 890: Special Topics in Real-Time AI The OVERSEER(Task manager) Currently simple messager role Advance capabilities involve deliberation scheduling: employing meta-reasoning to consider what computational tasks to execute. To do this, some estimate of runtime and quality of results should be available foe each algorithm.

38 Kansas State University Department of Computing and Information Sciences CIS 890: Special Topics in Real-Time AI Implementation and results The strengths of different methods are combined together: –Gas produce reasonable solution immediately –A * took those solutions near some maximas –HySS fine-tuned those maximas –ILP finished the optimization and generated te optimal solution Result: –Initial test: multiple instances of GAs –GAs: 20% speed up –HySS: 3%~5% speed up –A * and ILP: 15~25% speed up

39 Kansas State University Department of Computing and Information Sciences CIS 890: Special Topics in Real-Time AI Summary Exploited the anytime anywhere properties of several inference algorithms such as Gas, ILP and A * and unified them into a single model of parallel computation. The architecture can use the best characteristics of each algorithm.

40 Kansas State University Department of Computing and Information Sciences CIS 890: Special Topics in Real-Time AI Future Research Consider more algorithms Study the relationship between the problem domain and the corresponding solutions domain to help deliberation scheduling.

41 Kansas State University Department of Computing and Information Sciences CIS 890: Special Topics in Real-Time AI The End Any Questions ?

42 Kansas State University Department of Computing and Information Sciences CIS 890: Special Topics in Real-Time AI Linear Programming The problem of finding the most probable explanation has been transformed into an integer linear programming problem with a set of constraints to satisfied. Efficient algorithms for linear programming can be used to compute the optimal solution


Download ppt "Kansas State University Department of Computing and Information Sciences CIS 890: Special Topics in Real-Time AI Wednesday, March 14, 2001 Haipeng Guo."

Similar presentations


Ads by Google