Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 Logical Bayesian Networks A knowledge representation view on Probabilistic Logical Models Daan Fierens, Hendrik Blockeel, Jan Ramon, Maurice Bruynooghe.

Similar presentations


Presentation on theme: "1 Logical Bayesian Networks A knowledge representation view on Probabilistic Logical Models Daan Fierens, Hendrik Blockeel, Jan Ramon, Maurice Bruynooghe."— Presentation transcript:

1 1 Logical Bayesian Networks A knowledge representation view on Probabilistic Logical Models Daan Fierens, Hendrik Blockeel, Jan Ramon, Maurice Bruynooghe Katholieke Universiteit Leuven, Belgium

2 2 Probabilistic Logical Models  Variety of PLMs: Origin in Bayesian Networks (Knowledge Based Model Construction) Probabilistic Relational Models Bayesian Logic Programs CLP(BN) … Origin in Logic Programming PRISM Stochastic Logic Programs … THIS TALK - learning - best known - most developed

3 3 Combining PRMs and BLPs  PRMs: + Easy to understand, intuitive - Somewhat restricted (as compared to BLPs)  BLPs: + More general, expressive - Not always intuitive  Combine strengths of both models in one model ?  We propose Logical Bayesian Networks (PRMs+BLPs)

4 4 Overview of this Talk  Example  Probabilistic Relational Models  Bayesian Logic Programs  Combining PRMs and BLPs: Why and How ?  Logical Bayesian Networks

5 5 Example [  Koller et al.]  University: students (IQ) + courses (rating) students take courses (grade) grade  IQ rating  sum of IQ’s  Specific situation: jeff takes ai, pete and rick take lp, no student takes db

6 6 Bayesian Network-structure rating(db)rating(ai)rating(lp) iq(jeff)iq(pete)iq(rick) grade(jeff,ai)grade(rick,lp)grade(pete,lp)

7 7 PRMs [Koller et al.]  PRM: relational schema,dependency structure (+ aggregates + CPDs) keyiq Student keyrating Course key student Takes CPT aggr + CPT course grade Course rating Student iq Takes grade

8 8 PRMs (2) Semantics: PRM induces a Bayesian network on the relational skeleton keyiq jeff? pete? rick? Student keyrating ai? lp? db? Course key student f1jeff f2pete f3rick Takes course grade ai? lp? ?

9 9 PRMs - BN-structure (3) rating(db)rating(ai)rating(lp) iq(jeff)iq(pete)iq(rick) grade(jeff,ai)grade(rick,lp)grade(pete,lp)

10 10 PRMs: Pros & Cons (4)  Easy to understand and interpret  Expressiveness as compared to BLPs, … : Not possible to combine selection and aggregation [Blockeel & Bruynooghe, SRL- workshop ‘03] E.g. extra attribute sex for students rating  sum of IQ’s for female students Specification of logical background knowledge ? (no functors, constants)

11 11 BLPs [Kersting, De Raedt]  Definite Logic Programs + Bayesian networks Bayesian predicates (range) Random var = ground Bayesian atom: iq(jeff) BLP = clauses with CPT rating(C) | iq(S), takes(S,C). Range: {low,high} CPT + combining rule (can be anything) Semantics: Bayesian network random variables = ground atoms in LH-model dependencies  grounding of the BLP

12 12 BLPs (2) student(pete)., …, course(lp)., …, takes(rick,lp). rating(C) | iq(S), takes(S,C). rating(C) | course(C). grade(S,C) | iq(S), takes(S,C). iq(S) | student(S).  BLPs do not distinguish probabilistic and logical/certain/structural knowledge Influence on readability of clauses What about the resulting Bayesian network ?

13 13 BLPs - BN-structure (3) Fragment: iq(jeff) grade(jeff,ai) student(jeff)takes(jeff,ai) student(jeff)iq(jeff) true false distribution for iq/1 ? CPD

14 14 BLPs - BN-structure (3) Fragment: iq(jeff) grade(jeff,ai) student(jeff)takes(jeff,ai) distribution for grade/2, function of iq(jeff) takes(jeff,ai)grade(jeff,ai) true false? CPD

15 15 BLPs: Pros & Cons (4)  High expressiveness: Definite Logic Programs (functors, …) Can combine selection and aggregation (combining rules)  Not always easy to interpret the clauses the resulting Bayesian network

16 16 Combining PRMs and BLPs  Why ? 1 model = intuitive + high expressiveness  How ? Expressiveness: (  BLPs) Logic Programming Intuitive: (  PRMs) Distinguish probabilistic and logical/certain knowledge Distinct components (PRMs: schema determines random variables / dependency structure) (General vs Specific knowledge)

17 17 Logical Bayesian Networks  Probabilistic predicates (variables,range) vs Logical predicates  LBN - components: Relational schema  V Dependency Structure  DE CPDs+ aggregates  DI  Relational skeleton  Logic Program P l Description of DoD / deterministic info

18 18 Logical Bayesian Networks  Semantics: LBN induces a Bayesian network on the variables determined by P l and V

19 19 Normal Logic Program P l student(jeff). course(ai). takes(jeff,ai). student(pete). course(lp). takes(pete,lp). student(rick). course(db). takes(rick,lp).  Semantics: well-founded model WFM(P l ) (when no negation: least Herbrand model)

20 20 V iq(S) <= student(S). rating(C) <= course(C). grade(S,C) <= takes(S,C).  Semantics: determines random variables each ground probabilistic atom in WFM(Pl  V ) is random variable iq(jeff), …, rating(lp), …,grade(rick,lp) non-monotonic negation (not in PRMs, BLPs) grade(S,C) <= takes(S,C), not(absent(S,C)).

21 21 DE grade(S,C) | iq(S). rating(C) | iq(S) <- takes(S,C).  Semantics: determines conditional dependencies ground instances with context in WFM(P l ) e.g. rating(lp) | iq(pete) <- takes(pete,lp) e.g. rating(lp) | iq(rick) <- takes(rick,lp)

22 22 V + DE iq(S) <= student(S). rating(C) <= course(C). grade(S,C) <= takes(S,C) grade(S,C) | iq(S). rating(C) | iq(S) <- takes(S,C).

23 23 LBNs - BN-structure rating(db)rating(ai)rating(lp) iq(jeff)iq(pete)iq(rick) grade(jeff,ai)grade(rick,lp)grade(pete,lp)

24 24 DI  The quantitative component ~ in PRMs: aggregates + CPDs ~ in BLPs: CPDs + combining rules  For each probabilistic predicate p a logical CPD = function with input: set of pairs (Ground prob atom,Value) output: probability distribution for p Semantics: determines the CPDs for all variables about p

25 25 DI (2) e.g. for rating/1 (inputs are about iq/1) If (SUM (iq(S),Val) Val) > 1000 Then 0.7 high / 0.3 low Else 0.5 high / 0.5 low Can be written as logical probability tree (TILDE) sum(Val, iq(S,Val), Sum), Sum > 1000 0.5 / 0.50.7 / 0.3 cf [Van Assche et al., SRL-workshop ‘04]

26 26 DI (3)  DI determines the CPDs e.g. CPD for rating(lp) = function of iq(pete) and iq(rick) Entry in CPD for iq(pete)=100 and iq(rick)=120 ? Apply logical CPD for rating/1 to {(iq(pete),100),(iq(rick),120)} Result: probab. distribution 0.5 high / 0.5 low If (SUM (iq(S),Val) Val) > 1000 Then 0.7 high / 0.3 low Else 0.5 high / 0.5 low

27 27 DI (4)  Combine selection and aggregation? e.g. rating  sum of IQ’s for female students sum(Val, (iq(S,Val), sex(S,fem)), Sum), Sum > 1000 0.5 / 0.50.7 / 0.3 again cf [Van Assche et al., SRL-workshop ‘04]

28 28 LBNs: Pros & Cons / Conclusion  Qualitative part ( V + DE ): easy to interpret  High expressiveness Normal Logic Programs (non-monotonic negation, functors, …) Combining selection and aggregation  Comes at a cost: Quantitative part ( DI ) is more difficult (than for PRMs)

29 29 Future Work: Learning LBNs  Learning algorithms for PRMs & BLPs On high level: appropriate mix will probably do for LBNs LBNs  PRMs: learning quantitative component is more difficult for LBNs LBNs  BLPs: LBNs have separation V vs DE LBNs: distinction probabilistic predicates vs logical predicates = bias (but also used by BLPs in practice)

30 30 ?


Download ppt "1 Logical Bayesian Networks A knowledge representation view on Probabilistic Logical Models Daan Fierens, Hendrik Blockeel, Jan Ramon, Maurice Bruynooghe."

Similar presentations


Ads by Google