S TATISTICAL R ELATIONAL L EARNING Joint Work with Sriraam Natarajan, Kristian Kersting, Jude Shavlik.

S TATISTICAL R ELATIONAL L EARNING Joint Work with Sriraam Natarajan, Kristian Kersting, Jude Shavlik

B AYESIAN N ETWORKS BurglaryEarthquake Alarm JohnCalls eba 000.1 010.8 100.6 110.9 e 0.01 b MaryCalls

B AYESIAN N ETWORK FOR A C ITY BurglaryEarthquake Alarm Calls(H1)Calls(H3) BurglaryEarthquake Alarm Calls(H2) BurglaryEarthquake Alarm Calls(H2)Calls(H4) BurglaryEarthquake Alarm Calls(H3)Calls(H5) BurglaryEarthquake Alarm Calls(H4)Calls(H6) H1 H2 H3 H4 H5

S HARED V ARIABLES Earthquake(BL) Alarm(H1) Alarm(H2) Alarm(H3) Alarm(H4) Burglary(H4) Burglary(H2) Burglary(H3) Burglary(H1) Calls(H1) Calls(H4) Calls(H5) Calls(H2)Calls(H3)

F IRST O RDER L OGIC Burglary(house) Earthquake(city) Alarm(house) Calls(nhouse) HouseInCity(house, city) Alarm(house) :- HouseInCity(house, city), Earthquake(city), Burglary(house) eba 000.1 010.8 100.6 110.9 Neighbor(house, nhouse)

L OGIC + P ROBABILITY = S TATISTICAL R ELATIONAL L EARNING M ODELS Logic Probabilities Add Probabilities Statistical Relational Learning (SRL) Add Relations PRating CRating Diff

A LPHABETIC S OUP  Knowledge-based model construction [Wellman et al., 1992]  PRISM [Sato & Kameya 1997]  Stochastic logic programs [Muggleton, 1996]  Probabilistic relational models [Friedman et al., 1999]  Bayesian logic programs [Kersting & De Raedt, 2001]  Bayesian logic [Milch et al., 2005]  Markov logic [Richardson & Domingos, 2006]  Relational dependency networks [Neville & Jensen 2007]  ProbLog [De Raedt et al., 2007] And many others!

R ELATIONAL D ATABASE ProfLevel ProfCourseRating CourseDiff StudentCourseGrade StudentIQSatisfaction

F IRST O RDER L OGIC Prof(P) Level(P,L) Diff(C) Course(C) taughtBy(P,C) ratings(P,C,R) Student(S) IQ(S,I) satis(S,B) takes(S,C) grde(S,C,G) ProfLevel ProfCourseRating CourseDiff StudentCourseGrade StudentIQSatisfaction

G RAPHICAL M ODEL satisfaction(S, B) Diff(S, C, D)grades(S, C, G) avgGrade(S, G) avgDiff(S, D) P(satisfaction(S, B) | avgGrade(S, G), avgDiff(D))

R ELATIONAL D ECISION T REE speed(X,S), S>120 job(X, politician) knows(X,Y) job(Y, politician) N N N Y Y no yes no yes no yes NameSpeedJobFine Bob120TeacherN Alice150WriterN John180PoliticianN Mary160StudentY Mike140EngineerY Person1Person2 AliceJohn MaryMike MaryAlice BobMike BobMary

R ELATIONAL D ECISION T REE NameSpeedJobFine Bob120TeacherN Alice150WriterN John180PoliticianN Mary160StudentY Mike140EngineerY Person1Person2 AliceJohn MaryMike MaryAlice BobMike BobMary speed(Alice,150), 150>120 job(X, politician) knows(X,Y) job(Y, politician) N N N Y Y no yes no yes no yes

R ELATIONAL D ECISION T REE NameSpeedJobFine Bob120TeacherN Alice150WriterN John180PoliticianN Mary160StudentY Mike140EngineerY Person1Person2 AliceJohn MaryMike MaryAlice BobMike BobMary speed(Alice,150), 150>120 job(Alice, politician) knows(X,Y) job(Y, politician) N N N Y Y no yes no yes no yes

R ELATIONAL D ECISION T REE NameSpeedJobFine Bob120TeacherN Alice150WriterN John180PoliticianN Mary160StudentY Mike140EngineerY Person1Person2 AliceJohn MaryMike MaryAlice BobMike BobMary speed(Alice,150), 150>120 job(Alice, politician) knows(Alice,John) job(Y, politician) N N N Y Y no yes no yes no yes

R ELATIONAL D ECISION T REE NameSpeedJobFine Bob120TeacherN Alice150WriterN John180PoliticianN Mary160StudentY Mike140EngineerY Person1Person2 AliceJohn MaryMike MaryAlice BobMike BobMary speed(Alice,150), 150>120 job(Alice, politician) knows(Alice,John) job(John, politician) N N N Y Y no yes no yes no yes

R ELATIONAL P ROBABILITY T REES Use probabilities on the leaves Can be used to represent the conditional distributions Can use regression values on leaves to represent regression functions speed(X,S), S>120 job(X, politician) knows(X,Y) job(Y, politician) 0.1 0.2 0.4 0.8 no yes no yes no yes

S TRUCTURE L EARNING P ROBLEM Learn the structure of the conditional distributions Find the parents and the distribution for the target concept satisfaction(S, B) avgGrade(S, G)avgDiff(S, D) IQ(S, I) level(P, L)

R ELATIONAL T REE L EARNING 20 student(X ) paper(X,Y) 0.7-0.2 -0.9 student(X) = T paper(X,Y) = Tpaper(X,Y) = F student(X) = F XΔ x10.7 x2-0.2 x3-0.9 XY x1y1 x1y2 x3y1 X x1 x2 paper(X, Y) student(X) adviser(X) XΔ x10.7 x2-0.2 XΔ x3-0.9 XΔ x2-0.2 XΔ x10.7 0.25

Sequentially learn models where each subsequent model corrects the previous model F UNCTIONAL G RADIENT B OOSTING Data Predictions - Residues = Initial Model + + Induce Iterate Final Model = ++ + + … ψmψm Natarajan et al MLJ’12

B OOSTING A LGORITHM For each gradient step m=1 to M For each query predicate, P Generate trainset using previous model, F m-1 Learn a regression function, T m,p For each example, x Compute gradient for x Add to trainset Add T m,p to the model, F m

UW-CSE AUC-ROCAUC-PRLikelihood Training Time Boosting0.960.930.819 s RDN0.880.780.801 s Alchemy0.530.620.7393 hrs Predict advisedBy relation Given student, professor, courseTA, courseProf, etc relations 5-fold cross validation http://pages.cs.wisc.edu/~tushar/rdnboost/index.html

CARDIA Family history, medical history, physical activity, nutrient intake, obesity questions, pysochosocial, pulmonary function etc Goal is to identify risk factors in early adulthood that causes serious cardio-vascular issues in older adults Extremely rich dataset with 25 years of information S. Natarajan, J. Carr

R ESULTS

I MITATION L EARNING Expert agent performs actions (trajectories) Goal: Learn a policy from these trajectories to suggest actions based on current state Natarajan et al. IJCAI’11

Gridworld domainRobocup domain

A LZHEIMER ' S R ESEARCH AD – Progressive neurodegenerative condition resulting in loss of cognitive abilities and memory MRI – neuroimaging method Visualization of brain anatomy Humans are not very good at identifying people with AD, especially before cognitive decline MRI data – major source for distinguishing AD vs CN (Cognitively normal) or MCI vs CN Natarajan et al. Under review

P ROPOSITIONAL M ODELS ( WITH AAL)

C ONCLUSION Statistical Relational Learning combines first-order logic with probabilistic models Relational trees used to represent conditional distributions Boosting trees can be used to efficiently learn structure of SRL models

S TATISTICAL R ELATIONAL L EARNING Joint Work with Sriraam Natarajan, Kristian Kersting, Jude Shavlik.

Similar presentations

Presentation on theme: "S TATISTICAL R ELATIONAL L EARNING Joint Work with Sriraam Natarajan, Kristian Kersting, Jude Shavlik."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

S TATISTICAL R ELATIONAL L EARNING Joint Work with Sriraam Natarajan, Kristian Kersting, Jude Shavlik.

Similar presentations

Presentation on theme: "S TATISTICAL R ELATIONAL L EARNING Joint Work with Sriraam Natarajan, Kristian Kersting, Jude Shavlik."— Presentation transcript:

Similar presentations

About project

Feedback