CSE 574 – Artificial Intelligence II Statistical Relational Learning Instructor: Pedro Domingos.

Slides:



Advertisements
Similar presentations
University of Texas at Austin Machine Learning Group Department of Computer Sciences University of Texas at Austin Discriminative Structure and Parameter.
Advertisements

CPSC 322, Lecture 30Slide 1 Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 30 March, 25, 2015 Slide source: from Pedro Domingos UW.
CSE 5522: Survey of Artificial Intelligence II: Advanced Techniques Instructor: Alan Ritter TA: Fan Yang.
Markov Logic Networks Instructor: Pedro Domingos.
A Hierarchy of Independence Assumptions for Multi-Relational Bayes Net Classifiers School of Computing Science Simon Fraser University Vancouver, Canada.
Foundations of Comparative Analytics for Uncertainty in Graphs Lise Getoor, University of Maryland Alex Pang, UC Santa Cruz Lisa Singh, Georgetown University.
Data Visualization STAT 890, STAT 442, CM 462
Markov Logic: A Unifying Framework for Statistical Relational Learning Pedro Domingos Matthew Richardson
Modelling Relational Statistics With Bayes Nets School of Computing Science Simon Fraser University Vancouver, Canada Tianxiang Gao Yuke Zhu.
APRIL, Application of Probabilistic Inductive Logic Programming, IST Albert-Ludwigs-University, Freiburg, Germany & Imperial College of Science,
CS480/580 Introduction to Artificial Intelligence Shuiwang Ji.
Introduction to Introduction to Artificial Intelligence Henry Kautz.
EECS 349 Machine Learning Instructor: Doug Downey Note: slides adapted from Pedro Domingos, University of Washington, CSE
Statistical Relational Learning for Link Prediction Alexandrin Popescul and Lyle H. Unger Presented by Ron Bjarnason 11 November 2003.
CSE 546 Data Mining Machine Learning Instructor: Pedro Domingos.
CSE 574: Artificial Intelligence II Statistical Relational Learning Instructor: Pedro Domingos.
1 Data Mining Techniques Instructor: Ruoming Jin Fall 2006.
Representing Uncertainty CSE 473. © Daniel S. Weld 2 Many Techniques Developed Fuzzy Logic Certainty Factors Non-monotonic logic Probability Only one.
Relational Models. CSE 515 in One Slide We will learn to: Put probability distributions on everything Learn them from data Do inference with them.
Ranking by Odds Ratio A Probability Model Approach let be a Boolean random variable: document d is relevant to query q otherwise Consider document d as.
Learning, Logic, and Probability: A Unified View Pedro Domingos Dept. Computer Science & Eng. University of Washington (Joint work with Stanley Kok, Matt.
Learning Programs Danielle and Joseph Bennett (and Lorelei) 4 December 2007.
1 Learning the Structure of Markov Logic Networks Stanley Kok & Pedro Domingos Dept. of Computer Science and Eng. University of Washington.
CSE 590ST Statistical Methods in Computer Science Instructor: Pedro Domingos.
Data Mining – Intro.
CIS 410/510 Probabilistic Methods for Artificial Intelligence Instructor: Daniel Lowd.
Statistical Relational Learning Pedro Domingos Dept. Computer Science & Eng. University of Washington.
CSE 515 Statistical Methods in Computer Science Instructor: Pedro Domingos.
A simple method for multi-relational outlier detection Sarah Riahi and Oliver Schulte School of Computing Science Simon Fraser University Vancouver, Canada.
Practical Probabilistic Relational Learning Sriraam Natarajan.
Relational Probability Models Brian Milch MIT 9.66 November 27, 2007.
Web Query Disambiguation from Short Sessions Lilyana Mihalkova* and Raymond Mooney University of Texas at Austin *Now at University of Maryland College.
Bayesian networks Classification, segmentation, time series prediction and more. Website: Twitter:
IJCAI 2003 Workshop on Learning Statistical Models from Relational Data First-Order Probabilistic Models for Information Extraction Advisor: Hsin-His Chen.
CPSC 502, Lecture 15Slide 1 Introduction to Artificial Intelligence (AI) Computer Science cpsc502, Lecture 16 Nov, 3, 2011 Slide credit: C. Conati, S.
第十讲 概率图模型导论 Chapter 10 Introduction to Probabilistic Graphical Models
Markov Logic And other SRL Approaches
ICML-Tutorial, Banff, Canada, 2004 Kristian Kersting University of Freiburg Germany „Application of Probabilistic ILP II“, FP
Collective Classification A brief overview and possible connections to -acts classification Vitor R. Carvalho Text Learning Group Meetings, Carnegie.
Text Feature Extraction. Text Classification Text classification has many applications –Spam detection –Automated tagging of streams of news articles,
Empirical Research Methods in Computer Science Lecture 7 November 30, 2005 Noah Smith.
1 Generative and Discriminative Models Jie Tang Department of Computer Science & Technology Tsinghua University 2012.
Introduction to Bioinformatics Biostatistics & Medical Informatics 576 Computer Sciences 576 Fall 2008 Colin Dewey Dept. of Biostatistics & Medical Informatics.
Markov Logic Networks Pedro Domingos Dept. Computer Science & Eng. University of Washington (Joint work with Matt Richardson)
Learning to “Read Between the Lines” using Bayesian Logic Programs Sindhu Raghavan, Raymond Mooney, and Hyeonseo Ku The University of Texas at Austin July.
ICML-Tutorial, Banff, Canada, 2004 Kristian Kersting University of Freiburg Germany „Application of Probabilistic ILP II“, FP
27-18 września Data Mining dr Iwona Schab. 2 Semester timetable ORGANIZATIONAL ISSUES, INDTRODUCTION TO DATA MINING 1 Sources of data in business,
AdvancedBioinformatics Biostatistics & Medical Informatics 776 Computer Sciences 776 Spring 2002 Mark Craven Dept. of Biostatistics & Medical Informatics.
CPSC 322, Lecture 33Slide 1 Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 33 Nov, 30, 2015 Slide source: from David Page (MIT) (which were.
Instructor: Pedro Domingos
CPSC 422, Lecture 17Slide 1 Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 17 Oct, 19, 2015 Slide Sources D. Koller, Stanford CS - Probabilistic.
Automatic Labeling of Multinomial Topic Models
CSE 312 Foundations of Computing II Instructor: Pedro Domingos.
1 Scalable Probabilistic Databases with Factor Graphs and MCMC Michael Wick, Andrew McCallum, and Gerome Miklau VLDB 2010.
04/21/2005 CS673 1 Being Bayesian About Network Structure A Bayesian Approach to Structure Discovery in Bayesian Networks Nir Friedman and Daphne Koller.
CIS750 – Seminar in Advanced Topics in Computer Science Advanced topics in databases – Multimedia Databases V. Megalooikonomou Link mining ( based on slides.
General Information Course Id: COSC6342 Machine Learning Time: TU/TH 1-2:30p Instructor: Christoph F. Eick Classroom:AH301
Learning Bayesian Networks for Complex Relational Data
Instructor: Pedro Domingos
School of Computer Science & Engineering
General Graphical Model Learning Schema
Data Mining: Concepts and Techniques Course Outline
Basic Intro Tutorial on Machine Learning and Data Mining
CS7280: Special Topics in Data Mining Information/Social Networks
Probabilistic Horn abduction and Bayesian Networks
Data Warehousing and Data Mining
CSE 515 Statistical Methods in Computer Science
Relational Probability Models
Welcome! Knowledge Discovery and Data Mining
Statistical Relational AI
Presentation transcript:

CSE 574 – Artificial Intelligence II Statistical Relational Learning Instructor: Pedro Domingos

Logistics Instructor: Pedro Domingos Office: 648 Allen Center Office hours: Wednesdays 4:30-5:30 TA: Stanley Kok Office: 216 Allen Center Office hours: Mondays 4:30-5:30 Web: Mailing list: cse574

Evaluation Seminar (Pass/Fail) Project (100% of grade) –Proposals due April 8 –Progress report due May 6 –Presentation in class –Final report due June 3

Materials L. Getoor & B. Taskar (eds.), Statistical Relational Learning, MIT Press (to appear). –Draft chapters –Feedback for authors Papers

Topics Background SRL approaches SRL problems and applications

Background Statistical learning Inductive logic programming Sequential and spatial models

SRL Approaches Probabilistic relational models Stochastic logic programs Bayesian logic programs Relational Markov networks Markov logic networks Etc.

SRL Problems and Applications Aggregation Autocorrelation Information extraction and NLP Biology and medicine Relational reinforcement learning Etc.

Today: Introduction Motivation –The AI view –The data mining view –The statistical view –The computer science view Applications Major problem types A map of the field

The AI View Propositional Logic ProbabilityFirst-Order Logic Statistical Relational AI

The Data Mining View Most databases contain multiple tables Data mining algorithms assume one table Manual conversion: slow, costly bottleneck Important patterns may be missed Solution: Multi-relational data mining

The Statistical View Most statistical models assume i.i.d. data (independent and identically distributed) A few assume simple regular dependence (e.g., Markov chain) This is a huge restriction – Let’s remove it! –Allow dependencies between samples –Allow samples with different distributions

The Computer Science View CS faces a complexity bottleneck –Cost of hand-coding –Brittleness Machine learning and probability overcome this But they mostly apply only to attribute vectors Let’s extend them to handle structured objects, class hierarchies, relational databases, etc.

Applications Bottom line: Using statistical and relational information gives better results –Web search (Brin & Page, WWW-98) –Text classification (Chakrabarti et al, SIGMOD-98) –Marketing (Domingos & Richardson, KDD-01) –Record linkage (Pasula et al, NIPS-02) –Gene expression (Segal et al, UAI-03) –Information extraction (McCallum & Wellner, NIPS-04) –Etc.

Major Problem Types Collective classification Link discovery Link-based search Link-based clustering Social network analysis Object identification Transfer learning Etc.

A Map of the Field There are many approaches (“Alphabet soup”) Every year new ones are proposed (and for good reason) Key is to understand the major dimensions along which approaches can differ

Major Dimensions Probabilistic language Logical language Type of learning Type of inference Aggregation

Probabilistic Language Bayesian networks Markov networks (aka Markov random fields) Restrictions of these (e.g., logistic regression) Probabilistic context-free grammars

Logical Language Prolog / Horn clauses Frame systems / Description logics Conjunctive database queries Full first-order logic

Type of Learning Generative vs. discriminative Structure vs. parameters Knowledge-poor vs. knowledge-rich

Type of Inference Marginal/conditional vs. MAP –Marg./cond.: MCMC, belief propagation, etc. –MAP: Graph cuts, weighted satisfiability, etc. Full grounding vs. KBMC

Aggregation Quantifiers SQL-like aggregators (MAX, AVG, SUM, COUNT, MODE, etc.) Noisy-OR Logistic regression

Examples Probabilistic relational models (Friedman et al, IJCAI-99) Stochastic logic programs (Muggleton, SRL-00) Bayesian logic programs (Kersting & De Raedt, ILP-01) Relational Markov networks (Taskar et al, UAI-02) Markov logic networks (Richardson & Domingos, SRL-04)