Research at the Decision Making Lab Fabio Cozman Universidade de São Paulo.

Slides:



Advertisements
Similar presentations
Pat Langley Computational Learning Laboratory Center for the Study of Language and Information Stanford University, Stanford, California
Advertisements

CS188: Computational Models of Human Behavior
A Tutorial on Learning with Bayesian Networks
Bayesian Network and Influence Diagram A Guide to Construction And Analysis.
1 Some Comments on Sebastiani et al Nature Genetics 37(4)2005.
Imbalanced data David Kauchak CS 451 – Fall 2013.
Rulebase Expert System and Uncertainty. Rule-based ES Rules as a knowledge representation technique Type of rules :- relation, recommendation, directive,
CS 4100 Artificial Intelligence Prof. C. Hafner Class Notes March 27, 2012.
Ensemble Methods An ensemble method constructs a set of base classifiers from the training data Ensemble or Classifier Combination Predict class label.
Psychology 290 Special Topics Study Course: Advanced Meta-analysis April 7, 2014.
Dynamic Bayesian Networks (DBNs)
Machine Learning in Practice Lecture 7 Carolyn Penstein Rosé Language Technologies Institute/ Human-Computer Interaction Institute.
CMPUT 466/551 Principal Source: CMU
Rosa Cowan April 29, 2008 Predictive Modeling & The Bayes Classifier.
Integrating Bayesian Networks and Simpson’s Paradox in Data Mining Alex Freitas University of Kent Ken McGarry University of Sunderland.
1 Chapter 12 Probabilistic Reasoning and Bayesian Belief Networks.
LEARNING FROM OBSERVATIONS Yılmaz KILIÇASLAN. Definition Learning takes place as the agent observes its interactions with the world and its own decision-making.
CS 590M Fall 2001: Security Issues in Data Mining Lecture 3: Classification.
Bayesian Belief Networks
Ensemble Learning: An Introduction
Bayesian Networks. Graphical Models Bayesian networks Conditional random fields etc.
. Approximate Inference Slides by Nir Friedman. When can we hope to approximate? Two situations: u Highly stochastic distributions “Far” evidence is discarded.
Thanks to Nir Friedman, HU
Dynamic Bayesian Networks CSE 473. © Daniel S. Weld Topics Agency Problem Spaces Search Knowledge Representation Reinforcement Learning InferencePlanningLearning.
Exploration in Reinforcement Learning Jeremy Wyatt Intelligent Robotics Lab School of Computer Science University of Birmingham, UK
CS Bayesian Learning1 Bayesian Learning. CS Bayesian Learning2 States, causes, hypotheses. Observations, effect, data. We need to reconcile.
Jeff Howbert Introduction to Machine Learning Winter Classification Bayesian Classifiers.
For Better Accuracy Eick: Ensemble Learning
De Finetti’s ultimate failure Krzysztof Burdzy University of Washington.
Standard Error of the Mean
Overview G. Jogesh Babu. Probability theory Probability is all about flip of a coin Conditional probability & Bayes theorem (Bayesian analysis) Expectation,
DATA MINING : CLASSIFICATION. Classification : Definition  Classification is a supervised learning.  Uses training sets which has correct answers (class.
ECE 8443 – Pattern Recognition LECTURE 06: MAXIMUM LIKELIHOOD AND BAYESIAN ESTIMATION Objectives: Bias in ML Estimates Bayesian Estimation Example Resources:
INTRODUCTION TO MACHINE LEARNING. $1,000,000 Machine Learning  Learn models from data  Three main types of learning :  Supervised learning  Unsupervised.
ENSEMBLE LEARNING David Kauchak CS451 – Fall 2013.
1 Robot Environment Interaction Environment perception provides information about the environment’s state, and it tends to increase the robot’s knowledge.
Data Mining Practical Machine Learning Tools and Techniques Chapter 4: Algorithms: The Basic Methods Section 4.6: Linear Models Rodney Nielsen Many of.
ECE 8443 – Pattern Recognition LECTURE 07: MAXIMUM LIKELIHOOD AND BAYESIAN ESTIMATION Objectives: Class-Conditional Density The Multivariate Case General.
Bayesian Learning Chapter Some material adapted from lecture notes by Lise Getoor and Ron Parr.
Ensembles. Ensemble Methods l Construct a set of classifiers from training data l Predict class label of previously unseen records by aggregating predictions.
Classification Techniques: Bayesian Classification
Module networks Sushmita Roy BMI/CS 576 Nov 18 th & 20th, 2014.
METU Informatics Institute Min720 Pattern Classification with Bio-Medical Applications Lecture notes 9 Bayesian Belief Networks.
Chapter 6 Bayesian Learning
Slides for “Data Mining” by I. H. Witten and E. Frank.
The famous “sprinkler” example (J. Pearl, Probabilistic Reasoning in Intelligent Systems, 1988)
Active learning Haidong Shi, Nanyi Zeng Nov,12,2008.
Tractable Inference for Complex Stochastic Processes X. Boyen & D. Koller Presented by Shiau Hong Lim Partially based on slides by Boyen & Koller at UAI.
CIAR Summer School Tutorial Lecture 1b Sigmoid Belief Nets Geoffrey Hinton.
Bayesian networks and their application in circuit reliability estimation Erin Taylor.
De novo discovery of mutated driver pathways in cancer Discussion leader: Matthew Bernstein Scribe: Kun-Chieh Wang Computational Network Biology BMI 826/Computer.
Sampling and estimation Petter Mostad
Learning and Acting with Bayes Nets Chapter 20.. Page 2 === A Network and a Training Data.
Quiz 3: Mean: 9.2 Median: 9.75 Go over problem 1.
Bayesian Learning Bayes Theorem MAP, ML hypotheses MAP learners
CSE 473 Uncertainty. © UW CSE AI Faculty 2 Many Techniques Developed Fuzzy Logic Certainty Factors Non-monotonic logic Probability Only one has stood.
CSC321: Introduction to Neural Networks and Machine Learning Lecture 17: Boltzmann Machines as Probabilistic Models Geoffrey Hinton.
BAYESIAN LEARNING. 2 Bayesian Classifiers Bayesian classifiers are statistical classifiers, and are based on Bayes theorem They can calculate the probability.
Probabilistic Reasoning Inference and Relational Bayesian Networks.
On the Optimality of the Simple Bayesian Classifier under Zero-One Loss Pedro Domingos, Michael Pazzani Presented by Lu Ren Oct. 1, 2007.
Naive Bayes Classifier. REVIEW: Bayesian Methods Our focus this lecture: – Learning and classification methods based on probability theory. Bayes theorem.
Network Management Lecture 13. MACHINE LEARNING TECHNIQUES 2 Dr. Atiq Ahmed Université de Balouchistan.
Brief Intro to Machine Learning CS539
Today.
Classification of unlabeled data:
Bayesian Classification
Data Mining Lecture 11.
A Modified Naïve Possibilistic Classifier for Numerical Data
Bayesian Learning Chapter
Sofia Pediaditaki and Mahesh Marina University of Edinburgh
Presentation transcript:

Research at the Decision Making Lab Fabio Cozman Universidade de São Paulo

Decision Making Lab (2002)

Research tree Robotics (a bit) Bayes nets Sets of probabilities Algorithms independence Applications MDPs, robustness analysis, auctions Anytime, anyspace (embedded systems) Classification Applications Medical decisions MCMC algorithms inference & testing

Some (bio)robotics

Bayesian networks

Decisions in medical domains (with the University Hospital) Idea: To improve decisions at medical posts in urban, poor areas We are building networks that represent cardiac arrest — can be caused by stress, cardiac problems, respiratory problems, etc – Support by FAPESP

The HU-network

A better interface for teaching

Embedded Bayesian networks Challenge: to implement inference algorithms compactly and efficiently Real challenge: to develop anytime anyspace inference algorithms Idea: decompose networks, apply several algorithms (UAI2002 workshop on RT) – Support by HP Labs

Decomposing networks How to decompose and assign algorithms to meet space and time constraints with reasonable accuracy

Application: Failure analysis in car-wash systems

The car-wash network

Generating random networks Problem is easy to state, hard to solve: critical properties of DAGs are not known Method based on MCMC simulation, with constraints on induced width and degree – Support by FAPESP

Research tree (again) Biorobotics (a bit of it) Bayes nets Sets of probabilities Algorithms independence Applications MDPs, robustness analysis, auctions Anytime, anyspace (embedded systems) Classification Applications Medical decisions MCMC algorithms inference & testing

Bayesian network classifiers Goal is to use probabilistic models for classification – to “learn” classifiers using labeled and unlabeled data Work with Ira Cohen, Alex Bronstein and Marsha Duro (UIUC and HP Labs)

Using Bayesian networks to learn from labeled and unlabeled data Suppose we want to classify events based on observations; we have recorded data that are sometimes labeled and sometimes unlabeled What is the value of unlabeled data?

The Naïve Bayes classifier A Bayesian-network like classifier with excellent credentials: Use Bayes rule to get classification p(label | attrs.)  p(label)  i=0…N p(attr. i | Class) Attribute 1 Class Attribute 2Attribute N

The TAN classifier Attribute N X N Attribute 1 X 1 Class Attribute 2 X 2 Attribute 3 X 3

Now, let’s consider unlabeled data Our database: – Americanbaseballhamburger – Brazilian soccerrice and beans – Americangolfapple pie – ?saloon soccerrice and beans – ?golfrice and beans Question: How to use the unlabeled data?

Unlabeled data can help… Learning a Naïve Bayes for data generated from a Naïve Bayes model (10 attributes):

… but unlabeled data may degrade performance! Surprising fact: more data may not help; more data may hurt

Some math: asymptotic analysis Asymptotic bias: Variance decreases with more data

A very simple example Consider the following situation: Class X Y X Y “Real” “Assumed” X and Y are Gaussian given Class

Effect of unlabeled data – a different perspective

Searching for structures Previous tests suggest that we should pay attention to modeling assumptions when dealing with unlabeled data In the context of Bayesian network classifiers, we must look for structures This is not easy; worse, existing algorithms do not focus on classification

Stochastic Structure Search (SSS) Idea: search for structures using classification error Hard: search space is too messy Solution: Metropolis-Hastings sampling with underlying measure proportional to 1/p error

Some classification results

Some words on unlabeled data Unlabeled data can improve performance, can degrade performance — really hard! Current understanding about this problem is shaky – people think outliers or mismatches between labeled and unlabeled data cause the problem

Research tree (once again) Biorobotics (a bit of it) Bayes nets Sets of probabilities Algorithms independence Applications MDPs, robustness analysis, auctions Anytime, anyspace (embedded systems) Classification Applications Medical decisions MCMC algorithms inference & testing

Sets of probabilities Instead of probability of rain is 0.2, say probability of rain is [0.1, 0.3] Instead of expected value of stock is 10, admit expected value of stock is [0, 1000]

An example Consider a set of probabilities p(  1 ) p(  2 ), p(  3 ) Set of probabilities

Why? More realistic and quite expressive as representation language Excellent tool for – robustness/sensitivity analysis – modeling incomplete beliefs (probabilistic logic) – group decision-making – analysis of economic interactions – for example, to study arbitrage and design auctions

What we have been doing Trying to formalize and apply “interval” reasoning, particularly independence Building algorithms for manipulation of these intervals and sets – To deal with independence and networks – JavaBayes is the only available software that can deal with this (to some extent!)

Credal networks Using graphical models to represent sets of joint probabilities Question: what do the networks represent? Several open questions and need for algorithms Family In?Dog Sick? Lights On? Dog Barking? Dog Out?

Concluding To summarize, we want to understand how to use probabilities in AI, and then we add a bit of robotics Support from FAPESP and HP Labs has been generous Visit the lab in your next trip to São Paulo