CPS 270: Artificial Intelligence Machine learning Instructor: Vincent Conitzer.

Slides:



Advertisements
Similar presentations
Learning from Observations Chapter 18 Section 1 – 3.
Advertisements

Naïve Bayes. Bayesian Reasoning Bayesian reasoning provides a probabilistic approach to inference. It is based on the assumption that the quantities of.
DECISION TREES. Decision trees  One possible representation for hypotheses.
Data Mining Classification: Alternative Techniques
An Overview of Machine Learning
Ai in game programming it university of copenhagen Statistical Learning Methods Marco Loog.
Learning from Observations Copyright, 1996 © Dale Carnegie & Associates, Inc. Chapter 18 Spring 2004.
Machine Learning CPSC 315 – Programming Studio Spring 2009 Project 2, Lecture 5.
1 Chapter 10 Introduction to Machine Learning. 2 Chapter 10 Contents (1) l Training l Rote Learning l Concept Learning l Hypotheses l General to Specific.
01 -1 Lecture 01 Artificial Intelligence Topics –Introduction –Knowledge representation –Knowledge reasoning –Machine learning –Applications.
Learning from Observations Copyright, 1996 © Dale Carnegie & Associates, Inc. Chapter 18 Fall 2005.
Computational Learning Theory PAC IID VC Dimension SVM Kunstmatige Intelligentie / RuG KI2 - 5 Marius Bulacu & prof. dr. Lambert Schomaker.
LEARNING FROM OBSERVATIONS Yılmaz KILIÇASLAN. Definition Learning takes place as the agent observes its interactions with the world and its own decision-making.
Reinforcement Learning
Learning from Observations Chapter 18 Section 1 – 4.
Learning From Observations
Learning from Observations Copyright, 1996 © Dale Carnegie & Associates, Inc. Chapter 18 Fall 2004.
CSE 5731 Lecture 26 Introduction to Machine Learning CSE 573 Artificial Intelligence I Henry Kautz Fall 2001.
Three kinds of learning
LEARNING FROM OBSERVATIONS Yılmaz KILIÇASLAN. Definition Learning takes place as the agent observes its interactions with the world and its own decision-making.
1 MACHINE LEARNING TECHNIQUES IN IMAGE PROCESSING By Kaan Tariman M.S. in Computer Science CSCI 8810 Course Project.
Support Vector Machines Classification
Crash Course on Machine Learning
Supervised Learning and k Nearest Neighbors Business Intelligence for Managers.
Lyle Ungar, University of Pennsylvania Learning and Memory Reinforcement Learning.
Artificial Intelligence Lecture No. 28 Dr. Asad Ali Safi ​ Assistant Professor, Department of Computer Science, COMSATS Institute of Information Technology.
Data Mining Joyeeta Dutta-Moscato July 10, Wherever we have large amounts of data, we have the need for building systems capable of learning information.
INTRODUCTION TO MACHINE LEARNING. $1,000,000 Machine Learning  Learn models from data  Three main types of learning :  Supervised learning  Unsupervised.
Midterm Review Rao Vemuri 16 Oct Posing a Machine Learning Problem Experience Table – Each row is an instance – Each column is an attribute/feature.
Machine Learning1 Machine Learning: Summary Greg Grudic CSCI-4830.
Introduction to machine learning and data mining 1 iCSC2014, Juan López González, University of Oviedo Introduction to machine learning Juan López González.
Some working definitions…. ‘Data Mining’ and ‘Knowledge Discovery in Databases’ (KDD) are used interchangeably Data mining = –the discovery of interesting,
CHAPTER 18 SECTION 1 – 3 Learning from Observations.
1 Machine Learning 1.Where does machine learning fit in computer science? 2.What is machine learning? 3.Where can machine learning be applied? 4.Should.
Classification Techniques: Bayesian Classification
Learning from observations
1 Chapter 10 Introduction to Machine Learning. 2 Chapter 10 Contents (1) l Training l Rote Learning l Concept Learning l Hypotheses l General to Specific.
CS Inductive Bias1 Inductive Bias: How to generalize on novel data.
Classification (slides adapted from Rob Schapire) Eran Segal Weizmann Institute.
Course Overview  What is AI?  What are the Major Challenges?  What are the Main Techniques?  Where are we failing, and why?  Step back and look at.
COMP 2208 Dr. Long Tran-Thanh University of Southampton Decision Trees.
Reinforcement Learning AI – Week 22 Sub-symbolic AI Two: An Introduction to Reinforcement Learning Lee McCluskey, room 3/10
COMP 2208 Dr. Long Tran-Thanh University of Southampton Reinforcement Learning.
Chapter 6 Neural Network.
Ch 18 – Learning from Examples. Background An agent is learning if it improves its performance on future tasks after making observations about the world.
Chapter 18 Section 1 – 3 Learning from Observations.
WHAT IS DATA MINING?  The process of automatically extracting useful information from large amounts of data.  Uses traditional data analysis techniques.
Learning From Observations Inductive Learning Decision Trees Ensembles.
COMP 2208 Dr. Long Tran-Thanh University of Southampton Revision.
MACHINE LEARNING OVERVIEW Heng Ji 04/08, 2016.
Anifuddin Azis LEARNING. Why is learning important? So far we have assumed we know how the world works Rules of queens puzzle Rules of chess Knowledge.
Learning from Observations
Learning from Observations
Introduce to machine learning
School of Computer Science & Engineering
Done Done Course Overview What is AI? What are the Major Challenges?
Presented By S.Yamuna AP/CSE
Chapter 11: Learning Introduction
CH. 1: Introduction 1.1 What is Machine Learning Example:
Data Mining Lecture 11.
Overview of Supervised Learning
CSE P573 Applications of Artificial Intelligence Decision Trees
Basic Intro Tutorial on Machine Learning and Data Mining
Classification Techniques: Bayesian Classification
Prepared by: Mahmoud Rafeek Al-Farra
MACHINE LEARNING TECHNIQUES IN IMAGE PROCESSING
Learning from Observations
MACHINE LEARNING TECHNIQUES IN IMAGE PROCESSING
Learning from Observations
Instructor: Vincent Conitzer
Presentation transcript:

CPS 270: Artificial Intelligence Machine learning Instructor: Vincent Conitzer

Why is learning important? So far we have assumed we know how the world works –Rules of queens puzzle –Rules of chess –Knowledge base of logical facts –Actions’ preconditions and effects –Probabilities in Bayesian networks, MDPs, POMDPs, … –Rewards in MDPs At that point “just” need to solve/optimize In the real world this information is often not immediately available AI needs to be able to learn from experience

Different kinds of learning… Supervised learning: –Someone gives us examples and the right answer for those examples –We have to predict the right answer for unseen examples Unsupervised learning: –We see examples but get no feedback –We need to find patterns in the data Reinforcement learning: –We take actions and get rewards –Have to learn how to get high rewards

Example of supervised learning: classification We lend money to people We have to predict whether they will pay us back or not People have various (say, binary) features: –do we know their Address? do they have a Criminal record? high Income? Educated? Old? Unemployed? We see examples: (Y = paid back, N = not) +a, -c, +i, +e, +o, +u: Y -a, +c, -i, +e, -o, -u: N +a, -c, +i, -e, -o, -u: Y -a, -c, +i, +e, -o, -u: Y -a, +c, +i, -e, -o, -u: N -a, -c, +i, -e, -o, +u: Y +a, -c, -i, -e, +o, -u: N +a, +c, +i, -e, +o, -u: N Next person is +a, -c, +i, -e, +o, -u. Will we get paid back?

Classification… We want some hypothesis h that predicts whether we will be paid back +a, -c, +i, +e, +o, +u: Y -a, +c, -i, +e, -o, -u: N +a, -c, +i, -e, -o, -u: Y -a, -c, +i, +e, -o, -u: Y -a, +c, +i, -e, -o, -u: N -a, -c, +i, -e, -o, +u: Y +a, -c, -i, -e, +o, -u: N +a, +c, +i, -e, +o, -u: N Lots of possible hypotheses: will be paid back if… –Income is high (wrong on 2 occasions in training data) –Income is high and no Criminal record (always right in training data) –(Address is known AND ((NOT Old) OR Unemployed)) OR ((NOT Address is known) AND (NOT Criminal Record)) (always right in training data) Which one seems best? Anything better?

Occam’s Razor Occam’s razor: simpler hypotheses tend to generalize to future data better Intuition: given limited training data, –it is likely that there is some complicated hypothesis that is not actually good but that happens to perform well on the training data –it is less likely that there is a simple hypothesis that is not actually good but that happens to perform well on the training data There are fewer simple hypotheses Computational learning theory studies this in much more depth

Decision trees high Income? yes no NO yes no NO Criminal record? YES

Constructing a decision tree, one step at a time address? yes no +a, -c, +i, +e, +o, +u: Y -a, +c, -i, +e, -o, -u: N +a, -c, +i, -e, -o, -u: Y -a, -c, +i, +e, -o, -u: Y -a, +c, +i, -e, -o, -u: N -a, -c, +i, -e, -o, +u: Y +a, -c, -i, -e, +o, -u: N +a, +c, +i, -e, +o, -u: N -a, +c, -i, +e, -o, -u: N -a, -c, +i, +e, -o, -u: Y -a, +c, +i, -e, -o, -u: N -a, -c, +i, -e, -o, +u: Y +a, -c, +i, +e, +o, +u: Y +a, -c, +i, -e, -o, -u: Y +a, -c, -i, -e, +o, -u: N +a, +c, +i, -e, +o, -u: N criminal? -a, +c, -i, +e, -o, -u: N -a, +c, +i, -e, -o, -u: N -a, -c, +i, +e, -o, -u: Y -a, -c, +i, -e, -o, +u: Y +a, -c, +i, +e, +o, +u: Y +a, -c, +i, -e, -o, -u: Y +a, -c, -i, -e, +o, -u: N +a, +c, +i, -e, +o, -u: N income? +a, -c, +i, +e, +o, +u: Y +a, -c, +i, -e, -o, -u: Y +a, -c, -i, -e, +o, -u: N yes no yes no yes no Address was maybe not the best attribute to start with…

Starting with a different attribute yes no +a, -c, +i, +e, +o, +u: Y -a, +c, -i, +e, -o, -u: N +a, -c, +i, -e, -o, -u: Y -a, -c, +i, +e, -o, -u: Y -a, +c, +i, -e, -o, -u: N -a, -c, +i, -e, -o, +u: Y +a, -c, -i, -e, +o, -u: N +a, +c, +i, -e, +o, -u: N criminal? -a, +c, -i, +e, -o, -u: N -a, +c, +i, -e, -o, -u: N +a, +c, +i, -e, +o, -u: N +a, -c, +i, +e, +o, +u: Y +a, -c, +i, -e, -o, -u: Y -a, -c, +i, +e, -o, -u: Y -a, -c, +i, -e, -o, +u: Y +a, -c, -i, -e, +o, -u: N Seems like a much better starting point than address –Each node almost completely uniform –Almost completely predicts whether we will be paid back

Different approach: nearest neighbor(s) Next person is -a, +c, -i, +e, -o, +u. Will we get paid back? Nearest neighbor: simply look at most similar example in the training data, see what happened there +a, -c, +i, +e, +o, +u: Y (distance 4) -a, +c, -i, +e, -o, -u: N (distance 1) +a, -c, +i, -e, -o, -u: Y (distance 5) -a, -c, +i, +e, -o, -u: Y (distance 3) -a, +c, +i, -e, -o, -u: N (distance 3) -a, -c, +i, -e, -o, +u: Y (distance 3) +a, -c, -i, -e, +o, -u: N (distance 5) +a, +c, +i, -e, +o, -u: N (distance 5) Nearest neighbor is second, so predict N k nearest neighbors: look at k nearest neighbors, take a vote –E.g., 5 nearest neighbors have 3 Ys, 2Ns, so predict Y

Another approach: perceptrons Place a weight on every attribute, indicating how important that attribute is (and in which direction it affects things) E.g., w a = 1, w c = -5, w i = 4, w e = 1, w o = 0, w u = -1 +a, -c, +i, +e, +o, +u: Y (score = 5) -a, +c, -i, +e, -o, -u: N (score -5+1=-4) +a, -c, +i, -e, -o, -u: Y (score 1+4=5) -a, -c, +i, +e, -o, -u: Y (score 4+1=5) -a, +c, +i, -e, -o, -u: N (score -5+4=-1) -a, -c, +i, -e, -o, +u: Y (score 4-1=3) +a, -c, -i, -e, +o, -u: N (score 1+0=1) +a, +c, +i, -e, +o, -u: N (score =0) Need to set some threshold above which we predict to be paid back (say, 2) May care about combinations of things (nonlinearity) – generalization: neural networks

Reinforcement learning There are three routes you can take to work: A, B, C The times you took A, it took: 10, 60, 30 minutes The times you took B, it took: 32, 31, 34 minutes The time you took C, it took 50 minutes What should you do next? Exploration vs. exploitation tradeoff –Exploration: try to explore underexplored options –Exploitation: stick with options that look best now Reinforcement learning usually studied in MDPs –Take action, observe reward and new state

Bayesian approach to learning Assume we have a prior distribution over the long term behavior of A –With probability.6, A is a “fast route” which: With prob..25, takes 20 minutes With prob..5, takes 30 minutes With prob..25, takes 40 minutes –With probability.4, A is a “slow route” which: With prob..25, takes 30 minutes With prob..5, takes 40 minutes With prob..25, takes 50 minutes We travel on A once and see it takes 30 minutes P(A is fast | observation) = P(observation | A is fast)*P(A is fast) / P(observation) =.5*.6/(.5*.6+.25*.4) =.3/(.3+.1) =.75 Convenient approach for decision theory, game theory

Learning in game theory Like 2/3 of average game Very tricky because other agents learn at the same time From one agent’s perspective, the environment is changing –Taking the average of past observations may not be good idea