Machine Learning via Advice Taking Jude Shavlik. Thanks To... Rich Maclin Lisa Torrey Trevor Walker Prof. Olvi Mangasarian Glenn Fung Ted Wild DARPA.

Slides:



Advertisements
Similar presentations
Image classification Given the bag-of-features representations of images from different classes, how do we learn a model for distinguishing them?
Advertisements

SVM - Support Vector Machines A new classification method for both linear and nonlinear data It uses a nonlinear mapping to transform the original training.
A KTEC Center of Excellence 1 Pattern Analysis using Convex Optimization: Part 2 of Chapter 7 Discussion Presenter: Brian Quanz.

Pattern Recognition and Machine Learning
An Introduction of Support Vector Machine
Support Vector Machines and Kernels Adapted from slides by Tim Oates Cognition, Robotics, and Learning (CORAL) Lab University of Maryland Baltimore County.
Support Vector Machines
Pattern Recognition and Machine Learning: Kernel Methods.
Machine learning continued Image source:
The Disputed Federalist Papers : SVM Feature Selection via Concave Minimization Glenn Fung and Olvi L. Mangasarian CSNA 2002 June 13-16, 2002 Madison,
Support Vector Machines (SVMs) Chapter 5 (Duda et al.)
Reduced Support Vector Machine
Classification Problem 2-Category Linearly Separable Case A- A+ Malignant Benign.
Support Vector Machines
Bioinformatics Challenge  Learning in very high dimensions with very few samples  Acute leukemia dataset: 7129 # of gene vs. 72 samples  Colon cancer.
Support Vector Regression David R. Musicant and O.L. Mangasarian International Symposium on Mathematical Programming Thursday, August 10, 2000
Lecture 10: Support Vector Machines
Lisa Torrey University of Wisconsin – Madison CS 540.
Mathematical Programming in Support Vector Machines
An Introduction to Support Vector Machines Martin Law.
Neural Networks Lecture 8: Two simple learning algorithms
This week: overview on pattern recognition (related to machine learning)
Machine Learning1 Machine Learning: Summary Greg Grudic CSCI-4830.
CS 8751 ML & KDDSupport Vector Machines1 Support Vector Machines (SVMs) Learning mechanism based on linear programming Chooses a separating plane based.
Lisa Torrey and Jude Shavlik University of Wisconsin Madison WI, USA.
Machine Learning Seminar: Support Vector Regression Presented by: Heng Ji 10/08/03.
The Disputed Federalist Papers: Resolution via Support Vector Machine Feature Selection Olvi Mangasarian UW Madison & UCSD La Jolla Glenn Fung Amazon Inc.,
Support Vector Machines in Data Mining AFOSR Software & Systems Annual Meeting Syracuse, NY June 3-7, 2002 Olvi L. Mangasarian Data Mining Institute University.
1 SUPPORT VECTOR MACHINES İsmail GÜNEŞ. 2 What is SVM? A new generation learning system. A new generation learning system. Based on recent advances in.
Knowledge-Based Breast Cancer Prognosis Olvi Mangasarian UW Madison & UCSD La Jolla Edward Wild UW Madison Computation and Informatics in Biology and Medicine.
Using Advice to Transfer Knowledge Acquired in One Reinforcement Learning Task to Another Lisa Torrey, Trevor Walker, Jude Shavlik University of Wisconsin-Madison,
Lisa Torrey University of Wisconsin – Madison Doctoral Defense May 2009.
Skill Acquisition via Transfer Learning and Advice Taking Lisa Torrey, Jude Shavlik, Trevor Walker University of Wisconsin-Madison, USA Richard Maclin.
Transfer Learning Via Advice Taking Jude Shavlik University of Wisconsin-Madison.
Transfer in Reinforcement Learning via Markov Logic Networks Lisa Torrey, Jude Shavlik, Sriraam Natarajan, Pavan Kuppili, Trevor Walker University of Wisconsin-Madison,
PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 3: LINEAR MODELS FOR REGRESSION.
Support Vector Machines Reading: Ben-Hur and Weston, “A User’s Guide to Support Vector Machines” (linked from class web page)
An Introduction to Support Vector Machines (M. Law)
Relational Macros for Transfer in Reinforcement Learning Lisa Torrey, Jude Shavlik, Trevor Walker University of Wisconsin-Madison, USA Richard Maclin University.
Advice Taking and Transfer Learning: Naturally-Inspired Extensions to Reinforcement Learning Lisa Torrey, Trevor Walker, Richard Maclin*, Jude Shavlik.
Support Vector Machine Data Mining Olvi L. Mangasarian with Glenn M. Fung, Jude W. Shavlik & Collaborators at ExonHit – Paris Data Mining Institute University.
RSVM: Reduced Support Vector Machines Y.-J. Lee & O. L. Mangasarian First SIAM International Conference on Data Mining Chicago, April 6, 2001 University.
Today’s Topics 11/10/15CS Fall 2015 (Shavlik©), Lecture 22, Week 101 Support Vector Machines (SVMs) Three Key Ideas –Max Margins –Allowing Misclassified.
1  The Problem: Consider a two class task with ω 1, ω 2   LINEAR CLASSIFIERS.
Support vector machine LING 572 Fei Xia Week 8: 2/23/2010 TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: A 1.
1  Problem: Consider a two class task with ω 1, ω 2   LINEAR CLASSIFIERS.
Support Vector Machines Tao Department of computer science University of Illinois.
Support Vector Machines. Notation Assume a binary classification problem. –Instances are represented by vector x   n. –Training examples: x = (x 1,
Today’s Topics 11/17/15CS Fall 2015 (Shavlik©), Lecture 23, Week 111 Review of the linear SVM with Slack Variables Kernels (for non-linear models)
Support Vector Machines Reading: Ben-Hur and Weston, “A User’s Guide to Support Vector Machines” (linked from class web page)
Thirty-Two Years of Knowledge-Based Machine Learning Jude Shavlik University of Wisconsin Not on cs540 final.
SVMs in a Nutshell.
Machine Learning and Data Mining: A Math Programming- Based Approach Glenn Fung CS412 April 10, 2003 Madison, Wisconsin.
Incremental Reduced Support Vector Machines Yuh-Jye Lee, Hung-Yi Lo and Su-Yun Huang National Taiwan University of Science and Technology and Institute.
Massive Support Vector Regression (via Row and Column Chunking) David R. Musicant and O.L. Mangasarian NIPS 99 Workshop on Learning With Support Vectors.
Minimal Kernel Classifiers Glenn Fung Olvi Mangasarian Alexander Smola Data Mining Institute University of Wisconsin - Madison Informs 2002 San Jose, California,
A Brief Introduction to Support Vector Machine (SVM) Most slides were from Prof. A. W. Moore, School of Computer Science, Carnegie Mellon University.
Support Vector Machines Reading: Textbook, Chapter 5 Ben-Hur and Weston, A User’s Guide to Support Vector Machines (linked from class web page)
Knowledge-Based Nonlinear Support Vector Machine Classifiers Glenn Fung, Olvi Mangasarian & Jude Shavlik COLT 2003, Washington, DC. August 24-27, 2003.
Support Vector Machines (SVMs) Chapter 5 (Duda et al.) CS479/679 Pattern Recognition Dr. George Bebis.
PREDICT 422: Practical Machine Learning
An Introduction to Support Vector Machines
An Introduction to Support Vector Machines
Richard Maclin University of Minnesota - Duluth
Support Vector Machines
Refining Rules Incorporated into Knowledge-Based Support Vector Learners via Successive Linear Programming Richard Maclin University of Minnesota - Duluth.
University of Wisconsin - Madison
University of Wisconsin - Madison
Minimal Kernel Classifiers
Presentation transcript:

Machine Learning via Advice Taking Jude Shavlik

Thanks To... Rich Maclin Lisa Torrey Trevor Walker Prof. Olvi Mangasarian Glenn Fung Ted Wild DARPA

Quote (2002) from DARPA Sometimes an assistant will merely watch you and draw conclusions. Sometimes you have to tell a new person, 'Please don't do it this way' or 'From now on when I say X, you do Y.' It's a combination of learning by example and by being guided.

Widening the “Communication Pipeline” between Humans and Machine Learners Teacher Pupil Machine Learner

Our Approach to Building Better Machine Learners Human partner expresses advice “naturally” and w/o knowledge of ML agent’s internals Agent incorporates advice directly into the function it is learning Additional feedback (rewards, I/O pairs, inferred labels, more advice) used to refine learner continually

“Standard” Machine Learning vs. Theory Refinement Positive Examples (“should see doctor”) temp = 102.1, age = 21, sex = F, … temp = 101.7, age = 37, sex = M, … Negative Examples (“take two aspirins”) temp = 99.1, age = 43, sex = M, … temp = 99.6, age = 24, sex = F, … Approximate Domain Knowledge if temp = high and age = young … then neg example Related work by labs of Mooney, Pazzani, Cohen, Giles, etc

Rich Maclin’s PhD (1995) IF a Bee is (Near and West) & an Ice is (Near and North) Then Begin Move East Move North END

Sample Results Without advice With advice

Our Motto Give advice rather than commands to your computer

Outline Prior Knowledge and Support Vector Machines  Intro to SVM’s  Linear Separation  Non-Linear Separation  Function Fitting (“Regression”)  Advice-Taking Reinforcement Learning  Transfer Learning via Advice Taking

Support Vector Machines Maximizing the Margin between Bounding Planes A+ A- Support Vectors ? Margin

Linear Algebra for SVM’s Given p points in n dimensional space Represent by p-by-n matrix A of reals More succinctly where e is vector of ones Separate by two bounding planes Each A i in class +1 or -1

“Slack” Variables Dealing with Data that is not Linearly Separable A+ A- y Support Vectors

Support Vector Machines Quadratic Programming Formulation Solve this quadratic program min s.t. Maximize margin by minimizing Minimize sum of slack vars with wgt

Support Vector Machines Linear Programming Formulation Use 1-norm instead of 2-norm (typically runs faster; better feature selection; might generalize better, NIPS ‘03) min s.t.

Knowledge-Based SVM’s Generalizing “Example” from POINT to REGION A+ A-

Incorporating “Knowledge Sets” Into the SVM Linear Program This implication equivalent to set of constraints (proof in NIPS ’02 paper) Suppose that knowledge set belongs to class A+ Hence must lie in half space We therefore have the implication

Resulting LP for KBSVM’s We get this linear program (LP) Ranges over # regions

KBSVM with Slack Variables Was 0

SVMs and Non-Linear Separating Surfaces f1f1 f2f2 + + _ _ h(f 1, f 2 ) g(f 1, f 2 ) + + _ _ Non-linearly map to new space Linearly separate in new space (using kernels) Result is non-linear separator in original space Fung et al. (2003) presents knowledge- based non-linear SVMs

Support Vector Regression (aka Kernel Regression) Linearly approximating a function, given array A of inputs and vector y of (numeric) outputs f(x) ≈ x’w + b Find weights such that Aw + be ≈ y In dual space, w = A’ , so get (A A’)  + be ≈ y Kernel’izing (to get non-linear approx) K(A,A’)  + be ≈ y y x

What to Optimize? Linear program to optimize 1 st term (  ) is “regularizer” that minimizes model complexity 2 nd term is approximation error, weighted by parameter C Classical “least squares” fit if quadratic version and first term ignored

Predicting Y for New X y = K(x’, A’)  + b Use Kernel to compute “distance” to each training point (ie, row in A) Weight by  i (hopefully many  i are zero), Sum Add b (a scalar)

Knowledge-Based SVR Mangasarian, Shavlik, & Wild, JMLR ‘04 Add soft constraints to linear program (so need only follow advice approximately) minimize ||w|| 1 + C ||s|| 1 + penalty for violating advice such that y - s  Aw + b  y + s “slacked” match to advice Advice: In this region, y should exceed 4 S y 4

Testbeds: Subtasks of RoboCup Keep ball from opponents [Stone & Sutton, ICML 2001] Mobile KeepAway Score goal [Maclin et al., AAAI 2005] BreakAway

Reinforcement Learning Overview Take an action Receive a state Receive a reward Policy: choose the action with the highest Q-value in the current state Use the rewards to estimate the Q- values of actions in states Described by a set of features

Incorporating Advice in KBKR Advice format Bx ≤ d  f(x) ≥ hx +  If distanceToGoal ≤ 10 and shotAngle ≥ 30 Then Q(shoot) ≥ 0.9

Giving Advice About Relative Values of Multiple Functions Maclin et al, AAAI ’05 When the input satisfies preconditions(input) Then f 1 (input) > f 2 (input)

Sample Advice-Taking Results if distanceToGoal  10 and shotAngle  30 then prefer shoot over all other actions advice std RL 2 vs 1 BreakAway, rewards +1, -1 Q(shoot) > Q(pass) Q(shoot) > Q(move)

Transfer Learning Agent discovers how tasks are related We use a user mapping to tell the agent this Agent learns Task A Agent encounters related Task B Agent uses knowledge from Task A to learn Task B faster Task A is the source Task B is the target

Transfer Learning: The Goal for the Target Task performance training with transfer without transfer better start faster rise better asymptote

Our Transfer Algorithm Observe source task games to learn skills Use ILP to create advice for the target task Learn target task with KBKR Translate learned skills into transfer advice If there is user advice, add it in

Learning Skills By Observation Source-task games are sequences: (state, action) Learning skills is like learning to classify states by their correct actions ILP = Inductive Logic Programming State 1 distBetween(me,teammate2) = 15 distBetween(me,teammate1) = 10 distBetween(me,opponent1) = 5... action = pass(teammate2) outcome = caught(teammate2)

ILP: Searching for First-Order Rules P :- true P :- QP :- R P :- R, QP :- R, S P :- R, S, V, W, X P :- S We also use a random-sampling approach

Advantages of ILP Can produce first-order rules for skills Capture only the essential aspects of the skill We expect these aspects to transfer better Can incorporate background knowledge pass(Teammate) pass(teammate1) pass(teammateN) vs

Example of a Skill Learned by ILP from KeepAway pass(Teammate) :- distBetween(me, Teammate) > 14, passAngle(Teammate) > 30, passAngle(Teammate) < 150, distBetween(me, Opponent) < 7. Also gave “human” advice about shooting, since that is new skill in BreakAway

TL Level 7: KA to BA Raw Curves

TL Level 7: KA to BA Averaged Curves

TL Level 7: Statistics TL Metrics Average Reward TypeNameKA to BAMD to BA ScoreP ValueScoreP Value IJump start Jump start smoothed IITransfer ratio Transfer ratio (truncated) Average relative reduction (narrow) Average relative reduction (wide) Ratio (of area under the curves) Transfer difference Transfer difference (scaled) IIIAsymptotic advantage Asymptotic advantage smoothed Boldface indicates a significant difference was found

Conclusion Can use much more than I/O pairs in ML Give advice to computers; they automatically refine it based on feedback from user or environment Advice an appealing mechanism for transferring learned knowledge computer-to-computer

Some Papers (on-line, use Google :-) Creating Advice-Taking Reinforcement LearnersCreating Advice-Taking Reinforcement Learners, Maclin & Shavlik, Machine Learning 1996 Knowledge-Based Support Vector Machine ClassifiersKnowledge-Based Support Vector Machine Classifiers, Fung, Mangasarian, & Shavlik, NIPS 2002 Knowledge-Based Nonlinear Kernel ClassifiersKnowledge-Based Nonlinear Kernel Classifiers, Fung, Mangasarian, & Shavlik, COLT 2003 Knowledge-Based Kernel ApproximationKnowledge-Based Kernel Approximation, Mangasarian, Shavlik, & Wild, JAIR 2004 Giving Advice about Preferred Actions to Reinforcement Learners Via Knowledge-Based Kernel RegressionGiving Advice about Preferred Actions to Reinforcement Learners Via Knowledge-Based Kernel Regression, Maclin, Shavlik, Torrey, Walker, & Wild, AAAI 2005 Skill Acquisition via Transfer Learning and Advice TakingSkill Acquisition via Transfer Learning and Advice Taking, Torrey, Shavlik, Walker, & Maclin, ECML 2006

Backups

Breakdown of Results

What if User Advice is Bad?

Related Work on Transfer Q-function transfer in RoboCup Taylor & Stone (AAMAS 2005, AAAI 2005) Transfer via policy reuse Fernandez & Veloso (AAMAS 2006, ICML workshop 2006) Madden & Howley (AI Review 2004) Torrey et al. (ECML 2005) Transfer via relational RL Driessens et al. (ICML workshop 2006)