Machine Learning Version Spaces Learning. 2  Neural Net approaches  Symbolic approaches:  version spaces  decision trees  knowledge discovery  data.

Slides:



Advertisements
Similar presentations
Concept Learning and the General-to-Specific Ordering
Advertisements

2. Concept Learning 2.1 Introduction
Imbalanced data David Kauchak CS 451 – Fall 2013.
Concept Learning DefinitionsDefinitions Search Space and General-Specific OrderingSearch Space and General-Specific Ordering The Candidate Elimination.
Combining Inductive and Analytical Learning Ch 12. in Machine Learning Tom M. Mitchell 고려대학교 자연어처리 연구실 한 경 수
Università di Milano-Bicocca Laurea Magistrale in Informatica
SASH Spatial Approximation Sample Hierarchy
Adapted by Doug Downey from: Bryan Pardo, EECS 349 Fall 2007 Machine Learning Lecture 2: Concept Learning and Version Spaces 1.
Machine Learning: Symbol-Based
Learning decision trees derived from Hwee Tou Ng, slides for Russell & Norvig, AI a Modern Approachslides Tom Carter, “An introduction to information theory.
MACHINE LEARNING. What is learning? A computer program learns if it improves its performance at some task through experience (T. Mitchell, 1997) A computer.
CS Bayesian Learning1 Bayesian Learning. CS Bayesian Learning2 States, causes, hypotheses. Observations, effect, data. We need to reconcile.
INTRODUCTION TO MACHINE LEARNING. $1,000,000 Machine Learning  Learn models from data  Three main types of learning :  Supervised learning  Unsupervised.
Learning CPSC 386 Artificial Intelligence Ellen Walker Hiram College.
CS 478 – Tools for Machine Learning and Data Mining The Need for and Role of Bias.
10.4 How to Find a Perfect Matching We have a condition for the existence of a perfect matching in a graph that is necessary and sufficient. Does this.
Machine Learning CSE 681 CH2 - Supervised Learning.
General-to-Specific Ordering. 8/29/03Logic Based Classification2 SkyAirTempHumidityWindWaterForecastEnjoySport SunnyWarmNormalStrongWarmSameYes SunnyWarmHighStrongWarmSameYes.
Learning from observations
Learning from Observations Chapter 18 Through
Machine Learning Chapter 2. Concept Learning and The General-to-specific Ordering Tom M. Mitchell.
Kansas State University Department of Computing and Information Sciences CIS 830: Advanced Topics in Artificial Intelligence Monday, January 22, 2001 William.
Chapter 2: Concept Learning and the General-to-Specific Ordering.
For Wednesday No reading Homework: –Chapter 18, exercise 6.
STAR Sti, main features V. Perevoztchikov Brookhaven National Laboratory,USA.
Machine Learning Chapter 5. Artificial IntelligenceChapter 52 Learning 1. Rote learning rote( โรท ) n. วิถีทาง, ทางเดิน, วิธีการตามปกติ, (by rote จากความทรงจำ.
CpSc 810: Machine Learning Concept Learning and General to Specific Ordering.
Concept Learning and the General-to-Specific Ordering 이 종우 자연언어처리연구실.
Outline Inductive bias General-to specific ordering of hypotheses
Overview Concept Learning Representation Inductive Learning Hypothesis
For Monday No new reading Homework: –Chapter 18, exercises 3 and 4.
START OF DAY 1 Reading: Chap. 1 & 2. Introduction.
Learning, page 1 CSI 4106, Winter 2005 Symbolic learning Points Definitions Representation in logic What is an arch? Version spaces Candidate elimination.
1 Inductive Learning (continued) Chapter 19 Slides for Ch. 19 by J.C. Latombe.
KU NLP Machine Learning1 Ch 9. Machine Learning: Symbol- based  9.0 Introduction  9.1 A Framework for Symbol-Based Learning  9.2 Version Space Search.
ASSESSING LEARNING ALGORITHMS Yılmaz KILIÇASLAN. Assessing the performance of the learning algorithm A learning algorithm is good if it produces hypotheses.
1 Universidad de Buenos Aires Maestría en Data Mining y Knowledge Discovery Aprendizaje Automático 5-Inducción de árboles de decisión (2/2) Eduardo Poggi.
CS 5751 Machine Learning Chapter 3 Decision Tree Learning1 Decision Trees Decision tree representation ID3 learning algorithm Entropy, Information gain.
Decision Trees Binary output – easily extendible to multiple output classes. Takes a set of attributes for a given situation or object and outputs a yes/no.
1 Universidad de Buenos Aires Maestría en Data Mining y Knowledge Discovery Aprendizaje Automático 2-Concept Learning (1/3) Eduardo Poggi
Machine Learning A Quick look Sources: Artificial Intelligence – Russell & Norvig Artifical Intelligence - Luger By: Héctor Muñoz-Avila.
Machine Learning Concept Learning General-to Specific Ordering
Ensemble Methods in Machine Learning
Data Mining and Decision Support
CS 8751 ML & KDDComputational Learning Theory1 Notions of interest: efficiency, accuracy, complexity Probably, Approximately Correct (PAC) Learning Agnostic.
Concept Learning and The General-To Specific Ordering
Computational Learning Theory Part 1: Preliminaries 1.
Concept learning Maria Simi, 2011/2012 Machine Learning, Tom Mitchell Mc Graw-Hill International Editions, 1997 (Cap 1, 2).
Inductive Learning (2/2) Version Space and PAC Learning Russell and Norvig: Chapter 18, Sections 18.5 through 18.7 Chapter 18, Section 18.5 Chapter 19,
Machine Learning Lecture 1: Intro + Decision Trees Moshe Koppel Slides adapted from Tom Mitchell and from Dan Roth.
Machine Learning Chapter 7. Computational Learning Theory Tom M. Mitchell.
Naïve Bayes Classifier April 25 th, Classification Methods (1) Manual classification Used by Yahoo!, Looksmart, about.com, ODP Very accurate when.
Data Mining CH6 Implementation: Real machine learning schemes(2) Reporter: H.C. Tsai.
CSE573 Autumn /09/98 Machine Learning Administrative –Last topic: Decision Tree Learning Reading: 5.1, 5.4 Last time –finished NLP sample system’s.
CSE573 Autumn /11/98 Machine Learning Administrative –Finish this topic –The rest of the time is yours –Final exam Tuesday, Mar. 17, 2:30-4:20.
Lecture 1.31 Criteria for optimal reception of radio signals.
Chapter 2 Concept Learning
Concept Learning Machine Learning by T. Mitchell (McGraw-Hill) Chp. 2
CS 9633 Machine Learning Concept Learning
Challenges in Creating an Automated Protein Structure Metaserver
Reasoning about code CSE 331 University of Washington.
Ordering of Hypothesis Space
Version Spaces Learning
Concept Learning.
Machine Learning: UNIT-3 CHAPTER-1
Machine Learning Chapter 2
Inductive Learning (2/2) Version Space and PAC Learning
Implementation of Learning Systems
Machine Learning Chapter 2
Presentation transcript:

Machine Learning Version Spaces Learning

2  Neural Net approaches  Symbolic approaches:  version spaces  decision trees  knowledge discovery  data mining  speed up learning  inductive learning  … Introduction:  Very many approaches to machine learning:

Version Spaces A concept learning technique based on refining models of the world.

4 Concept Learning  Example:  a student has the following observations about having an allergic reaction after meals: RestaurantMealDayCost Reaction Alma 3breakfastFridaycheapYes De MoetelunchFridayexpensiveNo Alma 3lunchSaturdaycheapYes SedesbreakfastSundaycheapNo Alma 3breakfastSundayexpensiveNo  Concept to learn: under which circumstances do I get an allergic reaction after meals ??

5 In general  There is a set of all possible events:  Example: RestaurantMealDayCost 3 X 3 X 7 X 2 = 126 Reaction: Restaurant X Meal X Day X Cost --> Bool  There is a boolean function (implicitly) defined on this set.  Example: Find an inductive ‘guess’ of the concept, that covers all the examples!  We have the value of this function for SOME examples only.

6 Pictured: Set of all possible events events                                Given some examples: Find a concept that covers all the positive examples and none of the negative ones!

7 Non-determinism  Many different ways to solve this! Set of all possible events events                                  How to choose ?

8 An obvious, but bad choice:  The concept IS:  Alma 3 and breakfast and Friday and cheap OR  Alma 3 and lunch and Saturday and cheap  Does NOT generalize the examples any !! RestaurantMealDayCost Reaction Alma 3breakfastFridaycheapYes De MoetelunchFridayexpensiveNo Alma 3lunchSaturdaycheapYes SedesbreakfastSundaycheapNo Alma 3breakfastSundayexpensiveNo

9 Pictured:  Only the positive examples are positive ! Set of all possible events events                               

10 Equally bad is:  The concept is anything EXCEPT:  De Moete and lunch and Friday and expensive AND  Sedes and breakfast and Sunday and cheap AND  Alma 3 and breakfast and Sunday and expensive RestaurantMealDayCost Reaction Alma 3breakfastFridaycheapYes De MoetelunchFridayexpensiveNo Alma 3lunchSaturdaycheapYes SedesbreakfastSundaycheapNo Alma 3breakfastSundayexpensiveNo

11 Pictured:  Everything except the negative examples are positive: Set of all possible events events                                 

12 Solution: fix a language of hypotheses:  We introduce a fix language of concept descriptions. = hypothesis space  The concept can only be identified as being one of the hypotheses in this language  avoids the problem of having ‘useless’ conclusions  forces some generalization/induction to cover more than just the given examples.

13  most general hypothesis: [ ?, ?, ?, ?] Reaction - Example: RestaurantMealDayCost Reaction Alma 3breakfastFridaycheapYes De MoetelunchFridayexpensiveNo Alma 3lunchSaturdaycheapYes SedesbreakfastSundaycheapNo Alma 3breakfastSundayexpensiveNo  Every hypothesis is a 4-tuple:  maximally specific: ex.: [ Sedes, lunch, Monday, cheap]  combinations of ? and values are allowed: ex.: [ De Moete, ?, ?, expensive]  or [ ?, lunch, ?, ?]  One more hypothesis:  (bottom = denotes no example)

14 Hypotheses relate to sets of possible events      Events  x2      x1 x1 = x2 = h1 = [?, lunch, Monday, ?] h2 = [?, lunch, ?, cheap] h3 = [?, lunch, ?, ?] General Specific               Hypothesesh2 h3 h1

15 Expressive power of this hypothesis language:  Conjunctions of explicit, individual properties  Examples:  [?, lunch, Monday, ?] : Meal = lunch  Day = Monday  [?, lunch, ?, cheap] : Meal = lunch  Cost = cheap  [?, lunch, ?, ?] : Meal = lunch  In addition to the 2 special hypotheses:  [ ?, ?, ?, ?] : anything   : nothing

16 Other languages of hypotheses are allowed  Example: identify the color of a given sequence of colored objects. the examples: red : + purple : - purple : - blue : + blue : + any_color mono_color polyo_color red blue green orange purple   A useful language of hypotheses:

17 Important about hypothesis languages:  They should have a specific general ordering. Corresponding to the set-inclusion of the events they cover.

18 Defining Concept Learning:  Given:  A set X of possible events :  Ex.: Eat-events:  Ex.: Eat-events:  An (unknown) target function c: X -> {-, +} An (unknown) target function c: X -> {-, +}  Ex.: Reaction: Eat-events -> { -, +}  A language of hypotheses H  Ex.: conjunctions: [ ?, lunch, Monday, ?]  A set of training examples D, with their value under c  Ex.: (,+), …  Find:  A hypothesis h in H such that for all x in D: x is covered by h  c(x) = +

19 The inductive learning hypothesis: If a hypothesis approximates the target function well over a sufficiently large number of examples, then the hypothesis will also approximate the target function well on other unobserved examples.

20 Find-S: a naïve algorithm Initialize: h :=  For each positive training example x in D Do: If h does not cover x: Replace h by a minimal generalization of h that covers x Return h

21 Initially: h =  Reaction example: Example 1: h = [Alma 3,breakfast,Friday,cheap]  minimal generalizations = the individual events Example 2: h = [Alma 3, ?, ?,cheap]  Generalization = replace something by ? no more positive examples: return h RestaurantMealDayCost Reaction Alma 3breakfastFridaycheapYes De MoetelunchFridayexpensiveNo Alma 3lunchSaturdaycheapYes SedesbreakfastSundaycheapNo Alma 3breakfastSundayexpensiveNo

22 Properties of Find-S BethJoAlex HackerScientistFootballplayer H  Non-deterministic:  Depending on H, there maybe several minimal generalizations: Beth can be minimally generalized in 2 ways to include a new example Jo.

23 Properties of Find-S (2)  May pick incorrect hypothesis (w.r.t. the negative examples): D : BethAlexJo +-+ BethJoAlex HackerScientistFootballplayer H Is a wrong solution

24 Properties of Find-S (3)  Cannot detect inconsistency of the training data: D : BethJoBeth ++- BethAlexJo +-+ JoAlex Scientist HBeth  Nor inability of the language H to learn the concept:

25 Nice about Find-S:  It doesn’t have to remember previous examples ! If h already covered the 20 first examples, then h’ will as well h       XH  If the previous h already covered all previous examples, then a minimal generalization h’ will too !  h’   

26 Dual Find-S: Initialize: h := [ ?, ?,.., ?] For each negative training example x in D Do: If h does cover x: Replace h by a minimal specialization of h that does not cover x Return h

27 Reaction example: RestaurantMealDayCost Reaction Alma 3breakfastFridaycheapYes De MoetelunchFridayexpensiveNo Alma 3lunchSaturdaycheapYes SedesbreakfastSundaycheapNo Alma 3breakfastSundayexpensiveNo Initially: h = [ ?, ?, ?, ?] Example 1:  h= [ ?, breakfast, ?, ?] Example 2:  h= [ Alma 3, breakfast, ?, ?] Example 3:  h= [ Alma 3, breakfast, ?, cheap]

28  Perform both Find-S and Dual Find-S:  Find-S deals with positive examples  Dual Find-S deals with negative examples Version Spaces: the idea:  BUT: do NOT select 1 minimal generalization or specialization at each step, but keep track of ALL minimal generalizations or specializations

29 Version spaces: initialization  The version spaces G and S are initialized to be the smallest and the largest hypotheses only. G = { [ ?, ?, …, ?] } S = {  }

30 Negative examples:  Replace the top hypothesis by ALL minimal specializations that DO NOT cover the negative example. Invariant: only the hypotheses more specific than the ones of G are still possible: they don’t cover the negative example S = {  } G = { [ ?, ?, …, ?] } G = {h1, h2, …, hn}

31 Positive examples:  Replace the bottom hypothesis by ALL minimal generalizations that DO cover the positive example. Invariant: only the hypotheses more general than the ones of S are still possible: they do cover the positive example G = {h1, h2, …, hn} S = {  } S = { h1’, h2’, …,hm’ }

32 Later: negative examples  Replace the all hypotheses in G that cover a next negative example by ALL minimal specializations that DO NOT cover the negative example. Invariant: only the hypotheses more specific than the ones of G are still possible: they don’t cover the negative example S = { h1’, h2’, …,hm’ } G = {h1, h2, …, hn} G = {h1, h21, h22, …, hn}

33 Later: positive examples  Replace the all hypotheses in S that do not cover a next positive example by ALL minimal generalizations that DO cover the example. Invariant: only the hypotheses more general than the ones of S are still possible: they do cover the positive example G = {h1, h21, h22, …, hn} S = { h1’, h2’, …,hm’ } S = { h11’, h12’, h13’, h2’, …,hm’ }

34 Optimization: negative:  Only consider specializations of elements in G that are still more general than some specific hypothesis (in S) G = {h1, h21, …, hn} S = { h1’, h2’, …,hm’ } Invariant: on S used ! … are more general than...

35 Optimization: positive:  Only consider generalizations of elements in S that are still more specific than some general hypothesis (in G) Invariant: on G used ! G = {h1, h21, h22, …, hn} S = { h13’, h2’, …,hm’ } … are more specific than...

36 Pruning: negative examples  The new negative example can also be used to prune all the S - hypotheses that cover the negative example. G = {h1, h21, …, hn} S = { h1’, h3’, …,hm’ } Invariant only works for the previous examples, not the last one Cover the last negative example!

37 Pruning: positive examples  The new positive example can also be used to prune all the G - hypotheses that do not cover the positive example. G = {h1, h21, …, hn} S = { h1’, h3’, …,hm’ } Don’t cover the last positive example

38 Eliminate redundant hypotheses  If a hypothesis from G is more specific than another hypothesis from G: eliminate it ! G = {h1, h21, …, hn} S = { h1’, h3’, …,hm’ } More specific than another general hypothesis Reason: Invariant acts as a wave front: anything above G is not allowed. The most general elements of G define the real boundary Obviously also for S !

39 Convergence:  Eventually, if G and S MAY get a common element: Version Spaces has converged to a solution.  Remaining examples need to be verified for the solution. G = {h1, h21, h22, …, hn} S = { h13’, h2’, …,hm’ }

40 Reaction example  Initialization: [ ?, ?, ?, ?] Mostgeneral Mostspecific

41 Alma3, breakfast, Friday, cheap: +  Positive example: minimal generalization of  [ ?, ?, ?, ?]  [Alma3, breakfast, Friday, cheap]

42 DeMoete, lunch, Friday, expensive: -  Negative example: minimal specialization of [ ?, ?, ?, ?]  15 possible specializations !! [Alma3, ?, ?, ?] [DeMoete, ?, ?, ?] [Sedes, ?, ?, ?] [?, breakfast, ?, ?] [?, lunch, ?, ?] [?, dinner, ?, ?] [?, ?, Monday, ?] [?, ?, Tuesday, ?] [?, ?, Wednesday, ?] [?, ?, Thursday, ?] [?, ?, Friday, ?] [?, ?, Saturday, ?] [?, ?, Sunday, ?] [?, ?, ?, cheap] [?, ?, ?, expensive] Matches the negative example X X X X Does not generalize the specific model Specific model: [Alma3, breakfast, Friday, cheap] X X X X X X X X Remain !

43 Result after example 2: [ ?, ?, ?, ?]  [Alma3, breakfast, Friday, cheap] [Alma3, ?, ?, ?] [?, breakfast, ?, ?] [?, ?, ?, cheap]

44 Alma3, lunch, Saturday, cheap: +  Positive example: minimal generalization of [Alma3, breakfast, Friday, cheap] [Alma3, breakfast, Friday, cheap] [Alma3, ?, ?, ?] [?, breakfast, ?, ?] [?, ?, ?, cheap] [Alma3, ?, ?, cheap] does not match new example

45 Sedes, breakfast, Sunday, cheap: -  Negative example: minimal specialization of the general models: [Alma3, ?, ?, ?] [?, ?, ?, cheap] [Alma3, ?, ?, cheap] The only specialization that is introduced is pruned, because it is more specific than another general hypothesis

46 Alma 3, breakfast, Sunday, expensive: -  Negative example: minimal specialization of [Alma3, ?, ?, ?] [Alma3, ?, ?, ?] [Alma3, ?, ?, cheap] Same hypothesis !!! Cheap food at Alma3 produces the allergy !

47 Version Space Algorithm: Initially: G := { the hypothesis that covers everything} S := {  } S := {  } For each new positive example: Generalize all hypotheses in S that do not cover the example yet, but ensure the following: example yet, but ensure the following: - Only introduce minimal changes on the hypotheses. - Only introduce minimal changes on the hypotheses. - Each new specific hypothesis is a specialization of - Each new specific hypothesis is a specialization of some general hypothesis. some general hypothesis. - No new specific hypothesis is a generalization of - No new specific hypothesis is a generalization of some other specific hypothesis. some other specific hypothesis. Prune away all hypotheses in G that do not cover the Prune away all hypotheses in G that do not cover the example. example.

48 Version Space Algorithm (2):... For each new negative example: Specialize all hypotheses in G that cover the example, but ensure the following: but ensure the following: - Only introduce minimal changes on the hypotheses. - Only introduce minimal changes on the hypotheses. - Each new general hypothesis is a generalization of - Each new general hypothesis is a generalization of some specific hypothesis. some specific hypothesis. - No new general hypothesis is a specialization of - No new general hypothesis is a specialization of some other general hypothesis. some other general hypothesis. Prune away all hypotheses in S that cover the example. Until there are no more examples: report S and G OR S or G become empty: report failure

49 Properties of VS:  Symmetry:  positive and negative examples are dealt with in a completely dual way.  Does not need to remember previous examples.  Noise:  VS cannot deal with noise !  If a positive example is given to be negative then VS eliminates the desired hypothesis from the Version Space G !

50 Termination:  If it terminates because of “no more examples”: [Alma3, ?, ?, ?] [?, ?, Monday,?] [Alma3, ?, Monday, cheap] [Alma3, ?, ?, cheap] [?, ?, Monday, cheap] [Alma3, ?, Monday, ?] Then all these hypotheses, and all intermediate hypotheses, are still correct descriptions covering the test data. VS makes NO unnecessary choices !  Example (spaces on termination):

51 Termination (2):  If it terminates because of S or G being empty:  Then either:  The data is inconsistent (noise?)  The target concept cannot be represented in the hypothesis-language H. [Alma3, breakfast, ?, cheap]  [Alma3, lunch, ?, cheap]  Example:  target concept is:  given examples like:  this cannot be learned in our language H.

52 Which example next?  VS can decide itself which example would be most useful next. [Alma3, ?, ?, ?] [?, ?, Monday,?] [Alma3, ?, Monday, cheap] [Alma3, ?, ?, cheap] [?, ?, Monday, cheap] [Alma3, ?, Monday, ?]  It can ‘query’ a user for the most relevant additional classification !  Example: classified negative by 3 hypotheses classified positive by 3 hypotheses is the most informative new example

53 Use of partially learned concepts:  Example:  Example: [Alma3, ?, ?, ?] [?, ?, Monday,?] [Alma3, ?, Monday, cheap] [Alma3, ?, ?, cheap] [?, ?, Monday, cheap] [Alma3, ?, Monday, ?]  can be classified as positive  is covered by all remaining hypotheses !  it is enough to check that it is covered by the hypotheses in S ! (all others generalize these)

54 Use of partially learned concepts (2):  Example:  Example: [Alma3, ?, ?, ?] [?, ?, Monday,?] [Alma3, ?, Monday, cheap] [Alma3, ?, ?, cheap] [?, ?, Monday, cheap] [Alma3, ?, Monday, ?]  can be classified as negative  is not covered by any remaining hypothesis !   it is enough to check that it is not covered by any hypothesis in G ! (all others specialize these)

55 Use of partially learned concepts (3):  Example:  Example: [Alma3, ?, ?, ?] [?, ?, Monday,?] [Alma3, ?, Monday, cheap] [Alma3, ?, ?, cheap] [?, ?, Monday, cheap] [Alma3, ?, Monday, ?]  can not be classified  is covered by 3, not covered 3 hypotheses   no conclusion

56 Use of partially learned concepts (4):  Example:  Example: [Alma3, ?, ?, ?] [?, ?, Monday,?] [Alma3, ?, Monday, cheap] [Alma3, ?, ?, cheap] [?, ?, Monday, cheap] [Alma3, ?, Monday, ?] probably does not belong to the concept : ratio : 1/6  can only be classified with a certain degree of precision  is covered by 1, not covered 5 hypotheses

57 The relevance of inductive BIAS: choosing H  Our hypothesis language L fails to learn some concepts.  See example : [Alma3, breakfast, ?, cheap]  [Alma3, lunch, ?, cheap]  What about choosing a more expressive language H?  Assume H’:  allows conjunctions (as before)  allows disjunction and negation too !  Example: –Restaurant = Alma3  ~(Day = Monday)

58 Inductive BIAS (2):  This language H’ allows to represent ANY subset of the complete set of all events X RestaurantMealDayCost 3 X 3 X 7 X 2 = 126  But X has 126 elements  we can express different hypotheses now !

59 Inductive BIAS (3)  Version Spaces using H’: RestaurantMealDayCost Reaction Alma 3breakfastFridaycheapYes De MoetelunchFridayexpensiveNo Alma 3lunchSaturdaycheapYes SedesbreakfastSundaycheapNo Alma 3breakfastSundayexpensiveNo ? {~[DeMoete. Lunch, Friday, expensive]  ~[Sedes, breakfast, Sunday, cheap]} {~[DeMoete. Lunch, Friday, expensive]  ~[Sedes, breakfast, Sunday, cheap]  ~[Alma 3, breakfast, Sunday, expensive]}  ~[Alma 3, breakfast, Sunday, expensive]} {~[DeMoete. Lunch, Friday, expensive]} {[Alma 3, breakfast, Friday, cheap]  [Alma 3, lunch, Saturday, cheap]} {[Alma 3, breakfast, Friday, cheap]}

60 Inductive BIAS (3)  Resulting Version Spaces: G = {~[DeMoete. Lunch, Friday, expensive]  ~[Sedes, breakfast, Sunday, cheap]  ~[Alma 3, breakfast, Sunday, expensive]} S = {[Alma 3, breakfast, Friday, cheap]  [Alma 3, lunch, Saturday, cheap]}  We haven’t learned anything  merely restated our positive and negative examples !  In general: in order to be able to learn, we need an inductive BIAS (= assumption):  Example: “ The desired concept CAN be described as a conjunction of features “

61 Shift of Bias  Practical approach to the Bias problem:  Start VS with a very weak hypothesis language.  If the concept is learned: o.k.  Else  Refine your language and restart VS  Avoids the choice.  Gives the most general concept that can be learned.