Presentation is loading. Please wait.

Presentation is loading. Please wait.

International Graduate School of Dynamic Intelligent Systems Machine Learning RG Knowledge Based Systems Hans Kleine Büning 15 July 2015.

Similar presentations


Presentation on theme: "International Graduate School of Dynamic Intelligent Systems Machine Learning RG Knowledge Based Systems Hans Kleine Büning 15 July 2015."— Presentation transcript:

1 International Graduate School of Dynamic Intelligent Systems Machine Learning RG Knowledge Based Systems Hans Kleine Büning 15 July 2015

2 Hans Kleine Büning 9 January 2009 2 RG Knowledge Based Systems University of Paderborn International Graduate School of Dynamic Intelligent Systems Outline  Learning by Example  Motivation  Decision Trees  ID3  Overfitting  Pruning  Exercise  Reinforcement Learning  Motivation  Markov Decision Processes  Q-Learning  Exercise

3 Hans Kleine Büning 9 January 2009 3 RG Knowledge Based Systems University of Paderborn International Graduate School of Dynamic Intelligent Systems Outline  Learning by Example  Motivation  Decision Trees  ID3  Overfitting  Pruning  Exercise  Reinforcement Learning  Motivation  Markov Decision Processes  Q-Learning  Exercise

4 Hans Kleine Büning 9 January 2009 4 RG Knowledge Based Systems University of Paderborn International Graduate School of Dynamic Intelligent Systems Motivation  Partly inspired by human learning  Objectives:  Classify entities according to some given examples  Find structures in big databases  Gain new knowledge from the samples  Input:  Learning examples with  Assigned attributes  Assigned classes  Output:  General Classifier for the given task

5 Hans Kleine Büning 9 January 2009 5 RG Knowledge Based Systems University of Paderborn International Graduate School of Dynamic Intelligent Systems Classifying Training Examples  Training Example for EnjoySport  General Training Examples

6 Hans Kleine Büning 9 January 2009 6 RG Knowledge Based Systems University of Paderborn International Graduate School of Dynamic Intelligent Systems Attributes & Classes  Attribute: A i  Number of different values for A i : |A i |  Class: C i  Number of different classes: |C|  Premises:  n > 2  Consistent examples (no two objects with the same attributes and different classes)

7 Hans Kleine Büning 9 January 2009 7 RG Knowledge Based Systems University of Paderborn International Graduate School of Dynamic Intelligent Systems Possible Solutions  Decision Trees  ID3  C4.5  CART  Rule Based Systems  Clustering  Neural Networks  Backpropagation  Neuroevolution

8 Hans Kleine Büning 9 January 2009 8 RG Knowledge Based Systems University of Paderborn International Graduate School of Dynamic Intelligent Systems Decision Trees  Idea: Classify entities using if- then-rules  Example: Classifing Mushrooms  Attributes: Color, Size, Points  Classes: eatable, poisonous  Resulting rules:  if (Colour = red) and (Size = small) then poisonous  if (Colour = green) then eatable  … ColorSizePointsClass red brown green red small big small big yes no yes no poisonous eatable

9 Hans Kleine Büning 9 January 2009 9 RG Knowledge Based Systems University of Paderborn International Graduate School of Dynamic Intelligent Systems Decision Trees  There exist different decision trees for the same task.  In the mean the left tree decides earlier.

10 Hans Kleine Büning 9 January 2009 10 RG Knowledge Based Systems University of Paderborn International Graduate School of Dynamic Intelligent Systems How to measure tree quality?  Number of leafs?  Number of generated rules  Tree height?  Maximum rule length  External path length?  = Sum of the length of all paths from root to leaf  Amount of memory needed for all rules  Weighted external path length  Like external path length  Paths are weighted by the number of objects they represent

11 Hans Kleine Büning 9 January 2009 11 RG Knowledge Based Systems University of Paderborn International Graduate School of Dynamic Intelligent Systems Back to the Example CriterionLeft TreeRight Tree number of leafs45 height22 external path length65 weighted external path length78

12 Hans Kleine Büning 9 January 2009 12 RG Knowledge Based Systems University of Paderborn International Graduate School of Dynamic Intelligent Systems Weighted External Path Length  Idea from information theory:  Given:  Text which should be compressed  Probabilities for character occurrence  Result:  Coding tree  Example: eeab  p(e) = 0.5  p(a) = 0.25  p(b) = 0.25  Encoding: 110001  Build tree according to the information content.

13 Hans Kleine Büning 9 January 2009 13 RG Knowledge Based Systems University of Paderborn International Graduate School of Dynamic Intelligent Systems Entropy  Entropy = Measurement for mean information content  In general:  Mean number of bits to encode each element by optimal encoding. (= mean height of the theoretically optimal encoding tree)

14 Hans Kleine Büning 9 January 2009 14 RG Knowledge Based Systems University of Paderborn International Graduate School of Dynamic Intelligent Systems Information Gain  Information gain = expected reduction of entropy due to sorting  Conditional Entropy:  Information Gain:

15 Hans Kleine Büning 9 January 2009 15 RG Knowledge Based Systems University of Paderborn International Graduate School of Dynamic Intelligent Systems  Use conditional entropy and information gain for selecting split attributes.  Chosen split attribute A k :  Possible values for A k :  x i – Number of objects with value a i for A k  x i,j – Number of objects with value a i for A k and class C j Probability that one of the objects has attribute a i Probability that an object with attribute a i has class C j Probability that one of the objects has attribute a i Entropy & Decision Trees Probability that one of the objects has attribute a i

16 Hans Kleine Büning 9 January 2009 16 RG Knowledge Based Systems University of Paderborn International Graduate School of Dynamic Intelligent Systems Decision Tree Construction  Choose split attribute A k which gives the highest information gain or the smallest  Example: colour ColorSizePointsClass red brown green red small big small big yes no yes no poisonous eatable

17 Hans Kleine Büning 9 January 2009 17 RG Knowledge Based Systems University of Paderborn International Graduate School of Dynamic Intelligent Systems Decision Tree Construction (2)  Analogously:  H(C|A colour ) = 0.4  H(C|A size ) ≈ 0.4562  H(C|A points ) = 0.4  Choose colour or points as first split criterion  Recursively repeat this procedure

18 Hans Kleine Büning 9 January 2009 18 RG Knowledge Based Systems University of Paderborn International Graduate School of Dynamic Intelligent Systems Decision Tree Construction (3)  Right side is trivial:  Left side: both attributes have the same information gain

19 Hans Kleine Büning 9 January 2009 19 RG Knowledge Based Systems University of Paderborn International Graduate School of Dynamic Intelligent Systems Generalisation  The classifier should also be able to handle unknown data.  Classifing model is often called hypothesis.  Testing Generality:  Divide samples into  Training set  Validation or test set  Learn according to training set  Test generality according to validation set  Error computation:  Test set X  Hypothesis h  error(X,h) – Function which is monotonously increasing in the number of wrongly classified examples in X by h

20 Hans Kleine Büning 9 January 2009 20 RG Knowledge Based Systems University of Paderborn International Graduate School of Dynamic Intelligent Systems Overfitting  Learnt hypothesis performs good on training set but bad on validation set  Formally: h is overfitted if there exists a hypothesis h’ with error(D,h) error(X,h’) X validation set D training set

21 Hans Kleine Büning 9 January 2009 21 RG Knowledge Based Systems University of Paderborn International Graduate School of Dynamic Intelligent Systems Avoiding Overfitting  Stopping  Don‘t split further if some criteria is true  Examples:  Size of node n : Don‘t split if n contains less then ¯ examples.  Purity of node n : Don‘t split of purity gain is not big enough.  Pruning  Reduce decision tree after training.  Examples:  Reduced Error Pruning  Minimal Cost-Complexity Pruning  Rule-Post Pruning

22 Hans Kleine Büning 9 January 2009 22 RG Knowledge Based Systems University of Paderborn International Graduate School of Dynamic Intelligent Systems Pruning  Pruning Syntax:  If T 0 was produced by (repeated) pruning on T we write

23 Hans Kleine Büning 9 January 2009 23 RG Knowledge Based Systems University of Paderborn International Graduate School of Dynamic Intelligent Systems Maximum Tree Creation  Before pruning we need a maximum tree T max  What is a maximum tree?  All leaf nodes are smaller then some threshold or  All leaf nodes represent only one class or  All leaf nodes have only objects with the same attribute values  T max is then pruned starting from the leaves.

24 Hans Kleine Büning 9 January 2009 24 RG Knowledge Based Systems University of Paderborn International Graduate School of Dynamic Intelligent Systems Reduced Error Pruning 1.Consider branch T n of T 2.Replace T n by leaf with the class that is mostly associated with T n 3.If error(X, h(T)) < error(X, h(T/T n )) take back the decision 4.Back to 1. until all non-leaf nodes were considered

25 Hans Kleine Büning 9 January 2009 25 RG Knowledge Based Systems University of Paderborn International Graduate School of Dynamic Intelligent Systems Exercise Fred wants to buy a VW Beetle and classifies all offering in the classes interesting and uninteresting. Help Fred by creating a decision tree using the ID3 algorithm. ColourYear of ConstructionMileageClass red blue green red green blue yellow 1975 1980 1975 1975 1970 1975 1970 > 200 000 km > 200 000 km 200 000 km 200 000 km < 200 000 km interesting uninteresting interesting interesting uninteresting uninteresting interesting

26 Hans Kleine Büning 9 January 2009 26 RG Knowledge Based Systems University of Paderborn International Graduate School of Dynamic Intelligent Systems Outline  Learning by Example  Motivation  Decision Trees  ID3  Overfitting  Pruning  Exercise  Reinforcement Learning  Motivation  Markov Decision Processes  Q-Learning  Exercise

27 Hans Kleine Büning 9 January 2009 27 RG Knowledge Based Systems University of Paderborn International Graduate School of Dynamic Intelligent Systems

28 Hans Kleine Büning 9 January 2009 28 RG Knowledge Based Systems University of Paderborn International Graduate School of Dynamic Intelligent Systems Reinforcement Learning: The Idea  A way of programming agents by reward and punishment without specifying how the task is to be achieved

29 Hans Kleine Büning 9 January 2009 29 RG Knowledge Based Systems University of Paderborn International Graduate School of Dynamic Intelligent Systems Learning to Balance on a Bicycle  States:  Angle of handle bars  Angular velocity of handle bars  Angle of bicycle to vertical  Angular velocity of bicycle to vertical  Acceleration of angle of bicycle to vertical

30 Hans Kleine Büning 9 January 2009 30 RG Knowledge Based Systems University of Paderborn International Graduate School of Dynamic Intelligent Systems Learning to Balance on a Bicycle  Actions:  Torque to be applied to the handle bars  Displacement of the center of mass from the bicycle’s plan (in cm)

31 Hans Kleine Büning 9 January 2009 31 RG Knowledge Based Systems University of Paderborn International Graduate School of Dynamic Intelligent Systems Angle of bicycle to vertical is greater than 12° Reward = 0 Reward = -1 no yes

32 Hans Kleine Büning 9 January 2009 32 RG Knowledge Based Systems University of Paderborn International Graduate School of Dynamic Intelligent Systems Reinforcement Learning: Applications  Board Games  TD-Gammon program, based on reinforcement learning, has become a world-class backgammon player  Control a Mobile Robot  Learning to Drive a Bicycle  Navigation  Pole-balancing  Acrobot  Robot Soccer  Learning to Control Sequential Processes  Elevator Dispatching

33 Hans Kleine Büning 9 January 2009 33 RG Knowledge Based Systems University of Paderborn International Graduate School of Dynamic Intelligent Systems Deterministic Markov Decision Process

34 Hans Kleine Büning 9 January 2009 34 RG Knowledge Based Systems University of Paderborn International Graduate School of Dynamic Intelligent Systems Value of Policy and Agent’s Task

35 Hans Kleine Büning 9 January 2009 35 RG Knowledge Based Systems University of Paderborn International Graduate School of Dynamic Intelligent Systems Nondeterministic Markov Decision Process P = 0.8 P = 0.1

36 Hans Kleine Büning 9 January 2009 36 RG Knowledge Based Systems University of Paderborn International Graduate School of Dynamic Intelligent Systems Methods Dynamic Programming Value Function Approximation + Dynamic Programming Reinforcement Learning Valuation Function Approximation + Reinforcement Learning continuous states discrete states continuous states Model (reward function and transition probabilities) is known Model (reward function or transition probabilities) is unknown

37 Hans Kleine Büning 9 January 2009 37 RG Knowledge Based Systems University of Paderborn International Graduate School of Dynamic Intelligent Systems Q-learning Algorithm

38 Hans Kleine Büning 9 January 2009 38 RG Knowledge Based Systems University of Paderborn International Graduate School of Dynamic Intelligent Systems Q-learning Algorithm

39 Hans Kleine Büning 9 January 2009 39 RG Knowledge Based Systems University of Paderborn International Graduate School of Dynamic Intelligent Systems Example

40 Hans Kleine Büning 9 January 2009 40 RG Knowledge Based Systems University of Paderborn International Graduate School of Dynamic Intelligent Systems Example: Q-table Initialization

41 Hans Kleine Büning 9 January 2009 41 RG Knowledge Based Systems University of Paderborn International Graduate School of Dynamic Intelligent Systems Example: Episode 1

42 Hans Kleine Büning 9 January 2009 42 RG Knowledge Based Systems University of Paderborn International Graduate School of Dynamic Intelligent Systems Example: Episode 1

43 Hans Kleine Büning 9 January 2009 43 RG Knowledge Based Systems University of Paderborn International Graduate School of Dynamic Intelligent Systems Example: Episode 1

44 Hans Kleine Büning 9 January 2009 44 RG Knowledge Based Systems University of Paderborn International Graduate School of Dynamic Intelligent Systems Example: Episode 1

45 Hans Kleine Büning 9 January 2009 45 RG Knowledge Based Systems University of Paderborn International Graduate School of Dynamic Intelligent Systems Example: Episode 1

46 Hans Kleine Büning 9 January 2009 46 RG Knowledge Based Systems University of Paderborn International Graduate School of Dynamic Intelligent Systems Example: Q-table

47 Hans Kleine Büning 9 January 2009 47 RG Knowledge Based Systems University of Paderborn International Graduate School of Dynamic Intelligent Systems Example: Episode 1

48 Hans Kleine Büning 9 January 2009 48 RG Knowledge Based Systems University of Paderborn International Graduate School of Dynamic Intelligent Systems Episode 1

49 Hans Kleine Büning 9 January 2009 49 RG Knowledge Based Systems University of Paderborn International Graduate School of Dynamic Intelligent Systems Example: Q-table

50 Hans Kleine Büning 9 January 2009 50 RG Knowledge Based Systems University of Paderborn International Graduate School of Dynamic Intelligent Systems Example: Episode 2

51 Hans Kleine Büning 9 January 2009 51 RG Knowledge Based Systems University of Paderborn International Graduate School of Dynamic Intelligent Systems Example: Episode 2

52 Hans Kleine Büning 9 January 2009 52 RG Knowledge Based Systems University of Paderborn International Graduate School of Dynamic Intelligent Systems Example: Episode 2

53 Hans Kleine Büning 9 January 2009 53 RG Knowledge Based Systems University of Paderborn International Graduate School of Dynamic Intelligent Systems Example: Q-table after Convergence

54 Hans Kleine Büning 9 January 2009 54 RG Knowledge Based Systems University of Paderborn International Graduate School of Dynamic Intelligent Systems Example: Value Function after Convergence

55 Hans Kleine Büning 9 January 2009 55 RG Knowledge Based Systems University of Paderborn International Graduate School of Dynamic Intelligent Systems Example: Optimal Policy

56 Hans Kleine Büning 9 January 2009 56 RG Knowledge Based Systems University of Paderborn International Graduate School of Dynamic Intelligent Systems Example: Optimal Policy

57 Hans Kleine Büning 9 January 2009 57 RG Knowledge Based Systems University of Paderborn International Graduate School of Dynamic Intelligent Systems Q-learning

58 Hans Kleine Büning 9 January 2009 58 RG Knowledge Based Systems University of Paderborn International Graduate School of Dynamic Intelligent Systems Convergence of Q-learning

59 Hans Kleine Büning 9 January 2009 59 RG Knowledge Based Systems University of Paderborn International Graduate School of Dynamic Intelligent Systems Blackjack  Standard rules of blackjack hold  State space:  element[0] - current value of player's hand (4-21)  element[1] - value of dealer's face­-up card (2-11)  element[2] - player does not have usable ace (0/1)  Starting states:  player has any 2 cards (uniformly distributed), dealer has any 1 card (uniformly distributed)  Actions:  HIT  STICK  Rewards:  ­1 for a loss  0 for a draw  1 for a win

60 Hans Kleine Büning 9 January 2009 60 RG Knowledge Based Systems University of Paderborn International Graduate School of Dynamic Intelligent Systems Blackjack: Optimal Policy

61 Hans Kleine Büning 9 January 2009 61 RG Knowledge Based Systems University of Paderborn International Graduate School of Dynamic Intelligent Systems Exercise:

62 Hans Kleine Büning 9 January 2009 62 RG Knowledge Based Systems University of Paderborn International Graduate School of Dynamic Intelligent Systems Exercise:

63 Hans Kleine Büning 9 January 2009 63 RG Knowledge Based Systems University of Paderborn International Graduate School of Dynamic Intelligent Systems Problems  Multiagent Systems  Cooperative Agents  Competitive Agents  Continuous Domains  Partially observable MDP (POMDP)


Download ppt "International Graduate School of Dynamic Intelligent Systems Machine Learning RG Knowledge Based Systems Hans Kleine Büning 15 July 2015."

Similar presentations


Ads by Google