Presentation is loading. Please wait.

Presentation is loading. Please wait.

KU NLP Machine Learning1 Ch 9. Machine Learning: Symbol- based  9.0 Introduction  9.1 A Framework for Symbol-Based Learning  9.2 Version Space Search.

Similar presentations


Presentation on theme: "KU NLP Machine Learning1 Ch 9. Machine Learning: Symbol- based  9.0 Introduction  9.1 A Framework for Symbol-Based Learning  9.2 Version Space Search."— Presentation transcript:

1

2 KU NLP Machine Learning1 Ch 9. Machine Learning: Symbol- based  9.0 Introduction  9.1 A Framework for Symbol-Based Learning  9.2 Version Space Search  The Candidate Elimination Algorithm  9.3 ID3 Decision Tree Induction Algorithm  9.5 Knowledge and Learning  Explanation-Based Learning  9.6 Unsupervised Learning  Conceptual clustering

3 KU NLP Machine Learning2 9.0 Introduction q Learning  through the course of their interactions with the world  through the experience of their own internal states and processes  Is important for practical applications of AI q Knowledge engineering bottleneck  major obstacle to the widespread use of intelligent systems  the cost and difficulty of building expert systems using traditional knowledge acquisition techniques  one solution  For program to begin with a minimal amount of knowledge  And learn from examples, high-level advice, own explorations of the domain

4 KU NLP Machine Learning3 9.0 Introduction q Definition of learning q Views of Learning  Generalization from experience  Induction: must generalize correctly to unseen instances of domain  Inductive biases: selection criteria (must select the most effective aspects of their experience)  Changes in the learner  acquisition of explicitly represented domain knowledge, based on its experience, the learner constructs or modifies expressions in a formal language (e.g. logic). Any change in a system that allow it to perform better the second time on repetition of the same task or on another task drawn form the same population (Simon, 1983)

5 KU NLP Machine Learning4 9.0 Introduction q Learning Algorithms vary in  goals, available training data, learning strategies and knowledge representation languages q All algorithms learn by searching through a space of possible concepts to find an acceptable generalization (concept space Fig. 9.5) q Inductive learning  learning a generalization from a set of examples  concept learning is a typical inductive learning  infer a definition from given examples of some concept (e.g. cat, soybean disease)  allow to correctly recognize future instances of that concept  Two algorithms: version space search and ID3

6 KU NLP Machine Learning5 9.0 Introduction q Similarity-based vs. Explanation-based  Similarity-based (data-driven)  using no prior knowledge of the domain  rely on large numbers of examples  generalization on the basis of patterns in training data  Explanation-based Learning(prior knowledge-driven)  using prior knowledge of the domain to guide generalization  learning by analogy and other technology that utilize prior knowledge to learn from a limited amount of training data

7 KU NLP Machine Learning6 9.0 Introduction  Supervised vs. Unsupervised  supervised learning  learning from training instances of known classification  unsupervised learning  learning from unclassified training data  conceptual clustering or category formation

8 KU NLP Machine Learning7 9.1 Framework for Symbol-based Learning q Learning Algorithms are characterized by a general model (Fig. 9.1, p 354, sp 8)  Data and goals of the learning task  Representation Language  A set of operations  Concept space  Heuristic Search  Acquired knowledge

9 KU NLP Machine Learning8 A general model of the learning process (Fig. 9.1)

10 KU NLP Machine Learning9 9.1 Framework for Symbol-based Learning q Data and Goals  Type of data  positive or negative examples  Single positive example and domain specific knowledge  high-level advice (e.g. condition of loop termination)  analogies(e.g. electricity vs. water)  Goal of Learning algorithms: acquisition of  concept, general description of a class of objects  plans  problem-solving heuristics  other forms of procedural knowledge  Properties and quality of data  come from the outside environment (e.g. teacher) or generated by the program itself  reliable or contain noise  well-structured or unorganized  positive and negative or only positive

11 KU NLP Machine Learning10 9.1 Framework for Symbol-based Learning

12 KU NLP Machine Learning11 9.1 Framework for Symbol-based Learning q Representation of learned knowledge  concept expressions in predicate calculus  A simple formulation of the concept learning problem as conjunctive sentences containing variables  structured representation such as frames  description of plans as a sequence of operations or triangle table  representation of heuristics as problem-solving rules size(obj1, small) ^ color(obj1, red) ^ shape(obj1, round) size(obj2, large) ^ color(obj2, red) ^ shape(obj2, round) => size(X, Y) ^ color(X, red) ^ shape(X, round) size(obj1, small) ^ color(obj1, red) ^ shape(obj1, round) size(obj2, large) ^ color(obj2, red) ^ shape(obj2, round) => size(X, Y) ^ color(X, red) ^ shape(X, round)

13 KU NLP Machine Learning12 9.1 Framework for Symbol-based Learning q A Set of operations  Given a set of training instances, the leaner must construct a generalization, heuristic rule, or plan that satisfies its goal  Requires ability to manipulate representations  Typical operations include  generalizing or specializing symbolic expressions  adjusting the weights in a neural network  modifying the program’s representations q Concept space  defines a space of potential concept definitions  complexity of potential concept space is a measure of difficulty of learning algorithms

14 KU NLP Machine Learning13 9.1 Framework for Symbol-based Learning q Heuristic Search  Use available training data and heuristics to search efficiently  Patrick Winston’s work on learning concepts from positive and negative examples along with near misses (Fig. 9.2).  The program learns by refining candidate description of the target concept through generalization and specialization.  Generalization changes the candidate description to let it accommodate new positive examples (Fig. 9.3)  Specialization changes the candidate description to exclude near misses (Fig. 9.4)  Performance of learning algorithm is highly sensitive to the quality and order of the training examples

15 KU NLP Machine Learning14 Examples and Near Misses for the concept “Arch” (Fig. 9.2)

16 KU NLP Machine Learning15 Generalization of descriptions (Figure 9.3)

17 KU NLP Machine Learning16 Generalizations of descriptions (Fig 9.3 continued)

18 KU NLP Machine Learning17 Specialization of description (Figure 9.4)

19 KU NLP Machine Learning18 9.2 Version Space Search q Implementation of inductive learning as search through a concept space q Generalization operations impose an ordering on the concepts in a space, and uses this ordering to guide the search q 9.2.1 Generalization Operators and Concept Space q 9.2.2 Candidate Elimination Algorithm

20 KU NLP Machine Learning19 9.2.1 Generalization Operators and the Concept Spaces q Primary generalization operations used in ML  Replacing constants with variables  color(ball, red) -> color(X, red)  Dropping conditions from a conjunctive expression  shape(X, round) ^ size(X, small) ^ color(X, red) -> shape(X, round) ^ color(X, red)  Adding a disjunct to an expression  shape(X, round) ^ size(X, small) ^ color(X, red) -> shape(X, round) ^ size(X, small) ^ (color(X, red)  color(X, blue))  Replacing a property with its parent in a class hierarchy  color(X, red) -> color(X, primary_color) if primary_color is superclass of red

21 KU NLP Machine Learning20 9.2.1 Generalization Operators and the Concept Spaces q Notion of covering  If concept P is more general than concept Q, we say that “P covers Q” “P covers Q”  Color(X,Y) covers color(ball,Y), which in turn covers color(ball,red) q Concept space  Defines a space of potential concept definitions  The example concept space representing the predicate obj(Sizes, Color, Shapes) with properties and values predicate obj(Sizes, Color, Shapes) with properties and values  Sizes = {large, small}  Colors = {red, white, blue}  Shapes = {ball, brick, cube} is presented in Figure 9.5 (p 362, sp21) is presented in Figure 9.5 (p 362, sp21)

22 KU NLP Machine Learning21 A Concept Space (Fig. 9.5)

23 KU NLP Machine Learning22 9.2.2 The candidate elimination algorithm q Version space: the set of all concept descriptions consistent with the training examples. q Toward reducing the size of the version space as more examples become available (Fig. 9.10)  Specific to general search from positive examples  General to specific search from negative examples  Candidate elimination algorithm combines these into a bi- directional search q Generalize based on regularities found in the training data q Supervised learning

24 KU NLP Machine Learning23 9.2.2 The candidate elimination algorithm q The learned concept must be general enough to cover all positive examples, also must be specific enough to exclude all negative examples  maximally specific generalization  Maximally general specialization A concept c, is maximally specific if it covers all positive examples, none of the negative examples, and for any concept c’, that covers the positive examples, c  c’ A concept c, is maximally general if it covers none of the negative training instances, and for any other concept c’, that covers no negative training instance, c  c’.

25 KU NLP Machine Learning24 Specific to General Search

26 KU NLP Machine Learning25 Specific to General Search (Fig 9.7)

27 KU NLP Machine Learning26 General to Specific Search

28 KU NLP Machine Learning27 General to Specific Search (Fig 9.8)

29 KU NLP Machine Learning28 9.2.2 The candidate elimination algorithm

30 KU NLP Machine Learning29 9.2.2 The candidate elimination algorithm Begin Initialize G to the most general concept in the space; Initialize S to the first positive training instance; For each new positive instance p Begin Delete all members of G that fail to match p; For every s in S, if s does not match p, replace s with its most specific generalizations that match p and are more specific than some members of G; Delete from S any hypothesis more general than some other hypothesis in S; End; For each new negative instance n Begin Delete all members of S that match n; For each g in G that matches n, replace g with its most general specializations that do not match n and are more general than some members of S; Delete from G any hypothesis more specific than some other hypothesis in G; End

31 KU NLP Machine Learning30 9.2.2 The candidate elimination algorithm (Fig. 9.9)

32 KU NLP Machine Learning31 9.2.2 The candidate elimination algorithm q Combining the two directions of search into a single algorithm has several benefits.  G and S sets summarizes the information in the negative and positive training instances. q Fig. 9.10 gives an abstract description of the candidate elimination algorithm.  “+” signs represent positive instances  “-” signs indicate negative instances  The search “shrinks” the outermost concept to exclude negative instances  The search “expands” the innermost concept to include new positive instances

33 KU NLP Machine Learning32 9.2.2 The candidate elimination algorithm

34 KU NLP Machine Learning33 9.2.2 The candidate elimination algorithm q An incremental nature of learning algorithm  Accepts training instances one at a time, forming a usable, although possibly incomplete, generalization after each example (unlike the batch algorithm such as ID3). q Even before the algorithm converges on a single concept, the G and S sets provide usable constraints on that concept  If c is the goal concept, then for all g ∈ G and s ∈ S, s≤c≤g.  Any concept that is more general than some concept in G will cover negative instance; any concept that is more specific than some concept in S will fail to cover some positive instances

35 KU NLP Machine Learning34 9.2.4 Evaluating Candidate Elimination q Problems  combinatorics of problem space: excessive growth of search space  Useful to develop heuristics for pruning states from G and S (beam search)  Uses an inductive bias to reduce the size of concept space  trade off between expressiveness and efficiency  The algorithm may fail to converge because of noise or inconsistency in training data  One solution to this problem is to maintain multiple G and S sets q Contribution  explication of the relationship between knowledge representation, generalization, and search in inductive learning


Download ppt "KU NLP Machine Learning1 Ch 9. Machine Learning: Symbol- based  9.0 Introduction  9.1 A Framework for Symbol-Based Learning  9.2 Version Space Search."

Similar presentations


Ads by Google