Presentation is loading. Please wait.

Presentation is loading. Please wait.

Learning.

Similar presentations


Presentation on theme: "Learning."— Presentation transcript:

1 Learning

2 What is learning?

3 Supervised Learning Training data that has been classified Examples
Concept learning Decision trees Markov models Nearest neighbor Neural Nets (in coming weeks) Inductive Bias - limits imposed by assumptions! Especially what factors we choose as inputs

4 Rote Learning Store training data
Limitation - does not extend beyond what has been seen Example: concept learning

5 Concept Learning Inductive learning with generalization
Given training data: tuples <a1, a2, a3,…> Boolean value ai - can be any value ? Is used for a don’t care positive Null is used for don’t care negative

6 A hypothesis if a tuple that is true
hg <?, ?, ….> - most general - always true hs <null, null, …> most specific always false hg >= hs Defines a partially ordered lattice

7 Training Method Use the lattice to generate the most general hypothesis Weakness Inconsistent data Data errors

8

9 Decision Trees ID3 algorithm Entropy: a measure of information
p(I)log2 p(I) entropy of an element Entropy of the system of information: Sum - p(I)log2 p(I) P(I) is instances of I / total instances This is done over the outputs of the tree

10 Gain(S, A) = Entropy(S) - S ((|Sv| / |S|) * Entropy(Sv))
Gain is a measure of the effectiveness of a attribute Gain(S, A) = Entropy(S) - S ((|Sv| / |S|) * Entropy(Sv)) Sv Number of outputs with value v is attribute S is the number of elements in the outputs

11 ID3 Greedy algorithm Select the attribute by the largest gain
Iterate until done

12 Markov Models Markov Chain is a set of states
State transitions are probabilistic State xi goes to state xj with P(xj | xi) This can be extended to allow the probability to depend on a set of past states (Memory)

13 Example from the Text Given a set of words, Markov chain to generate similar words For each letter position of the words, compute probability Use a matrix of counts Count[from][to] Normalize rows by total count in row

14 Nearest Neighbor 1NN: Use vectors to represent entities
Use distance measure between vectors to locate closest known entity Can be effected by noisy data

15 kNN - better Use k closest neighbors and vote

16 Other techniques Yet to cover!
Evolutionary algorithms Neural nets


Download ppt "Learning."

Similar presentations


Ads by Google