Presentation on theme: "Computational Learning An intuitive approach. Human Learning Objects in world –Learning by exploration and who knows? Language –informal training, inputs."— Presentation transcript:
Human Learning Objects in world –Learning by exploration and who knows? Language –informal training, inputs may be incorrect Programming –A couple of examples of loops or recursions Medicine –See one, do one, teach one People: few complex examples, informal, complex behavioral output
Computational Learning Representation provided Simple inputs: vectors of values Simple outputs: e.g. yes or no, a number, a disease Many examples (thousands to millions) Quantifiable + Useful, e.g. automatic generation of expert systems
Concerns Generalization accuracy –Performance on unseen data –Evaluation Noise and overfitting Biases of representation –You only find what you look for.
Three Learning Problems Classification: from known examples create decision procedure to guess class –Patient data -> guess disease Regression: from known examples create decision procedure to guess real numbers –Stock data -> guess price Clustering: putting data into “meaningful” groups –Patient Data -> new diseases
Simple data attribute-value representation = 1 example Sex, age, smoker, etc are the attributes Values are male, 50, true etc Only data of this form allowed.
1-Nearest Neighbor classification If x is a example, find the nearest neighbor NN in the data using euclidean distance. Guess the class of c is the class of NN K-nearest neighbor: let the k-nearest neighbors vote Renamed as IB-k in Weka
Neural Net A single perceptron can’t learn some simple concepts, like XOR A multilayered network of perceptrons can learn any boolean function Learning is not biological but follows from multivariable calculus
Gedanken experiments Try ML algorithms on imagined data Ex. Concept: x>y, ie. Data looks like 3,1,+. 2,4,-. etc Which algorithms do best? And how well? Consider the boundaries. My guesses: SMO> Perceptron>NearestN>DT.
Check Guesses with Weka 199 examples. DT= 92.9 (called J48 in weka) NN= 97.5 (called IB1 in weka) SVM = 99.0 (called SMO in weka)