ITEC323 Lecture 1.

ITEC323 Lecture 1

Grading Scheme Final 30% Midterm 30% Presentation+report 15% HW 20%
Class Participation 5%

What is Machine Learning?
The real question is what is learning? Using past experiences to improve future performance. For a machine, experiences come in the form of data. What does it mean to improve performance? Learning is guided by an objective, associated with a particular notion of loss to be minimized (or, equivalently, gain to be maximized). Why machine learning? We need computers to make informed decisions on new, unseen data. Often it is too difficult to design a set of rules “by hand”. Machine learning is about automatically extracting relevant information from data and applying it to analyze new data

What is the Learning Problem?
Learning = Improving with experience at some task Improve over task T with respect to performance measure P based on experience E

It is very hard to write programs that solve problems like recognizing a face. We don’t know what program to write because we don’t know how our brain does it. Even if we had a good idea about how to do it, the program might be horrendously complicated. Instead of writing a program by hand, we collect lots of examples that specify the correct output for a given input. A machine learning algorithm then takes these examples and produces a program that does the job. The program produced by the learning algorithm may look very different from a typical hand-written program. It may contain millions of numbers. If we do it right, the program works for new cases as well as the ones we trained it on.

Why “Learn”? Machine learning is programming computers to optimize a performance criterion using example data or past experience. There is no need to “learn” to calculate payroll Learning is used when: Human expertise does not exist (navigating on Mars), Humans are unable to explain their expertise (speech recognition) Solution changes in time (routing on a computer network) Solution needs to be adapted to particular cases (user biometrics)

What We Talk About When We Talk About“Learning”
Learning general models from a data of particular examples Data is cheap and abundant (data warehouses, data marts); knowledge is expensive and scarce. Example in retail: Customer transactions to consumer behavior: People who bought “Da Vinci Code” also bought “The Five People You Meet in Heaven” ( Build a model that is a good and useful approximation to the data.

A classic example of a task that requires machine learning: It is very hard to say what makes a 2

Summary: Machine Learning Study of algorithms that improve their performance at some task with experience Optimize a performance criterion using example data or past experience. Role of Statistics: Inference from a sample Role of Computer science: Efficient algorithms to Solve the optimization problem Representing and evaluating the model for inference

Growth of Machine Learning
Machine learning is preferred approach to Speech recognition, Natural language processing Computer vision Medical outcomes analysis Robot control Computational biology This trend is accelerating Improved machine learning algorithms Improved data capture, networking, faster computers Software too complex to write by hand New sensors / IO devices Demand for self-customization to user, environment It turns out to be difficult to extract knowledge from human expertsfailure of expert systems in the 1980’s. Alpydin & Ch. Eick: ML Topic1

Some more examples of tasks that are best solved by using a learning algorithm
Recognizing patterns: Facial identities or facial expressions Handwritten or spoken words Medical images Recognizing anomalies: Unusual sequences of credit card transactions Unusual patterns of sensor readings in a nuclear power plant or unusual sound in your car engine. Prediction: Future stock prices or currency exchange rates

Some web-based examples of machine learning
The web contains a lot of data. Tasks with very big datasets often use machine learning especially if the data is noisy or non-stationary. Spam filtering, fraud detection: The enemy adapts so we must adapt too. Recommendation systems: Lots of noisy data. Million dollar prize! Information retrieval: Find documents or images with similar content. Data Visualization: Display a huge database in a revealing way

Machine Learning vs Symbolic AI
Knowledge Representation works with facts/assertions and develops rules of logical inference. The rules can handle quantifiers. Learning and uncertainty are usually ignored. Expert Systems used logical rules or conditional probabilities provided by “experts” for specific domains. Graphical Models treat uncertainty properly and allow learning (but they often ignore quantifiers and use a fixed set of variables) Set of logical assertions  values of a subset of the variables and local models of the probabilistic interactions between variables. Logical inference  probability distributions over subsets of the unobserved variables (or individual ones) Learning = refining the local models of the interactions.

Machine Learning vs Statistics
A lot of machine learning is just a rediscovery of things that statisticians already knew. This is often disguised by differences in terminology: Ridge regression = weight-decay Fitting = learning Held-out data = test data But the emphasis is very different: A good piece of statistics: Clever proof that a relatively simple estimation procedure is asymptotically unbiased. A good piece of machine learning: Demonstration that a complicated algorithm produces impressive results on a specific task. Data-mining: Using very simple machine learning techniques on very large databases because computers are too slow to do anything more interesting with ten billion examples.

Types of learning task Supervised learning
Learn to predict output when given an input vector Who provides the correct answer? Reinforcement learning Learn action to maximize payoff Not much information in a payoff signal Payoff is often delayed Unsupervised learning Create an internal representation of the input e.g. form clusters; extract features How do we know if a representation is good? This is the new frontier of machine learning because most big datasets do not come with labels.

Typical ML tasks Classification Supervised learning Labels:
Domain: instances with labels s: spam/ not spam Patients: ill/healthy Credit card: legitimate/fraud Input: training data Examples from the domain Selected randomly Output: predictor Classifies new instances Supervised learning Labels: Binary, discrete, continuous Multiple/unique label per instance Multiple labels: Predictors: Linear classifiers Decision Trees Neural networks Nearest Neighbor

Typical ML tasks Clustering Examples: Input: training data
No observed label Many time label is “hidden” Examples: User modelling Documents classification Input: training data Output: clustering Partitioning the space Somewhat fuzzy goal Unsupervised learning Control Learner affects the environment Examples: Robots Driving cars Interactive environment What you see depends on what you do Exploration/exploitation Reinforcement Learning

Instance-based Methods
Model-based methods: estimate a fixed set of model parameters from data. compute prediction in closed form using parameters. Instance-based methods: look up similar “nearby” instances. Predict that new instance will be like those seen before. Example: will I like this movie?

Nonparametric Methods
Another name for instance-based or memory-based learning. Misnomer: they have parameters. Number of parameters is not fixed. Often grows with number of examples: More examples  higher resolution.

Instance-based Learning
K-Nearest Neighbor Algorithm Weighted Regression Case-based reasoning

K-Nearest Neighbor Features
All instances correspond to points in an n-dimensional Euclidean space Classification is delayed till a new instance arrives Classification done by comparing feature vectors of the different points Target function may be discrete or real-valued

1-Nearest Neighbor

3-Nearest Neighbor

k-nearest neighbor rule
Choose k odd to help avoid ties (parameter!). Given a query point xq, find the sphere around xq enclosing k points. Classify xq according to the majority of the k neighbors. At least k points in sphere. Legend: Green circle = test case. Solid circle: k = 3 Dashed circle: k = 5.

ITEC323 Lecture 1.

Similar presentations

Presentation on theme: "ITEC323 Lecture 1."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

ITEC323 Lecture 1.

Similar presentations

Presentation on theme: "ITEC323 Lecture 1."— Presentation transcript:

Similar presentations

About project

Feedback