CIVL 7012/8012 Introduction to Machine Learning.

CIVL 7012/8012 Introduction to Machine Learning

What is Machine Learning?
Field of study that enables computers to learn by itself without being programmed. Learning implies improving on certain task T, with respect to performance metric P based on training experience E Some examples Identifying spams in T: Classify s as legitimate or spam P: Percentage of accurate classifications E: database Detecting damages in structures T: Detect the location and extent of damages in structure P: Percentages of accurate detections E: Visual and sensory data

Some more examples Autonomous vehicles
T: Vehicle driving on the road on its own using cameras and sensors P: Distance travelled on its own until occurrence of human-judged error E: Visual data and sensor collected by observing a human driver Classifying soil using photographs T: Classifying soil class and type P: Percentages of accurate classifications E: Visual data such as photographs Identifying spoken words (How personal assistant such as Apple Siri, Amazon Alexa, etc. work) T: Classify and identify words spoken by someone P: Percentage of accurate identifications E: Database of words spoken by people with different accents

Some more examples of tasks best solved by a learning algorithm
Classification and regression Determining credit scores Determining if someone is credit-worthy Pattern recognition - Such as shopping recommendations for items bought together. For example: mobile phones and protective cases) Recognizing anomalies and outliers - Such as unusual credit card transactions. Forecasting and Predicting - Stock prices in the future

How is ML different from traditional modeling?
Machine Learning ML does not need human to program. The data is enough for the computer to recognize patterns and make predictions

ML components Every ML algorithms has three components:
Representation: Evaluation Optimization

Representation The knowledge or the data should be presented in a formal language understood by the computer Examples include decision trees, graphical models, neural networks, support vector machines, etc. Depends on “features” or variables that represent the real work entity

Optimization Learn a model from the features by minimizing certain criteria (e.g. the error) Largely depends on ML algorithm used

Evaluation Evaluate the program for different learners (ML algorithms) and distinguish good learners using some metric. Similar to using different models to fit a data and choosing the best one in traditional modeling Not all ML algorithms perform the same. Some might work very well at certain evaluations while performing poorly on others.

Types of learning Supervised learning
Training data included desired outputs (Remember regression?) Unsupervised learning Training does not include desired outputs Used in finding hidden patterns in data (e.g. weather forecast) Semi-supervised learning Training data includes few desired outputs Reinforcement learning - “rewards” from “actions”

Supervised learning: Regression
Given (x1,y1), (x2,y2), …., (xn,yn) learn a function f(x) that can predict y given x. Price of house as a function of area of house Image source:

Steps Find the function f(x) best representing the data using training data which is called “Feature Engineering”. Evaluate its performance using testing data Look at other algorithms that can potentially work better It is important to understand how the features interact or if the performance of ML algorithm is affected by feature type

Some applications Disease diagnosis based on symptoms of the patient Face recognition based on picture of a person Voice recognition based on the tone and pitch of an individual’s voice Credit risk assessment based on credit history

Supervised learning: Classification
Given (x1,y1), (x2,y2), …., (xn,yn) learn a function f(x) that can predict y given x provided y is a categorical variable. Example: Given past weather pattern, what will be the weather like tomorrow? Will it be sunny, or will it rain? Classifying Class A and Class B which have similar features Image source:

Unsupervised learning: Clustering
Given x1, x2, …., xn can predict hidden structure behind x’s Clustering need not be spherical Image source:

Reinforcement learning
Goal is to learn actions that maximize rewards Lifelong learning without training and testing Used in robotics, industrial manufacturing

Example Consider a self driving car stopped at a red-light signal. The light turn green and the car starts moving. Here the agent is the car and it is moving from one state to another. This actions produces some reward. On the contrary if the car does not start and remains at the intersection despite green signal, it receives a traffic ticket. This is negative reward.

Assume the world as a finite set of states Each state has finite number of actions Specific actions from a state results in a reward when executed and then we move to a different state

Evaluation metrics for ML classifiers
Accuracy alone is not a good measure of the accuracy of an algorithm. Consider an algorithm detecting spams. Out of 10,000 s we know there are 100 spams. Case 1: Any the algorithm sees is classified as Not-spam. Accuracy for spam s only = 0/100 Case 2: Any the algorithm sees is classified as spam Accuracy for spam = 100/100 Is the second algorithm better?

Precision and recall Since accuracy alone does not provide a good idea of algorithm in many cases, we use two standard measures to evaluate quality of prediction. Precision: Accuracy in terms of classifying relevant instances Recall: Accuracy in classifying all relevant instances correctly If nt is the number of true positives, nf is the number of false positives and ne is the number of false negatives Precision: (nt)/(nt+nf) Recall: (nt)/(nt+ne)

F1 score Used to evaluate the performance of classifiers
Combines precision and recall Calculated as harmonic mean of precision and recall F1=2*Precision*Recall/(Precision+Recall)

Example Consider there are 100 known spam s in a dataset with 10,000 s. Find precision, recall assuming all the s the algorithm sees are classified as Not-spam. nt=0, nf=0, nt=100 Precision=undefined Recall=0

Errors There are two types of errors in ML algorithms:
Training error: Error made on training data Testing error: Error made on testing data. Unlike traditional modeling ML algorithms need to developed with generalization in mind in order to fit data unused in training.

Estimating errors For traditional models we can estimate errors made the model in training within a confidence interval ML algorithms however will strive to achieve perform very well on training data sometimes with zero error. Therefore traditional statistical method may not be applicable in evaluating ML algorithms.

Cross validation Cross validation evaluates performance of supervised ML algorithms Given a dataset with number of features or variables how do you decide which algorithms is the best? We can Train the algorithm to be more accurate (loses generalization) Choose the most accurate algorithm What about data that is not seen by the algorithm yet? How will it perform with such data?

Example In this example, piece wise linear regression is a perfect fit for the dataset, but this maybe too perfect. What about future data point though? The first two are likely going to better fit for future point due to their better generalization. Best algorithms are those which are accurate but do not lose generalization and can predict future uncertainties in data set better

Methods of Cross Validation
Train test: Steps Split the data set into training and testing data set. Usually the split is 75% training and 25% testing. Develop a model using training dataset Test model performance using training dataset. Leave One Out Cross Validation (LOOCV) Remove one observation from training data Train ML algorithm on the remaining data Test the algorithm with the removed data Repeat it for all observations in the training data LOOCV is cumbersome as all the observations in original dataset need to tested. If there are 10 million observations then 10 millions training instances are needed.

Cross Validation K-fold cross validation
LOOCV is not suitable for huge datasets. K-fold validation is a more efficient approach Steps Split data into K partitions Train K-1 partitions and test on remaining partition Repeat this for each partition Value of K depends on the data 10-fold cross validation is the most commonly used

Methods of Cross Validation
Method of validation used depends on the user. Train test works best if the training data is not biased LOOCV although validates all the data points is not scalable to large datasets With LOOCV with exclusion of data can result in different data fit as the data changes constant

CIVL 7012/8012 Introduction to Machine Learning.

Similar presentations

Presentation on theme: "CIVL 7012/8012 Introduction to Machine Learning."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

CIVL 7012/8012 Introduction to Machine Learning.

Similar presentations

Presentation on theme: "CIVL 7012/8012 Introduction to Machine Learning."— Presentation transcript:

Similar presentations

About project

Feedback