CSE 4705 Artificial Intelligence

CSE 4705 Artificial Intelligence
Jinbo Bi Department of Computer Science & Engineering

Machine learning (1) Supervised learning algorithms

Topics in machine learning
Supervised learning such as classification and regression Unsupervised learning such as cluster analysis, outlier/novelty detection Dimension reduction Semi-supervised learning Active learning Online learning

Common techniques Supervised learning Regularized least squares
Least-absolute-shrinkage-and-selection operator Neural networks Logistic regression Decision trees Fisher’s discriminant analysis Support vector machines Graphical models

Common techniques Unsupervised learning K-means
Gaussian mixture models Hierarchical clustering Graph-based clustering (e.g., Spectral clustering)

Common techniques Dimension reduction Principal component analysis
Independent component analysis Canonical correlation analysis Feature selection Sparse modeling

Machine learning / Data mining
Data mining (sometimes called data or knowledge discovery) is the process of analyzing data from different perspectives and summarizing it into useful information ACM SIGKDD conference The ultimate goal of machine learning is the creation and understanding of machine intelligence ICML conference Heavily related to statistical learning theory Artificial intelligence is the intelligence exhibited by machines or software. It is to study how to create computers and computer software that are capable of intelligent behavior. AAAI conference

Supervised learning: definition
Given a collection of examples (training set ) Each example contains a set of attributes (independent variables), one of the attributes is the target (dependent variables). Find a model to predict the target as a function of the values of other attributes. Goal: previously unseen examples should be predicted as accurately as possible. A test set is used to determine the accuracy of the model. Usually, the given data set is divided into training and test sets, with training set used to build the model and test set used to validate it.

Supervised learning: classification
When the dependent variable is categorical, a classification problem

Classification: example
Face recognition Goal: Predict the identity of a face image Approach: Align all images to derive the features Model the class (identity) based on these features

Supervised learning: regression
When the dependent variable is continuous, a regression problem

Regression: example Risk prediction for patients
Goal: Predict the likelihood if a patient will suffer major complication after a surgery procedure Approach: Use patients vital signs before and after surgical operation. Heart Rate, Respiratory Rate, etc. Monitor patients by expert medical professionals to rate the likelihood of a patient having complication Learn a model as patient vital signs to map to the risk ratings. Use this model to detect potential high-risk patients for a particular surgical procedure

Unsupervised learning: clustering
Given a set of data points, each having a set of attributes, and a similarity measure among them, find clusters such that Data points in one cluster are more similar to one another. Data points in separate clusters are less similar to one another. Similarity Measures: Euclidean Distance if attributes are continuous. Other Problem-specific Measures

Clustering: example High Risky Patient Detection
Goal: Predict if a patient will suffer major complication after a surgery procedure Approach: Use patients vital signs before and after surgical operation. Heart Rate, Respiratory Rate, etc. Find patients whose symptoms are dissimilar from most of other patients.

Practice Judge what kind of the problem it is in the following scenarios A student collected a couple of online documents about movies, and try to identify which movie the documents discuss In a cognitive test, a person is asked if he could recognize the “red” color from a screen. The person needs to press a button if he thinks he sees red, or otherwise not. Then an EEG recording is made during the test. A researcher wants to use the EEG recordings to predict whether the red color is recognized. A researcher observed and recorded whether conditions (temperature, wind speed, snow etc.) from the past month, then he wants to use the data to predict the temperature in the next day.

Review of probability and linear algebra

Basics of probability An experiment (random variable) is a well-defined process with observable outcomes. The set or collection of all outcomes of an experiment is called the sample space, S. An event E is any subset of outcomes from S. Probability of an event, P(E) is P(E) = number of outcomes in E / number of outcomes in S.

Probability theory

Probability theory Joint Probability Marginal Probability
Conditional Probability Joint Probability

Probability theory Sum Rule Product Rule
The marginal prob of X equals the sum of the joint prob of x and y with respect to y Product Rule The joint prob of X and Y equals the product of the conditional prob of Y given X and the prob of X

Illustration Y=1 Y=2 p(X) p(Y) p(X|Y=1) p(X,Y)

The rules of probability
Sum Rule Product Rule Bayes’ Rule = p(X|Y)p(Y) posterior  likelihood × prior

Mean and variance The mean of a random variable X is the average value X takes. The variance of X is a measure of how dispersed the values that X takes are. The standard deviation is simply the square root of the variance.

Simple example X= {1, 2} with P(X=1) = 0.8 and P(X=2) = 0.2 Mean
0.8 X X 2 = 1.2 Variance 0.8 X (1 – 1.2) X (1 – 1.2) X (2 – 1.2) X (2-1.2)

Gaussian distribution

Multivariate Gaussian
x y

Basics of linear algebra

Matrix multiplication
The product of two matrices Special case: vector-vector product, matrix-vector product C A B

Matrix multiplication

Rules of matrix multiplication
B

Vector norms

Matrix norms and trace

A bit more on matrix

Orthogonal matrix 1 .

Square matrix – eigenvalue, eigenvector
where

Symmetric matrix eigen-decomposition of A

Singular value decomposition
orthogonal orthogonal diagonal

Supervised learning – practical issues
Underfitting Overfitting Before introducing these important concept, let us study a simple regression algorithm – linear regression

Questions?

CSE 4705 Artificial Intelligence

Similar presentations

Presentation on theme: "CSE 4705 Artificial Intelligence"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

CSE 4705 Artificial Intelligence

Similar presentations

Presentation on theme: "CSE 4705 Artificial Intelligence"— Presentation transcript:

Similar presentations

About project

Feedback