Machine Learning ICS 178 Instructor: Max Welling visualization & k nearest neighbors.

Slides:



Advertisements
Similar presentations
Machine Learning Intro iCAMP 2012
Advertisements

Nonparametric Methods: Nearest Neighbors
Unsupervised Learning Clustering K-Means. Recall: Key Components of Intelligent Agents Representation Language: Graph, Bayes Nets, Linear functions Inference.
Clustering Basic Concepts and Algorithms
1 Welcome to the Kernel-Class My name: Max (Welling) Book: There will be class-notes/slides. Homework: reading material, some exercises, some MATLAB implementations.
Machine Learning and the Big Data Challenge Max Welling UC Irvine 1.
1 Video Processing Lecture on the image part (8+9) Automatic Perception Volker Krüger Aalborg Media Lab Aalborg University Copenhagen
Data Visualization STAT 890, STAT 442, CM 462
Learning from Observations Chapter 18 Section 1 – 4.
By Fernando Seoane, April 25 th, 2006 Demo for Non-Parametric Classification Euclidean Metric Classifier with Data Clustering.
Instructor: Max Welling
Lecture 4 Unsupervised Learning Clustering & Dimensionality Reduction
Machine Learning ICS 273A Instructor: Max Welling.
Unsupervised Learning
Semi-Supervised Clustering Jieping Ye Department of Computer Science and Engineering Arizona State University
Smart Traveller with Visual Translator. What is Smart Traveller? Mobile Device which is convenience for a traveller to carry Mobile Device which is convenience.
Pattern Recognition. Introduction. Definitions.. Recognition process. Recognition process relates input signal to the stored concepts about the object.
Part I: Classification and Bayesian Learning
Introduction to machine learning
Introduction to Data Mining Engineering Group in ACL.
CS Machine Learning. What is Machine Learning? Adapt to / learn from data  To optimize a performance function Can be used to:  Extract knowledge.
Tal Mor  Create an automatic system that given an image of a room and a color, will color the room walls  Maintaining the original texture.
MediaEval Workshop 2011 Pisa, Italy 1-2 September 2011.
1 Mean shift and feature selection ECE 738 course project Zhaozheng Yin Spring 2005 Note: Figures and ideas are copyrighted by original authors.
Boris Babenko Department of Computer Science and Engineering University of California, San Diego Semi-supervised and Unsupervised Feature Scaling.
FODAVA-Lead Education, Community Building, and Research: Dimension Reduction and Data Reduction: Foundations for Interactive Visualization Haesun Park.
Transfer Learning with Applications to Text Classification Jing Peng Computer Science Department.
CS 445/545 Machine Learning Winter, 2012 Course overview: –Instructor Melanie Mitchell –Textbook Machine Learning: An Algorithmic Approach by Stephen Marsland.
CS 445/545 Machine Learning Spring, 2013 See syllabus at
Understanding The Semantics of Media Chapter 8 Camilo A. Celis.
CPS 270: Artificial Intelligence Machine learning Instructor: Vincent Conitzer.
Pattern Recognition April 19, 2007 Suggested Reading: Horn Chapter 14.
MACHINE LEARNING 8. Clustering. Motivation Based on E ALPAYDIN 2004 Introduction to Machine Learning © The MIT Press (V1.1) 2  Classification problem:
1 Data Mining: Data Lecture Notes for Chapter 2. 2 What is Data? l Collection of data objects and their attributes l An attribute is a property or characteristic.
1 Data Mining: Concepts and Techniques (3 rd ed.) — Chapter 12 — Jiawei Han, Micheline Kamber, and Jian Pei University of Illinois at Urbana-Champaign.
Machine Learning Extract from various presentations: University of Nebraska, Scott, Freund, Domingo, Hong,
Data Science and Big Data Analytics Chap 4: Advanced Analytical Theory and Methods: Clustering Charles Tappert Seidenberg School of CSIS, Pace University.
Image Classification for Automatic Annotation
Autonomous Robots Vision © Manfred Huber 2014.
Chapter 4 Scatterplots and Correlation. Chapter outline Explanatory and response variables Displaying relationships: Scatterplots Interpreting scatterplots.
Clustering Instructor: Max Welling ICS 178 Machine Learning & Data Mining.
Machine Learning Saarland University, SS 2007 Holger Bast Marjan Celikik Kevin Chang Stefan Funke Joachim Giesen Max-Planck-Institut für Informatik Saarbrücken,
CS378 Final Project The Netflix Data Set Class Project Ideas and Guidelines.
MACHINE LEARNING 7. Dimensionality Reduction. Dimensionality of input Based on E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1)
Machine Learning (ML) with Weka Weka can classify data or approximate functions: choice of many algorithms.
Machine Learning ICS 178 Instructor: Max Welling Supervised Learning.
Introduction to Reinforcement Learning Hiren Adesara Prof: Dr. Gittens.
1 Kernel Machines A relatively new learning methodology (1992) derived from statistical learning theory. Became famous when it gave accuracy comparable.
Cluster Analysis What is Cluster Analysis? Types of Data in Cluster Analysis A Categorization of Major Clustering Methods Partitioning Methods.
Learning and Removing Cast Shadows through a Multidistribution Approach Nicolas Martel-Brisson, Andre Zaccarin IEEE TRANSACTIONS ON PATTERN ANALYSIS AND.
The KDD Process for Extracting Useful Knowledge from Volumes of Data Fayyad, Piatetsky-Shapiro, and Smyth Ian Kim SWHIG Seminar.
Machine Learning Supervised Learning Classification and Regression K-Nearest Neighbor Classification Fisher’s Criteria & Linear Discriminant Analysis Perceptron:
Machine Learning Supervised Learning Classification and Regression K-Nearest Neighbor Classification Fisher’s Criteria & Linear Discriminant Analysis Perceptron:
Semi-Supervised Clustering
Machine Learning overview Chapter 18, 21
On Interpreting I Interpreting Histograms, Density Functions, distributions of a single attribute What is the type of the attribute? What is the mean.
A Personal Tour of Machine Learning and Its Applications
Digital 2D Image Basic Masaki Hayashi
Introduction Computer vision is the analysis of digital images
Machine Learning Dimensionality Reduction
Machine Learning & Data Science
On Interpreting I Interpreting Histograms, Density Functions, distributions of a single attribute What is the type of the attribute? What is the mean.
Example Histogram c) Interpret the following histogram that captures the percentage of body-fat in a testgroup [4]:  
Classification and Prediction
Instructor: Max Welling
On Interpreting I Interpreting Histograms, Density Functions, distributions of a single attribute What is the type of the attribute? What is the mean.
Machine Learning Support Vector Machine Supervised Learning
Machine Learning – a Probabilistic Perspective
Machine Learning overview Chapter 18, 21
Instructor: Vincent Conitzer
Presentation transcript:

Machine Learning ICS 178 Instructor: Max Welling visualization & k nearest neighbors

Types of Learning Supervised Learning Labels are provided, there is a strong learning signal. e.g. classification, regression. Semi-supervised Learning. Only part of the data have labels. e.g. a child growing up. Reinforcement learning. The learning signal is a (scalar) reward and may come with a delay. e.g. trying to learn to play chess, a mouse in a maze. Unsupervised learning There is no direct learning signal. We are simply trying to find structure in data. e.g. clustering, dimensionality reduction.

Ingredients Data: what kind of data do we have? Prior assumptions: what do we know a priori about the problem? Representation: How do we represent the data? Model / Hypothesis space: What hypotheses are we willing to entertain to explain the data? Feedback / learning signal: what kind of learning signal do we have (delayed, labels)? Learning algorithm: How do we update the model (or set of hypothesis) from feedback? Evaluation: How well did we do, should we change the model?

Data Preprocessing Before you start modeling the data, you want to have a look at it to get a “feel”. What are the “modalities” of the data: e.g. Netflix: users and movies Text: words-tokens and documents Video: pixels, frames, color-index (R,G,B) What is the domain? Netflix: rating-values [1,2,3,4,5,?] Text: # times a word appears: [0,1,2,3,...] Video: brightness value: [0,..,255] or real-valued. Are there missing data-entries? Are there outliers in the data? (perhaps a typo?)

Data Preprocessing Often it is a good idea to compute the mean and variance of the data. Mean gives you a sense of location, Variance/STD a sense of scale. Better even is to histogram the data: Tricky issue: how do you choose the bin-size: too small: you see noise, too big: it’s one clump. meanvariance standard deviation

Preprocessing For netflix you can histogram this for both modalities: The rating distribution over users for a movie. The rating distribution over movies for a user. The rating distribution over users for all movies jointly. The rating distribution over all movies for all users jointly. You can compute properties and plot them against each other. For example: Compute the the user-specific mean variance over movies and plot a scatter plot: user-mean user-variance every dot is a different user

Scatter-Plots This shows all the 2-D projections of the “Iris data”. Color indicates the class of iris. How many attributes do we have for Iris?

3-D visualization contour plot meshgrid plot

Embeddings Every red dot represents an image. An image has +/ pixels Each image is projected to a 2-D space Projections are such that similar images are projected to similar locations in the 2-D embedding. This gives us an idea how the data is organized. These plots are produced by “local linear embedding”

Embeddings

Visualization by Clustering By performing a clustering of the data and looking at the cluster-prototypes you can get an idea of the type of data.

Preprocessing Often it is useful to “standardize” (or “whiten”) the data before you start modeling. The idea is to remove the mean and the variance so that your algorithm can focus on more sophisticated (higher order) structure.

Be Creative! WEKA DEMO