Transductive Regression Piloted by Inter-Manifold Relations.

Slides:



Advertisements
Similar presentations
Active Appearance Models
Advertisements

Principal Component Analysis Based on L1-Norm Maximization Nojun Kwak IEEE Transactions on Pattern Analysis and Machine Intelligence, 2008.
A Geometric Perspective on Machine Learning 何晓飞 浙江大学计算机学院 1.
Graph Embedding and Extensions: A General Framework for Dimensionality Reduction Keywords: Dimensionality reduction, manifold learning, subspace learning,
Semi-supervised Learning Rong Jin. Semi-supervised learning  Label propagation  Transductive learning  Co-training  Active learning.
Large Scale Manifold Transduction Michael Karlen Jason Weston Ayse Erkan Ronan Collobert ICML 2008.
Nonlinear Unsupervised Feature Learning How Local Similarities Lead to Global Coding Amirreza Shaban.
Support Vector Machine
Graph Based Semi- Supervised Learning Fei Wang Department of Statistical Science Cornell University.
K nearest neighbor and Rocchio algorithm
Semi-Supervised Clustering Jieping Ye Department of Computer Science and Engineering Arizona State University
Kernel Methods and SVM’s. Predictive Modeling Goal: learn a mapping: y = f(x;  ) Need: 1. A model structure 2. A score function 3. An optimization strategy.
1 Computational Learning Theory and Kernel Methods Tianyi Jiang March 8, 2004.
Semi-Supervised Learning D. Zhou, O Bousquet, T. Navin Lan, J. Weston, B. Schokopf J. Weston, B. Schokopf Presents: Tal Babaioff.
Atul Singh Junior Undergraduate CSE, IIT Kanpur.  Dimension reduction is a technique which is used to represent a high dimensional data in a more compact.
Lightseminar: Learned Representation in AI An Introduction to Locally Linear Embedding Lawrence K. Saul Sam T. Roweis presented by Chan-Su Lee.
Review Rong Jin. Comparison of Different Classification Models  The goal of all classifiers Predicating class label y for an input x Estimate p(y|x)
Nonlinear Dimensionality Reduction by Locally Linear Embedding Sam T. Roweis and Lawrence K. Saul Reference: "Nonlinear dimensionality reduction by locally.
Nonlinear Dimensionality Reduction Approaches. Dimensionality Reduction The goal: The meaningful low-dimensional structures hidden in their high-dimensional.
Jinhui Tang †, Shuicheng Yan †, Richang Hong †, Guo-Jun Qi ‡, Tat-Seng Chua † † National University of Singapore ‡ University of Illinois at Urbana-Champaign.
Ch. Eick: Support Vector Machines: The Main Ideas Reading Material Support Vector Machines: 1.Textbook 2. First 3 columns of Smola/Schönkopf article on.
Manifold learning: Locally Linear Embedding Jieping Ye Department of Computer Science and Engineering Arizona State University
Transfer Learning From Multiple Source Domains via Consensus Regularization Ping Luo, Fuzhen Zhuang, Hui Xiong, Yuhong Xiong, Qing He.
Machine Learning CUNY Graduate Center Lecture 3: Linear Regression.
1 Graph Embedding (GE) & Marginal Fisher Analysis (MFA) 吳沛勳 劉冠成 韓仁智
1 Logistic Regression Adapted from: Tom Mitchell’s Machine Learning Book Evan Wei Xiang and Qiang Yang.
Graph Embedding: A General Framework for Dimensionality Reduction Dong XU School of Computer Engineering Nanyang Technological University
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Adaptive nonlinear manifolds and their applications to pattern.
ECE 8443 – Pattern Recognition Objectives: Error Bounds Complexity Theory PAC Learning PAC Bound Margin Classifiers Resources: D.M.: Simplified PAC-Bayes.
IEEE TRANSSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE
Exploring Intrinsic Structures from Samples: Supervised, Unsupervised, and Semisupervised Frameworks Supervised by Prof. Xiaoou Tang & Prof. Jianzhuang.
Support Vector Machines Reading: Ben-Hur and Weston, “A User’s Guide to Support Vector Machines” (linked from class web page)
CS 782 – Machine Learning Lecture 4 Linear Models for Classification  Probabilistic generative models  Probabilistic discriminative models.
Computer Vision Lab. SNU Young Ki Baik Nonlinear Dimensionality Reduction Approach (ISOMAP, LLE)
A Two-level Pose Estimation Framework Using Majority Voting of Gabor Wavelets and Bunch Graph Analysis J. Wu, J. M. Pedersen, D. Putthividhya, D. Norgaard,
1 A fast algorithm for learning large scale preference relations Vikas C. Raykar and Ramani Duraiswami University of Maryland College Park Balaji Krishnapuram.
GRASP Learning a Kernel Matrix for Nonlinear Dimensionality Reduction Kilian Q. Weinberger, Fei Sha and Lawrence K. Saul ICML’04 Department of Computer.
Chapter 11 Statistical Techniques. Data Warehouse and Data Mining Chapter 11 2 Chapter Objectives  Understand when linear regression is an appropriate.
SemiBoost : Boosting for Semi-supervised Learning Pavan Kumar Mallapragada, Student Member, IEEE, Rong Jin, Member, IEEE, Anil K. Jain, Fellow, IEEE, and.
Project by: Cirill Aizenberg, Dima Altshuler Supervisor: Erez Berkovich.
Sparse Kernel Methods 1 Sparse Kernel Methods for Classification and Regression October 17, 2007 Kyungchul Park SKKU.
Linear Models for Classification
Optimal Dimensionality of Metric Space for kNN Classification Wei Zhang, Xiangyang Xue, Zichen Sun Yuefei Guo, and Hong Lu Dept. of Computer Science &
HAITHAM BOU AMMAR MAASTRICHT UNIVERSITY Transfer for Supervised Learning Tasks.
Unsupervised Feature Selection for Multi-Cluster Data Deng Cai, Chiyuan Zhang, Xiaofei He Zhejiang University.
Support vector machine LING 572 Fei Xia Week 8: 2/23/2010 TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: A 1.
A Convergent Solution to Tensor Subspace Learning.
Chapter1: Introduction Chapter2: Overview of Supervised Learning
GENDER AND AGE RECOGNITION FOR VIDEO ANALYTICS SOLUTION PRESENTED BY: SUBHASH REDDY JOLAPURAM.
Optimal Reverse Prediction: Linli Xu, Martha White and Dale Schuurmans ICML 2009, Best Overall Paper Honorable Mention A Unified Perspective on Supervised,
Ultra-high dimensional feature selection Yun Li
Support Vector Machines Reading: Ben-Hur and Weston, “A User’s Guide to Support Vector Machines” (linked from class web page)
CSC321: Lecture 25: Non-linear dimensionality reduction Geoffrey Hinton.
Tree and Forest Classification and Regression Tree Bagging of trees Boosting trees Random Forest.
SemiBoost : Boosting for Semi-supervised Learning Pavan Kumar Mallapragada, Student Member, IEEE, Rong Jin, Member, IEEE, Anil K. Jain, Fellow, IEEE, and.
Spectral Methods for Dimensionality
Semi-Supervised Clustering
Chapter 7. Classification and Prediction
Machine Learning Clustering: K-means Supervised Learning
Boosting and Additive Trees (2)
Unsupervised Riemannian Clustering of Probability Density Functions
کاربرد نگاشت با حفظ تنکی در شناسایی چهره
Machine Learning Basics
Outline Nonlinear Dimension Reduction Brief introduction Isomap LLE
Learning with information of features
Presented by: Chang Jia As for: Pattern Recognition
Generally Discriminant Analysis
Support Vector Machines
Recursively Adapted Radial Basis Function Networks and its Relationship to Resource Allocating Networks and Online Kernel Learning Weifeng Liu, Puskal.
Using Manifold Structure for Partially Labeled Classification
Presentation transcript:

Transductive Regression Piloted by Inter-Manifold Relations

Regression Algorithms. Reviews Exploit the manifold structures to guide the regression Belkin et.al, Regularization and semi- supervised learning on large graphs transduces the function values from the labeled data to the unlabeled ones utilizing local neighborhood relations, Global optimization for a robust prediction. Cortes et.al, On transductive regression. Tikhonov Regularization on the Reproducing Kernel Hilbert Space (RKHS) Classification problem can be regarded as a special version of regression Fei Wang et.al, Label Propagation Through Linear Neighborhoods An iterative procedure is deduced to propagate the class labels within local neighborhood and has been proved convergent Regression Values are constrained at 0 and 1 (binary) samples belonging to the corresponding class =>1 o.w. => 0 The convergence point can be deduced from the regularization framework

The Problem We are Facing Age estimation w.r.t. different genders Pose Estimation w.r.t. different Genders Illuminations Expressions Persons w.r.t. different persons FG-NET Aging Database CMU-PIE Dataset

The problem The Problem We are Facing All samples are considered as in the same class Samples close in the data space X are assumed to have similar function values (smoothness along the manifold) For the incoming sample, no class information is given. Utilize class information in the training process to boost the performance Regression on Multi-Class Samples. Traditional Algorithms The class information is easy to obtain for the training data

The problem The Problem.Difference with Multiview Algorithms There exists a clear correspondence among multiple learners. The class information is utilized in two ways: Intra-Class Regularization & Inter-Class Regularization Multi-View Regression One object can have multiple views or employ multiple learners for the same object. Multi-Class Regression No explicit correspondence. The data of different classes may be obtained from different instances in our configuration, thus it is much more challenging. Disagreement of different learners is penalized

The algorithm TRIM. Assumption & Notation Samples from different classes lie within different sub-manifolds Samples from different classes share similar distribution along respective sub-manifolds Labels: Function values for regression. Intra-Manfiold Intra-Class, Inter- Manifold Inter-Class.

TRIM. Intra-Manifold Regularization Respective intrinsic graphs are built for different sample classes Correspondingly, intra-manifold regularization item for different classes are calculated separately intrinsic graph The Regularization when p=1 when p=2 It may not be proper to preserve smoothness between samples from different classes.

The algorithm TRIM. Inter-Manifold Regularization Assumptions Samples with similar labels lie generally in similar relative positions on the corresponding sub-manifolds. Motivation 1.Align the sub-manifolds of different class samples according to the labeled points and graph structures. 2. Derive the correspondence in the aligned space using nearest neighbor technique.

The algorithm TRIM. Reinforced Landmark Correspondence Initialize the inter-manifold graph using the - ball distance criterion on the sample labels Reinforce the inter-manifold connections by iteratively implementing Only sample pairs with top 20% largest similarity scores are selected as landmark correspondences.

The algorithm TRIM. Manifold Alignment Minimize the correspondence error on the landmark points Hold the intra-manifold structures The item is a global compactness regularization, and is the Laplacian Matrix of where 1 If and are of different classes 0 o.w.

TRIM. Inter-Manifold Regularization Concatenate the derived inter-manifold graphs to form Laplacian Regularization

Objective Deduction TRIM. Objective Fitness Item RKHS Norm Intra-Manifold Regularization Inter-Manifold Regularization

Solution TRIM. Solution The solution to the minimization of the objective admits an expansion (Generalized Representer theorem) Thus the minimization over Hilbert space boils down to minimizing the coefficient vector over The minimizer is given by where and K is the N × N Gram matrix of labeled and unlabeled points over all the sample classes.

Solution TRIM.Generalization For the out-of-sample data, the labels can be estimated using Note here in this framework the class information for the incoming sample is not required in the prediction stage. Original Version without kernel

Two Moons Experiments. Nonlinear Two Moons (a) Original Function Value Distribution. (b) Traditional Graph Laplacian Regularized Regression (separate regressors for different classes). (c) Two Class TRIM. (d) Two Class TRIM on RKHS. Note the difference in the area indicated by the rectangle. The relation between function values and angles in the polar coordinates is quartic.

Cyclone Experiments.Cyclone Dataset Regression on Cyclone Dataset: (a) Original Function Values. (b) Traditional Graph Laplacian Regularized Regression (separate regressors for different classes). (c) Three Class TRIM. (d) Three Class TRIM on RKHS. Class Distribution of the Cyclone Dataset Regression on one class failed for the traditional algorithm because the lack of labeled samples. The cross manifold guidance that could be utilized grows rapidly as the class number increases.

YAMAHA Dataset Experiments.Age Dataset TRIM vs traditional graph Laplacian regularized regression for the training set evaluation on YAMAHA database. Open set evaluation for the kernelized regression on the YAMAHA database. (left) Regression on the training set. (right) Regression on out-of-sample data

Summary A new topic that is often met in applications but receive little attention. Class information is utilized in the training stage to boost the performance and the system does not require class information in the testing stage. Intra-Class and Inter-Class graphs are constructed and corresponding regularizations are introduced. Sub-manifolds of different sample classes are aligned and labels are propagated among samples from different classes.

Thank You!