# Label Distribution Learning and Its Applications

## Presentation on theme: "Label Distribution Learning and Its Applications"— Presentation transcript:

Label Distribution Learning and Its Applications
Xin Geng （耿新） Pattern Learning and Mining (PALM) Lab （模式学习与挖掘实验室, School of Computer Science and Engineering Southeast University, Nanjing, China （东南大学）

Learning with Ambiguity
Single-label Learning Multi-label Learning ? Label Ambiguity Less Ambiguity More Ambiguity

Label Ambiguity Multi-label Learning “What describes the instance?”
cloud sky water building Multi-label Learning

More Ambiguity? “How to describe the instance?” some cloud mostly sky
much water a bit of building

How to learn? Not a good choice! Keep more, learn more MLL
Thresholding Positive labels MLL Label Distribution Learning (LDL) Assign a real number to each label Importance Confidence Level …… Not a good choice! Keep more, learn more

LDL – Problem Formulation
Description Degree A real number is assigned to the label for the instance WLOG Label Distribution Complete label set

LDL – Problem Formulation

LDL – Algorithms Two Categories
Conditional Probability Mass Function (Classification) Model the mapping from the instance x to the label distribution d via a conditional PMF Multivariate Support Vector Regression (Regression) Model the mapping from the instance x to the label distribution d via a multivariate support vector machine

Conditional Probability Mass Function
Learning from Label Distribution Training set: Goal: learn a conditional mass function that can generate label distributions similar to given the instance K-L divergence

Conditional Probability Mass Function
Directly minimizing the K-L divergence between predicted and real LDs MaxEnt Model

Conditional Probability Mass Function
IIS-LLD [Geng, Yin, and Zhou, TPAMI’13] [Geng, Smith-Miles, and Zhou, AAAI’10]

Conditional Probability Mass Function
BFGS-LLD [Geng and Ji, ICDMW’13]

Conditional Probability Mass Function
CPNN [Geng, Yin, and Zhou, TPAMI’13] 3

Multivariate Support Vector Regression
Two issues How to output a distribution composed by multiple components? Multivariate Support Vector Regression (M-SVR) [Fernandez et al., TSP’04] How to constrain each component of the distribution within the range of a probability, i.e., [0, 1]? Model the regression by a sigmoid function Solve the two problems simultaneously LDSVR [Geng and Hou, submitted to IJCAI’15] Fit a sigmoid function to each component of the label distribution simultaneously by a support vector machine

Multivariate Support Vector Regression
Sigmoid model Target function of SVR Loss Function

Multivariate Support Vector Regression
The loss function Dimension by dimension Insensitive Zone Problem: Examples falling into the area ρ1 will be penalized once while those falling into the area ρ2 will be penalized twice.

Multivariate Support Vector Regression
The loss function Multivariate Insensitive Zone Problem: Difficult to optimize and apply the kernel trick

Multivariate Support Vector Regression
The loss function Measure the loss by calculating how far away from zi another point z′i∈ Rc should move to get the same output with the ground truth

Multivariate Support Vector Regression
The loss function Replacing ui with u′i/4 Insensitive Zone

Age Estimation Aging is a slow and gradual progress
[Geng, Yin, and Zhou, TPAMI’13] [Geng, Smith-Miles, and Zhou, AAAI’10] Aging is a slow and gradual progress The faces at close ages look quite similar Can we use the neighboring ages to relieve the ‘lack of training samples’ problem?

Age Estimation Experiment
[Geng, Yin, and Zhou, TPAMI’13] [Geng, Smith-Miles, and Zhou, AAAI’10] Experiment

Head Pose Estimation Bivariate Label Distribution
[Geng and Xia, CVPR’14] Bivariate Label Distribution

Head Pose Estimation [Geng and Xia, CVPR’14] Experiment

Multilabel Ranking for Natural Scene Images
[Geng and Luo, CVPR’14] Multilabel Ranking A bipartition of the relevant (positive) and irrelevant (negative) labels A proper ranking over relevant labels Multiple Rankers: Subjective Inconsistent “Ground Truth”

Multilabel Ranking for Natural Scene Images
[Geng and Luo, CVPR’14] Multilabel Ranking by Preference Distribution Virtual labels as split point between relevant and irrelevant labels

Multilabel Ranking for Natural Scene Images
[Geng and Luo, CVPR’14] Experiment

Crowd Counting [Wang, Zhang and Geng, Neurocomputing’15]

Crowd Counting [Wang, Zhang and Geng, Neurocomputing’15]

Pre-release Prediction of Crowd Opinion on Movies
[Geng and Hou, submitted to IJCAI’15] Pre-release Metadata Crowd Rating Distribution

Pre-release Prediction of Crowd Opinion on Movies
[Geng and Hou, submitted to IJCAI’15] Experiment

Conclusion Label distribution learning It is useful when
More general framework than single-label and multi-label learning Deals with different importance of labels Matches certain problems better Needs special design It is useful when There is a natural measure of description degree There are multiple labeling sources for one instance The labels are correlated to each other ……