Improving Human Action Recognition using Score Distribution and Ranking Minh Hoai Nguyen Joint work with Andrew Zisserman 1.

Improving Human Action Recognition using Score Distribution and Ranking Minh Hoai Nguyen Joint work with Andrew Zisserman 1

2 Inherent Ambiguity: When does an action begin and end?

Precise Starting Moment? 3 -Hands are being extended? -Hands are in contact?

4 When Does the Action End? -Action extends over multiple shots -Camera shows a third person in the middle

Video clip Latent location of action Consider subsequences Max HandShake classifier Action Location as Latent Information HandShake scores Recognition score (in testing) Update the classifier (in training)

Poor Performance of Max 6 DatasetWholeMax Hollywood266.764.8 TVHID66.665.0 Mean Average Precision (higher is better) Possible reasons:  The learned action classifier is far from perfect  The output scores are noisy  The maximum score is not robust Action recognition is … a hard problem 

Video clip Latent location of action Considered subsequences HandShake classifier Can We Use Mean Instead? HandShake scores Mean On Hollywood2, Mean is generally better than Max WholeMaxMean Hollywood2-Handshake48.057.150.3 But not always

Another HandShake Example 8 The proportion of HandShake is small For Whole and Mean, the Signal-to-Noise ratio is small

Latent location of actionVideo clip HandShake scores Sampled subsequences Sort Improved HandShake score Distribution-based classification Base HandShake classifier Proposed Method: Use the Distribution

Case 1: equivalent to using Mean Learning Formulation Subsequence-score distribution Video label weights bias Hinge loss Weights for Distribution Emphasize the relative importance of classifier scores Special cases: Case 2: equivalent to using Max

Controlled Experiments 11 Random action location Synthetic video Two controlled parameters: -The action percentage -, the separation between non-action and action features

Controlled Experiments 12

Hollywood2 – Progress over Time 13 8.6%9.3% Best Published Results Mean Average Precision (higher is better)

Hollywood2 – State-of-the-art Methods 14 Dataset Introduction (STIP + scene context) Deep Learning features Mined compound features Dense Trajectory Descriptor (DTD) Improved DTD (better motion est.) DTD + saliency same Mean Average Precision (higher is better)

Results on TVHI Dataset 15 14.8% Mean Average Precision (higher is better)

Weights for SSD classifiers 16

AnswerPhone Example 1 17

AnswerPhone Example 2 18

The End 19

Improving Human Action Recognition using Score Distribution and Ranking Minh Hoai Nguyen Joint work with Andrew Zisserman 1.

Similar presentations

Presentation on theme: "Improving Human Action Recognition using Score Distribution and Ranking Minh Hoai Nguyen Joint work with Andrew Zisserman 1."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Improving Human Action Recognition using Score Distribution and Ranking Minh Hoai Nguyen Joint work with Andrew Zisserman 1.

Similar presentations

Presentation on theme: "Improving Human Action Recognition using Score Distribution and Ranking Minh Hoai Nguyen Joint work with Andrew Zisserman 1."— Presentation transcript:

Similar presentations

About project

Feedback