Action Recognition in Temporally Untrimmed Videos

Slides:



Advertisements
Similar presentations
TRECVID 2011 Surveillance Event Detection Speaker: Lu Jiang Longfei Zhang, Lu Jiang, Lei Bao, Shohei Takahashi, Yuanpeng Li, Alexander.
Advertisements

DONG XU, MEMBER, IEEE, AND SHIH-FU CHANG, FELLOW, IEEE Video Event Recognition Using Kernel Methods with Multilevel Temporal Alignment.
Image classification Given the bag-of-features representations of images from different classes, how do we learn a model for distinguishing them?
Limin Wang, Yu Qiao, and Xiaoou Tang
1 Challenge the future HON4D: Histogram of Oriented 4D Normals for Activity Recognition from Depth Sequences Omar Oreifej Zicheng Liu CVPR 2013.
Human Action Recognition across Datasets by Foreground-weighted Histogram Decomposition Waqas Sultani, Imran Saleemi CVPR 2014.
Carolina Galleguillos, Brian McFee, Serge Belongie, Gert Lanckriet Computer Science and Engineering Department Electrical and Computer Engineering Department.
Week 9 Fatemeh Yazdiananari. Accomplished Tasks  Fixed the issues with classifiers  We retrained SVMs with the new UCF101 histograms  On temporally.
CS395: Visual Recognition Spatial Pyramid Matching Heath Vinicombe The University of Texas at Austin 21 st September 2012.
1 Week 6 Fatemeh Yazdiananari. 2 Feature Extraction 75 Validation Videos Were sent to the cluster for DTF feature extraction The extracted features are.
Activity Recognition Aneeq Zia. Agenda What is activity recognition Typical methods used for action recognition “Evaluation of local spatio-temporal features.
Watching Unlabeled Video Helps Learn New Human Actions from Very Few Labeled Snapshots Chao-Yeh Chen and Kristen Grauman University of Texas at Austin.
Vision-Based Analysis of Small Groups in Pedestrian Crowds Weina Ge, Robert T. Collins, R. Barry Ruback IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE.
Event prediction CS 590v. Applications Video search Surveillance – Detecting suspicious activities – Illegally parked cars – Abandoned bags Intelligent.
Multi-Class Object Recognition Using Shared SIFT Features
COS 429 PS5: Finding Nemo. Exemplar -SVM Still a rigid template,but train a separate SVM for each positive instance For each category it can has exemplar.
© 2013 IBM Corporation Efficient Multi-stage Image Classification for Mobile Sensing in Urban Environments Presented by Shashank Mujumdar IBM Research,
Real-time Action Recognition by Spatiotemporal Semantic and Structural Forest Tsz-Ho Yu, Tae-Kyun Kim and Roberto Cipolla Machine Intelligence Laboratory,
SVMLight SVMLight is an implementation of Support Vector Machine (SVM) in C. Download source from :
Bag of Video-Words Video Representation
Action recognition with improved trajectories
Watch, Listen and Learn Sonal Gupta, Joohyun Kim, Kristen Grauman and Raymond Mooney -Pratiksha Shah.
Marcin Marszałek, Ivan Laptev, Cordelia Schmid Computer Vision and Pattern Recognition, CVPR Actions in Context.
Week 9 Presented by Christina Peterson. Recognition Accuracies on UCF Sports data set Method Accuracy (%)DivingGolfingKickingLiftingRidingRunningSkating.
Beauty is Here! Evaluating Aesthetics in Videos Using Multimodal Features and Free Training Data Yanran Wang, Qi Dai, Rui Feng, Yu-Gang Jiang School of.
Mentor: Salman Khokhar Action Recognition in Crowds Week 7.
UCF REU: Weeks 1 & 2. Gradient Code Gradient Direction of the Gradient: Calculating theta.
Real-Time Cyber Physical Systems Application on MobilityFirst Winlab Summer Internship 2015 Karthikeyan Ganesan, Wuyang Zhang, Zihong Zheng.
Efficient Subwindow Search: A Branch and Bound Framework for Object Localization ‘PAMI09 Beyond Sliding Windows: Object Localization by Efficient Subwindow.
Mebi 591D – BHI Kaggle Class Baselines kaggleclass.weebly.com/
Chao-Yeh Chen and Kristen Grauman University of Texas at Austin Efficient Activity Detection with Max- Subgraph Search.
IR Homework #3 By J. H. Wang May 4, Programming Exercise #3: Text Classification Goal: to classify each document into predefined categories Input:
Students: Meera & Si Mentor: Afshin Dehghan WEEK 4: DEEP TRACKING.
First-Person Activity Recognition: What Are They Doing to Me? M. S. Ryoo and Larry Matthies Jet Propulsion Laboratory, California Institute of Technology,
Context-based vision system for place and object recognition Antonio Torralba Kevin Murphy Bill Freeman Mark Rubin Presented by David Lee Some slides borrowed.
Iterative similarity based adaptation technique for Cross Domain text classification Under: Prof. Amitabha Mukherjee By: Narendra Roy Roll no: Group:
Levi Smith Christian Weigandt.  Getting data set together  Clipping videos to form the training and testing data for our classifier  Looking at code.
***Classification Model*** Hosam Al-Samarraie, PhD. CITM-USM.
Object Recognition as Ranking Holistic Figure-Ground Hypotheses Fuxin Li and Joao Carreira and Cristian Sminchisescu 1.
Week8 Fatemeh Yazdiananari.  Fixed the issues with classifiers  We retrained SVMs with the new UCF101 histograms  On temporally untrimmed videos: ◦
Weekly Report Cotta, Lucas Computer Scientist. Previous Plans · Kaldi/EEG - Once running, analyze some EEG examples with Kaldi · Demo - Implement Spectrogram.
Object Recognition Tutorial Beatrice van Eden - Part time PhD Student at the University of the Witwatersrand. - Fulltime employee of the Council for Scientific.
Cell Segmentation in Microscopy Imagery Using a Bag of Local Bayesian Classifiers Zhaozheng Yin RI/CMU, Fall 2009.
Chapter 15: Classification of Time- Embedded EEG Using Short-Time Principal Component Analysis by Nguyen Duc Thang 5/2009.
Week 4: 6/6 – 6/10 Jeffrey Loppert. This week.. Coded a Histogram of Oriented Gradients (HOG) Feature Extractor Extracted features from positive and negative.
Week 5 Emily Hand UNR. AdaBoost For our previous detector, we used SVM.  Color Histogram We decided to try AdaBoost  Mean Blocks.
1 Bilinear Classifiers for Visual Recognition Computational Vision Lab. University of California Irvine To be presented in NIPS 2009 Hamed Pirsiavash Deva.
Action Recognition in Video
Unsupervised Learning of Video Representations using LSTMs
Detecting Semantic Concepts In Consumer Videos Using Audio Junwei Liang, Qin Jin, Xixi He, Gang Yang, Jieping Xu, Xirong Li Multimedia Computing Lab,
Human Action Recognition Week 10
Data Driven Attributes for Action Detection
Query-Focused Video Summarization – Week 1
An Additive Latent Feature Model
Introductory Seminar on Research: Fall 2017
Customer Satisfaction Based on Voice
Detecting Room Occupancy with Pi Camera
Using Transductive SVMs for Object Classification in Images
A Tutorial on HOG Human Detection
Context-Aware Modeling and Recognition of Activities in Video
Two-Stream Convolutional Networks for Action Recognition in Videos
Mentor: Salman Khokhar
Data Driven Attributes for Action Detection
Week 6 Fatemeh Yazdiananari.
Human Action Recognition Week 8
Exemplar-SVM for Action Recognition
Comparison of EET and Rank Pooling on UCF101 (split 1)
Vehicle detection and localization
THE ASSISTIVE SYSTEM SHIFALI KUMAR BISHWO GURUNG JAMES CHOU
Truman Action Recognition Status update
Presentation transcript:

Action Recognition in Temporally Untrimmed Videos Fatemeh Yazdiananari

Temporally Clipped v.s Unclipped Temporally Clipped: Videos only contain the action. Temporally Unclipped: Videos contain both the action and non-action. Temporally Unclipped is a real-world representation of videos. Action recognition needs to be adapted for it.

Unclipped Videos Contains more then the action Determine the temporal location and the action itself Make temporally clipped recognition methods suitable for unclipped data We are considering 4 different methods

The 4 Methods 1. Dividing a video into clips 2. Overlapping Sliding Windows in time 3. Spatiotemporal Segmentation 4. Graphical Model: Capturing the relationship of clips

Baseline Action Recognition Using DTF features HOG, HOF, MBH, Trajectory Bag of Words model Feature Vector: each video is represented by a histogram of visual words SVM is used as the classifier

Preliminary Steps Download UCF101, DTF, three split files Run and understand demos of SVM Work on UCF101 baseline Write code to load Features, Labels, and Names of each video.

Ground truth of all data both test and training data SVM demos Ground truth of all data both test and training data

SVM demos Small unfilled circles are the trained data, filled circles are the tested data. Only were classified as positive.

Code Feature matrix (DTF) : (13320, 16000) Label Vector : (13320,1) Name Vector : (13320,1) Next step is to optimize this into a structure for each video with feature, label, name and index Optimization will help me run a comparison with the Train/Test splits and implement MultiClass SVM Next week I will be able to run baseline and get accuracy percentage of UCF101