Learning to classify the visual dynamics of a scene Nicoletta Noceti Università degli Studi di Genova Corso di Dottorato.

Slides:



Advertisements
Similar presentations
Applications of one-class classification
Advertisements

Université du Québec École de technologie supérieure Face Recognition in Video Using What- and-Where Fusion Neural Network Mamoudou Barry and Eric Granger.
Context-based object-class recognition and retrieval by generalized correlograms by J. Amores, N. Sebe and P. Radeva Discussion led by Qi An Duke University.
DDDAS: Stochastic Multicue Tracking of Objects with Many Degrees of Freedom PIs: D. Metaxas, A. Elgammal and V. Pavlovic Dept of CS, Rutgers University.
Presented by Xinyu Chang
RGB-D object recognition and localization with clutter and occlusions Federico Tombari, Samuele Salti, Luigi Di Stefano Computer Vision Lab – University.
Change Detection C. Stauffer and W.E.L. Grimson, “Learning patterns of activity using real time tracking,” IEEE Trans. On PAMI, 22(8): , Aug 2000.
By: Ryan Wendel.  It is an ongoing analysis in which videos are analyzed frame by frame  Most of the video recognition is pulled from 3-D graphic engines.
Tracking Learning Detection
Activity Recognition Aneeq Zia. Agenda What is activity recognition Typical methods used for action recognition “Evaluation of local spatio-temporal features.
Space-time interest points Computational Vision and Active Perception Laboratory (CVAP) Dept of Numerical Analysis and Computer Science KTH (Royal Institute.
Visual Event Detection & Recognition Filiz Bunyak Ersoy, Ph.D. student Smart Engineering Systems Lab.
Tracking Multiple Occluding People by Localizing on Multiple Scene Planes Professor :王聖智 教授 Student :周節.
Object Inter-Camera Tracking with non- overlapping views: A new dynamic approach Trevor Montcalm Bubaker Boufama.
Robust Object Tracking via Sparsity-based Collaborative Model
Watching Unlabeled Video Helps Learn New Human Actions from Very Few Labeled Snapshots Chao-Yeh Chen and Kristen Grauman University of Texas at Austin.
A KLT-Based Approach for Occlusion Handling in Human Tracking Chenyuan Zhang, Jiu Xu, Axel Beaugendre and Satoshi Goto 2012 Picture Coding Symposium.
Local Descriptors for Spatio-Temporal Recognition
Event prediction CS 590v. Applications Video search Surveillance – Detecting suspicious activities – Illegally parked cars – Abandoned bags Intelligent.
A Study of Approaches for Object Recognition
CS335 Principles of Multimedia Systems Content Based Media Retrieval Hao Jiang Computer Science Department Boston College Dec. 4, 2007.
Video Google: Text Retrieval Approach to Object Matching in Videos Authors: Josef Sivic and Andrew Zisserman University of Oxford ICCV 2003.
A Self-Organizing Approach to Background Subtraction for Visual Surveillance Applications Lucia Maddalena and Alfredo Petrosino, Senior Member, IEEE.
Tracking Video Objects in Cluttered Background
5/30/2006EE 148, Spring Visual Categorization with Bags of Keypoints Gabriella Csurka Christopher R. Dance Lixin Fan Jutta Willamowski Cedric Bray.
1 Invariant Local Feature for Object Recognition Presented by Wyman 2/05/2006.
Introduction to Object Tracking Presented by Youyou Wang CS643 Texas A&M University.
Smart Traveller with Visual Translator for OCR and Face Recognition LYU0203 FYP.
1 Activity and Motion Detection in Videos Longin Jan Latecki and Roland Miezianko, Temple University Dragoljub Pokrajac, Delaware State University Dover,
ICCV 2003UC Berkeley Computer Vision Group Recognizing Action at a Distance A.A. Efros, A.C. Berg, G. Mori, J. Malik UC Berkeley.
Recognizing Action at a Distance A.A. Efros, A.C. Berg, G. Mori, J. Malik UC Berkeley.
West Virginia University
Computer vision.
Flow Based Action Recognition Papers to discuss: The Representation and Recognition of Action Using Temporal Templates (Bobbick & Davis 2001) Recognizing.
Action recognition with improved trajectories
Prakash Chockalingam Clemson University Non-Rigid Multi-Modal Object Tracking Using Gaussian Mixture Models Committee Members Dr Stan Birchfield (chair)
Olga Zoidi, Anastasios Tefas, Member, IEEE Ioannis Pitas, Fellow, IEEE
TP15 - Tracking Computer Vision, FCUP, 2013 Miguel Coimbra Slides by Prof. Kristen Grauman.
COMPUTER VISION: SOME CLASSICAL PROBLEMS ADWAY MITRA MACHINE LEARNING LABORATORY COMPUTER SCIENCE AND AUTOMATION INDIAN INSTITUTE OF SCIENCE June 24, 2013.
Marcin Marszałek, Ivan Laptev, Cordelia Schmid Computer Vision and Pattern Recognition, CVPR Actions in Context.
Spatio-temporal constraints for recognizing 3D objects in videos Nicoletta Noceti Università degli Studi di Genova.
Pedestrian Detection and Localization
November 13, 2014Computer Vision Lecture 17: Object Recognition I 1 Today we will move on to… Object Recognition.
Visual Information Systems Recognition and Classification.
Computer Vision Michael Isard and Dimitris Metaxas.
Motion Analysis using Optical flow CIS750 Presentation Student: Wan Wang Prof: Longin Jan Latecki Spring 2003 CIS Dept of Temple.
Efficient Visual Object Tracking with Online Nearest Neighbor Classifier Many slides adapt from Steve Gu.
Chapter 5 Multi-Cue 3D Model- Based Object Tracking Geoffrey Taylor Lindsay Kleeman Intelligent Robotics Research Centre (IRRC) Department of Electrical.
Gang WangDerek HoiemDavid Forsyth. INTRODUCTION APROACH (implement detail) EXPERIMENTS CONCLUSION.
1 Motion Analysis using Optical flow CIS601 Longin Jan Latecki Fall 2003 CIS Dept of Temple University.
Jack Pinches INFO410 & INFO350 S INFORMATION SCIENCE Computer Vision I.
Mestrado em Ciência de Computadores Mestrado Integrado em Engenharia de Redes e Sistemas Informáticos VC 15/16 – TP14 Pattern Recognition Miguel Tavares.
Human Activity Recognition at Mid and Near Range Ram Nevatia University of Southern California Based on work of several collaborators: F. Lv, P. Natarajan,
 Present by 陳群元.  Introduction  Previous work  Predicting motion patterns  Spatio-temporal transition distribution  Discerning pedestrians  Experimental.
Using decision trees to build an a framework for multivariate time- series classification 1 Present By Xiayi Kuang.
Image features and properties. Image content representation The simplest representation of an image pattern is to list image pixels, one after the other.
ENTERFACE 08 Project 9 “ Tracking-dependent and interactive video projection ” Mid-term presentation August 19th, 2008.
Face Detection 蔡宇軒.
Machine learning & object recognition Cordelia Schmid Jakob Verbeek.
Guillaume-Alexandre Bilodeau
Traffic Sign Recognition Using Discriminative Local Features Andrzej Ruta, Yongmin Li, Xiaohui Liu School of Information Systems, Computing and Mathematics.
A Forest of Sensors: Using adaptive tracking to classify and monitor activities in a site Eric Grimson AI Lab, Massachusetts Institute of Technology
Learning Mid-Level Features For Recognition
Supervised Time Series Pattern Discovery through Local Importance
Object Tracking Based on Appearance and Depth Information
Kan Liu, Bingpeng Ma, Wei Zhang, Rui Huang
Aim of the project Take your image Submit it to the search engine
PRAKASH CHOCKALINGAM, NALIN PRADEEP, AND STAN BIRCHFIELD
Object Recognition Today we will move on to… April 12, 2018
Presentation transcript:

Learning to classify the visual dynamics of a scene Nicoletta Noceti Università degli Studi di Genova Corso di Dottorato “Sistemi e Servizi Cognitivi per l’Intelligenza di Ambiente e le Telecomunicazioni”

2 Outline of the presentation  From past… our 3D object recognition system a demo  …to future Research proposal Scenario and aims Problem statement

3 3D Object Recognition  We observe an object from slightly different viewpoints and exploit local features distinctive in space and stable in time to perform recognition Obtain a 3D object recognition method based on a compact description of image sequences Exploit temporal continuity and spatial information both on training and test

4 Recognizing objects with ST models Video sequences Keypoints detection and description Keypoints tracking Cleaning procedure Building the spatio temporal model 2-stage matching procedure Object recognition Spatio-temporal model for training Spatio-temporal model for test

5 From sequence to spatio-temporal model

6

7 Time invariant feature  We obtain a set of time-invariant features: a spatial appearance descriptor, that is the average of all SIFT vectors of its trajectory a temporal descriptor, that contains information on when the feature first appeared in the sequence and on when it was last observed

8 Recognizing objects with ST models Video sequences Keypoints detection and description Keypoints tracking Cleaning procedure Building the spatio temporal model 2-stage matching procedure Object recognition Spatio-temporal model for training Spatio-temporal model for test

9 Matching of sequence models

10 Experiments and results  Matching assessment Illumination, scale and background changes Changes in motion Increasing the number of objects  Object recognition on a 20 objects dataset  Recognition on a video streaming E. Delponte, N. Noceti, F. Odone and A. Verri Spatio temporal constraints for matching view-based descriptions of 3D objects In WIAMIS 2007

11 3D objects

12 Recognizing 20 objects Book:  Bambi: + + Dewey: О О X Dewey:  Sully: X X Box: О О Donald:  Dewey:++ Scrooge:О О Book:  Bambi: + + Dewey: О О X Dewey:  Sully: X X Box: О О Donald:  Dewey:++ Scrooge:О О Book:  Bambi: + + Dewey: О О X Dewey:  Sully: X X Box: О О Donald:  Dewey:++ Scrooge:О О Book:  Bambi: + + Dewey: О О X Dewey:  Sully: X X Box: О О Donald:  Dewey:++ Scrooge:О О Number of experiments: 840 TP=51FN=13 FP=11TN=765

13 Recognition on a video stream

14 But my research proposal is… “Learning to classify the visual dynamics of a scene”  Idea: to combine classical computer vision techniques and learning approaches to understand and classify dynamic events Modeling of common behaviours Anomaly detection

15 State of the art  In the video surveillance framework there is a growing need for adaptive systems, able to learn behaviour models by long time observations  In the last decades it has been accepted that many computer vision application are better dealt with a learning from example approach  Focusing on video description, there are some promising works but the research has still many open issues

16 The cognitive cycle Sensing Analysis and Representation DecisionAction  Description of video content  Event classification to decide what is an anomaly event and how it is described Now decision made by humans → automation

17 Analysis and representation  The focus of the first part of our work will be on video processing to study robust spatio-temporal features obtaining a reliable video content description Low-level blob description exploring more features (shape, color, texture) than the ones usually used (position, area, perimeter) To look for a balance between computational complexity (real time needed…) and efficency

18 Analysis and representation  Why do we need a robust blob description? Blobs will be tracked but there are some problems to deal with:  Illumination changes  Velocity variations  Occlusion  Trajectories intersection  Features local nature  A reliable blob description allows to obtain a robust tracker

19 From representation to decision  Blobs trajectories built by tracking are the starting point of the classification step  Idea: to integrate motion analysis with statistical learning techniques to exploit the knowledge coming from previously seen scenario Unsupervised learning Manifold learning

20 Learning techniques  Unsupervised learning Method of machine learning where a models is fit to observations. It is disinguished from supervised learning from the fact that there is no a priori output  Manifold learning High dimensional data can be difficult to interpret. One approach to semplifications is to assume that the data of interest lies on an embedded non-linear manifold within the higher dimensional space

21 From representation to decision  …but our representation is not suitable for a learning framework… At this computation point, an event is related to one (or more) blob trajectory  Two possible solutions: Appropriate handling of the description Design of appropriate similarity functions

22 Case studies  Today: medium distance video of indoor scenes  Long term objective: wide area monitoring Analysis of complex crowded scenes (train stations, airports) From blob tracking to the study of the whole scene motion (optical flow based)

23 Collaborations  Imavis: IMAge and VISion Development and software consulting company with headquarters in Bologna and a reserach and development office in Genova  SINTESIS project: Sistema INTegrato per la Sicurezza ad Intelligenza diStribuita DIBE, DIST, DISI, XXX altro?

Thanks for your attention!