Liyan Zhang, Ronen Vaisenberg, Sharad Mehrotra, Dmitri V. Kalashnikov Department of Computer Science University of California, Irvine This material is.

Slides:



Advertisements
Similar presentations
Image Retrieval With Relevant Feedback Hayati Cam & Ozge Cavus IMAGE RETRIEVAL WITH RELEVANCE FEEDBACK Hayati CAM Ozge CAVUS.
Advertisements

Pseudo-Relevance Feedback For Multimedia Retrieval By Rong Yan, Alexander G. and Rong Jin Mwangi S. Kariuki
A Unified Framework for Context Assisted Face Clustering
Road-Sign Detection and Recognition Based on Support Vector Machines Saturnino, Sergio et al. Yunjia Man ECG 782 Dr. Brendan.
Word Spotting DTW.
Kien A. Hua Division of Computer Science University of Central Florida.
Active Learning for Streaming Networked Data Zhilin Yang, Jie Tang, Yutao Zhang Computer Science Department, Tsinghua University.
Imbalanced data David Kauchak CS 451 – Fall 2013.
Carolina Galleguillos, Brian McFee, Serge Belongie, Gert Lanckriet Computer Science and Engineering Department Electrical and Computer Engineering Department.
Comparing Twitter Summarization Algorithms for Multiple Post Summaries David Inouye and Jugal K. Kalita SocialCom May 10 Hyewon Lim.
Patch to the Future: Unsupervised Visual Prediction
Identifying Image Spam Authorship with a Variable Bin-width Histogram-based Projective Clustering Song Gao, Chengcui Zhang, Wei Bang Chen Department of.
Intelligent Systems Lab. Recognizing Human actions from Still Images with Latent Poses Authors: Weilong Yang, Yang Wang, and Greg Mori Simon Fraser University,
Exploiting Relationships for Object Consolidation Zhaoqi Chen Dmitri V. Kalashnikov Sharad Mehrotra Computer Science Department University of California,
Exploiting Relationships for Object Consolidation Zhaoqi Chen Dmitri V. Kalashnikov Sharad Mehrotra Computer Science Department University of California,
Watching Unlabeled Video Helps Learn New Human Actions from Very Few Labeled Snapshots Chao-Yeh Chen and Kristen Grauman University of Texas at Austin.
Instructor: Mircea Nicolescu Lecture 13 CS 485 / 685 Computer Vision.
Content Based Image Clustering and Image Retrieval Using Multiple Instance Learning Using Multiple Instance Learning Xin Chen Advisor: Chengcui Zhang Department.
Daozheng Chen 1, Mustafa Bilgic 2, Lise Getoor 1, David Jacobs 1, Lilyana Mihalkova 1, Tom Yeh 1 1 Department of Computer Science, University of Maryland,
Modeling Pixel Process with Scale Invariant Local Patterns for Background Subtraction in Complex Scenes (CVPR’10) Shengcai Liao, Guoying Zhao, Vili Kellokumpu,
Beyond Actions: Discriminative Models for Contextual Group Activities Tian Lan School of Computing Science Simon Fraser University August 12, 2010 M.Sc.
Identifying Patterns in Road Networks Topographic Data and Maps Henri Lahtinen Arto Majoinen.
Exploiting Relationships for Domain-Independent Data Cleaning Dmitri V. Kalashnikov Sharad Mehrotra Stella Chen Computer Science Department University.
A Study of Approaches for Object Recognition
ACM Multimedia th Annual Conference, October , 2004
CS335 Principles of Multimedia Systems Content Based Media Retrieval Hao Jiang Computer Science Department Boston College Dec. 4, 2007.
Sampling from Large Graphs. Motivation Our purpose is to analyze and model social networks –An online social network graph is composed of millions of.
MANISHA VERMA, VASUDEVA VARMA PATENT SEARCH USING IPC CLASSIFICATION VECTORS.
Adaptive Graphical Approach to Entity Resolution Dmitri V. Kalashnikov Stella Chen, Dmitri V. Kalashnikov, Sharad Mehrotra Computer Science Department.
Disambiguation Algorithm for People Search on the Web Dmitri V. Kalashnikov, Sharad Mehrotra, Zhaoqi Chen, Rabia Nuray-Turan, Naveen Ashish For questions.
Exploiting Context Analysis for Combining Multiple Entity Resolution Systems Zhaoqi Chen, Dmitri V. Kalashnikov, Sharad Mehrotra University of California,
Presented by Zeehasham Rasheed
Pattern Recognition. Introduction. Definitions.. Recognition process. Recognition process relates input signal to the stored concepts about the object.
Text Mining: Finding Nuggets in Mountains of Textual Data Jochen Dijrre, Peter Gerstl, Roland Seiffert Presented by Huimin Ye.
Large-scale organization of metabolic networks Jeong et al. CS 466 Saurabh Sinha.
Chapter 4 Pattern Recognition Concepts continued.
Extracting Places and Activities from GPS Traces Using Hierarchical Conditional Random Fields Yong-Joong Kim Dept. of Computer Science Yonsei.
Overcoming the Quality Curse Sharad Mehrotra University of California, Irvine Collaborators/Students (Current) Dmitri Kalashnikov, Yasser Altowim, Hotham.
What’s Making That Sound ?
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Mining Rare and Frequent Events in Multi-camera Surveillance.
Associative Pattern Memory (APM) Larry Werth July 14, 2007
Mining Discriminative Components With Low-Rank and Sparsity Constraints for Face Recognition Qiang Zhang, Baoxin Li Computer Science and Engineering Arizona.
CS 8751 ML & KDDSupport Vector Machines1 Support Vector Machines (SVMs) Learning mechanism based on linear programming Chooses a separating plane based.
CS654: Digital Image Analysis Lecture 3: Data Structure for Image Analysis.
Marcin Marszałek, Ivan Laptev, Cordelia Schmid Computer Vision and Pattern Recognition, CVPR Actions in Context.
Understanding the Semantics of Media Lecture Notes on Video Search & Mining, Spring 2012 Presented by Jun Hee Yoo Biointelligence Laboratory School of.
Recognizing Activities of Daily Living from Sensor Data Henry Kautz Department of Computer Science University of Rochester.
Exploiting Context Analysis for Combining Multiple Entity Resolution Systems -Ramu Bandaru Zhaoqi Chen Dmitri V.kalashnikov Sharad Mehrotra.
Chapter 4: Pattern Recognition. Classification is a process that assigns a label to an object according to some representation of the object’s properties.
Tijana Janjusevic Multimedia and Vision Group, Queen Mary, University of London Clustering of Visual Data using Ant-inspired Methods Supervisor: Prof.
Computer Graphics and Image Processing (CIS-601).
CVPR2013 Poster Detecting and Naming Actors in Movies using Generative Appearance Models.
A NOVEL METHOD FOR COLOR FACE RECOGNITION USING KNN CLASSIFIER
Network Community Behavior to Infer Human Activities.
Content-Based Image Retrieval QBIC Homepage The State Hermitage Museum db2www/qbicSearch.mac/qbic?selLang=English.
Mining Dependency Relations for Query Expansion in Passage Retrieval Renxu Sun, Chai-Huat Ong, Tat-Seng Chua National University of Singapore SIGIR2006.
DeepDive Model Dongfang Xu Ph.D student, School of Information, University of Arizona Dec 13, 2015.
Leveraging Knowledge Bases for Contextual Entity Exploration Categories Date:2015/09/17 Author:Joonseok Lee, Ariel Fuxman, Bo Zhao, Yuanhua Lv Source:KDD'15.
Using decision trees to build an a framework for multivariate time- series classification 1 Present By Xiayi Kuang.
Similarity Measurement and Detection of Video Sequences Chu-Hong HOI Supervisor: Prof. Michael R. LYU Marker: Prof. Yiu Sang MOON 25 April, 2003 Dept.
A Binary Linear Programming Formulation of the Graph Edit Distance Presented by Shihao Ji Duke University Machine Learning Group July 17, 2006 Authors:
Mustafa Gokce Baydogan, George Runger and Eugene Tuv INFORMS Annual Meeting 2011, Charlotte A Bag-of-Features Framework for Time Series Classification.
Student Gesture Recognition System in Classroom 2.0 Chiung-Yao Fang, Min-Han Kuo, Greg-C Lee, and Sei-Wang Chen Department of Computer Science and Information.
Semantic Graph Mining for Biomedical Network Analysis: A Case Study in Traditional Chinese Medicine Tong Yu HCLS
Supervised Time Series Pattern Discovery through Local Importance
Yun-FuLiu Jing-MingGuo Che-HaoChang
Computer Science Department University of California, Irvine
Disambiguation Algorithm for People Search on the Web
Self-tuning in Graph-Based Reference Disambiguation
Multimedia Information Retrieval
Presentation transcript:

Liyan Zhang, Ronen Vaisenberg, Sharad Mehrotra, Dmitri V. Kalashnikov Department of Computer Science University of California, Irvine This material is based upon work supported by the NSF grants. 1

Outline 2

Sensor Driven Applications.. Numerous physical world domains where sensors are used intelligent transportation systems reconnaissance surveillance systems smart buildings smart grid... 3

Smart Video Surveillance We focus on Smart Video Surveillance video cameras are installed within buildings to monitor human activities CS Building in UC Irvine Video collection 4 Surveillance Video Database Semantic Extraction Semantic Extraction Event Database Event Database Query/ Analysis

Event Model 5 Surveillance Video Database Semantic Extraction Semantic Extraction Event Database Event Database Query /Analysis event who what Other property when Activity recognition Face recognition localization Temporal placement extraction Event model : where Query Examples: When Sharad left his office on last Friday? Who is the last visitor to Sharad’s office yesterday? Query Examples: When Sharad left his office on last Friday? Who is the last visitor to Sharad’s office yesterday?

Person Identification Challenge Person Identification 6 event who what Other property when Activity recognition Face recognition localization Temporal placement extraction Event model : where Bob other Alice ? ? ? Who ?

Traditional Approach 7 Traditional Approach Face Detection Face Recognition ? ? ? Detect 70 faces/ 1000 images 2~3 images/ person Poor Performance

Rationale for Poor Performance 8 resolution (original) (1/2 original) (1/3 original) Poor Quality of Data No faces Small faces Low resolution Low temporal Resolution Poor Quality of Data No faces Small faces Low resolution Low temporal Resolution original performance original performance Drop to 70% Drop to 30% Sampling rate Sampling rate 1 frame/sec 1/3 frame/sec 1/2 frame/sec 1 frame/sec original performance original performance Drop to 53% Drop to 35%

Exploiting Contextual Information 9 Face Recognition Bob Face Recognition Failed !!! Color similar Time contin -uity activity similar Advantages: -- Additional evidence for People Identification -- Contextual features may be robust to image quality -- Color, activity, location, time...

Contributions A robust approach to PI in surveillance video by exploiting contextual features. Significant improvements over face recognition based technique Tolerates degradation in video quality – lower resolution, frame rates, etc. Key Observation : PI problem in video can be mapped to the entity resolution problem extensively explored in the literature. PI problem: subject in video  realworld person ER problem: object in database  realworld name Exploits Relationship based Data Cleaning (RelDC) developed for entity resolution [ACM TODS 2006] 10 Face detection Face Recognition Face detection Face Recognition Contextual Information Contextual Information

RelDC: Entity Relationship Graphs To solve entity resolution problem, try to construct an entity relationship graph. 11 Entity Resolution  P1, ‘Databases... ’, ‘John Black’, ‘Don White’   P2, ‘Multimedia... ’, ‘Sue Grey’, ‘D. White’   P3, ‘Title3...’, ‘Dave White’   P4, ‘Title5...’, ‘Don White’, ‘Joe Brown’   P5, ‘Title6...’, ‘Joe Brown’, ‘Liz Pink’   P6, ‘Title7... ’, ‘Liz Pink’, ‘D. White’   P1, ‘Databases... ’, ‘John Black’, ‘Don White’   P2, ‘Multimedia... ’, ‘Sue Grey’, ‘D. White’   P3, ‘Title3...’, ‘Dave White’   P4, ‘Title5...’, ‘Don White’, ‘Joe Brown’   P5, ‘Title6...’, ‘Joe Brown’, ‘Liz Pink’   P6, ‘Title7... ’, ‘Liz Pink’, ‘D. White’  ‘Don White’ ‘Dave White’ ER Graph: Node: Entities Edge: Relationships ER Graph: Node: Entities Edge: Relationships

RelDC Framework for Entity Resolution For each choice node r  Assigning the value to w r1, w r2,,...,w rN  Value of w ri is degree of belief that y ri is the correct option for r  Pick the option with the max w ri as its answer for reference r  Compute w r1, w r2,,...,w rN by analyzing connection strength between nodes in the graph  Connection strength can be based on variety of factors:  feature-based similarity  correlations  Association  Relationship analysis For each choice node r  Assigning the value to w r1, w r2,,...,w rN  Value of w ri is degree of belief that y ri is the correct option for r  Pick the option with the max w ri as its answer for reference r  Compute w r1, w r2,,...,w rN by analyzing connection strength between nodes in the graph  Connection strength can be based on variety of factors:  feature-based similarity  correlations  Association  Relationship analysis 12

Connection between PI and entity resolution 13 Subject in video Real-world person name Person Identification Object in database Real-world Object name Real-world Object name Entity Resolution  P1, ‘Databases... ’, ‘John Black’, ‘Don White’   P2, ‘Multimedia... ’, ‘Sue Grey’, ‘D. White’   P3, ‘Title3...’, ‘Dave White’   P4, ‘Title5...’, ‘Don White’, ‘Joe Brown’   P5, ‘Title6...’, ‘Joe Brown’, ‘Liz Pink’   P6, ‘Title7... ’, ‘Liz Pink’, ‘D. White’   P1, ‘Databases... ’, ‘John Black’, ‘Don White’   P2, ‘Multimedia... ’, ‘Sue Grey’, ‘D. White’   P3, ‘Title3...’, ‘Dave White’   P4, ‘Title5...’, ‘Don White’, ‘Joe Brown’   P5, ‘Title6...’, ‘Joe Brown’, ‘Liz Pink’   P6, ‘Title7... ’, ‘Liz Pink’, ‘D. White’  ‘Don White’ ‘Dave White’ Shot 3 Shot 2 Bob Alice Shot 1

Constructing the ER Graph for PI Low Level Feature Extraction Surveillance Videos Face Recognition Foreground Color Bounding Box Video Segmentation Shots Color Histogram Activity FR Result Event Detection PI relationship graph 14

Low Level Feature Extraction Foreground Color Extraction start end Key frame Shot 1 Temporal Segmentation Videos Time Continuity Color Continuity Color Continuity Shots 64-bin Color histogram Face Detection and Recognition Face Detection and Recognition FR(image, person)=1 Bounding Box and Centroid Extraction 64-bin Color histogram 15

Activity Detection Walking Direction Changes of bounding boxes and centroids Activity Detection Appear and disappear locations Downside of Corridor Walking to Office in Corner A strong signal in person identification Observing: An subject enter/exist Bob’s office frequently Observing: An subject enter/exist Bob’s office frequently High Probability: This subject is Bob. High Probability: This subject is Bob. 16

Subject x 12 Subject x11 Subject x2 Subject x3 Shot s1 AliceBob Shot s3 Shot s2 act1 0.5 act3 act Time t 12 H1 Time t 11 Time t3 Time t2 H 12 H 2 H 3 PI Graph 1 FR result tells: Subject 2 is “Bob” Color Similarity: Euclidean distance Prob. of activity determinin g entity 17 w 31 w 32 w 22 w 21 w 12 w

How to compute weight? Context Attraction Principle If the pair is more strongly connected than the other pair then the weight between should be larger than Context Attraction Principle If the pair is more strongly connected than the other pair then the weight between should be larger than H12 H11 Subject x 12 Subject x11 Subject x2 Subject x3 Shot s1 Alice Bob Shot s3 Shot s2 act3 act1 act H 3 H2 H w 31 w 32 Who Subject 3 is, Alice or Bob? Who Subject 3 is, Alice or Bob? Delete edges Sim<0.3 Bob: 3 paths Alice: 1 path So: W 31 <W 32 Bob: 3 paths Alice: 1 path So: W 31 <W 32

Compute connection strength Computing Connection Strength Phase 1: Discover connections  Find all L-short simple u-v paths  Bottleneck  Graph theoretic techniques to optimize Phase 1: Discover connections  Find all L-short simple u-v paths  Bottleneck  Graph theoretic techniques to optimize Phase 2: Measure the strength  In the discovered connections  Many c(u,v) models are possible  Random walks in graphs models Phase 2: Measure the strength  In the discovered connections  Many c(u,v) models are possible  Random walks in graphs models Overall generic formula : 19

Using connection strength to determine weights Determine weights  According to CAP principle  Proportional to c(x r,y rj ) Optimization problem  Slack variables  Solver  Iterative solution  Interpret weights 20

Dealing with “Others” Usually, after computing weights, choose the option with max value. However, in our dataset, for each subject in video the weight for “others” is always large because there is higher probability that the subject is not the person we are interested in. Then, how to solve it? Learn a classifier based on output of RelDC to other choices. 21

Experiments Dataset: 2 weeks surveillance videos from 2 cameras in the CS building of UC Irvine Sampling rate: 1 frame/sec Frame resolution: 704 *480 1 week data as training data, 1 week as test data About 50 individuals totally Manually labeled 4 people Measurement: For each person, select top K subjects compute Precision, Recall and F-measure Comparison with KNN method Precision and Recall with K increasing from 1 to20 F-measure when K=20 Our approach: 0.76 KNN:0.24 Our Precision KNN Precision Our Recall KNN Recall 22

Experiments To test the robustness of our approach, we degrade the resolution and sampling rate of frames Performance of activity detection :  drops when sampling rate reduces from 1 frame/sec to 1/2 and 1/3 frame/sec  many important frames are lost with the decrease of sampling rate  decrease of resolution does not affect the performance of activity detection Performance of activity detection :  drops when sampling rate reduces from 1 frame/sec to 1/2 and 1/3 frame/sec  many important frames are lost with the decrease of sampling rate  decrease of resolution does not affect the performance of activity detection person identification result (F-measure when k = 20):  drops with the reduction of resolution and sampling rate  However, PI result even with the lowest resolution and sampling rate is much better than the baseline results (Naive Approach) person identification result (F-measure when k = 20):  drops with the reduction of resolution and sampling rate  However, PI result even with the lowest resolution and sampling rate is much better than the baseline results (Naive Approach) 23

Conclusion and Future work Conclusion Task: person identification in the context of Smart Video Surveillance Convert an indoor person identification problem into entity resolution problem Apply RelDC to solve PI problem Experiments demonstrate the effectiveness and robustness of the approach Future work Mine the frequent activity pattern to identify a person Construct a multi-sensor model Identify person in real time 24

25