Download presentation
Presentation is loading. Please wait.
Published byJasmin Doyle Modified over 9 years ago
1
CAMEO: Year 1 Progress and Year 2 Goals Manuela Veloso, Takeo Kanade, Fernando de la Torre, Paul Rybski, Brett Browning, Raju Patil, Carlos Vallespi, Betsy Ricker
2
CAMEO Internals Raw video is converted to a continuous mosaic Faces are detected and motions of people are tracked Action data is logged for analysis and replay Meeting simulator used for CAMEO learning validation. Live person actions are used to seed meeting simulation environment. Detected person motions are used to classify actions Actions of the group are used to classify global meeting state Classified actions reported to other agents
3
CAMEO’s Connection to other CALO Agents CAMEO is an example of a physical event capture system. Systems such as these transmit state information about people to the CALO timeline server. Individualized CALO agents can access this information to obtain updates about their individual users.
4
Inferring Meeting State with CAMEO: Overview 1)CAMEO observes activities of people in meeting 2)Raw visual motion is segmented into discrete actions 3)High-level meeting state is inferred from the aggregate actions of the group
5
Training CAMEO to Recognize Human Actions Tracked person horizontal displacement Tracked person vertical displacement A series of raw person actions are tracked and recorded by CAMEO. Action data is manually labeled. Relative displacement means and variances for each action class Significant statistics of raw actions are extracted. Action data is now represented as a learned generalization.
6
Action Recognition Person action sequences are represented as a simple finite state machine. State transitions are encoded in a dynamic Bayesian network which infers the current person state as a function of observed human activity and previous state. Dynamic Baysian Network Person Action State Machine
7
Classification of Person State in a Meeting Standing Stand Sitting Sit Time in seconds Example of person state classification: Here, the states of a person are correctly classified from the Bayesian network. The parameters of the activity data are learned from previously-recorded meeting data.
8
Classification of the Meeting State Global meeting state is defined by the aggregate activities of every person attending the meeting. An example of a global meeting finite state machine. CAMEO can be set up to recognize different meeting types. Classification of meeting state using the defined state machine.
9
Generating Meeting Summary Meeting event log becomes summary Low and high-level events can be organized into a hierarchy Meeting can be viewed at any requested level of detail from summary to captured video (and eventually audio) 2004-02-03 Project Status Report 13:04:05 Meeting Start 13:12:12 General Discussion 13:19:45 Presentation 13:24:23 General Discussion 13:29:29 Meeting End
10
Generating Meeting Summary Meeting event log becomes summary Low and high-level events can be organized into a hierarchy Meeting can be viewed at any requested level of detail from summary to captured video (and eventually audio) 2004-02-03 Project Status Report 13:04:05 Meeting Start 13:12:12 General Discussion 13:19:45 Presentation 13:19:45 Jim stands 13:19:50 Jim walks to podium 13:20:00 Jim speaks 13:22:04 *Unknown* speaks 13:22:45 Jim speaks 13:30:23 Wendy stands 13:30:37 Wendy walks to podium 13:30:42 Wendy speaks 13:33:04 Wendy sits down 13:33:04 Jim speks 13:38:50 Jim sits down 13:40:23 General Discussion 13:50:29 Meeting End
11
Protecting Individual’s Privacy Issues Recognition is voluntary. CAMEO only recognizes people it has registered. We can digitally represent video logs so faces are distorted or represented only as shapes: Raw video with tracking information Stored video log after privacy filtering
12
Some ways CALO Agents could use CAMEO Data What meetings happened when? Who was at the meeting? Who was sitting, standing, or speaking? Where were people looking? Who was talking? What were people doing? Who was pointing at what? What happened during the formal presentation? What happened during the general discussion? What is a general/detailed summary of the meeting? What did person 'x' contribute to the meeting? How to replay a meeting from a specific point in time? How to replay specific parts of the meeting?
13
Some ways CALO Agents could use CAMEO Data What meetings happened when? –When a meeting starts, CAMEO can post an event to the timeline server indicating the start time of the meeting. By querying the timeline server for events of the appropriate tag, CALO agents could determine the starts of the various meetings and obtain other information about them such as what it was about.
14
Some ways CALO Agents could use CAMEO Data Who was at the meeting? –Face recognition is required. This can be done by applying various kinds of image matching algorithms (SVD, template matching, etc...) to see how close a given face is to a database of saved faces. A database of saved faces must be available to work from.
15
Some ways CALO Agents could use CAMEO Data Who is sitting, standing, or speaking? –By tracking the positions of people as they move around, we should be able to tell who is sitting and who is standing. Depending on how animated the faces are in that state, we should also be able to tell who is speaking by how much they're bobbing around.
16
Some ways CALO Agents could use CAMEO Data Where are people are looking? –In order to determine where people are looking, a profile face detector is needed. In this case, we should be able to tell which direction they're looking and correlate this with the other faces in the image to figure out where in the image people are likely to be looking
17
Some ways CALO Agents could use CAMEO Data Who was talking? –Besides tracking the face movements, audio data can be recorded by possibly instrument CAMEO or the meeting attendees with microphones (i.e. Alex Rudnicky). With multiple microphones in the room, sound localization techniques would be required.
18
Some ways CALO Agents could use CAMEO Data What were people doing? –Besides the relative positions of people’s bodies in the room, more detailed information could be obtained with a full-body tracker. Including information about the room itself, such as what else is in the room (tables, whiteboards, or chairs) would let CAMEO report more detailed information.
19
Some ways CALO Agents could use CAMEO Data Who was pointing at what? –We need to have even more detailed full-body tracking. By tracking arms and arm positions with a stereo camera (ie, Trevor Darrell), we should be able to figure out where the person is pointing. By putting a stereo head on a panning mount, a lot of information about the environment could be obtained very easily. Even by extending the 2D tracker so that it identifies arms as being attached to bodies, we might be able to get this information. However, this is only as good as long as the person is pointing in a direction perpendicular to CAMEO. Having two CAMEOs would be a good way to solve this problem.
20
Some ways CALO Agents could use CAMEO Data What happened during the formal presentation? –Information has to be collated and merged in such a way as the speaker is identified, and information regarding the speech and powerpoint presentation is processed (CALO- MMD group).
21
Some ways CALO Agents could use CAMEO Data What happened during the general discussion? –Information has to be collated and merged in such a way as the speakers are identified, and information regarding the speech is processed (CALO-MMD group).
22
Some ways CALO Agents could use CAMEO Data What is a general/detailed summary of the meeting? –Given a state machine which can be used to describe the most common things in a meeting, we could cluster the individual events into larger states which indicate the various sections of the meeting based on a generic agenda (intro, formal presentation, questions, open discussion, wrap-up), or even a specific agenda that is provided to CAMEO ahead of time? People print out agendas and often bring them to formal meetings so that everyone can follow allong.
23
Some ways CALO Agents could use CAMEO Data What did person 'x' contribute to the meeting? –Tracking an individual person's speech and gestures allows the events posted to the timeline server to be gathered/clustered into a personalized kind of state machine that can be viewed at a very minute level of detail (individual gestures and actions) or a high level description such as "person x didn't talk very much", etc...
24
Some ways CALO Agents could use CAMEO Data How to replay a meeting from a specific point in time? –The raw movie files are available. Once the individual person events are classified, the timestamps can be extracted from the timeline server and the video can be replayed from that location.
25
Some ways CALO Agents could use CAMEO Data How to replay specific parts of the meeting, i.e., introductions, discussion after the presentation, wrap up? –We need to create a probabilistic meeting ontology that we can use to parse and tag the meeting identifying parts of the meeting with different probabilities. We can learn the model of different types of meetings in terms of learning the probabilistic parameters of an ontology or the Bayesian dependencies from types, people, and meeting purpose, to the format of the meeting.
Similar presentations
© 2025 SlidePlayer.com Inc.
All rights reserved.