Presentation is loading. Please wait.

Presentation is loading. Please wait.

Improving Automatic Meeting Understanding by Leveraging Meeting Participant Behavior Satanjeev Banerjee, Thesis Proposal. April 21, 2008 1.

Similar presentations


Presentation on theme: "Improving Automatic Meeting Understanding by Leveraging Meeting Participant Behavior Satanjeev Banerjee, Thesis Proposal. April 21, 2008 1."— Presentation transcript:

1 Improving Automatic Meeting Understanding by Leveraging Meeting Participant Behavior Satanjeev Banerjee, Thesis Proposal. April 21, 2008 1

2 Using Human Knowledge 2 Knowledge of human experts is used to build systems that function under uncertainty Often captured through in-lab data labeling Another source of knowledge: Users of the system Can provide subjective knowledge System can adapt to the users and their information needs Reduce data needed in the lab Technical goal: Improve system performance by automatically extracting knowledge from users

3 Domain: Meetings 3 Problem: Large parts of meetings contain unimportant information Some small parts contain important information How to retrieve the important information? Impact goal: Help humans get information from meetings (Romano and Nunamaker, 2001) What information do people need from meetings?

4 Understanding Information Needs 4 Survey of 12 CMU faculty members How often do you need information from past meetings? On average, 1 missed-meeting, 1.5 attended-meeting a month What information do you need? Missed-meeting: What was discussed about topic X? Attended-meeting: Detail question – What was the accuracy? How do you get the information? From notes if available – high satisfaction If meeting missed – ask face-to-face (Banerjee, Rose & Rudnicky, 2005) Task 1: Detect agenda item being discussed Task 2: Identify utterances to include in notes

5 Existing Approaches to Accessing Meeting Information 5 Meeting recording and browsing (Cutler, et al,02), (Ionescu, et al, 02), (Ehlen, et al, 07), (Waibel, et al, 98). Automatic meeting understanding Meeting transcription (Stolcke, et al, 2004), (Huggins-Daines, et al, 2007) Meeting topic segmentation (Galley, et al, 2003), (Purver, et al, 2006) Activity recognition through vision (Rybski & Veloso, 2004) Action item detection (Ehlen, et al, 07) Our goal Extract high quality supervision...from meeting participants (best judges of noteworthy info)...during the meeting (when participants are most available) Classic supervised learning Unsupervised learning Meeting participants used after the meeting

6 Challenges for Supervision Extraction During the Meeting 6 Giving feedback costs the user time and effort Creates a distraction from the users main task – participating in the meeting Our high-level approach Develop supervision extraction mechanisms that help meeting participants do their task Interpret participants responses as labeled data

7 Thesis Statement 7 Develop approaches to extract high quality supervision from system users, by designing extraction mechanisms that help them do their own task, and interpret their actions as labeled data

8 Roadmap for the Rest of this Talk 8 Review of past strategies for supervision extraction Approach: Passive supervision extraction for agenda item labeling Active supervision extraction to identify noteworthy utterances Success criteria, contribution and timeline

9 Past Strategies for Extracting Supervision from Humans 9 Two types of strategies: Passive and Active Passive: System does not choose which data points user will label E.g.: Improving ASR from user corrections (Burke, et al, 06) Active: System chooses which data points user will label E.g.: Have user label traffic images as risky or not (Saunier, et al, 04) Past strategies | Passive approach | Active approach | Summary

10 Research Issue 1: How to Ask Users for Labels? 10 Categorical labels Associate desktop documents with task label (Shen, et al, 07) Label image of safe roads for robot navigation (Failes & Olsen, 03) Item scores/rank Rank report items for inclusion in summary (Garera, et al, 07) Pick best schedule from system-provided choices (Weber, et al, 07) Feedback on features: Tag movies with new text features (Garden, et al, 05) Identify terms that signify document similarity (Godbole, et al, 04) Past strategies | Passive approach | Active approach | Summary

11 Research Issue 2: How to Interpret User Actions as Feedback? 11 Depends on similarity between user and system behavior Interpretation simple when behaviors are similar E.g.: Email classification (Cohen 96) Interpretation may be difficult when user behavior and target system behavior are starkly different E.g.: User corrections of ASR output (Burke, et al, 06) Past strategies | Passive approach | Active approach | Summary

12 Research Issue 3: How to Select Data Points for Label Query (Active Strategy)? 12 Typical active learning approach: Goal: Minimize number of labels sought to reach target error Approach: Choose data points most likely to improve learner E.g.: Pick data points closest to decision boundary (Monteleoni, et al, 07) Typical assumption: Humans task is labeling System users task is usually not same as labeling data Past strategies | Passive approach | Active approach | Summary

13 Our Overall Approach to Extracting Data from System Users 13 Goal: Extract high quality subjective labeled data from system users. Passive approach: Design the interface to ease interpretation of user actions as feedback Task: Label meeting segments with agenda item Active approach: Develop label query mechanisms that: Query for labels while helping the user do his task Extract labeled data from user actions Task: Identify noteworthy utterances in meetings

14 Talk Roadmap 14 Review of past strategies for supervision extraction Approach: Passive supervision extraction for agenda item labeling Active supervision extraction to identify noteworthy utterances Success criteria, contribution and timeline

15 Passive Supervision: General Approach 15 Goal: Design the interface to enable interpretation of user actions as feedback Recipe: Identify kind of labeled data needed Target a user task Find relationship between user task and data needed Build interface for user task that captures the relationship Past strategies | Passive approach | Active approach | Summary

16 Supervision for Agenda Item Detection 16 Meeting segments labeled with agenda item 1.Most notes refer to discussions in preceding segment 2.A note and its related segment belong to same agenda item 1. Time stamp speech and notes 2. Enable participants to label notes with agenda item Labeled data Automatically detect agenda item being discussed User task Relationship Note taking interface Note taking during meetings Past strategies | Passive approach | Active approach | Summary

17 17 Speech recognition research status Topic detection research status FSGs Insert Agenda Shared note taking area Personal notes – not shared Past strategies | Passive approach | Active approach | Summary

18 Getting Segmentation from Notes 18 Notes time stamp Notes agenda item box 100Speech recognition research status 200Speech recognition research status 400Topic detection research status 600Topic detection research status 800Speech recognition research status 950Speech recognition research status Topic detection research status 300 700 Past strategies | Passive approach | Active approach | Summary

19 Evaluate the Segmentation 19 How accurate is the extracted segmentation? Compare to human annotator Also compare to standard topic segmentation algorithms Evaluation metric: P k For every pair of time points k seconds apart, ask: Are the two points in the same segment or not, in the reference? Are the two points in the same segment or not, in the hypothesis? P k = # time pairs where hypothesis and reference disagree Total # of time point pairs in the meeting Past strategies | Passive approach | Active approach | Summary

20 SmartNotes Deployment in Real Meetings 20 Has been used in 75 real meetings 16 unique participants overall 4 sequences of meetings Sequence = 3 or more longitudinal meetings Past strategies | Passive approach | Active approach | Summary SequenceNum meetings so far 1 (ongoing)30 227 3 (ongoing)8 4 (ongoing)4 Remaining6

21 Data for Evaluation 21 Data: 10 consecutive related meetings Reference segmentation: Meetings segmented into agenda items by two different annotators. Inter-annotator agreement: P k = 0.062 Avg meeting length31 minutes Avg # agenda items per meeting4.1 Avg # participants per meeting3.75 (2 to 5) Avg # notes per agenda item5.9 Avg # notes per meeting25 Past strategies | Passive approach | Active approach | Summary

22 Results 22 Baseline: TextTiling (Hearst 97) State of the art: (Purver, et al, 2006) P k Significant Not significant Past strategies | Passive approach | Active approach | Summary

23 Does Agenda Item Labeling Help Retrieve Information Faster? 23 2 10-minute meetings, manually labeled with agenda items 5 questions prepared for each meeting Questions prepared without access to agenda items 16 subjects, not participants of the test meetings Within subjects user study Experimental manipulation: Access to segmentation versus no segmentation Past strategies | Passive approach | Active approach | Summary

24 Minutes to Complete the Task 24 Significant Past strategies | Passive approach | Active approach | Summary

25 Shown So Far 25 Method of extracting meeting segments labeled with agenda item from note taking Resulting data produces high quality segmentation Likely to help participants retrieve information faster Next: Learn to label meetings that dont have notes Past strategies | Passive approach | Active approach | Summary

26 Proposed Task: Learn to Label Related Meetings that Dont Have Notes 26 Plan: Implement language model based detection similar to (Spitters & Kraaiij, 2001). Train agenda item – specific language models on automatically extracted labeled meeting segments Perform segmentation similar to (Purver, et al, 06) Label new meeting segments with agenda item whose LM has the lowest perplexity Past strategies | Passive approach | Active approach | Summary

27 Proposed Evaluation 27 Evaluate agenda item labeling of meeting with no notes 3 real meeting sequences with 10 meetings each For each meeting i in each sequence Train agenda item labeler on automatically extracted labeled data from previous meetings in same sequence Compute labeling accuracy against manual labels Show improvement in accuracy from meeting to meeting Baseline: Unsupervised segmentation + text matching between speech and agenda item label text Evaluate effect on retrieving information Ask users to answer questions from each meeting With agenda item labeling output by improved labeler, versus With agenda item labeling output by baseline labeler Past strategies | Passive approach | Active approach | Summary

28 Talk Roadmap 28 Review of past strategies for supervision extraction Approach: Passive supervision extraction for agenda item labeling Active supervision extraction to identify noteworthy utterances Success criteria, contribution and timeline

29 Active Supervision 29 System goal: Select data points, and query user for labels In active learning, humans task is to provide the labels But system users task may be very different from labeling data General approach 1. Design query mechanisms such that Each label query also helps the user do his own task The users response to the query can be interpreted as a label 2. Choose data points to query by balancing Estimated benefit of query to user Estimated benefit of label to learner Past strategies | Passive approach | Active approach | Summary

30 Task: Noteworthy Utterance Detection 30 Goal: Identify noteworthy utterances – utterances that participants would include in notes Labeled data needed: Utterances labeled as either noteworthy or not noteworthy Past strategies | Passive approach | Active approach | Summary

31 Extracting Labeled Data 31 Noteworthy utterance detector Label query mechanism Notes assistance: Suggest utterances for inclusion in notes during the meeting Helps participants take notes Interpret participants acceptances / rejections as noteworthy / not noteworthy labels Method of choosing utterances for suggestion Benefit to users note taking Benefit to learner (detector) from users acceptance/rejection Past strategies | Passive approach | Active approach | Summary Completed Proposed

32 Proposed: Noteworthy Utterance Detector 32 Binary classification of utterances as noteworthy or not Support Vector Machine classifier Features: Lexical: Keywords, tf-idf, named entities, numbers Prosodic: speaking rate, f0 max/min Agenda item being discussed Structural: Speaker identity, utterances since last accepted suggestion Similar to meeting summarization work of (Zhu & Penn, 2006) Past strategies | Passive approach | Active approach | Summary

33 Extracting Labeled Data 33 Noteworthy utterance detector Label query mechanism Notes assistance: Suggest utterances for inclusion in notes during the meeting Helps participants take notes Interpret participants acceptances / rejections as noteworthy / not noteworthy labels Method of choosing utterances for suggestion Benefit to users note taking Benefit to learner (detector) from users acceptance/rejection Past strategies | Passive approach | Active approach | Summary Completed Proposed

34 Mechanism 1: Direct Suggestion 34 Fix the problem with emailing Past strategies | Passive approach | Active approach | Summary

35 Mechanism 2: Sushi Boat 35 pilot testing has been successful most participants took twenty minutes ron took much longer to finish tasks there was no crash Past strategies | Passive approach | Active approach | Summary

36 Differences between The Mechanisms 36 Direct suggestion User can provide accept/reject label Higher cost for the user if suggestion is not noteworthy Sushi boat suggestion User only provides accept labels Lower cost for the user Past strategies | Passive approach | Active approach | Summary

37 Will Participants Accept Suggestions? 37 Num Offered Num offered per min Num accepted Num accepted per min % accepted Direct suggestion 500.6170.234.0 Sushi boat2731.8850.631.0 Wizard of Oz study Wizard listened to audio and suggested text 6 meetings – 2 direct mechanism, 4 sushi boat mechanism Past strategies | Passive approach | Active approach | Summary

38 Percentage of Notes from Sushi Boat 38 MeetingNum lines of notes Num lines from Sushi boat % lines from Sushi boat 17686% 2242083% 3322991% 4323094% Total/Avg958589% Past strategies | Passive approach | Active approach | Summary

39 Extracting Labeled Data 39 Noteworthy utterance detector Label query mechanism Notes assistance: Suggest utterances for inclusion in notes during the meeting Helps participants take notes Interpret participants acceptances / rejections as noteworthy / not noteworthy labels Method of choosing utterances for suggestion Benefit to users note taking Benefit to learner (detector) from users acceptance/rejection Past strategies | Passive approach | Active approach | Summary Completed Proposed

40 Method of Choosing Utterances for Suggestion 40 One idea: Pick utterances that either have high benefit for detector, or high benefit for the user Most beneficial for detector: Least confident utterances Most beneficial for user: Noteworthy utterances with high conf Does not take into account users past acceptance pattern Our approach: Estimate and track users likelihood of acceptance Pick utterances that either have high detector benefit, or is very likely to be accepted Past strategies | Passive approach | Active approach | Summary

41 Estimating Likelihood of Acceptance 41 Features: Estimated user benefit of suggested utterance Benefit(utt) = where T(utt) = time to type utterance, R(utt) = time to read utterance # suggestions, acceptances, rejections in this and previous meetings Amount of speech in preceding window of time Time since last suggestion Combine features using logistic regression Learn per participant from past acceptances/rejections Past strategies | Passive approach | Active approach | Summary T(utt) – R(utt)), if utt is noteworthy according to detector – R(utt)), if utt is not noteworthy according to detector

42 Past strategies | Passive approach | Active approach | Summary Overall Algorithm for Choosing Utterances for Direct Suggestion 42 Estimate benefit of utterance label to detector Estimate likelihood of acceptance Given: An utterance and a participant Decision to make: Suggest utterance to participant? > threshold? Suggest utterance to participant Yes Dont suggest No Combine

43 Learning Threshold and Combination Wts 43 Train on WoZ data Split meetings into development and test set For each parameter setting Select utterances for suggestion to user in development set Compute acceptance rate by comparing against those accepted by the user in the meeting Of those shown, use acceptances and rejections to retrain utterance detector Evaluate utterance detector on test set Pick parameter setting with acceptable tradeoff between utterance detector error rate and acceptance rate Past strategies | Passive approach | Active approach | Summary

44 Proposed Evaluation 44 Evaluate improvement in noteworthy utterance detection 3 real meeting sequences with 15 meetings each Initial noteworthy detector trained on prior data Retrain over first 10 meetings by suggesting notes Test over next 5 Evaluate: After each test meeting, ask participants to grade automatically identified noteworthy utterances Baseline: Grade utterances identified by prior-trained detector Evaluate effect on retrieving information Ask users to answer questions from test meetings With utterances identified by detector trained on 10 meetings, vs. With utterances identified by prior-trained detector Past strategies | Passive approach | Active approach | Summary

45 Talk Roadmap 45 Review of past strategies for supervision extraction Approach: Passive supervision extraction for agenda item labeling Active supervision extraction to identify noteworthy utterances Success criteria, contribution and timeline

46 Thesis Success Criteria 46 Show agenda item labeling improves with labeled data automatically extracted from notes Show participants can retrieve information faster Show noteworthy utterance detection improves with actively extracted labeled data Show participants retrieve information faster Past strategies | Passive approach | Active approach | Summary

47 Expected Technical Contribution 47 Framework to actively acquire data labels from end users Learning to identify noteworthy utterances by suggesting notes to meeting participants. Improving topic labeling of meetings by acquiring labeled data from note taking Past strategies | Passive approach | Active approach | Summary

48 Summary: Tasks Completed/Proposed 48 Agenda item detection through passive supervision Design interface to acquire labeled dataCompleted Evaluate interface and labeled data obtainedCompleted Implement agenda item detection algorithmProposed Evaluate agenda item detection algorithmProposed Important utterance detection through active learning Implement notes suggestion interfaceCompleted Implement SVM classifierProposed Evaluate the summarizationProposed Past strategies | Passive approach | Active approach | Summary

49 Proposal Timeline 49 Time frameScheduled task Apr – Jun 08Iteratively do the following: Continue running Wizard of Oz studies in real meetings to fine- tune label query mechanisms. Analyze WoZ data to identify features for the automatic summarizer In parallel, implement baseline meeting summarizer Jul – Aug 08Deploy online summarization and notes suggestion system in real meetings, and iterate on its development based on feedback Sep – Oct 08Upon stabilization, perform summarization user study on test meeting groups Nov 08Implement agenda detection algorithm Dec 08Perform agenda detection based user study Jan – Mar 09Write dissertation Apr 09Defend thesis Past strategies | Passive approach | Active approach | Summary

50 Thank you! 50

51 Acceptances Per Participant 51 Participant Num sushi boat lines accepted % of acceptances % of notes in 5 prior meetings 18782.9%90% 243.8%10% 365.7%0% 487.6%Did not attend Totals105


Download ppt "Improving Automatic Meeting Understanding by Leveraging Meeting Participant Behavior Satanjeev Banerjee, Thesis Proposal. April 21, 2008 1."

Similar presentations


Ads by Google