Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 Dialogue, Speech and Images: The Companions Project Data Set Yorick Wilks, David Benyon, Christopher Brewster, Pavel Ircing, and Oli Mival

Similar presentations


Presentation on theme: "1 Dialogue, Speech and Images: The Companions Project Data Set Yorick Wilks, David Benyon, Christopher Brewster, Pavel Ircing, and Oli Mival"— Presentation transcript:

1 1 Dialogue, Speech and Images: The Companions Project Data Set Yorick Wilks, David Benyon, Christopher Brewster, Pavel Ircing, and Oli Mival http://www.companions-project.org

2 2 Companions Project 4-year, FP6, EU Project 14 partner sites (academic + commercial research) Research in Multimodal interfaces –Machine learning applied to dialogue systems –Emotions and ECAs –Dialogue and planning for mobile devices –Two prototypes/demonstrators Senior Companion Health and Fitness Companion

3 3 A Multiplicity of Companions Two major prototypes: –A Health and Fitness Companions Task driven, focussed, domain specific –A Senior Companion Open domain, mixed initiative, building a life narrative via photos Other Companions: –A mobile version of the H&FC –A home/cookery focused version of H&FC –An SC for the Czech language

4 4 The need for Dialogue Corpora General paucity of dialogue corpora The SC is open domain (because photos can be qabout anything), aimed at the elderly, and we cannot assume dialogue structures transfer Key idea: Use the initial prototype to generate more data Initial data collection more limited based on WoZ methodology - this is what this talk is about.

5 5 Specifications Modified WoZ –Emphasis on naturally occuring dialogues relevant to domain –People asked to reminisce about photos Initially random public domain Proper scenario - photos of personal importance –Photos primarily of people and events: friends and relative, weddings, holidays, etc. –We assumed the WoZ knew how many people in photos (because we assumed image processing technology could tell the System)

6 6 Specifications (2) WoZ instructed to use a standard set of questions such as: –What are is the name of the person in the picture? –Where is this picture taken? –What is the relationship between the people? But interviewer not limited to this User is encouraged to express feelings, memories, and associations

7 7 Data Collection Set up (English) Data with two set ups: –WoZ with Avatar + TTS as system – WoZ without TTS i.e. with human interviewer Use of TTS (although theoretically more realistic) slowed down the experiments too much Photos showed one at time, participation tails off after about 20 min

8 8

9 9 Senior Companion Data Collection at Napier September 2007: –45 sessions/ 30 hours Gender 27 male/13 female Age: 19 - 73 7 sessions in homes, 38 at Napier With avatar 16 sessions, without avatar 29 sessions –Early sessions were not transcribed to ASR standards, later sessions used ‘Transcriber’ tool (Barras et al., 2001)

10 10 Current status (English data): TOTAL SESSIONS = 101 (approx 70 hours) –In.TRS format = 42 (approx 30 hours) 27 sessions with full video and trs files (waiting on transcription for 16) 15 sessions with trs files with no video –In simple text format = 59 (approx 40 hours) 55 session pre-transcriber (.trs files) 4 sessions pre-transcriber with full video

11 11 Moira Ross, 68, Aberdeen, Scotland

12 12 Data Collection Sample M1:Okay, I think we’re ready to start looking at your pictures now. Please tell me about your first photo. F1:Okay, that’s at a friend’s wedding and that’s Martin and my son, Stefan, that’s a few years old now, wearing their kilts. M1:How old is Stefan? F1:I think in that picture he must have been about five? M1:Is that Stefan on the right? F1:It is, yes. M1:Great. Is there anything else you would like to say about them? F1:Yeah, well I remember that day, about… it was a friend, Chris’s wedding, so… and I think it was a… yeah… Stefan had his kilt outfit on that day. M1:That’s very interesting, how does this photo make you feel? F1:It just… it reminds me that it was winter, it was right after Christmas, that wedding, it was very cold that day. M1:Okay, let’s move on to the next. F1:That’s me and Martin in Gibraltar. That was very, very many years ago. We were visiting a friend in Gibraltar.

13 13 Czech SC data recording Set up chosen with the avatar Wizard a dedicated room has been established for the recording the subject sees on the screen: –the photo currently being discussed –the avatar (“talking head”) audio is captured by high-quality wireless microphones the subject is also simultaneously recorded by 3 miniDV cameras – video is intended for future use in emotion detection, gesture recognition, etc.

14 14 Recording room setup

15 15 Current state of the Czech Data 50 subjects recorded (mostly seniors) average length of an interview is 55 minutes, average number of photos being discussed during the session is 8.5 it turns out that old people really DO enjoy discussing their photos with an artificial “companion” and reminiscing about them on the other hand, people in “productive-age” often tend to provide just the technical description of the discussed photo

16 16 Availability and Format We plan to make all this data publicly available Most appropriate format still open issue Four data streams at least: –Audio –Transcription –Video –Images discussed

17 17 Thank You Comments and advice - welcome!


Download ppt "1 Dialogue, Speech and Images: The Companions Project Data Set Yorick Wilks, David Benyon, Christopher Brewster, Pavel Ircing, and Oli Mival"

Similar presentations


Ads by Google