Presentation is loading. Please wait.

Presentation is loading. Please wait.

Search and Annotation Tool for Oral History INTER-VIEWS Henk van den Heuvel, Centre for Language and Speech Technology (CLST) Radboud University Nijmegen,

Similar presentations


Presentation on theme: "Search and Annotation Tool for Oral History INTER-VIEWS Henk van den Heuvel, Centre for Language and Speech Technology (CLST) Radboud University Nijmegen,"— Presentation transcript:

1 Search and Annotation Tool for Oral History INTER-VIEWS Henk van den Heuvel, Centre for Language and Speech Technology (CLST) Radboud University Nijmegen, the Netherlands

2 The Data IPNV Interview corpus: Over 1,100 audio-recorded interviews with veterans -Various missions of the Dutch Army -Collected by Stef Scagliola for the Veterans Institute, Doorn, Netherlands Selection of 250 interviews to be included for the tool: -120: World War II -100: Netherlands East Indies -30: New Guinea -Interviews of about 2 to 2.5 hours -Stored in 16 kHz, 16 bits wave -Stored at DANS using Persistent Identifiers

3 Background of the tool A result of a range of projects in the years 2009-2012 With the CLST, the Veterans Institute & DANS Veteran Tapes: Enhanced publication: Living Oral History Workbench: -index 246 interviews with relevant search terms -using Automatic Speech Recognition, -annotate retrieved fragments in a Wiki-like environment

4 Background of the tool INTER-VIEWS: -All 246 interviews were persistently stored at DANS -Persistent Identifiers are used to refer to the individual interviews -The metadata components of the interviews were registered in the CLARIN’s metadata component registry and linked to ISOcat categories (CMDI files) -The same was done for the metadata of the Oral History Annotation Tool (e.g. tool name, owner, makers, description, input, output, availability) Further data curation in CLARIN: -Completing the CLARIN metadata for 950 IPNV interviews in total (including the 246 interviews curated before)

5 Objectives of the tool 1.Find relevant fragments in large collection of audio data 2.Add annotations / comments to selected fragments 3.Make annotations public to other researchers (or not) 4.Verification of research results and claims in publications

6 Challenges: Disclosing the audio Automatic Speech Recognition -Speaker adaption on 2.5 minutes per speaker -Lexicon with keywords -Language model -Key words from: –Thesaurus –Summaries No exact transcripts, but effective keyword spotting ! Decoding Search Feature Extraction LexiconLM Result Acoustic Models

7 Features of the tool Retrieval of interviews and fragments of interviews based on Automatic Speech Recognition output Audio playback for retrieved fragments Metadata of all interviews Transcription of audio segments Annotations to fragments to be added by registered users A user administration to restrict the transcription & annotation facilities to registered users Adjustment of a fragment’s start and end point Advanced search options The tool is compliant with CLARIN-NL standards

8 Heuvel, H. van den, Sanders, E., Rutten, R., Scagliola, S.,Witkamp, P. (2012): An Oral History Annotation Tool for INTER-VIEWs Proceedings LREC2012, Istanbul, pp. 215-218. Heuvel, H. van den, Oostdijk, N. (2016): Falling silent, lost for words... Tracing personal involvement in interviews with Dutch war veterans. In: Proceedings LREC 2016, 23-28 May 2016, Portorož, Slovenia. Publications

9 The tool http://www.watveteranenvertellen.nl/annotationtool/ http://wwwlands2.let.ru.nl/spex/annotationtool/ http://wwwlands2.let.ru.nl/spex/annotationtooldemo

10 Desirable extensions of the tool The option to navigate through the audio of the full interview Extend search facility to metadata, annotations, summary texts Integrate the tool with the fragment fitter so as to make it suitable for Enhanced Publications Visualisation by a timeline to show the chronological order of inserted annotations Introduce a shop cart in which a user can collect relevant fragments for his/her own use Employ the tool for other audio collections NB: A newer tool for Document retrieval for full interview (600 interviews in total) is available at: http://interview.veteraneninstituut.nl/search/

11 Login screen Oral History Search and Annotation Tool CLST, Nijmegen, http://www.ru.nl/CLSThttp://www.ru.nl/CLST

12 Search by interview & time code Oral History Search and Annotation Tool CLST, Nijmegen, http://www.ru.nl/CLSThttp://www.ru.nl/CLST

13 Oral History Search and Annotation Tool CLST, Nijmegen, http://www.ru.nl/CLSThttp://www.ru.nl/CLST Search by word(s)

14 Hits: Fragment list Information for selected fragment Change time interval Sound Oral History Search and Annotation Tool CLST, Nijmegen, http://www.ru.nl/CLSThttp://www.ru.nl/CLST

15 Add annotation Add transcription Oral History Search and Annotation Tool CLST, Nijmegen, http://www.ru.nl/CLSThttp://www.ru.nl/CLST

16 Oral History Search and Annotation Tool CLST, Nijmegen, http://www.ru.nl/CLSThttp://www.ru.nl/CLST Publish annotation Add attachments My annotations


Download ppt "Search and Annotation Tool for Oral History INTER-VIEWS Henk van den Heuvel, Centre for Language and Speech Technology (CLST) Radboud University Nijmegen,"

Similar presentations


Ads by Google