Kirrkirr: a Bidirectional Warlpiri- English Dictionary Kristen Parton.

Slides:



Advertisements
Similar presentations
XML III. Learning Objectives Formatting XML Documents: Overview Using Cascading Style Sheets to format XML documents Using XSL to format XML documents.
Advertisements

WEB DESIGN TABLES, PAGE LAYOUT AND FORMS. Page Layout Page Layout is an important part of web design Why do you think your page layout is important?
Using XSL and XQL For Efficient, Customised Access To Dictionary Information Kevin Jansz Department of Linguistics, University.
The Chinese Room: Understanding and Correcting Machine Translation This work has been supported by NSF Grants IIS Solution: The Chinese Room Conclusions.
What is Word Study? PD Presentation: Union 61 Revised ELA guide Supplement (and beyond)
Learning the Basics – Lesson 1
UNIT 12 LO4 BE ABLE TO CREATE WEBSITES Cambridge Technicals.
Kirrkirr: Software for browsing and visual exploration of a structured Warlpiri dictionary Kevin Jansz Department of Linguistics,
Ch. 5 Web Page Design – Templates and Style Sheets Mr. Ursone.
Using KE in Multilingual Mode Robert Patterson Michele Watson.
Objective Understand web-based digital media production methods, software, and hardware. Course Weight : 10%
Int 1 Revision Word Processing Most people are familiar with word processing packages such as Microsoft Word, Open Office and Word Perfect. Here are some.
Kirrkirr A Dictionary Visualization Tool Conrad Wai Andrei Pop.
Upgrading to XHTML DECO 3001 Tutorial 1 – Part 1 Presented by Ji Soo Yoon 19 February 2004 Slides adopted from
Tutorial 3: Adding and Formatting Text. 2 Objectives Session 3.1 Type text into a page Copy text from a document and paste it into a page Check for spelling.
An innovative platform to allow translation and indexing of internet sites Localization World
What’s needed for lexical databases? Experiences with Kirrkirr Christopher Manning and Kristen Parton Depts of Computer Science and Linguistics Stanford.
Introducing HTML & XHTML:. Goals  Understand hyperlinking  Understand how tags are formed and used.  Understand HTML as a markup language  Understand.
1 Agenda Overview Review Roles Lists Libraries Columns.
1. 2 Content WSK Online is a new online database of specialized dictionaries covering all the major areas of linguistics and communication science: Biannual.
With Alex Conger – President of Webmajik.com FrontPage 2002 Level I (Intro & Training) FrontPage 2002 Level I (Intro & Training)
Deny A. Kwary Internal Structures of Dictionary Entries.
(C) 2013 Logrus International Practical Visualization of ITS 2.0 Categories for Real World Localization Process Part of the Multilingual Web-LT Program.
Chapter 11 Adding Media and Interactivity. Flash is a software program that allows you to create low-bandwidth, high-quality animations and interactive.
Designing a Presentation
CS-EE 481 Spring Founders Day, 2005 University of Portland School of Engineering Project Pocket Gopher Conversational Learning Agent Team Josh Jones.
11 Games and Content Session 4.1. Session Overview  Show how games are made up of program code and content  Find out about the content management system.
 ult.htm ult.htm  This website illustrates the use of CCS (style sheets)
August Chapter 1 - Introduction Learning XML by Erik T. Ray Slides were developed by Jack Davis College of Information Science and Technology Radford.
Copyright © Texas Education Agency, All rights reserved. 1 Web Technologies Website Development with Dreamweaver.
Aurora: A Conceptual Model for Web-content Adaptation to Support the Universal Accessibility of Web-based Services Anita W. Huang, Neel Sundaresan Presented.
EMELD Workshop on Digitizing Lexical Information Modeling Lexical Entries in Bilingual Dictionaries —Or— Exegeting the UML Model Mike Maxwell Linguistic.
Tech Tools to Support Literacy in the Content Area ATEN Region 2 July 2005 July 2005.
XSLT for Data Manipulation By: April Fleming. What We Will Cover The What, Why, When, and How of XSLT What tools you will need to get started A sample.
Exploring Web Page Design. What is a Web Page?  A web page is a multimedia file which can be stored on a web server.  It can include text, graphics,
London April 2005 London April 2005 Creating Eyeblaster Ads The Rich Media Platform The Rich Media Platform Eyeblaster.
Chapter 1: By: Ms. Ola Al-arjani
London April 2005 London April 2005 Creating Eyeblaster Ads The Rich Media Platform The Rich Media Platform Eyeblaster.
Copyright © 2008 Pearson Prentice Hall. All rights reserved. 1 Exploring Microsoft Office Word 2007 Chapter 8 Word and the Internet Robert Grauer, Keith.
CHAPTER FIVE TEXT.
XP Dreamweaver 8.0 Tutorial 3 1 Adding Text and Formatting Text with CSS Styles.
PHP meets MySQL.
9 Chapter Nine Compiled Web Server Programs. 9 Chapter Objectives Learn about Common Gateway Interface (CGI) Create CGI programs that generate dynamic.
To enhance learning, service, and research through an advanced information technology environment. Our Mission:To enhance learning, service,and research.
The Document Object Model. The Web B.D, A.D. They aren’t web pages, they’re document objects A web browser interprets structured information. A server.
Automated Benchmarking Of Local Authority Web Sites Brian Kelly UK Web Focus UKOLN University of Bath Bath, BA2 7AY UKOLN is supported by:
ITCS373: Internet Technology Lecture 5: More HTML.
ICDL 2004 Improving Federated Service for Non-cooperating Digital Libraries R. Shi, K. Maly, M. Zubair Department of Computer Science Old Dominion University.
Kirrkirr: A flexible and approachable software interface to indigenous dictionaries Christopher Manning & Kristen Parton Computer Science and Linguistics,
Kirrkirr: Software for the Flexible and Interactive Visualization of a Structured Warlpiri Dictionary Christopher Manning Computer Science and Linguistics,
McGraw-Hill Career Education© 2008 by the McGraw-Hill Companies, Inc. All Rights Reserved. Office Word 2007 Lab 3 Creating Reports and Tables.
Use CSS to Implement a Reusable Design Selecting a Dreamweaver CSS Starter Layout is the easiest way to create a page with a CSS layout You can access.
SIL FieldWorks Language Explorer: The lexicon component Gary Simons SIL International Lexicon Tools and Lexicon Standards Nijmegen, 4–5 August 2010.
Customizing Aspen Templates TEC04 Elizabeth Lucchese.
Web Development 101 Presented by John Valance
Tutorial 3 Adding and Formatting Text with CSS Styles.
XML The Extensible Markup Language (XML ), which is comparable to SGML and modeled on it, describes how to describe a collection of data. A standard way.
Web Page Design 1 Information Technology ClassAct SRS enabled. Web Page Design This presentation will explore: creating web pages structure, formatting.
Compare and Contrast : Blackboard & a Personal Web Page www3.ltu.edu/~s_schneider/howto/faculty.htm You’ll find this presentation (and another) here :
Slang. Informal verbal communication that is generally unacceptable for formal writing.
Explain what a comparative ending is and give examples. Use the word “tales” in a sentence. What is a past tense verb? What is an irregular verb?
New Project Model UI Primary Author: Mikhail Sennikovsky Major contributors: Mikhail Voronin, Oleg Krasilnikov, Leo Treggiari Intel Corporation September,
Understanding Web-Based Digital Media Production Methods, Software, and Hardware Objective
ICAD3218A Create User Documentation.  Before starting to create any user documentation ask ‘What is the documentation going to be used for?’.  When.
In this session, you will learn to:
Learning the Basics – Lesson 1
Data Virtualization Tutorial: XSLT and Streaming Transformations
Chapter 27 WWW and HTTP.
What’s needed for lexical databases? Experiences with Kirrkirr
Using Dictionaries in Translation (223 TRAJ)
Presentation transcript:

Kirrkirr: a Bidirectional Warlpiri- English Dictionary Kristen Parton

Kirrkirr: Objectives Kirrkirr aims to present the contents of a dictionary in a way which is flexible, interactive, customizable, and (especially) fun Kirrkirr has diverse target users, with varying levels of literacy, for example professional linguists, elementary school children, teachers, and native speakers Currently, Kirrkirr is used with the Australian Aboriginal language Warlpiri, spoken by about 3,000 people in northern Australia Kirrkirr uses a Warlpiri-English dictionary developed by linguists in Australia, with detailed information about each word, including glosses, definitions, dialects, grammatical comments and cross- references between words for synonyms, antonyms, “see also” and other relationships Unlike paper dictionaries, electronic dictionaries can provide an interactive educational tool customizable to various audiences

Dictionary Usability The interface has a colorful, clickable panel which links words related in different ways, rather than just relying on the alphabetical list of words; this also makes the dictionary more interactive Many words are linked to pictures and sounds, which reinforce the meaning of the words through non-textual means The dictionary uses “fuzzy spelling” to catch spelling errors made by the user when searching for a word User modes tailor the appearance of the formatted entries to each target audience:  English meaning only,for novice users with English backgrounds  In Warlpiri, for native speakers of Warlpiri  Basic details, for intermediate users such as students  Full details, for advanced users such as teachers or linguists

Lexicon Structure The dictionary is maintained by linguists in Australia in an ad- hoc text format, which is converted to a structured XML dictionary by a Perl script Rather than load the large (10Mb) XML file in memory, each headword’s XML entry is loaded individually as needed The rich structure of the XML allows XSLT stylesheet manipulation of the dictionary entries to produce output formatted differently for different users The XSLT stylesheet outputs HTML pages, which make use of the cross-references in the dictionary by creating hyperlinks between different words

Customizing Format with XSLT At run-time, the XML entries are processed by an XSLT stylesheet, which selects which elements of the entry to show, determines the order to show them in, and formats each field differently depending on the user mode  For example, “Meaning only” outputs the english glosses of a word in large font, whereas “Full details” outputs all of the information in the dictionary in a normal sized font in a specific order. Since the XML is parsed at run-time, more information can be added to the XML to allow “parameter passing” from the program to the XSLT  For example, the location of the images folder can only be determined at run-time, but by adding an field to the XML at run-time, the XSLT can create an tag to display an image in the HTML output

English-Warlpiri Dictionary The original dictionary is one-way Warlpiri to English, but a bidirectional bilingual dictionary is more useful for most users An English index was built from glosses in the dictionary such that each gloss links to the equivalent Warlpiri entries. Rather than being two separate monolingual dictionaries, these dictionaries share the same data, thus eliminating conflicting entries and maintaining consistency The XML entries of all the Warlpiri equivalents to an English word are merged, and passed to an XSLT spreadsheet, which creates an HTML page for the English word

English-Warlpiri Dictionary To make the English dictionary symmetric to the Warlpiri, Kirrkirr now has an English word list, English formatted entries, a much faster English search, and the capability to do “fuzzy spelling” in English Problems arise because most Warlpiri words have several English equivalents, and also because phrases in English might be indexed under several different terms  For example, “yawarrangi” meaning “large male kangaroo” should be indexed under “kangaroo” rather than “large” or “male”  However, the “jawirdiki” and other words that mean “stay put” should be indexed under “stay” and not “put”  Words like “kirany-kiranypa” meaning “spinifex lizard” should be indexed under “spinifex” (the type) and “lizard”

Warlpiri Morphology Warlpiri is an agglutinating language, meaning that grammatical suffixes get added on to words: nyangulparnangku nya- ngu- lpa- rna- ngku See- PAST- IPFV- 1SG.SUBj- 2SG.OBJ “I was looking at you.” Root word: “nya-nyi” meaning “to see” For lookup in the dictionary, users have to know the root word This is difficult for learners of Warlpiri, given that morphemes are not always separated by hyphens and verbs are indexed with non-past tense inflections To make Kirrkirr more usable, a morphological analyzer was implemented to accept well-formed Warlpiri words and find the possible root words to look up

Morphological Analysis Suffixes from the dictionary are stored in a trie for quick lookup Each time an affix is stripped, the remaining string is checked to see whether it is in the dictionary Each possible morpheme is added to a lattice structure which holds all possible morphological decompositions of the word Grammar rules are applied to eliminate many impossible parses Some properties of Warlpiri make parsing more difficult, and show the need for a different indexing system:  Verbs are stored with non-past inflections but are seen with different inflections. For example, “nya-nyi” may show up as “nya-ngu.” But indexing “nya-nyi” under “nya” creates more abiguity, since “nya” is another word.  Some words have optional suffixes, such as “l(pa)” which may be seen as “l” or “lpa.” These words must be indexed under both entries.

Conclusions Making Kirrkirr a bidirectional English-Warlpiri and Warlpiri-English dictionary increases its usability and practicality, by making it easier for users who are more comfortable in English to browse and search in English. Allowing lookup of Warlpiri words from actual speech using the morphological analysis also increases usability, especially for users who are learning Warlpiri, since they do not have to figure out the root word. Future work:  Improving the morphological analysis to provide roughly ranked possible parses of all morphemes of an entire word, using more grammatical information and frequency information  Extending Kirrkirr to other languages