Presentation is loading. Please wait.

Presentation is loading. Please wait.

The LEGO Project Brent Miller, The LINGUIST List.

Similar presentations


Presentation on theme: "The LEGO Project Brent Miller, The LINGUIST List."— Presentation transcript:

1 The LEGO Project Brent Miller, The LINGUIST List

2 Overview Introduction Doing LEGO Current Status Future of LEGO

3 Introduction LEGO and the Need for Interoperability

4 A Variety of Data Standards LIFT LMF TEI File Formats PDF Excel/Access MDF (Toolbox).doc/.odt (Word/OpenOffice)

5 Why Interoperate? Greater access to language data More intelligent searches Ease of comparison between lexicons

6 What is LEGO? Three-year project sponsored by the NSF Participants: LINGUIST List, University at Buffalo Goal: Create a datanet of interoperable lexicons Map grammatical information to GOLD Map structure to a common schema (LL-LIFT) Output in XML where lexicon contributor allows Preserve source’s integrity

7 LEGO’s Purpose Not intended to develop a lexicon creation or display tool Will support multi-lexicon searches and comparisons Will demonstrate the value of digital standards in linguistic research

8 Doing LEGO Team Structure and Workflow

9 Team Structure Three principle investigators Jeff Good, University at Buffalo Helen Aristar-Dry and Anthony Aristar, Eastern Michigan University Three graduate students Brent Miller, Justin Petro, Erica Wicks One undergraduate, Lili Xia One programmer, Lily Zheng

10 Workflow Receive the Data ‘Descriptive’ XML XSL Stylesheet Upload to Database GOLD Mapping ‘Publish’ to LEGO Site

11 Current Status Our Data, Website, and Faceted Search

12 Lexical Data Completed 11 wordlists (10 Qiang dialects, Sáliba) 7 lexicons (Western Sisaala, Potawatomi, Udi, Ibibio, Wichita, Tuva, Shoshone) 10 nearing completion (Fulfulde, Archi, Udi, Mocoví, Jarawara, Nhirrpi, Titan, Maa, Mbodomo, Western Pantar, Mocho’)

13 The LEGO Site Homepage (in development) http://lego.linguistlist.org Browse lexicons Each lexicon has a homepage Browse entries Each entry has its own page Faceted search Allows for fine-grained GOLD-aware searches of morphological information across lexicons

14

15

16

17

18

19 Faceted Search Choose lexicons Text search Search across forms, variants, glosses, definitions, etymology, examples, notes Displays keyword in context Filters Easily added/removed Narrow search in real time

20

21

22

23

24

25 Filters GOLD concepts Author grammatical information tokens Language codes Note types Entry relation types

26

27

28

29

30 Future of LEGO Immediate and Long-Term Plans

31 2011-2012 Create a lexicon creator log-in Allow users to edit and add to their data User-tagging of GOLD concepts Upload of user’s original lexicon documents Enhance publically-available datanet of lexicons Facilitate open participation of linguists Solicit a large number of new lexicons Refine the import/export facility Publicize the site

32 2012 and Beyond Continue to solicit new data and refine the interface The more data that’s present on the site, the more useful it will become to semanticists, typologists, lexicographers, translators, and other researchers


Download ppt "The LEGO Project Brent Miller, The LINGUIST List."

Similar presentations


Ads by Google