Presentation is loading. Please wait.

Presentation is loading. Please wait.

Overview Ideas Other Stuff

Similar presentations


Presentation on theme: "Overview Ideas Other Stuff"— Presentation transcript:

1 Overview Ideas Other Stuff
Working Group #6 Overview Ideas Other Stuff Participants: Helen Aristar-Dry, Terry Langendoen, Lori Levin, William Lewis, Steve Miller, Thorsten Trippel, Eno-Abasi, Urua, Robert Vann June 20-22, 2006 EMELD 2006 Tools and Standards

2 WG#6 – Resource and Data Discovery
Questions: What tools exist for Resource Discovery? Data Discovery? What do linguists (want to) look for? Do the tools and the needs match? June 20-22, 2006 EMELD 2006 Tools and Standards

3 What resources do linguists want?
What linguists want: Much more MORE? Plenty o PRELDs? A QBE 4 IGT? A pet horta! June 20-22, 2006 EMELD 2006 Tools and Standards

4 What resources do linguists want?
What linguists want: Corpora Archives Linguistic “data”, specific kinds of linguistic data types (e.g., constructions) Speakers of Languages Other linguists who work on specific languages Find data that exist only in legacy formats E.g., not online The “shoebox registry” Resources for Languages If none available, for genetically related languages June 20-22, 2006 EMELD 2006 Tools and Standards

5 More things linguists want …
More resources that linguists and communities want (for particular languages): Unicode fonts Keyboard layouts Font Converters POS taggers/OCR software, etc. June 20-22, 2006 EMELD 2006 Tools and Standards

6 Tools for Discovery: Resources
Two classes of tools: Tools that do resource search OLAC Google IMDI June 20-22, 2006 EMELD 2006 Tools and Standards

7 Tools for Discovery: Data
Two classes of tools: Tools that do data (deep) search Usually within resource Often idiosyncratic Exceptions: Tgrep Need more QBE (Query by Example) tools Pre-packaged queries June 20-22, 2006 EMELD 2006 Tools and Standards

8 Tools Room additions Tools room The Archive/Corpus Mall
Need services listed Services to list include Aggregators Search tools The Archive/Corpus Mall One stop shop for all your archive and corpus needs Ultimately should be OLAC (May harvest from Natural Language Software Registry and OLAC) June 20-22, 2006 EMELD 2006 Tools and Standards

9 OLAC Mechanisms for requesting changes to OLAC metadata not publicized well There are additions that linguists might want Additional metadata for sub-disciplines Need an OLAC tutorial For potential resource providers For users looking for resources Need a service that will register your archive for you (they ask the questions and do all the work) June 20-22, 2006 EMELD 2006 Tools and Standards

10 How to Get More (New) Resources and Data?
Simons et al, A model for interoperability How to Get More (New) Resources and Data? Resources and Data Dissertations Publications Language data Incentives to get more linguists to provide resources and data $$ Recognition Grades Archives that warehouse documents and data, even on a small scale (and provide metadata?) Ended up dominating our discussions June 20-22, 2006 EMELD 2006 Tools and Standards EMELD 2004, Linguistic Databases

11 Language Data Even if not available to the public, all resources should be archived!!! June 20-22, 2006 EMELD 2006 Tools and Standards


Download ppt "Overview Ideas Other Stuff"

Similar presentations


Ads by Google