Presentation is loading. Please wait.

Presentation is loading. Please wait.

Jan 9, 2004 Symposium on Best Practice LSA, Boston, MA 1 Metadata Helen Aristar Dry Eastern Michigan University LINGUIST List.

Similar presentations


Presentation on theme: "Jan 9, 2004 Symposium on Best Practice LSA, Boston, MA 1 Metadata Helen Aristar Dry Eastern Michigan University LINGUIST List."— Presentation transcript:

1 Jan 9, 2004 Symposium on Best Practice LSA, Boston, MA 1 Metadata Helen Aristar Dry Eastern Michigan University LINGUIST List

2 Jan 9, 2004 Symposium on Best Practice LSA, Boston, MA 2 Outline What is metadata? Why use OLAC metadata? How can you write OLAC metadata for your resources?  Metadata in XML  Using ORE

3 Jan 9, 2004 Symposium on Best Practice LSA, Boston, MA 3 Preliminaries Language documentation is valuable only if it is findable On the Internet, this means “findable by computational means” Efficient search and retrieval of language resources requires the use of metadata

4 Jan 9, 2004 Symposium on Best Practice LSA, Boston, MA 4 Metadata is: Structured data about data Similar to catalogue information Usually consists of a set of elements, each of which describes a property of the resource The elements of a metadata set can be encoded in different “languages,” e.g., html, xml, rdf/xml

5 Jan 9, 2004 Symposium on Best Practice LSA, Boston, MA 5 An example Title: Biao Min Data Creator (depositor): David Solnit Subject (linguistic field): Language Description Subject (language): Biao Min Date created: April 5, 1982 Description: The Biao Min data on the E- MELD site includes over 3,000 lexical items.....

6 Jan 9, 2004 Symposium on Best Practice LSA, Boston, MA 6 Example in HTML

7 Jan 9, 2004 Symposium on Best Practice LSA, Boston, MA 7 Example in XML Biao Min Data David Solnit Biao Min

8 Jan 9, 2004 Symposium on Best Practice LSA, Boston, MA 8 Metadata Different metadata specifications: MARC, METS, Dublin Core, IMDI, OLAC IMDI & OLAC designed specifically for language documentation

9 Jan 9, 2004 Symposium on Best Practice LSA, Boston, MA 9 OLAC Metadata Product of the Open Language Archives Community http://www.language-archives.org/ Strengths:  Ease of creation  Search & retrieval via the protocols of the Open Archives Initiative

10 Jan 9, 2004 Symposium on Best Practice LSA, Boston, MA 10 Open Archives Initiative Cross-disciplinary initiative for search and retrieval of metadata from multiple archives Establishes protocols for “harvesting” metadata records of participating archives and making them available via “Service Providers.” Supports formation of discipline-specific sub-communities such as OLAC (Open Language Archives Community)

11 Jan 9, 2004 Symposium on Best Practice LSA, Boston, MA 11 LINGUIST List = OLAC Gateway LINGUIST List is the main service provider for OLAC Harvests metadata from 27 major archives Collects metadata from individual linguists about their language documentation Offers search interface for over 30,000 records of language-related data See: http://linguistlist.org/olac/ http://linguistlist.org/olac/

12 Jan 9, 2004 Symposium on Best Practice LSA, Boston, MA 12 OLAC Metadata OAI uses the Dublin Core (DC) metadata standard  15 elements (each optional & repeatable)  Core vocabulary for refining elements (dcterms) Sub-communities may qualify DC metadata to suit their specific needs OLAC has qualified DC metadata to better describe language resources.

13 Jan 9, 2004 Symposium on Best Practice LSA, Boston, MA 13 OLAC Qualifies 5 of the 15 DC Elements  Language  Publisher  Relation  Rights  Source  Subject  Title  Type  Contributor  Coverage  Creator  Date  Description  Format  Identifier

14 Jan 9, 2004 Symposium on Best Practice LSA, Boston, MA 14 OLAC recommends 5 extensions:  Language  OLAC language  Subject  OLAC Language  Linguistic Field  Type  Linguistic Data Type  Discourse Type  Contributor  Role  Creator  Role

15 Jan 9, 2004 Symposium on Best Practice LSA, Boston, MA 15 Provides a controlled vocabulary for identifying the role of a Creator or Contributor more precisely. The vocabulary identifies approximately twenty roles that are common in the development of language resources. Examples: depositor, signer, transcriber, respondent, editor, consultant, researcher. Documentation: http://www.language-archives.org/REC/role.html Participant Role

16 Jan 9, 2004 Symposium on Best Practice LSA, Boston, MA 16 Language Identification: Provides codes for identifying all known languages, both living and extinct. Applies to: Language, Subject

17 Jan 9, 2004 Symposium on Best Practice LSA, Boston, MA 17 Linguistic Field Provides codes for identifying the content of a resource as relevant to a particular subfield of linguistic science Applies to: Subject Examples: anthropological_linguistics, applied_linguistics, cognitive_science, computational_linguistics, lexicography, discourse_analysis,

18 Jan 9, 2004 Symposium on Best Practice LSA, Boston, MA 18 Describes the resource as representing a recognized structural type of linguistic information Applies to: Type Examples:  Lexicon  Primary text  Language description  Dataset (Already in DCterms). Linguistic Data Type

19 Jan 9, 2004 Symposium on Best Practice LSA, Boston, MA 19 Discourse Type Provides a controlled vocabulary for identifying approximately ten discourse types. It is used with Type to identify the genre of a language resource (particularly a primary text). Types: Interactive Discourse, Report, Singing, Oratory, Narrative, Formulaic Discourse, Procedural Discourse, Language Play, Unintelligible Speech http://www.language- archives.org/REC/discourse.html http://www.language- archives.org/REC/discourse.html

20 Jan 9, 2004 Symposium on Best Practice LSA, Boston, MA 20 See “metadata” in the E-MELD School of Best Practices: http://emeld.org/school/classroom/metadata Or use the OLAC Repository Editor: See: http://linguistlist.org/ore/ http://linguistlist.org/ore/ Writing metadata


Download ppt "Jan 9, 2004 Symposium on Best Practice LSA, Boston, MA 1 Metadata Helen Aristar Dry Eastern Michigan University LINGUIST List."

Similar presentations


Ads by Google