Presentation is loading. Please wait.

Presentation is loading. Please wait.

Federal Controlled Vocabularies Data Architecture Sub-Committee (DAS) April 8, 2010 Brand K. Niemann.

Similar presentations


Presentation on theme: "Federal Controlled Vocabularies Data Architecture Sub-Committee (DAS) April 8, 2010 Brand K. Niemann."— Presentation transcript:

1 Federal Controlled Vocabularies Data Architecture Sub-Committee (DAS) April 8, 2010 Brand K. Niemann

2 Federal Controlled Vocabularies What Are They Examples Discussion

3 Why a Controlled Vocabulary? Improve effectiveness of information storage and retrieval systems Knowledge workers spend 25-35% of their time searching for information with 50% success 1 The need for vocabulary control arises from two basic features of natural language, namely : Two or more words or terms can be used to represent a single concept Example: salinity/saltiness VHF/Very High Frequency Two or more words that have the same spelling can represent different concepts Example: Mercury (planet) Mercury (metal) Mercury (automobile) Mercury (mythical being) Tutorial http://www.slis.kent.edu/~mzeng/Z3919/1need.htm http://www.slis.kent.edu/~mzeng/Z3919/1need.htm 1 Working Council of CIOs, Business Wire, Feb 27 2001

4 Controlled Vocabulary Synonym Ring Authority File Taxonomy Thesaurus + Words with same meaning in a given context + Preferred Terms (USE) + Broader (BT) and Narrower Terms (NT) + Related Terms (RT) {BT, NT, USE} List Set of terms arranged in logical way Increasing structural and semantic complexity Why and when to use: http://www.slis.kent.edu/~mzeng/Z3919/6pro&con.htm Dimension and Context

5 Controlled Vocabulary: Dimension and Context Synonym Ring Authority File Taxonomy Thesaurus + Words with same meaning in a given context + Preferred Terms (USE) + Broader (BT) and Narrower Terms (NT) + Related Terms (RT) {BT, NT, USE} List Set of terms arranged in logical way Increasing structural and semantic complexity Dimension and Context (not a definitive list) Organizationhuman resources, marketing, accounting, etc. Function Type employment, staffing, training, etc. Subjectwater pollution, soil pollution, air pollution, etc. Identify a document or database for a data catalog (data.gov, data.gov.uk, etc.) Consistent vocabulary for describing database or document dcat and related, Dublin Core, SKOS, FOAF 1 Identify a data ItemVehicle Identification Number (VIN) Uniform Resource Indicator (URI) Identify a data ElementPatient Person First Name ISO/IEC 11179-5/UDEF Relate a Resource Relate a Vocabulary 1 http://richard.cyganiak.de/2010/03/dcat-for-egov-ig.pdf

6 Controlled Vocabulary Examples Agency --Context --Dimension DOD - Center for Army Lessons Learned Intended Purpose: Organization of equipment supporting the business -- Functio n (Also by Type) NASA - NASA Thesaurus Intended Purpose: Organization of equipment supporting the business --Type EPA - Data Classes and Areas Intended Purpose: Organization of subject areas supporting the business --Subject IRS -IRS Tax Map Intended Purpose: Organization of topics for answering questions ---Subject Synonyms and Word Equivalent Radio - Radio Detection and Ranging Telescope -scope Manned Lunar Space Vehicle - Apollo 11 Mission Waste - Run-off Amended Tax Return - 1040X Authority File Radio Detection Finding - (USE) Radio Scope (USE) Telescope Run-off (USE) Waste Employment Income (USE) Wages and Salary Taxonomy + Broader (BT) and Narrower Terms (NT) {BT, NT, USE, UF} ( BT) Radar (by function) ( NT) aircraft radars (NT) airport radar systems (NT) Ground Based Radar (NT) imaging radar (NT) meteorological radar (NT) missile site radar (NT) search radar (NT) terrain analysis radar ( BT) Instruments (NT) Accelerometers (NT) Acoustic Sensors (NT) etc.. (NT) Telescopes (NT) Optical telescopes (NT) Radio telescopes (BT) Substances (NT) Chemicals (NT) Biological (NT) Contaminants (NT) Wastes (NT) Radiation (NT) Commercial Products (BT) Tax Topics (NT) IRS Help (NT) IRS Procedures (NT) Collection (NT) Alternative Filing Methods (NT) General Information (NT) Which Forms to Use Thesaurus + Related Terms (RT) ( BT) Radar (RT) AN/MPQ-65 (RT) AN/MPQ-65 Radar set (RT) navigation (RT) instruments (RT) noise (radar) (RT) radar scattering (BT) Radio Telescopes (RT) Microwaves (High Energy Radio Telescope) (BT) Wastes ( RT) Garbage (RT) Refuse (RT) Biosolids (RT) Pollution Control Facilities (BT) Itemized Deductions (NT) Should I Itemize? (ET) Publication 501 (RT) Tax Topic 551 Publication 501 - (ET) Exemptions, Standard Deduction, and Filing Information

7 Discussion Topics and General Considerations 1.Sources for Federal Controlled Vocabularies considerations 2.Relate vocabularies across domain considerations – Move from levels of concreteness to abstractness – Understand similarity between domains and differences between domains – Require consistency 3.Your input Language Universals and Linguistic Typology, Comrie, 1989 (Survey of World languages for comparison and classification)

8 Resources Guidelines for the Construction, Format, and Management of Monolingual Controlled Vocabularies 9 Related Efforts 10 Federal CV Efforts 11 Display Types 12 Automated Example 13 Ontology Spectrum 14 Sample Tools 15

9 Guidelines ANSI/NISO Z39.19 - Guidelines for the Construction, Format, and Management of Monolingual Controlled Vocabularies – http://www.niso.org/kst/reports/standards?step=2&gid=&project_key =7cc9b583cb5a62e8c15d3099e0bb46bbae9cf38a

10 Related Efforts Universal Data Element Framework (UDEF) Controlled vocabulary for naming data elements based on ISO/IEC 11179-5. http://www.opengroup.org/udef/ Digital Express Research Institute (DERI) Data catalog (dcat) vocabulary RDF Vocabulary for exchange of data catalogs, such as data.gov and data.gov.uk (early draft) http://vocab.deri.ie/dcat Universal Core (UCORE)Agreed upon representations for most commonly shared and understood elements. https://ucore.gov/ucore/ NIEM IEPDAgreed upon exchange for area of shared interest. http://www.niem.gov/ etc

11 Federal CV Efforts USAF Vocabulary OneSource https://gcic.af.mil/OneSource/welcome.aspx CENDI September 11, 2008 Workshop New Dimensions in Knowledge Organization Systems http://cendiwiki.wik.is/2008_September_11 SKOS for the DoD Metadata Registry @ https://metadata.dod.mil/mdr/documents/DoDMWG/2010/01/2010-01-13_SKOS.ppt https://metadata.dod.mil/mdr/documents/DoDMWG/2010/01/2010-01-13_SKOS.ppt Taxonomy Tuesday http://semanticommunity.wik.is/Taxonomy_Tuesday VoCampDCMay 2009 http://vocamp.org/wiki/VoCampDCMay2009 etc

12 Display Types More Types: http://www.slis.kent.edu/~mzeng/Z3919/53display.htm

13 Automated Example

14 Controlled Vocabulary Courtesy of Leo Obrst, Mitre Corporation

15 Sample Tools


Download ppt "Federal Controlled Vocabularies Data Architecture Sub-Committee (DAS) April 8, 2010 Brand K. Niemann."

Similar presentations


Ads by Google