Controlled Vocabulary Working Group - 2013 PRESENTED BY JOHN PORTER.

Slides:



Advertisements
Similar presentations
Internet Search Lecture # 3.
Advertisements

Controlled Vocabularies in TELPlus Antoine ISAAC Vrije Universiteit Amsterdam EDLProject Workshop November 2007.
Mark Servilla & Duane Costa LTER Network Office LTER 2012 All Scientist Meeting LTER Network Office.
AIS AIM SG Ad-hoc Chapters 5-6 Group TOKYO MEETING REPORT.
ENTSO-E Metadata Repository Introduction
LTER IM Articulation Work: Developing Community Web Recommendations Nicole Kaplan (SGS), Karen Baker (CCE, PAL), Barbara Benson (NTL), Eda Melendez-Colom.
2009 Mid–Term Review El Verde Field Station June 4, 2009.
1 Transportation Librarians Roundtable Transportation Research Thesaurus: WSDOT Use Cases February 14, 2008 Andy Everett Metadata Repository Administrator.
Quick Start Guide Version 1.0. Focused around 14 major areas of engineering, AccessEngineering features a new taxonomy book view offering comprehensive.
Building the LTER Network Information System. NIS History, Then and Now YearMilestone 1993 – 1996NIS vision formed by Information Managers (IMs) and LTER.
Improving Quality with the Substance Registry Services (SRS) John Harman U.S. EPA May 14, 2009.
What’s The Difference??  Subject Directory  Search Engine  Deep Web Search.
Long-Term Ecological Research working_groups/controlled_vocabulary Working Group: “Synthesis through data.
Semantic Web Technologies Lecture # 2 Faculty of Computer Science, IBA.
Improving Data Discovery in Metadata Repositories through Semantic Search Chad Berkley 1, Shawn Bowers 2, Matt Jones 1, Mark Schildhauer 1, Josh Madin.
ISO as the metadata standard for Statistics South Africa
10 April 2014 The Redesigned WSDOT Data Catalog Andy Everett, Metadata Repository Librarian, Washington State DOT.
Based on material developed by Samantha Romanello and
 Workshops: March & May 2011 and lots of VTCs! Details at:
Objective: Researchers need access to data, regardless of the language used in the metadata. Our objective is to facilitate discovery of ILTER data regardless.
EML Congruency Checker A tool to assess and report on the quality of EML-based data packages.
Controlled Vocabulary & Thesaurus Design Term Selection/Format & Synonyms.
Materials Science Registry Will propose RDA Materials Science WG Define minimum/modest metadata extensions to Dublin Core to enable resource discovery.
LTER IMC Meeting Sept Past Activities Created list of about ~650 terms based on widely-used LTER EML Keywords Autocomplete search aid added to.
1999 Asian Women's Network Training Workshop Tools for Searching Information on the Web  Search Engines  Meta-searchers  Information Gateways  Subject.
Internet Business Foundations © 2004 ProsoftTraining All rights reserved.
Metadata and Geographical Information Systems Adrian Moss KINDS project, Manchester Metropolitan University, UK
ZLOT Prototype Assessment John Carlo Bertot Associate Professor School of Information Studies Florida State University.
Controlled Vocabulary Working Group Virtual Water Cooler Session April 6-7, 2009 Moderator: John Porter rm.action?confKey=jhp7e.
Change Enhancement Process Overview Chris Walsh North American Area Representative SAGGroup Executive Committee In my spare time … Chief Technology Architect.
The UNESCO Thesaurus Meeting for Managers of UNESCO Documentation Networks Meron Ewketu UNESCO Library June
1 Session Number Presentation_ID © 2001, Cisco Systems, Inc. All rights reserved. Using the Cisco TAC Web Site for LAN Switching Issues Cisco TAC Web Seminar.
Introduction to Morpho BEAM Workshop Samantha Romanello Long Term Ecological Research University of New Mexico.
Controlled Vocabulary VTC June 1, Agenda Review some past activities Plan some future activities.
 Finalize VOCAB “Terms of Reference”  Define use cases for the keyword database and its development  Develop procedures for capturing and managing.
New Tools for astronomy librarians D Donna Thompson SLA PAM Roundtable June 9, 2014.
Why EML Metrics Primary quality checks are limited –schema compliance –EML parser (ids and references) Dataset quality not sufficient for automated use.
1 Understanding Cataloging with DLESE Metadata Karon Kelly Katy Ginger Holly Devaul
Controlled Vocabulary Giri Palanisamy Eda C. Melendez-Colom Corinna Gries Duane Costa John Porter.
Introduction to Morpho RCN Workshop Samantha Romanello Long Term Ecological Research University of New Mexico.
The US Long Term Ecological Research (LTER) Network: Site and Network Level Information Management Kristin Vanderbilt Department of Biology University.
GOOGLE SCHOLAR Compiled by Helene van der Sandt. WHAT IS GOOGLE SCHOLAR?
TSS Database Inventory. CIRA has… Received and imported the 2002 and 2018 modeling data Decided to initially store only IMPROVE site-specific data Decided.
CIW Lesson 6MBSH Mr. Schmidt1.  Define databases and database components  Explain relational database concepts  Define Web search engines and explain.
LTER IM Meeting 2008 – Benson, Boose, Bohm, Gries, Gu, Kaplan, Koskela, Laney, Porter, Remillard, Sheldon and others.
LTER GIS Working Group Update Adam Skibbe and Theresa Valentine 2012 June Water Cooler.
Achieving Semantic Interoperability at the World Bank Designing the Information Architecture and Programmatically Processing Information Denise Bedford.
Controlled Vocabulary & Thesaurus Design Associative Relationships & Thesauri.
Controlled Vocabulary Working Group Activities
Statistical Data and Metadata Exchange SDMX Metadata Common Vocabulary Status of project and issues ( ) Marco Pellegrino Eurostat
GEONIS. From the IM Proposals Developing “PASTA” ready spatial data for the Network Information System (NIS) – 1. Attend a workshop to create best practices.
Charles Copp, Neil Caithness & Richard White.  Evaluation, selection and acquisition of existing thesauri  Thesaurus modelling - logical and physical.
Controlled Vocabulary Working Group Activities
Network Information System Advisory Committee NISAC Activity Report 2007 LTER IM Meeting Wade Sheldon (GCE) Committee Co-chair.
Research Skills for Your Essay Where to begin…. Starting the search task for real Finding and selecting the best resources are the key to any project.
Developing our Metadata: Technical Considerations & Approach Ray Plante NIST 4/14/16 NMI Registry Workshop BIPM, Paris 1 …don’t worry ;-) or How we concentrate.
Slides Template for Module 3 Contextual details needed to make data meaningful to others CC BY-NC.
Network Information System Advisory Committee (NISAC)
DataNet Collaboration
AIS AIM SG Ad-hoc Chapters 5-6 Group
Lesson 6: Databases and Web Search Engines
Christian Ansorge Arona, 09/04/2014
Taxonomies, Lexicons and Organizing Knowledge
Introduction to the New SSA OnePoint Online Website
LTER Metadata Query Interface – Current Status and Future Challenges
CVE.
Lesson 6: Databases and Web Search Engines
One Language. One Enterprise.™
LTER Controlled Vocabulary Virtual WaterCooler - July, 2018
Citation databases and social networks for researchers: measuring research impact and disseminating results - exercise Elisavet Koutzamani
Presentation transcript:

Controlled Vocabulary Working Group PRESENTED BY JOHN PORTER

Goal  Make it easy for researchers to find the data they need from LTER repositories by  Enhancing searches through the use of a thesaurus that provides synonyms, narrower terms and related terms  Creating a browseable structure for locating datasets

2013 Goals  Enhance term list to incorporate:  New terms suggested by sites  Frequently searched terms  Frequently used terms  Terms related to human activities (social science)  More synonyms for existing terms that are found in LTER Metadata  Needed: Establish clear criteria for evaluating candidate terms  Best Practices

Goals  Add definitions for terms in the Controlled Vocabulary  Create plans for dealing with taxonomic names and places that are currently not part of the existing Controlled Vocabulary

Workshop – May 2013  Pre-Workshop  Queried LTER Sites for new candidate terms – Melendez, Henshaw, Vanderbilt  Queried existing documents for words not currently in the Controlled Vocabulary – Gastil-Buhl  Queried logs for search terms used by Metacat users - Costa  Updated Tematres software to the latest version - Porter  Identified online sources for definitions – O’Brien, Vanderbilt  Investigated taxonomic web services and gazetteers – Gries  Note: the group favors using Taxonomic and Geographic Coverage elements rather than keywords for these elements

Workshop Participants 2013  LTER Information Managers  Margaret O’Brien, Kristen Vanderbilt, Donald Henshaw and John Porter  Professional Librarians from UVA:  Sherry Lake and Ivey Glendon  Added a lot to our discussions  “about” vs. “contains” taxonomies  our focus is describing what datasets contain  “about” is much harder to define for data

Workshop Results 2013  New Terms  ~ 230 terms were suggested by 4 sites  ~ 75 terms were accepted and added to LTER Vocabulary  Reason for rejection was given for each term not added  ~ 25 additional terms were added based on use at 3 or more LTER Sites or 2 or more sites with > 10 datasets  ~ Several suggested terms were added as non-preferred (UF) terms  Definitions  309 new definitions added

Controlled Vocabulary Status  710 total preferred terms  200 synonyms (“use for” terms)  363 total definitions

Important Workshop Activities  Developed improved Best Practices for identifying additional terms for inclusion (  Including a table that lays out grounds for rejecting particular words

WhatRationaleDo’s Problem Abbreviation Keywords should be applied to a number of datasets across the LTER Network. Data discovery is the goal, so keywords that find data are most useful. Propose keywords that are used at several other sites, and numerous datasets NR - not repeated in multiple datasets Keywords should be used at more than one site A goal is to enable cross-site searching Propose keywords that are used at several other sites A - absent from other sites Avoid proposing stand-alone adjectives Stand alone adjectives imply an “of what” question. Such as “aboveground” raises the question “aboveground what?” Propose nouns or possibly verbs, but not stand-alone adjectives. Perferred terms can include an adjective with an object (e.g., aboveground biomass) ADJ - stand-alone adjective Be specific Vague or ill-defined terms are hard to consistently assign Use specific, unambiguous and well- defined terms V - Vague Avoid duplicating concepts already in the Controlled Vocabulary Duplicative keywords lead to inconsistent keyword assignments Avoid duplication of nearly-equivalent terms AWE - adequate alternative word exists Keywords should be well-defined Without definition and context some technical terms may be difficult to assess or place Provide good definitions NC - needs clarification or better definition Proposed synonyms should have exact correspondence to the preferred term Synonyms should not refer to different concepts than the associated preferred term Select synonyms that are exact matches for the concept described by the preferred term NS - not a synonym Keywords should be terms that users frequently search on Keywords that are not searched for by users are not particularly useful. Propose keywords that are frequently used in searches NU - not used for search

Vision  Refining the “Vision” for how the controlled vocabulary can be used to make PASTA and other NIS elements more effective  And link to other efforts such as DataOne, LODE and EnvThes  Optional workshop yesterday – tasks identified:  Identify systems and software tools that effectively exploit controlled vocabularies for searching/browsing and ranking  Metrics tools: help identify specific datasets that could benefit from additional keywords

Help us out!  During discussions today and tomorrow, think about how the Controlled Vocabulary can be leveraged  Incorporate terms from the Controlled Vocabulary into your site EML documents  ASK us if you need help!!!!! – we have tools