Controlled Vocabulary & Thesaurus Design Planning & Maintenance.

Slides:



Advertisements
Similar presentations
Geoscience Information Network Stephen M Richard Arizona Geological Survey National Geothermal Data System.
Advertisements

Thesaurus speed dating conclusions. The ideal thesaurus… …is tailor-made for the special needs of its user community. In other words, it is different.
1 Leonard Will Willpower Information Evaluation of HILT 2.
Spatial Data Infrastructure: Concepts and Components Geog 458: Map Sources and Errors March 6, 2006.
Taxonomies of Knowledge: Building a Corporate Taxonomy Wendi Pohs, Iris Associates
Taxonomies, Lexicons and Organizing Knowledge Wendi Pohs, IBM Software Group.
Helping people find content … preparing content to be found Enabling the Semantic Web Joseph Busch.
Entering A New ERA : The European Research Area Ken Miller UK Data Archive University Of Essex June 11-15, 2002.
CAP 252 Lecture Topic: Requirement Analysis Class Exercise: Use Cases.
Building Digital Museums, Libraries and Archives David Dawson Senior Policy Adviser (Digital Futures)
Using Metadata in CONTENTdm Diana Brooking and Allen Maberry Metadata Implementation Group, Univ. of Washington Crossing Organizational Boundaries Oct.
Thesaurus Design and Development
© Tefko Saracevic, Rutgers University1 digital libraries and human information behavior Tefko Saracevic, Ph.D. School of Communication, Information and.
The Subject Librarian's Role in Building Digital Collections: Where Information Management and Subject Expertise Meet Ruth Vondracek Oregon State University.
CSE 730 Information Retrieval of Biomedical Data The use of medical lexicon in biomedical IR.
A Registry for controlled vocabularies at the Library of Congress
© Tefko Saracevic, Rutgers University1 digital libraries and human information behavior Tefko Saracevic, Ph.D. School of Communication, Information and.
Task analysis 1 © Copyright De Montfort University 1998 All Rights Reserved Task Analysis Preece et al Chapter 7.
MDC Open Information Model West Virginia University CS486 Presentation Feb 18, 2000 Lijian Liu (OIM:
Bina Nusantara 2 C H A P T E R INFORMATION SYSTEM BUILDING BLOCKS.
PREMIS Tools and Services Rebecca Guenther Network Development & MARC Standards Office, Library of Congress NDIIPP Partners Meeting July 21,
Evaluation of digital collections' user interfaces Radovan Vrana Faculty of Humanities and Social Sciences Zagreb, Croatia
Academic Research to Support Arguments.
ADC Meeting ICEO Standards Working Group Steven F. Browdy, Co-Chair ADC Workshop Washington, D.C. September, 2007.
Human and Institutional Capacity Development Project in Rwanda (HICD-R) CORE TEAM KM WORKSHOP February 26, 2015 Delivered by Courtney Roberts.
HEALTH DEVELOPMENT AGENCY ONLINE INFORMATION RESOURCES Heidi Livingstone Marta Calonge Contreras.
المحاضرة الثالثة. Software Requirements Topics covered Functional and non-functional requirements User requirements System requirements Interface specification.
Human Resource Management Lecture 27 MGT 350. Last Lecture What is change. why do we require change. You have to be comfortable with the change before.
Planning and Writing Your Documents Chapter 6. Start of the Project Start the project by knowing the software you will write about, but you should try.
CS 360 Lecture 3.  The software process is a structured set of activities required to develop a software system.  Fundamental Assumption:  Good software.
LIS 506 (Fall 2006) LIS 506 Information Technology Week 11: Digital Libraries & Institutional Repositories.
VOA3R Virtual Open Access Agriculture & Aquaculture Repository: sharing scientific and scholarly research related to agriculture, food, and environment.
Controlled Vocabulary & Thesaurus Design Term Selection/Format & Synonyms.
Process Analysis Agenda  Multiple methods & perspectives There are lots of ways to map processes  Useful in many situations not just HRIS design  Preparation.
Knowledge Management in Theory and Practice
Chapter 10 Information Systems Analysis and Design
Jennie Ning Zheng Linda Melchor Ferhat Omur. Contents Introduction WordNet Application – WordNet Data Structure - WordNet FrameNet Application – FrameNet.
Survey of Medical Informatics CS 493 – Fall 2004 September 27, 2004.
The Information Challenge Exponential growth of resources New researchers with new needs Multiple communication options New expectations and opportunities.
Keyword vs. Controlled Vocabulary Searching 12 Basic Skills for IQ.
Optimizing Resource Discovery Service Interfaces in Statewide Virtual Libraries: The Library of Texas Challenge William E. Moen, Ph.D. Texas Center for.
Definition of a taxonomy “System for naming and organizing things into groups that share similar characteristics” Taxonomy Architectures Applications.
When Search is not Enough Case Study: The Advertising Research Foundation Gilbane Boston November 27, 2007 Gilbane Boston November 27, 2007.
Electronic Scriptorium, Ltd. AIIM Minnesota Chapter Metadata and Taxonomy Presentation Copyright Electronic Scriptorium, Ltd. All rights reserved, 1991.
FEA DRM Management Strategy Presented by : Mary McCaffery, US EPA.
Controlled Vocabulary & Thesaurus Design Planning & Maintenance.
Controlled Vocabulary & Thesaurus Design Term Selection/Format & Synonyms.
Frankfurt (Germany), 6-9 June 2011 SmartLife Guillaume & SmartLife Core Group – France – S1 – Paper SmartLife initiative in Focus.
1 Understanding Cataloging with DLESE Metadata Karon Kelly Katy Ginger Holly Devaul
WEB 2.0 PATTERNS Carolina Marin. Content  Introduction  The Participation-Collaboration Pattern  The Collaborative Tagging Pattern.
Controlled Vocabulary Giri Palanisamy Eda C. Melendez-Colom Corinna Gries Duane Costa John Porter.
Controlled Vocabulary & Thesaurus Design Course Introduction and Background.
Harvesting Social Knowledge from Folksonomies Harris Wu, Mohammad Zubair, Kurt Maly, Harvesting social knowledge from folksonomies, Proceedings of the.
Digital Libraries1 David Rashty. Digital Libraries2 “A library is an arsenal of liberty” Anonymous.
Metadata “Data about data” Describes various aspects of a digital file or group of files Identifies the parts of a digital object and documents their content,
1 Value of Taxonomies in Knowledge Management Joe Schehr VP Knowledge Management and Technology Solutions LexisNexis.
Information Retrieval Transfer Cycle Dania Bilal IS 530 Fall 2007.
Requirements Engineering Process
A Resource Discovery Service for the Library of Texas Requirements, Architecture, and Interoperability Testing William E. Moen, Ph.D. Principal Investigator.
Information Architecture Strategy Recommendation Highlights Presented by Cord Woodruff, Ph.D. September 5, 2001.
Knowledge Management in Theory and Practice
Achieving Semantic Interoperability at the World Bank Designing the Information Architecture and Programmatically Processing Information Denise Bedford.
Controlled Vocabulary & Thesaurus Design Associative Relationships & Thesauri.
Controlled Vocabulary & Thesaurus Design Types of Controlled Vocabularies.
Sample Project Context INFO 330. The Deliverables Analyze Org Project Scope Stakeholder analysis User Usability Surveys Personas Info Heuristics Content.
SKOS : A language to describe simple knowledge structures for the web
Charlyn P. Salcedo Instructor Types of Indexing Languages.
Chapter 1 Assuming the Role of the Systems Analyst.
UNIFIED MEDICAL LANGUAGE SYSTEMS (UMLS)
Taxonomies, Lexicons and Organizing Knowledge
Presentation transcript:

Controlled Vocabulary & Thesaurus Design Planning & Maintenance

Developed by the Association of Library Collections & Technical Services and Library of Congress’s Cataloger’s Learning Workshop Controlled Vocabulary Review  What?  What Controlled Vocabulary is right for you?  When?  When should the CV be developed and implemented?  Why?  Why is this CV a necessary development project?  How?  How is the CV going to be developed?

Developed by the Association of Library Collections & Technical Services and Library of Congress’s Cataloger’s Learning Workshop Thesaurus Design Questions  Is a controlled vocabulary really necessary?  What is the lowest level of vocabulary that will get the job done?  Will natural language searching be sufficient?  Will an interface design improvement alleviate the need for a controlled vocabulary?  Will there be more than one indexer?  Is someone available with the time and the skills to develop a thesaurus?  Will someone be available in the future?

Developed by the Association of Library Collections & Technical Services and Library of Congress’s Cataloger’s Learning Workshop Project Justification  Cost of finding (time, frustration)  Cost of not finding (bad decisions)  Cost of training (staff turnover)  Value of discovery (related information, browsing)  Language is ambiguous – synonyms, abbreviations, acronyms, misspellings, homonyms, antonyms, etc.

Developed by the Association of Library Collections & Technical Services and Library of Congress’s Cataloger’s Learning Workshop Project Justification  What are the specific objectives of the project?  Are essential objects hidden in a lot of chaff?  Are a few good objects sufficient? Or is it necessary to find the best, the one that makes a difference, or everything on a topic?  Use easily understood terms like common vocabularies rather than technical terms like taxonomies  Stories tell it best.

Developed by the Association of Library Collections & Technical Services and Library of Congress’s Cataloger’s Learning Workshop Project Justification  “Users of … intranets frequently express frustration with how much time it takes to find items—both when searching for known items and when browsing to see if items on a particular topic exist in the system... Browsing and search functions are much enhanced if the indexing and topic hierarchy, or taxonomy, make sense to the user and are customized to reflect the content of the source documents.” Jan Sykes, Information Management Services, February 2001

Developed by the Association of Library Collections & Technical Services and Library of Congress’s Cataloger’s Learning Workshop Project Justification  “Power users find great value in using a known, granular indexing language that can surface the most relevant items and filter out items of peripheral or no interest.” Jan Sykes, Information Management Services, February 2001  “Keyword search captures only 33% of relevant information.” Chris Wilkie, BBC Information and Archives, Sept. 2002

Developed by the Association of Library Collections & Technical Services and Library of Congress’s Cataloger’s Learning Workshop Project Justification  “Most of the complaints we get are due to the way users search – they use the wrong keywords.” Must search stink?, Forrester, 2000  “40% of search failures come from customers and information providers using different terms.” The Business Benefits of Taxonomy, Judi Vernau, SchemaLogic, Oct. 2005

Developed by the Association of Library Collections & Technical Services and Library of Congress’s Cataloger’s Learning Workshop Project Justification  “Knowledge workers spend 35% of their productive time searching for information online, while 40% of the corporate users report that they cannot find the information they need to do their jobs.” Working Council of CIOs, Business Week, Feb. 27, 2001

Developed by the Association of Library Collections & Technical Services and Library of Congress’s Cataloger’s Learning Workshop Process of CV Design  Understand user and organizational needs  Define the subject scope  Identify sources of ‘raw’ vocabulary  Harvest terms (wordstock) that are likely to be search terms in the field  Group the terms into broad categories, subcategories and sub-subcategories  Establish relationships  Collect feedback and revise until stable

Developed by the Association of Library Collections & Technical Services and Library of Congress’s Cataloger’s Learning Workshop Involving Users  More user involvement = better suited to use  Take every opportunity to involve users  Start from user search logs to find commonly used terms  User experience focus groups  Prototyping  Solicit community feedback  Online discussion groups  Surveys  Observation  Term submissions

Developed by the Association of Library Collections & Technical Services and Library of Congress’s Cataloger’s Learning Workshop Interoperability  Searchers want to search multiple databases at once  Indexers want to use a vocabulary they are familiar with to index objects in a different domain  Content producers want to merge multiple databases indexed using different vocabularies  User communities want a single thesaurus that spans multiple domains  International organizations want a single vocabulary that supports searching in multiple languages

Developed by the Association of Library Collections & Technical Services and Library of Congress’s Cataloger’s Learning Workshop Thesauri can differ in:  Specificity  Treatment of synonyms  Pre- vs. post-coordination  Relationships  Warrant  Scope

Developed by the Association of Library Collections & Technical Services and Library of Congress’s Cataloger’s Learning Workshop Methods of Integration  Mapping  Switching language  Integration  Unified Medical Language System’s (UMLS) 3 main components:  Metathesaurus  concepts  Semantic Network  categories  SPECIALIST Lexicon  indices  Super-language  Merging

Developed by the Association of Library Collections & Technical Services and Library of Congress’s Cataloger’s Learning Workshop Simple Knowledge Organization System Term: Economic cooperation UF:Economic co-operation BT:Economic policy NT:Economic integration European economic cooperation European industrial cooperation Industrial cooperation RT:Interdependence SN:Includes cooperative measures in banking, trade, industry etc., between and among countries.

Developed by the Association of Library Collections & Technical Services and Library of Congress’s Cataloger’s Learning Workshop SKOS

Developed by the Association of Library Collections & Technical Services and Library of Congress’s Cataloger’s Learning Workshop SKOS

Developed by the Association of Library Collections & Technical Services and Library of Congress’s Cataloger’s Learning Workshop SKOS

Developed by the Association of Library Collections & Technical Services and Library of Congress’s Cataloger’s Learning Workshop Testing & Evaluation: Methods  Heuristic Evaluation  Evaluation by an expert or a panel of experts  Affinity Modeling  Task a sample of users with organizing your terms  Compare to your own organization of the terms  Usability Testing  Holistic evaluation of the information system, including the content, interface, etc.

Developed by the Association of Library Collections & Technical Services and Library of Congress’s Cataloger’s Learning Workshop Testing & Evaluation: Discussion  Why test a controlled vocabulary?  What are some useful criteria for evaluating a controlled vocabulary?

Developed by the Association of Library Collections & Technical Services and Library of Congress’s Cataloger’s Learning Workshop Upkeep & Maintenance  Controlled vocabularies as living entities needing  New material added  Outdated material removed  Changes made  Requires a long-term maintenance plan  Institution support and resources  Someone who is a maintainer  Look to your users for input!  Term submissions  Search logs  Anticipate change!