Implementing a Taxonomy in a Content Management Portal Content Week 2005 Miami, Florida Monday, January 31, 2005 Workshop H 2:45pm – 4:45 pm Marjorie M.K.

Slides:



Advertisements
Similar presentations
Dr. Leo Obrst MITRE Information Semantics Information Discovery & Understanding Command & Control Center February 6, 2014February 6, 2014February 6, 2014.
Advertisements

DCMI Workshop on Metadata and Search Vendor Panel Presentation Bradley P. Allen
Taxonomy as Content Outline, Site Map and Search Aid SLA NWR Vancouver October 6, 2006 Marjorie M.K. Hlava President
Classification & Your Intranet: From Chaos to Control Susan Stearns Inmagic, Inc. E-Libraries E204 May, 2003.
Business Development Suit Presented by Thomas Mathews.
Basic Searching Engineering Village. Agenda What is Engineering Village? Setting up a personal account Searching Engineering Village How to.
Advanced Information Systems Laboratory Department of Computer Science and Systems Engineering GI-DAYS MÜNSTER A software tool.
Module 5a: Authority Control and Encoding Schemes IMT530: Organization of Information Resources Winter 2007 Michael Crandall.
Researching Efficiently and Cost Effectively on Lexis Advance™ and Lexis.com 1.
Taxonomies, Lexicons and Organizing Knowledge Wendi Pohs, IBM Software Group.
Advanced Searching Engineering Village.
Engineering Village ™ Basic Searching.
Leveraging Your Taxonomy to Increase User Productivity MAIQuery and TM Navtree.
Engineering Village ™ ® Basic Searching On Compendex ®
Best Web Directories and Search Engines Order Out of Chaos on the World Wide Web.
Environmental Terminology System and Services (ETSS) June 2007.
Thesaurus Design and Development
1 CS 502: Computing Methods for Digital Libraries Lecture 17 Descriptive Metadata: Dublin Core.
1 Languages for aboutness n Indexing languages: –Terminological tools Thesauri (CV – controlled vocabulary) Subject headings lists (CV) Authority files.
Libraries and Institutional Content Management Systems
Sunday May 4 – 5 PM Bradford, Hlava, McNaughton
Vocabulary & languages in searching
Implementing Metadata Marjorie M K Hlava, President Access Innovations, Inc. Albuquerque, NM
ROI & Impact: Quantitative & Qualitative Measures for Taxonomies Wednesday, 11 February :00 – 12:30 PM MST Presented by Jay Ven Eman, Ph.D., CEO.
MDC Open Information Model West Virginia University CS486 Presentation Feb 18, 2000 Lijian Liu (OIM:
Malaysian Grid for Learning October DC 2004, Shanghai, China. © 2004 MIMOS Berhad. All Rights Reserved Metadata Management System DC2004: International.
Taxonomies: Hidden but Critical Tools Marjorie M.K. Hlava President Access Innovations, Inc.
SharePoint Users Group Content Classification Step by Step SharePoint 2007 and 2010.
Controlled Vocabulary & Thesaurus Design Planning & Maintenance.
Indexing Knowledge Daniel Vasicek 2014 March 27 Introduction Basic topic is : All Human Knowledge Who Cares? Simple Examples.
LIS 506 (Fall 2006) LIS 506 Information Technology Week 11: Digital Libraries & Institutional Repositories.
Copyright C.M. Mitchell Consulting 2005 Taxonomy 101 – Why is it so Important? Presented by: Carol Mitchell.
Copyright © 2006 Access Innovations, Inc. 1 Building Taxonomies Part 5 Alice Redmond-Neal Access Innovations, Inc. Enterprise Search Summit New York City,
1999 Asian Women's Network Training Workshop Tools for Searching Information on the Web  Search Engines  Meta-searchers  Information Gateways  Subject.
Meta Tagging / Metadata Lindsay Berard Assisted by: Li Li.
NCSU Libraries Kristin Antelman NCSU Libraries June 24, 2006.
1 Schema Registries Steven Hughes, Lou Reich, Dan Crichton NASA 21 October 2015.
EPA’s Environmental Terminology System and Services (ETSS) Michael Pendleton Data Standards Branch, EPA/OEI Ecoiformatics Technical Collaborative Indicators.
XP New Perspectives on The Internet, Sixth Edition— Comprehensive Tutorial 3 1 Searching the Web Using Search Engines and Directories Effectively Tutorial.
Search Engines. Search Strategies Define the search topic(s) and break it down into its component parts What terms, words or phrases do you use to describe.
Electronic Scriptorium, Ltd. AIIM Minnesota Chapter Metadata and Taxonomy Presentation Copyright Electronic Scriptorium, Ltd. All rights reserved, 1991.
The UNESCO Thesaurus Meeting for Managers of UNESCO Documentation Networks Meron Ewketu UNESCO Library June
5 - 1 Copyright © 2006, The McGraw-Hill Companies, Inc. All rights reserved.
Chapter 4c, Database H Definition H Structure H Parts H Types.
Copyright © 2006 Access Innovations, Inc. 1 Building Taxonomies Part 2 Alice Redmond-Neal Access Innovations, Inc. Enterprise Search Summit New York City,
Semantic web course – Computer Engineering Department – Sharif Univ. of Technology – Fall Knowledge Representation Semantic Web - Fall 2005 Computer.
Evolution of a production pipeline Marjorie M.K. Hlava President Access Innovations.
Thesauri usage in information retrieval systems: example of LISTA and ERIC database thesaurus Kristina Feldvari Departmant of Information Sciences, Faculty.
Copyright © 2006 Pilothouse Consulting Inc. All rights reserved. Search Overview Search Features: WSS and Office Search Architecture Content Sources and.
Consultative process for finalizing the Guidance Document to facilitate the implementation of the clearing-house mechanism regional and national nodes.
Metadata “Data about data” Describes various aspects of a digital file or group of files Identifies the parts of a digital object and documents their content,
IMT530- Organization of Information Resources1 Feedback Lectures –More practical examples –Like guest lecturers –Generally helpful in understanding concepts.
1 Open Ontology Repository initiative - Planning Meeting - Thu Co-conveners: PeterYim, LeoObrst & MikeDean ref.:
Copyright (c) 2014 Pearson Education, Inc. Introduction to DBMS.
Achieving Semantic Interoperability at the World Bank Designing the Information Architecture and Programmatically Processing Information Denise Bedford.
Jean-Yves Le Meur - CERN Geneva Switzerland - GL'99 Conference 1.
Copyright © 2007, Oracle. All rights reserved. Managing Items and Item Catalogs.
Charlyn P. Salcedo Instructor Types of Indexing Languages.
5/29/2001Y. D. Wu & M. Liu1 Content Management for Digital Library May 29, 2001.
Slides Template for Module 3 Contextual details needed to make data meaningful to others CC BY-NC.
Designing Cross-Language Information Retrieval System using various Techniques of Query Expansion and Indexing for Improved Performance  Hello everyone,
Computer Aided Software Engineering (CASE)
Federated & Meta Search
Taxonomies, Lexicons and Organizing Knowledge
Wsdl.
Tools of Software Development
IL Step 3: Using Bibliographic Databases
Overview of Oracle Site Hub
Database Design Hacettepe University
Attributes and Values Describing Entities.
Presentation transcript:

Implementing a Taxonomy in a Content Management Portal Content Week 2005 Miami, Florida Monday, January 31, 2005 Workshop H 2:45pm – 4:45 pm Marjorie M.K. Hlava Access Innovations, Inc

Introductions Name Project Expectations for these two short hours Please fill in the sign up sheet Would you like – 1. Copy of this presentation? – 2. Sample software? – 3. Other information?

Copyright © 2005 Access Innovations, Inc. What will we talk about this afternoon? 1.Definitions 2.Where taxonomy fits in the Information Circle 3.Where to use a taxonomy 4.Taxonomies for Communities of Practice 5.Surrounding theories and applications 6.How to build and maintain 7.How is used in enterprise information

Thesaurus Master Data Feed MAI to add Metadata Database Management System Add Metadata using MAI Search Inverted File Implementing a Taxonomy in a Content Management Portal

Copyright © 2005 Access Innovations, Inc. 1. Definitions

Copyright © 2005 Access Innovations, Inc. What is a taxonomy? A hierarchical thesaurus with authority terms applied at the final node A browse-able web interface A Linnaean System A browse- able list with the term instance at the final leaf

Copyright © 2005 Access Innovations, Inc. Types of Taxonomies Naming and organizing things into groups that share similar characteristics 1. Flat – just a list 2. Hierarchical – Taxonomic view 3. Faceted – Sorted by a single charasteristic – Metadata - Dublin Core – COSATI -GILS 4. Thesaurus – Term records – Database backend – Easier to modify and maintain

Copyright © 2005 Access Innovations, Inc. Taxonomy in meta data Definition – Taxonomy is a thesaurus in its hierarchical view with the authority files applied at the final nodes – It allows the browse-able front end to a portal – It provides keyword and name access to the content in the portal

Copyright © 2005 Access Innovations, Inc. Taxonomy definition A taxonomy is a thesaurus in hierarchical view with authority file terms added at the final nodes Thesaurus Authority file Hierarchical form Final nodes

Copyright © 2005 Access Innovations, Inc. Thesaurus Concepts Methods Procedures Cognitive approach The knowledge capture piece The topics or subjects

Copyright © 2005 Access Innovations, Inc. Authority file People Places Things The tangible approach Concrete Entities

Copyright © 2005 Access Innovations, Inc. Hierarchical view Gives the Portal view The view of all the preferred terms in categorized order An outline of the thesaurus

Copyright © 2005 Access Innovations, Inc. Final Nodes The last position on the hierarchical tree – Taxonomy concept – narrower terms » final node - people, place or thing term » document instance » Letter to George Wiesman Dec 12, 2003 » Technical report number TR-1039 » Museum artifact 1706 wodden wagon wheel

Copyright © 2005 Access Innovations, Inc. Term Records – the Database Part Associative terms – Related terms Equivalence terms – Preferred and non preferred – Use and used for – Synonyms Hierarchical terms – Broader narrower terms – Parent Child

Copyright © 2005 Access Innovations, Inc. Other term record fields Scope notes Cross references History Term Status Category User defined

Copyright © 2005 Access Innovations, Inc. 2. Where does a taxonomy fit in the information circle?

Copyright © 2005 Access Innovations, Inc. Information Circle - Overview Taxonomy User Content Output

Copyright © 2005 Access Innovations, Inc. Content Taxonomy User Content Output Web Pages White Papers Research Reports Licensed Data Feeds Intranet Internal Reports Lotus Notes files Databases Public Relations Documents/Press Releases Market Research Reports Customer Relationship Management (CRM) HR Files Accounting/Financial Records Legal Documents Patents Museum artifacts

Copyright © 2005 Access Innovations, Inc. Taxonomy User Content Output Content – cont’d HTML – Meta name / Keywords DB – Field / Meta tag / Element XML – Entity table for valid values Content Creation:

Copyright © 2005 Access Innovations, Inc. Taxonomy User Content Output Taxonomy is applied to new and existing content: Meta Tags Thesaurus Terms Authority Terms Date Author Description etc. Rule BaseTaxonomy

Copyright © 2005 Access Innovations, Inc. Taxonomy – cont’d Taxonomy User Content Output Index data - Manually - Automatically Suggest new candidate terms Review

Copyright © 2005 Access Innovations, Inc. Output Taxonomy User Content Output Searchable Data - Internal Data - External Data

Copyright © 2005 Access Innovations, Inc. User Taxonomy User Content Output Web Browsing/Searching Database Browsing/Searching Query Resolution

Copyright © 2005 Access Innovations, Inc. User – cont’d Taxonomy OutputUser Content User Input - Suggested Candidate Terms - New Documents Reports Based on User Search - Search Logs - Null Hits (These will also suggest new candidate terms)

Copyright © 2005 Access Innovations, Inc. New Content Taxonomy User New Content Output The cycle begins again

Copyright © 2005 Access Innovations, Inc. Information Circle - Overview Taxonomy User Content Output

Copyright © 2005 Access Innovations, Inc. 3. Where to use a taxonomy Link the Taxonomy and Indexing Always in sync with the industry Keep up to date with terminology Automatically index the old data Filter newsfeeds Search using the Taxonomy File using the taxonomy Spell check using the taxonomy Link to translation system Catalog using the taxonomy Index a book

Copyright © 2005 Access Innovations, Inc.

Thesaurus Master

Copyright © 2005 Access Innovations, Inc.

Database Management System - Add Metadata using MAI Search Inverted File Aadvark Alligator Apple Advantage …. Zebra Record locator Accessinn.com/12345/demofile/recid15 Database records Each with many elements Portal Searching

Copyright © 2005 Access Innovations, Inc. Search Inverted File Aadvark Alligator Apple Advantage …. Zebra Record locator Accessinn.com/12345/demofile/recid15 Database records Each with many elements Portal Searching Many data bases can be reached

Copyright © 2005 Access Innovations, Inc. 4. Taxonomies for Communities of Practice

Copyright © 2005 Access Innovations, Inc. Taxonomies in a Community of Practice Nature of Communities of Practice (CoP) Taxonomies in context Value of taxonomies Creating a taxonomy Applying the taxonomy

Copyright © 2005 Access Innovations, Inc. Nature of CoPs Free flowing, loosely structured Simple, ad hoc categorization Active CoPs need organization Search tends to be hit-or-miss Courtesy of Lillian Gassie, Naval Postgraduate School, Monterey, CA

Copyright © 2005 Access Innovations, Inc. Taxonomies in Context A taxonomy aspires to be: a correlation of the different functional, regional and (possibly) national languages used by a community of practice a support mechanism for navigation a support tool for search engines and knowledge maps an authority for tagging documents and other information objects a knowledge base in its own right Reference: “Taxonomies: the vital tool of information architecture”,

Copyright © 2005 Access Innovations, Inc. Value of Taxonomies Improves organization & structure Facilitates navigation Facilitates knowledge discovery Reduces effort Saves time “Taxonomies are better created by professional indexers or librarians than by domain experts.” Courtesy of Lillian Gassie, Naval Postgraduate School, Monterey, CA

Copyright © 2005 Access Innovations, Inc. Naval Postgraduate School’s Homeland Security Taxonomy (1)

Copyright © 2005 Access Innovations, Inc. Naval Postgraduate School’s Homeland Security Taxonomy (2)

Copyright © 2005 Access Innovations, Inc. IBM Insight graphical view

Copyright © 2005 Access Innovations, Inc. Applying a Taxonomy (1) Manually Add terms into meta data fields Design navigation & site indexes with taxonomy hierarchy Courtesy of Lillian Gassie, Naval Postgraduate School, Monterey, CA

Incorporating Hierarchical Classification from a Taxonomy Courtesy of Lillian Gassie, Naval Postgraduate School, Monterey, CA

Applying a Taxonomy (2) System integration Search & retrieval systems Auto-assignment of metadata Categorization systems Courtesy of Lillian Gassie, Naval Postgraduate School, Monterey, CA

Applying the Taxonomy to a Digital Library Web portal Locally held documents Public repositories Commercial data sources Agency data sources INTERNET (public) spiders Meta-Search Tool Filtered content Search engine Automated categorization Library catalogs Search engine Courtesy of Lillian Gassie, Naval Postgraduate School, Monterey, CA

Copyright © 2005 Access Innovations, Inc. 5. Surrounding theories and applications

Copyright © 2005 Access Innovations, Inc. Other Vocabulary types Uncontrolled lists Classification System Subject headings Controlled vocabulary – usually synonyms and spelling Authority files Thesaurus Taxonomy

Copyright © 2005 Access Innovations, Inc. Uncontrolled list - define Add terms as they occur No cross reference Simple flat structure

Copyright © 2005 Access Innovations, Inc. Controlled term lists - defined State the preferred terms Provide allowed term entry Heavily cross referenced Not generally hierarchical Popular Easy to create

Copyright © 2005 Access Innovations, Inc. Controlled term list - format Cars – use Automobiles Personal Computer – use Microcomputer

Copyright © 2005 Access Innovations, Inc. Classification vs Subject Headings Classification – single spot or placement – browse physical list – often a numbering system – clear hierarchy – no or few cross references

Copyright © 2005 Access Innovations, Inc. Classification vs Subject Headings Subject headings – generic search – hidden classification system – related terms and cross references in heavy use – Usually the inverted form cells, electric – Alphabetic access

Copyright © 2005 Access Innovations, Inc. Authority systems - defined Lists of terms in the preferred format for use Frequently have cross references Widely available Frequently coded lists Brand names

Copyright © 2005 Access Innovations, Inc. Authority lists - examples ISO Country Name and Code – International Standards Organization ISO Language list NAICS (SIC) – Standard Industrial Classification Code (SIC) – Replaced by – North American Industrial Classification System (NAICS)

Copyright © 2005 Access Innovations, Inc. What is a thesaurus? Jessica L. Milstead. All Rights Reserved “For writers, it is a tool like Roget’s ­ one with words grouped and classified to help select the best word to convey a specific nuance of meaning. For indexers and searchers, it is an information storage and retrieval tool: a listing of words and phrases authorized for use in an indexing system, together with relationships, variants and synonyms, and aids to navigation through the thesaurus”

Copyright © 2005 Access Innovations, Inc. Thesaurus - defined For information retrieval 1960’s – indexing either intellectual or automatic – in searching – searching but not indexing – indexing but not searching – hierarchical view for searching

Copyright © 2005 Access Innovations, Inc. Thesaurus - defined Monolingual - standard – British – English - ISO 5578 – American – English –ANSI/NISO Z39.19 Multilingual – standard ISO 5579 – concept mapping – Eurovoc Discipline or Mission based - ad hoc

Copyright © 2005 Access Innovations, Inc. Thesaurus -standard format Main Entries Top Terms - TT Broader Terms - BT Narrower Terms - NT RELATED TERMS - RT Scope Notes - SN History - HI Date term added/changed - DA

Copyright © 2005 Access Innovations, Inc. Standards Monolingual – NISO / ANSI – Z39.19 – ISO 5578 Multilingual – ISO 5579

Copyright © 2005 Access Innovations, Inc. ISO Standards Set up already - easy to adopt Multiple broader terms The standards outline procedures – ISO -better for implementation – NISO much better reading

Copyright © 2005 Access Innovations, Inc. Why do we index ? Improve precision – define scope of terms Improve recall – different terms for same concept Guide to a field of expertise Learning tool Richer expression

Copyright © 2005 Access Innovations, Inc. Uses ? Indexing* – …process by which subject terms or classification symbols are assigned to concepts in documents – A thesaurus is also known as an indexing language – * not the building of the inverted file in computer sense of indexing

Copyright © 2005 Access Innovations, Inc. What are we controlling ? Synonyms – different terms same concept Polysemes or Homonyms – same word different meanings – Lead – Reading

Copyright © 2005 Access Innovations, Inc. How ? Meaning – delineation of scope of a term Term equivalence – linking of synonyms Disambiguation of homonyms – lead (metal) – lead (element) – lead (management)

Copyright © 2005 Access Innovations, Inc. Precision options Language specificity Coordination Compound terms - level of precoordination Homographs and scope notes Word distance indication

Copyright © 2005 Access Innovations, Inc. Precision options Structural relationships Links and roles Treatment and aspect codes Weighting

Copyright © 2005 Access Innovations, Inc. Disambiguation BillInvoice BillLegislative Bill Sport BillPerson

Copyright © 2005 Access Innovations, Inc. Disambiguation BillsInvoices BillsLegislation Bill Animal BillPerson PT NTBT RTRT BTNT

Copyright © 2005 Access Innovations, Inc. 6. How to build and maintain a taxonomy

Copyright © 2005 Access Innovations, Inc. How to build a taxonomy Collect the terms Pull out authority terms Organize into arrays Choose top terms Organize hierarchically Flesh out term records Test, review, and edit

Copyright © 2005 Access Innovations, Inc. Or said another way … Define scope Collect terms and relationships Identify existing taxonomies Identify resources Create & refine taxonomy Apply taxonomy Review and update

Copyright © 2005 Access Innovations, Inc. Maintain Steady stream of terms – Web logs – Null sets – New announcements – Indexing team – Library – Records managers – Etc. Candidate terms Out of date is nearly useless

Copyright © 2005 Access Innovations, Inc. Best Results Measures Accuracy Productivity Hits, Misses and Noise Precision (Recall) Relevance Ease of set up Time to production

Copyright © 2005 Access Innovations, Inc. Integration Thesaurus – full featured – multiple views – multiple versions – multiple languages Automatic indexing – filtering – assisted Data Harmony MAI and Thesaurus Master

Copyright © 2005 Access Innovations, Inc. Visual Taxonomy Ways to look – Hierarchical – Alphabetic – by term – Ring diagrams – Topic maps – Related terms Visual Taxonomy

Content Management System

Copyright © 2005 Access Innovations, Inc. API to Many Systems for CMS

Copyright © 2005 Access Innovations, Inc. Apply to the meta data Automatic application? Spider setting internally External web crawls – use all aliases Filter data Enhance search experience

Copyright © 2005 Access Innovations, Inc. Meta data The fields The elements – Class codes – Title – Author – Plaintiff – Product – subject / topic Meta Name Keywords in HTML

Copyright © 2005 Access Innovations, Inc.

7. How Taxonomies are used in Enterprise Information

Copyright © 2005 Access Innovations, Inc. Brand is repeated in several spots and tied to search as well

Another way of listing brands

Category list from taxonomy is tied to brand list and product list

Category code from the taxonomy is tied to the brand list and the product list

Copyright © 2005 Access Innovations, Inc. Enterprise Taxonomy Management Consistent application across entire site Synonyms are used interchangeably User doesn’t need to know the taxonomy Pop up view is helpful Site map for construction and browsing Allows hidden sections for internal use

Copyright © 2005 Access Innovations, Inc. Taxonomies Form the basis for knowledge sharing Add value to discussion Allow deeper retrieval Are straightforward to create Require on-going maintenance

Copyright © 2005 Access Innovations, Inc. Your Taxonomy There is too much information to pile it on the floor. It fits in many places in the information flow

Copyright © 2005 Access Innovations, Inc.

Data Feed Thesaurus Master MAI to add Metadata Database Management System Add Metadata using MAI Search Inverted File Implementing a Taxonomy in a Content Management Portal

Copyright © 2005 Access Innovations, Inc. Thank you for your time! Questions? Marjorie M.K. Hlava Access Innovations, Inc