Taxonomies, Lexicons and Organizing Knowledge Wendi Pohs, IBM Software Group.

Slides:



Advertisements
Similar presentations
How Will it Help Me Do My Job?
Advertisements

Data Mining and the Web Susan Dumais Microsoft Research KDD97 Panel - Aug 17, 1997.
Taxonomy as Content Outline, Site Map and Search Aid SLA NWR Vancouver October 6, 2006 Marjorie M.K. Hlava President
Classification & Your Intranet: From Chaos to Control Susan Stearns Inmagic, Inc. E-Libraries E204 May, 2003.
Gathering Information Information Collection: Garbage In – Garbage Out.
PolyAnalyst Data and Text Mining tool Your Knowledge Partner TM www
Taxonomies of Knowledge: Building a Corporate Taxonomy Wendi Pohs, Iris Associates
R EALLY [ ] S TRATEGIES It’s all about the content XML That Pays Off for Your Content Database “It’s all about the content.” Lisa Bos
Taxonomies in Electronic Records Management Systems May 21, 2002.
Chapter 6 Database Design
Beyond Sentiment Mining Social Media A Panel Discussion of Trends and Ideas Marie Wallace, IBM Marcello Pellacani, Expert System Fabio Lazzarini, CRIBIS.
Chapter 11 Managing Knowledge.
PROMPT: Algorithm and Tool for Automated Ontology Merging and Alignment Natalya F. Noy and Mark A. Musen.
Libraries and Institutional Content Management Systems
Managing Records in 21st Century Stories from the World Bank Group.
Overview of Search Engines
Basic tasks of generic software Chapter 3. Contents This presentation covers the following: – The basic tasks of standard/generic software including:
Information Literacy, Search Strategies & Catalog Instruction Frederic Murray Assistant Professor MLIS, University of British Columbia BA, Political Science,
Database System Development Lifecycle © Pearson Education Limited 1995, 2005.
Sarah Rice - IA Summit 2004 Bottom-Up Information Architecture: Re-Design of an Enterprise Class Web Site.
SharePoint Users Group Content Classification Step by Step SharePoint 2007 and 2010.
Blaz Fortuna, Marko Grobelnik, Dunja Mladenic Jozef Stefan Institute ONTOGEN SEMI-AUTOMATIC ONTOLOGY EDITOR.
Enterprise & Intranet Search How Enterprise is different from Web search What to think about when evaluating Enterprise Search How Intranet use is different.
Understanding the Web Site Development Process. Understanding the Web Site Development You need a good project plan Larger projects need a project manager.
Creating Page Layouts using SharePoint Designer or Visual Studio Becky Bertram MVP SharePoint Server, MCSD, MCAD
M ODULE 5 – S HARE P OINT 2010 C ONTENT T YPES.
Using Taxonomies Effectively in the Organization v. 2.0 KnowledgeNets 2001 Vivian Bliss Microsoft Knowledge Network Group
Content Strategy.
IST 210 Database Design Process IST 210 Todd S. Bacastow January 2005.
Week 4 Lecture Part 3 of 3 Database Design Samuel ConnSamuel Conn, Faculty Suggestions for using the Lecture Slides.
Information Systems & Semantic Web University of Koblenz ▪ Landau, Germany Semantic Web - Multimedia Annotation – Steffen Staab
Copyright C.M. Mitchell Consulting 2005 Taxonomy 101 – Why is it so Important? Presented by: Carol Mitchell.
Four Techniques to Increase Content Quality with Less Staff Using Synergies Between Content Management and Enterprise Search.
Meta Tagging / Metadata Lindsay Berard Assisted by: Li Li.
GIS On The Web: An Overview of ArcIMS. *The easy flow of geographic data can offer real-life solutions in many societal sectors, including municipal government,
Use of Hierarchical Keywords for Easy Data Management on HUBzero HUBbub Conference 2013 September 6 th, 2013 Gaurav Nanda, Jonathan Tan, Peter Auyeung,
Features and Algorithms Paper by: XIAOGUANG QI and BRIAN D. DAVISON Presentation by: Jason Bender.
© 2001 Business & Information Systems 2/e1 Chapter 8 Personal Productivity and Problem Solving.
Using Taxonomies Effectively in the Organization KMWorld 2000 Mike Crandall Microsoft Information Services
Definition of a taxonomy “System for naming and organizing things into groups that share similar characteristics” Taxonomy Architectures Applications.
XP New Perspectives on The Internet, Sixth Edition— Comprehensive Tutorial 3 1 Searching the Web Using Search Engines and Directories Effectively Tutorial.
The Internet 8th Edition Tutorial 4 Searching the Web.
Librarians vs. Automation Carolyn Weber Lucio Campanelli Will Hohyon Ryu.
1 Metadata –Information about information – Different objects, different forms – e.g. Library catalogue record Property:Value: Author Ian Beardwell Publisher.
5 - 1 Copyright © 2006, The McGraw-Hill Companies, Inc. All rights reserved.
“Metadata is cataloguing” ?????????? Pat Bell HM Customs and Excise.
6.1 © 2010 by Prentice Hall 6 Chapter Foundations of Business Intelligence: Databases and Information Management.
Librarians vs. Automation Carolyn Weber Lucio Campanelli Will Hohyon Ryu.
Search Strategies & Catalog Instruction Frederic Murray Assistant Professor MLIS, University of British Columbia BA, Political Science, University of Iowa.
DANIELA KOLAROVA INSTITUTE OF INFORMATION TECHNOLOGIES, BAS Multimedia Semantics and the Semantic Web.
The NIH Enterprise Information Portal IMPAC II GM Lead Users Group April 10, 2002.
Information Literacy, Search Strategies & Catalog Instruction Frederic Murray Assistant Professor MLIS, University of British Columbia BA, Political Science,
MSG Reuse Catalog T.W. van den Berg 7 April 2010.
Bringing Order to the Web : Automatically Categorizing Search Results Advisor : Dr. Hsu Graduate : Keng-Wei Chang Author : Hao Chen Susan Dumais.
SharePoint University of the Highlands and Islands SharePoint for Records Management.
Data Management: Data Analysis Types of Data Analysis at USGS There are several ways to classify Data Analysis activities at USGS, and here are some of.
Chapter 11 Managing Knowledge.
Information Organization: Overview
Managing Records in 21st Century
Chapter 6 Database Design
Tagging documents made easy, using machine learning
Federated & Meta Search
Chapter 11 Managing Knowledge.
How does a Requirements Package Vary from Project to Project?
Taxonomies, Lexicons and Organizing Knowledge
Cataloging the Internet
C.U.SHAH COLLEGE OF ENG. & TECH.
Transportation Research Thesaurus:
Overview of Oracle Site Hub
Information Organization: Overview
Presentation transcript:

Taxonomies, Lexicons and Organizing Knowledge Wendi Pohs, IBM Software Group

IBM Software Group Agenda Benefits, business and technical A few definitions Planning Issues Measuring value Futures Q&A

IBM Software Group The Mantra  Knowledge is in the eye of the beholder, but reflecting end user needs is as critical as representing texts....and it takes work!

IBM Software Group Business Benefits Mergers and acquisitions Research and development Industries: Consulting Pharmaceuticals Financial services Legal If only I could find information to help me do my job better...

IBM Software Group Technical Benefits Site creation Navigation/search Personalization Defining areas of expertise

IBM Software Group “The science, laws or principles of classification” (From the Greek: rules of arrangement) Biology (Linnaeus) Education (Bloom) A hierarchical collection of categories and documents Structure and content Definitions: Taxonomy

IBM Software Group Definitions: Directory More general than taxonomy Natural structure Wide vs deep Category structure less controlled File system Yahoo ( Yellow Pages Corporate Web sites (

IBM Software Group Controlled vocabulary Subject headings, labels Synonyms (U, UF) Relation types (TT, BT, NT,SN, HN, RT, SA) Examples: html Definitions: Thesaurus

IBM Software Group Definitions: Meta-data and tagging Meta-data Properties, attributes: information describing types of data [Crandall] The ‘energy’ required to keep things organized [Earley] Tagging, Document Properties

IBM Software Group Analyzing documents and assigning them to predefined categories Rule-based vs natural Classification schemes Dewey Library of Congress Industry-specific Definitions: Classification

IBM Software Group Definitions: Clustering Clustering Automatically generating groups of similar documents based on distance or proximity measures "Bags of words" Vector analysis determines boundaries Adaptive, but not abstract

IBM Software Group Develop a Plan Determine user information needs Information audit, Content audit Select appropriate sources Create initial taxonomy Edit categories Categorize new documents Test the UI Train the taxonomy

IBM Software Group Plan: Information audit What is the objective of the system? Who owns the project? What do users need? What do content creators need? What do system managers need?

IBM Software Group Plan: Content audit Is there an existing taxonomy? How clean is the meta-data? Is the content suited to automatic classification techniques? Good example: Notes discussion databases Not-so-good example: Web site with little text, lots of links Is a subset of a source better than the whole?

IBM Software Group Plan: Select sources Which sources? Who owns them? Which sources do users access most often? How do users access these sources? What is the lifecycle of the content? Who identifies the most current content?

IBM Software Group Resources Centralized or department-level Who decides when new content is added? Term approval process How do new concepts get into the taxonomy? Plan: Maintenance

IBM Software Group Identify issues Getting user involvement and buy-in Maintenance resources Directory versus taxonomy Meta-data Globalization and regionalization Hidden vs published taxonomies

IBM Software Group Understand the BIG issues Organizational “perfection complex” [Chait] Multiple taxonomies Automated versus manual categorization

IBM Software Group Multiple taxonomies Many editors Term approval process, synonyms Standard tools across the enterprise Federated taxonomies Taxonomy links, “cross-connections,” facets, views Taxonomy mapping

IBM Software Group

Measuring value NCR Corporation - Support Organization Needed to convince organization of the value of captured content Managers resisted diverting resources to maintaining content Current measure: Time per incident How could the value of a knowledge classification system be demonstrated?

IBM Software Group Measuring value NCR developed a new parameter: Knowledge helpful (the answer was in the support database and was used to solve the problem) Knowledge not effective (the answer sent them in the wrong direction, did not help to address the issue) Knowledge not available (nothing available to assist in solving the problem) Knowledge not required (problem solved without the use of the knowledge base)

IBM Software Group Futures Methods: Feature extraction, statistical analysis, rules-based, label generation Starter taxonomies, imports Taxonomy mapping Interfaces: Visualization, better training tools

IBM Software Group Q&A ?