Presentation is loading. Please wait.

Presentation is loading. Please wait.

Implementing a Taxonomy in a Content Management Portal Content Week 2005 Miami, Florida Monday, January 31, 2005 Workshop H 2:45pm – 4:45 pm Marjorie M.K.

Similar presentations


Presentation on theme: "Implementing a Taxonomy in a Content Management Portal Content Week 2005 Miami, Florida Monday, January 31, 2005 Workshop H 2:45pm – 4:45 pm Marjorie M.K."— Presentation transcript:

1 Implementing a Taxonomy in a Content Management Portal Content Week 2005 Miami, Florida Monday, January 31, 2005 Workshop H 2:45pm – 4:45 pm Marjorie M.K. Hlava Access Innovations, Inc. 505-998-0800 mhlava@accessinn.com www.accessinn.com

2 Introductions Name Project Expectations for these two short hours Please fill in the sign up sheet Would you like – 1. Copy of this presentation? – 2. Sample software? – 3. Other information?

3 Copyright © 2005 Access Innovations, Inc. What will we talk about this afternoon? 1.Definitions 2.Where taxonomy fits in the Information Circle 3.Where to use a taxonomy 4.Taxonomies for Communities of Practice 5.Surrounding theories and applications 6.How to build and maintain 7.How is used in enterprise information

4 Thesaurus Master Data Feed MAI to add Metadata Database Management System Add Metadata using MAI Search Inverted File Implementing a Taxonomy in a Content Management Portal

5 Copyright © 2005 Access Innovations, Inc. 1. Definitions

6 Copyright © 2005 Access Innovations, Inc. What is a taxonomy? A hierarchical thesaurus with authority terms applied at the final node A browse-able web interface A Linnaean System A browse- able list with the term instance at the final leaf

7 Copyright © 2005 Access Innovations, Inc. Types of Taxonomies Naming and organizing things into groups that share similar characteristics 1. Flat – just a list 2. Hierarchical – Taxonomic view 3. Faceted – Sorted by a single charasteristic – Metadata - Dublin Core – COSATI -GILS 4. Thesaurus – Term records – Database backend – Easier to modify and maintain

8 Copyright © 2005 Access Innovations, Inc. Taxonomy in meta data Definition – Taxonomy is a thesaurus in its hierarchical view with the authority files applied at the final nodes – It allows the browse-able front end to a portal – It provides keyword and name access to the content in the portal

9 Copyright © 2005 Access Innovations, Inc. Taxonomy definition A taxonomy is a thesaurus in hierarchical view with authority file terms added at the final nodes Thesaurus Authority file Hierarchical form Final nodes

10 Copyright © 2005 Access Innovations, Inc. Thesaurus Concepts Methods Procedures Cognitive approach The knowledge capture piece The topics or subjects

11 Copyright © 2005 Access Innovations, Inc. Authority file People Places Things The tangible approach Concrete Entities

12 Copyright © 2005 Access Innovations, Inc. Hierarchical view Gives the Portal view The view of all the preferred terms in categorized order An outline of the thesaurus

13 Copyright © 2005 Access Innovations, Inc. Final Nodes The last position on the hierarchical tree – Taxonomy concept – narrower terms » final node - people, place or thing term » document instance » Letter to George Wiesman Dec 12, 2003 » Technical report number TR-1039 » Museum artifact 1706 wodden wagon wheel

14 Copyright © 2005 Access Innovations, Inc. Term Records – the Database Part Associative terms – Related terms Equivalence terms – Preferred and non preferred – Use and used for – Synonyms Hierarchical terms – Broader narrower terms – Parent Child

15 Copyright © 2005 Access Innovations, Inc. Other term record fields Scope notes Cross references History Term Status Category User defined

16 Copyright © 2005 Access Innovations, Inc. 2. Where does a taxonomy fit in the information circle?

17 Copyright © 2005 Access Innovations, Inc. Information Circle - Overview Taxonomy User Content Output

18 Copyright © 2005 Access Innovations, Inc. Content Taxonomy User Content Output Web Pages White Papers Research Reports Licensed Data Feeds Intranet Internal Reports Lotus Notes files Databases Public Relations Documents/Press Releases Market Research Reports Customer Relationship Management (CRM) HR Files Accounting/Financial Records Legal Documents Patents Museum artifacts

19 Copyright © 2005 Access Innovations, Inc. Taxonomy User Content Output Content – cont’d HTML – Meta name / Keywords DB – Field / Meta tag / Element XML – Entity table for valid values Content Creation:

20 Copyright © 2005 Access Innovations, Inc. Taxonomy User Content Output Taxonomy is applied to new and existing content: Meta Tags Thesaurus Terms Authority Terms Date Author Description etc. Rule BaseTaxonomy

21 Copyright © 2005 Access Innovations, Inc. Taxonomy – cont’d Taxonomy User Content Output Index data - Manually - Automatically Suggest new candidate terms Review

22 Copyright © 2005 Access Innovations, Inc. Output Taxonomy User Content Output Searchable Data - Internal Data - External Data

23 Copyright © 2005 Access Innovations, Inc. User Taxonomy User Content Output Web Browsing/Searching Database Browsing/Searching Query Resolution

24 Copyright © 2005 Access Innovations, Inc. User – cont’d Taxonomy OutputUser Content User Input - Suggested Candidate Terms - New Documents Reports Based on User Search - Search Logs - Null Hits (These will also suggest new candidate terms)

25 Copyright © 2005 Access Innovations, Inc. New Content Taxonomy User New Content Output The cycle begins again

26 Copyright © 2005 Access Innovations, Inc. Information Circle - Overview Taxonomy User Content Output

27 Copyright © 2005 Access Innovations, Inc. 3. Where to use a taxonomy Link the Taxonomy and Indexing Always in sync with the industry Keep up to date with terminology Automatically index the old data Filter newsfeeds Search using the Taxonomy File using the taxonomy Spell check using the taxonomy Link to translation system Catalog using the taxonomy Index a book

28 Copyright © 2005 Access Innovations, Inc.

29

30

31 Thesaurus Master

32 Copyright © 2005 Access Innovations, Inc.

33 Database Management System - Add Metadata using MAI Search Inverted File Aadvark Alligator Apple Advantage …. Zebra Record locator Accessinn.com/12345/demofile/recid15 Database records Each with many elements Portal Searching

34 Copyright © 2005 Access Innovations, Inc. Search Inverted File Aadvark Alligator Apple Advantage …. Zebra Record locator Accessinn.com/12345/demofile/recid15 Database records Each with many elements Portal Searching Many data bases can be reached

35 Copyright © 2005 Access Innovations, Inc. 4. Taxonomies for Communities of Practice

36 Copyright © 2005 Access Innovations, Inc. Taxonomies in a Community of Practice Nature of Communities of Practice (CoP) Taxonomies in context Value of taxonomies Creating a taxonomy Applying the taxonomy

37 Copyright © 2005 Access Innovations, Inc. Nature of CoPs Free flowing, loosely structured Simple, ad hoc categorization Active CoPs need organization Search tends to be hit-or-miss Courtesy of Lillian Gassie, Naval Postgraduate School, Monterey, CA

38 Copyright © 2005 Access Innovations, Inc. Taxonomies in Context A taxonomy aspires to be: a correlation of the different functional, regional and (possibly) national languages used by a community of practice a support mechanism for navigation a support tool for search engines and knowledge maps an authority for tagging documents and other information objects a knowledge base in its own right Reference: “Taxonomies: the vital tool of information architecture”, www.tfpl.com

39 Copyright © 2005 Access Innovations, Inc. Value of Taxonomies Improves organization & structure Facilitates navigation Facilitates knowledge discovery Reduces effort Saves time “Taxonomies are better created by professional indexers or librarians than by domain experts.” Courtesy of Lillian Gassie, Naval Postgraduate School, Monterey, CA

40 Copyright © 2005 Access Innovations, Inc. Naval Postgraduate School’s Homeland Security Taxonomy (1)

41 Copyright © 2005 Access Innovations, Inc. Naval Postgraduate School’s Homeland Security Taxonomy (2)

42 Copyright © 2005 Access Innovations, Inc. IBM Insight graphical view

43

44 Copyright © 2005 Access Innovations, Inc. Applying a Taxonomy (1) Manually Add terms into meta data fields Design navigation & site indexes with taxonomy hierarchy Courtesy of Lillian Gassie, Naval Postgraduate School, Monterey, CA

45 Incorporating Hierarchical Classification from a Taxonomy Courtesy of Lillian Gassie, Naval Postgraduate School, Monterey, CA

46 Applying a Taxonomy (2) System integration Search & retrieval systems Auto-assignment of metadata Categorization systems Courtesy of Lillian Gassie, Naval Postgraduate School, Monterey, CA

47 Applying the Taxonomy to a Digital Library Web portal Locally held documents Public repositories Commercial data sources Agency data sources INTERNET (public) spiders Meta-Search Tool Filtered content Search engine Automated categorization Library catalogs Search engine Courtesy of Lillian Gassie, Naval Postgraduate School, Monterey, CA

48 Copyright © 2005 Access Innovations, Inc. 5. Surrounding theories and applications

49 Copyright © 2005 Access Innovations, Inc. Other Vocabulary types Uncontrolled lists Classification System Subject headings Controlled vocabulary – usually synonyms and spelling Authority files Thesaurus Taxonomy

50 Copyright © 2005 Access Innovations, Inc. Uncontrolled list - define Add terms as they occur No cross reference Simple flat structure

51 Copyright © 2005 Access Innovations, Inc. Controlled term lists - defined State the preferred terms Provide allowed term entry Heavily cross referenced Not generally hierarchical Popular Easy to create

52 Copyright © 2005 Access Innovations, Inc. Controlled term list - format Cars – use Automobiles Personal Computer – use Microcomputer

53 Copyright © 2005 Access Innovations, Inc. Classification vs Subject Headings Classification – single spot or placement – browse physical list – often a numbering system – clear hierarchy – no or few cross references

54 Copyright © 2005 Access Innovations, Inc. Classification vs Subject Headings Subject headings – generic search – hidden classification system – related terms and cross references in heavy use – Usually the inverted form cells, electric – Alphabetic access

55 Copyright © 2005 Access Innovations, Inc. Authority systems - defined Lists of terms in the preferred format for use Frequently have cross references Widely available Frequently coded lists Brand names

56 Copyright © 2005 Access Innovations, Inc. Authority lists - examples ISO Country Name and Code – International Standards Organization ISO Language list NAICS (SIC) – Standard Industrial Classification Code (SIC) – Replaced by – North American Industrial Classification System (NAICS)

57 Copyright © 2005 Access Innovations, Inc. What is a thesaurus? Jessica L. Milstead. All Rights Reserved “For writers, it is a tool like Roget’s ­ one with words grouped and classified to help select the best word to convey a specific nuance of meaning. For indexers and searchers, it is an information storage and retrieval tool: a listing of words and phrases authorized for use in an indexing system, together with relationships, variants and synonyms, and aids to navigation through the thesaurus” www.jelem.com

58 Copyright © 2005 Access Innovations, Inc. Thesaurus - defined For information retrieval 1960’s – indexing either intellectual or automatic – in searching – searching but not indexing – indexing but not searching – hierarchical view for searching

59 Copyright © 2005 Access Innovations, Inc. Thesaurus - defined Monolingual - standard – British – English - ISO 5578 – American – English –ANSI/NISO Z39.19 Multilingual – standard ISO 5579 – concept mapping – Eurovoc Discipline or Mission based - ad hoc

60 Copyright © 2005 Access Innovations, Inc. Thesaurus -standard format Main Entries Top Terms - TT Broader Terms - BT Narrower Terms - NT RELATED TERMS - RT Scope Notes - SN History - HI Date term added/changed - DA

61 Copyright © 2005 Access Innovations, Inc. Standards Monolingual – NISO / ANSI – Z39.19 – ISO 5578 Multilingual – ISO 5579

62 Copyright © 2005 Access Innovations, Inc. ISO Standards Set up already - easy to adopt Multiple broader terms The standards outline procedures – ISO -better for implementation – NISO much better reading

63 Copyright © 2005 Access Innovations, Inc. Why do we index ? Improve precision – define scope of terms Improve recall – different terms for same concept Guide to a field of expertise Learning tool Richer expression

64 Copyright © 2005 Access Innovations, Inc. Uses ? Indexing* – …process by which subject terms or classification symbols are assigned to concepts in documents – A thesaurus is also known as an indexing language – * not the building of the inverted file in computer sense of indexing

65 Copyright © 2005 Access Innovations, Inc. What are we controlling ? Synonyms – different terms same concept Polysemes or Homonyms – same word different meanings – Lead – Reading

66 Copyright © 2005 Access Innovations, Inc. How ? Meaning – delineation of scope of a term Term equivalence – linking of synonyms Disambiguation of homonyms – lead (metal) – lead (element) – lead (management)

67 Copyright © 2005 Access Innovations, Inc. Precision options Language specificity Coordination Compound terms - level of precoordination Homographs and scope notes Word distance indication

68 Copyright © 2005 Access Innovations, Inc. Precision options Structural relationships Links and roles Treatment and aspect codes Weighting

69 Copyright © 2005 Access Innovations, Inc. Disambiguation BillInvoice BillLegislative Bill Sport BillPerson

70 Copyright © 2005 Access Innovations, Inc. Disambiguation BillsInvoices BillsLegislation Bill Animal BillPerson PT NTBT RTRT BTNT

71 Copyright © 2005 Access Innovations, Inc. 6. How to build and maintain a taxonomy

72 Copyright © 2005 Access Innovations, Inc. How to build a taxonomy Collect the terms Pull out authority terms Organize into arrays Choose top terms Organize hierarchically Flesh out term records Test, review, and edit

73 Copyright © 2005 Access Innovations, Inc. Or said another way … Define scope Collect terms and relationships Identify existing taxonomies Identify resources Create & refine taxonomy Apply taxonomy Review and update

74 Copyright © 2005 Access Innovations, Inc. Maintain Steady stream of terms – Web logs – Null sets – New announcements – Indexing team – Library – Records managers – Etc. Candidate terms Out of date is nearly useless

75 Copyright © 2005 Access Innovations, Inc. Best Results Measures Accuracy Productivity Hits, Misses and Noise Precision (Recall) Relevance Ease of set up Time to production

76 Copyright © 2005 Access Innovations, Inc. Integration Thesaurus – full featured – multiple views – multiple versions – multiple languages Automatic indexing – filtering – assisted Data Harmony MAI and Thesaurus Master

77 Copyright © 2005 Access Innovations, Inc. Visual Taxonomy Ways to look – Hierarchical – Alphabetic – by term – Ring diagrams – Topic maps – Related terms Visual Taxonomy

78

79

80

81

82 Content Management System

83 Copyright © 2005 Access Innovations, Inc. API to Many Systems for CMS

84 Copyright © 2005 Access Innovations, Inc. Apply to the meta data Automatic application? Spider setting internally External web crawls – use all aliases Filter data Enhance search experience

85 Copyright © 2005 Access Innovations, Inc. Meta data The fields The elements – Class codes – Title – Author – Plaintiff – Product – subject / topic Meta Name Keywords in HTML

86 Copyright © 2005 Access Innovations, Inc.

87 7. How Taxonomies are used in Enterprise Information

88 Copyright © 2005 Access Innovations, Inc. Brand is repeated in several spots and tied to search as well

89

90

91 Another way of listing brands

92 Category list from taxonomy is tied to brand list and product list

93 Category code from the taxonomy is tied to the brand list and the product list

94 Copyright © 2005 Access Innovations, Inc. Enterprise Taxonomy Management Consistent application across entire site Synonyms are used interchangeably User doesn’t need to know the taxonomy Pop up view is helpful Site map for construction and browsing Allows hidden sections for internal use

95 Copyright © 2005 Access Innovations, Inc. Taxonomies Form the basis for knowledge sharing Add value to discussion Allow deeper retrieval Are straightforward to create Require on-going maintenance

96 Copyright © 2005 Access Innovations, Inc. Your Taxonomy There is too much information to pile it on the floor. It fits in many places in the information flow

97 Copyright © 2005 Access Innovations, Inc.

98 Data Feed Thesaurus Master MAI to add Metadata Database Management System Add Metadata using MAI Search Inverted File Implementing a Taxonomy in a Content Management Portal

99 Copyright © 2005 Access Innovations, Inc. Thank you for your time! Questions? Marjorie M.K. Hlava Access Innovations, Inc. 505-998-0800 mhlava@accessinn.com www.accessinn.com


Download ppt "Implementing a Taxonomy in a Content Management Portal Content Week 2005 Miami, Florida Monday, January 31, 2005 Workshop H 2:45pm – 4:45 pm Marjorie M.K."

Similar presentations


Ads by Google