Presentation is loading. Please wait.

Presentation is loading. Please wait.

Strategies LLC Taxonomy 6-15 June 2007Copyright 2007 Taxonomy Strategies LLC. All rights reserved. Taxonomy & metadata strategies for effective content.

Similar presentations


Presentation on theme: "Strategies LLC Taxonomy 6-15 June 2007Copyright 2007 Taxonomy Strategies LLC. All rights reserved. Taxonomy & metadata strategies for effective content."— Presentation transcript:

1 Strategies LLC Taxonomy 6-15 June 2007Copyright 2007 Taxonomy Strategies LLC. All rights reserved. Taxonomy & metadata strategies for effective content management Melbourne, Sydney, Canberra Masterclass

2 2 Taxonomy Strategies LLC The business of organized information Todays agenda 9:00-9:1010 minIntroduction 9:10-9:155 minWarm-up exercise 9:15-9:4530 minTaxonomy fundamentals: Building taxonomies 9:45-10:0015 minTaxonomy exercise 10:00-10:3030 minTaxonomy fundamentals: Taxonomy business case 10:30-11:0030 minTea Break 11:00-12:0060 minTaxonomy governance 12:00-12:3030 minCapabilities self-assessment 12:30-13:3060 minLunch 13:30-14:3060 minTaxonomy benchmarking 14:30-14:4515 minBenchmarking exercise 14:45-15:1530 minTea Break 15:15-16:1560 minContent tagging 16:15-16:3015 minTagging exercise 16:30-17:0030 minQ&A

3 3 Taxonomy Strategies LLC The business of organized information Who I am: Joseph Busch v Over 25 years in the business of organized information. Founder, Taxonomy Strategies LLC Director, Solutions Architecture, Interwoven VP, Infoware, Metacode Technologies – (acquired by Interwoven, November 2000) Program Manager, Getty Foundation Manager, Pricewaterhouse v Metadata and taxonomies community leadership. President, American Society for Information Science & Technology Director, Dublin Core Metadata Initiative Adviser, National Research Council Computer Science and Telecommunications Board Reviewer, National Science Foundation Division of Information and Intelligent Systems Founder, Networked Knowledge Organization Systems/Services

4 4 Taxonomy Strategies LLC The business of organized information What we do Organize Stuff

5 5 Taxonomy Strategies LLC The business of organized information For us, taxonomy work includes: v Metadata specification defines the properties needed to describe content so that it can be found & used. v Vocabularies are collections of terms that are used to specify some of the metadata properties. Some vocabularies are big and hierarchical, some are small and flat. v An application profile specifies what metadata & vocabularies are required, and then represents them formally.

6 6 Taxonomy Strategies LLC The business of organized information Recent & current projects: http://www.taxonomystrategies.com/html/clients.htm http://www.taxonomystrategies.com/html/clients.htm GovernmentCommercial Not-for-Profit

7 7 Taxonomy Strategies LLC The business of organized information Who are you? What sectors do you work in? Your Role v Administrator v Records Manager v Content Manager v Communications v Editor v Information Architect v Usability Expert v Librarian v Knowledge Engineer v Ontologist v Chief Information Officer Industrial Sector v Agriculture & Processing Food, Lumber, Pulp & Paper v Financial Services Banking & Insurance v Government Public administration Public safety v High Tech Computers, Software & Telecommunications v Heavy Manufacturing Steel, Automobiles & Aircraft v Manufacturing Consumer Products v Medical & Health Care v Mining & Refining Petrochemicals, Oil & Gas v Pharmaceuticals

8 8 Taxonomy Strategies LLC The business of organized information Why are you here? v What are the key questions that you want answered in todays workshop? v Please rank the questions from the most important (5) to the least important (1) v Please provide your job title, organization and department; your name is optional. Priority (1-5)Questions Your title or role: Your org or industry: Your dept: Your name:(optional)

9 9 Taxonomy Strategies LLC The business of organized information Todays agenda 9:00-9:1010 minIntroduction 9:10-9:155 minWarm-up exercise 9:15-9:4530 minTaxonomy fundamentals: Building taxonomies 9:45-10:0015 minTaxonomy exercise 10:00-10:3030 minTaxonomy fundamentals: Taxonomy business case 10:30-11:0030 minTea Break 11:00-12:0060 minTaxonomy governance 12:00-12:3030 minCapabilities self-assessment 12:30-13:3060 minLunch 13:30-14:3060 minTaxonomy benchmarking 14:30-14:4515 minBenchmarking exercise 14:45-15:1530 minTea Break 15:15-16:1560 minContent tagging 16:15-16:3015 minTagging exercise 16:30-17:0030 minQ&A

10 10 Taxonomy Strategies LLC The business of organized information The Taxonomy problem: How to pick from > 5,000 faucets? By: v Category v Price v Brand v Color/Finish v # Handles v Series Name v Water Filter? v Faucet Spray v Handle Shape v Soap Dispenser?

11 11 Taxonomy Strategies LLC The business of organized information The main issue: What goes here? v When do the things in the list change? v How do we maintain the list? v What rules do we follow?

12 12 Taxonomy Strategies LLC The business of organized information Seven phases of taxonomy development Week:123456789101112 1Identify Objectives Conduct interviews 2Inventory Resources Identify, gather & review resources Define fields & purpose 3Specify Metadata 4Model Content Define content chunks & XML DTDs 5Specify Vocabularies Compile controlled vocabularies 6Specify Procedures Develop workflow, rules & procedures 7 Test & Train Manually tag small sample

13 13 Taxonomy Strategies LLC The business of organized information Taxonomy design phases need to be iterated 1Identify Objectives 2Inventory Resources 3Specify Metadata 4Model Content 5Specify Vocabularies 6Specify Procedures 7 Test & Train Interview core team and stakeholders Identify, gather & review resources Define fields & purpose Define content chunks & XML DTDs Compile controlled vocabularies Develop workflow rules & procedures Plan & Prototype Manually tag small sample Gather additional resources, if any Revise if needed, bake into alpha CMS Revise, use in alpha CMS alpha workflows in CMS Alpha Dev & Test Review tagged samples, default procedures Use alpha CMS to tag larger sample Modify CMS for beta Revise, use in beta CMS Modify & extend workflows Gather additional sources, if any Beta D&T Interview alpha users Use beta CMS to tag larger sample Finalize training materials & train staff Modify for 1.0 Revise using team procedu re Finalize procedure materials Final D&T Interview beta users

14 14 Taxonomy Strategies LLC The business of organized information Licensing an existing taxonomy See Factivas taxonomy www.taxonomywarehouse.comwww.taxonomywarehouse.com v There are usually license fees, but these will be less than the effort to develop an equivalent taxonomy. v But pre-existing taxonomies rarely fit an organizations needs and may require extensive customization. Recommendation v Adopt a faceted approach. v Reuse existing (especially internal) vocabularies for as many of the facets as possible. v Plan on doing full-custom Content Type and Topic taxonomies.

15 15 Taxonomy Strategies LLC The business of organized information Free sources for 8 common taxonomies TaxonomyDefinitionPotential Sources OrganizationOrganizational structure.SP 800-87, U.S. Government Manual, Your organizational structure, etc. Content TypeStructured list of the various types of content being managed or used. Dublin Core Type Vocabulary, AGLS Document Type, Your records management policy, etc. IndustryBroad market categories such as lines of business, life events, or industry codes. SIC, NAICS, Your market segments, etc. LocationPlace of operations or constituencies. FIPS 5-2, FIPS 55-3, ISO 3166, UN Statistics Div, US Postal Service, Your sales regions, etc. Business Activity Business activities or functions performed to accomplish mission and goals. Federal Enterprise Architecture Business Reference Model, Enterprise ontology, Your business functions, etc. TopicBusiness topics relevant to your mission & goals. Federal Register Thesaurus, NAL Agricultural Thesaurus, Your research areas, etc. AudienceSubset of constituents to whom a piece of content is directed or is intended to be used by. GEM, ERIC Thesaurus, IEEE LOM, Your psycho-graphics or personas, etc. Products & Services Names of products/programs and services. ERP system, Your products and services, etc.

16 16 Taxonomy Strategies LLC The business of organized information Typical product catalog: A-Z, then idiosyncratic categories

17 17 Taxonomy Strategies LLC The business of organized information How to analyze existing product catalog categories: Principles and priorities Preparing a product catalog for facet browsing (aka Guided Navigation) requires a category hierarchy and additional attributes. Principles 1. Categories and subcategories that could be swapped are candidates for conversion to attributes. 2. Repeated lists of subcategories signal a possible need for an attribute. 3. The number of attributes should not exceed six or seven, so not all attribute candidates should be used. Avoid selecting strongly correlated attributes, such as Weight and Shipping Weight. Priorities 1. Choose Categories that apply to many products, over those with few products. 2. Choose Attributes that apply to many Categories over those that apply only to very few categories.

18 18 Taxonomy Strategies LLC The business of organized information Product categories example: Wireless carrier Products Accessories Content Phones Services Batteries Cases Chargers Data Hands-Free Headsets Miscellaneous Conferencing Internet / Data Landline Phone Network & Roaming Relay Services Solutions Wireless Data Versatile Phones Smart Devices Basic Phones Prepaid Phones International Only Phones Mobile Broad- band Cards Purchased Subscription

19 19 Taxonomy Strategies LLC The business of organized information Product attributes example: Digital cameras in an electronics catalog v Types of attributes Generic attributes – Brand/Product Family/Model – Price Range – Usually Ships Merchandising attributes – Usage (E-mail, Internet Browsing, Programming, …) – Segment (Home, Business, Education, Government …) – Region & Country – Most Popular – New – Related Products Specialized attributes – Capacity (Battery; Memory; MB; GB; BPS, …) – Resolution (DPI; Megapixels; XGA, XGA, UXGA, …) – Size (Display; Screen;...) – Standard (a, b, g, n, …; scsi, ata, sata, eide, …; dimm, simm, …) – Type (Camera; Battery; Display; Printer; Server; Storage; Switch; …) Resolution 3 Megapixels (4) 4 Megapixels (5) 5 Megapixels (27) 6-8 Megapixels (21) Brand Canon (15) Fuji (10) Kodak (17) Nikon (8) Olympus (9) Type Point & Shoot (25) Digital SLR (10) Packages (5) Price Range $100-250 (5) $250-500 (16) $500-1000 (19) More than $1000 (3)

20 20 Taxonomy Strategies LLC The business of organized information Faceted taxonomy theory & practice v How many terms are needed to provide sufficient granularity? Not as many as you think! v Post-coordinate indexing allows several simple controlled vocabularies to be combined, rather than using a single large pre-coordinated vocabulary.

21 21 Taxonomy Strategies LLC The business of organized information The power of faceted taxonomy 10,000 4 independent categories of 10 nodes each have the same discriminatory power as one hierarchy of 10,000 nodes (10 4 ) Easier to maintain Easier to tag by content authors Can be easier to navigate v Its more effective to increase the number of facets, than to increase the number of terms per facet. Advocacy Contractors & Grantees Environmental Professionals Federal Facilities General Public Industry Kids Researchers & Scientists Small Business Students Audience Advisory Exposure Food Safety Health Assessment Health Effect Health Risk Occupational Health Pesticide Effects Sun Protection Toxicity HealthIndustry Allergen Biological Contaminant Carcinogen Chemical Explosive Liquid Waste Microorganism Ozone Pesticide Radioactive Waste Substance Agriculture & Cattle Automobile Repair Chemical Dry Cleaning Electronics & Computer Energy Extractive Industries Food Processing Leather Tanning & Finishing Metal Finishing

22 22 Taxonomy Strategies LLC The business of organized information Automatically created taxonomies v Documents can be clustered based on similarities and differences. v Problems: Typically only a single hierarchy No overall plan Results hard for people to navigate What does North mean on this map?

23 23 Taxonomy Strategies LLC The business of organized information Automatic taxonomy construction software v Software can scan large quantities of content and extract statistically significant words and phrases. v Example: Archive of 10 publications analyzed for topics related to copyright. v Software does a poor job of De-duplication. Turning significant words and phrases into a larger structure. Discriminating between gold and garbage. v Software is good for Getting an understanding of the key noun phrases in a large collection. Providing test cases for evaluating a taxonomy. Source: Sample data courtesy of nStein.

24 24 Taxonomy Strategies LLC The business of organized information Most popular flickr tags on 20 Feb 2007 http://www.flickr.com/photos/tags/ http://www.flickr.com/photos/tags/ Sort flickr categories into 5 or fewer groups. Then label each group.

25 25 Taxonomy Strategies LLC The business of organized information Taxonomy exercise Facet grouping v Universal taxonomy facets By location (spatially) By time (chronologically) By type (genre) By physical properties (size, color, shape, etc.) By subject (topic) Richard Saul Wurman. Information Architects (1996)

26 26 Taxonomy Strategies LLC The business of organized information Taxonomy exercise Facet grouping Sort flickr categories into 5 or fewer groups. Then label each group.

27 27 Taxonomy Strategies LLC The business of organized information Todays agenda 9:00-9:1010 minIntroduction 9:10-9:155 minWarm-up exercise 9:15-9:4530 minTaxonomy fundamentals: Building taxonomies 9:45-10:0015 minTaxonomy exercise 10:00-10:3030 minTaxonomy fundamentals: Taxonomy business case 10:30-11:0030 minTea Break 11:00-12:0060 minTaxonomy governance 12:00-12:3030 minCapabilities self-assessment 12:30-13:3060 minLunch 13:30-14:3060 minTaxonomy benchmarking 14:30-14:4515 minBenchmarking exercise 14:45-15:1530 minTea Break 15:15-16:1560 minContent tagging 16:15-16:3015 minTagging exercise 16:30-17:0030 minQ&A

28 28 Taxonomy Strategies LLC The business of organized information Business case and motivations for taxonomies v How are we going to use content, metadata, and taxonomies in applications to obtain business benefits?

29 29 Taxonomy Strategies LLC The business of organized information What technology analysts have said: Add metadata to search on! v Adding metadata to unstructured content allows it to be managed like structured content. Applications that use structured content work better. v Enriching content with structured metadata is critical for supporting search and personalized content delivery. v Content that has been adequately tagged with metadata can be leveraged in usage tracking, personalization and improved searching. vBetter structure equals better access: Taxonomy serves as a framework for organizing the ever-growing and changing information within a company. The many dimensions of taxonomy can greatly facilitate Web site design, content management, and search engineering. If well done, taxonomy will allow for structured Web content, leading to improved information access.

30 30 Taxonomy Strategies LLC The business of organized information Fundamentals of taxonomy ROI v Tagging content using a taxonomy is a cost, not a benefit. v There is no benefit without exposing the tagged content to users in some way that cuts costs or improves revenues. v Putting taxonomy into operation requires UI changes and/or backend system changes, as well as data changes. v You need to determine those changes, and their costs, as part of the ROI.

31 31 Taxonomy Strategies LLC The business of organized information Product utilization: Taxonomy compared to search v Conversion rate increases. HomeDepot.com – Double digit increase. 1-800-Flowers.com – More than a 10% increase. Otto Group (Kaleidoscope, Freemans, Grattan, and lookagain catalogs) – 130% increase. v Lift in average order size.

32 32 Taxonomy Strategies LLC The business of organized information Product catalog: Taxonomy compared to search Benefit: Increased conversion rate & revenue lift Web sales net income$ 80,000,000 Increased conversion rate30% $ 24,000,000 Order size lift 10% $ 8,000,000 Potential revenue increase per year$ 32,000,000

33 33 Taxonomy Strategies LLC The business of organized information Usability research: Taxonomy compared to search v We found that users preferred a browsing oriented interface for a browsing task, and a direct search interface when they knew precisely what they wanted. Marti Hearst (and others) v The category interface is superior to the list interface in both subjective and objective measures. Hao Chen & Susan Dumais

34 34 Taxonomy Strategies LLC The business of organized information Usability research: Taxonomy compared to search Median Search Time in Seconds In top 20 results Not in top 20 results Category is 36% faster Category is 48% faster Source: Chen & Dumais

35 35 Taxonomy Strategies LLC The business of organized information Time saved: Taxonomy compared to search 1 hour per day searching x 36% faster = 22 minutes each day 22 minutes x 250 working days per year = 5500 minutes or 92 hours per year

36 36 Taxonomy Strategies LLC The business of organized information Time saved: Taxonomy compared to search Benefit:Increase service efficiency Number of call center calls per month 50,000 Average cost per call$ 20 Call response costs per month$ 1,000,000 Total call response costs per year $12,000,000 Percentage of self-serviced calls due to improved information browsing30% Service costs savings per year $ 3,600,000

37 37 Taxonomy Strategies LLC The business of organized information Trusted advisers: Taxonomy avoids costs v The amount of time wasted in futile searching for vital information is enormous, leading to staggering costs … Sue Feldman, v Suns usability experts calculated that 21,000 employees were wasting an average of six minutes per day due to inconsistent intranet navigation structures. When lost time was multiplied by staff salaries, the estimated productivity loss exceeded $10M per yearabout $500 per employee per year. Jakob Nielsen, useit.com

38 38 Taxonomy Strategies LLC The business of organized information Knowledge workers spend up to 2.5 hours each day looking for information … … But find what they are looking for only 40% of the time. Source: Kit Sims Taylor

39 39 Taxonomy Strategies LLC The business of organized information 25% 8% Knowledge workers spend more time re-creating existing content than creating new content Source: Kit Sims Taylor (cited by Sue Feldman in her original article)

40 40 Taxonomy Strategies LLC The business of organized information Cost saved by not recreating content Benefit:Increase in productivity Number of employees 100 Average employee salary $ 80,000 Employee costs per year $8,000,000 Increase in productivity from not re- creating content25% Employee cost savings per year $2,000,000

41 41 Taxonomy Strategies LLC The business of organized information Business case summary 1. Classifications and classification-like schemes are being used to facilitate information seeking in the workplace, and on the web. 2. Users take advantage (and prefer) this type of scheme (faceted navigation) when it is made available in the user interface. 3. Hierarchical or facet navigation can be guided by the User Interface. 4. Facet navigation is best combined with keyword searching. E.g., keyword search followed by faceted navigation of results.

42 42 Taxonomy Strategies LLC The business of organized information Todays agenda 9:00-9:1010 minIntroduction 9:10-9:155 minWarm-up exercise 9:15-9:4530 minTaxonomy fundamentals: Building taxonomies 9:45-10:0015 minTaxonomy exercise 10:00-10:3030 minTaxonomy fundamentals: Taxonomy business case 10:30-11:0030 minTea Break 11:00-12:0060 minTaxonomy governance 12:00-12:3030 minCapabilities self-assessment 12:30-13:3060 minLunch 13:30-14:3060 minTaxonomy benchmarking 14:30-14:4515 minBenchmarking exercise 14:45-15:1530 minTea Break 15:15-16:1560 minContent tagging 16:15-16:3015 minTagging exercise 16:30-17:0030 minQ&A

43 43 Taxonomy Strategies LLC The business of organized information Taxonomy requires a business processes v Taxonomies must change, gradually, over time if they are to remain relevant. v Maintenance processes need to be specified so that the changes are based on rational cost/benefit decisions.

44 44 Taxonomy Strategies LLC The business of organized information Taxonomy governance can be viewed as a standards process v Taxonomy must evolve, but in a predictable way. v Team structure, with an appeals process Taxonomy stewardship is part-time role at most organizations. Team needs to make decisions based on costs and benefits. v Documentation and educational materials. v Comment-handling responsibilities (part of error- correction process) v Issue Logs. v Release Schedule.

45 45 Taxonomy Strategies LLC The business of organized information Taxonomy governance: Change process overview Working Copies of CVs, maintain in Taxonomy Tool Site Search Tool Portal Project Archives DMS MetataggingTool Search UI 2: NASA Taxonomy Team decides when to update snapshots of external CVs 4: Updated versions of CVs to Consumers NASA Taxonomy Governance Environment 3: Team adds value to snapshots through definitions, synonyms, classification rules, training materials, etc. Internally Created CVs Codes NASA Competencies CVs from other NASA Sources External Standard Vocabularies 2:Taxonomy Team decides when to update CV snapshots Taxonomy Facets 3:Team adds value via definitions, synonyms, classification rules, training materials, etc. 1:External controlled vocabularies (CVs) change on their own schedule Taxonomy Governance Environment 4:Updated versions of CVs published to consumers CV Consumers CV Sources Subject Codes Expertise Other Internal External Standard Site Search Tool Portal Working Papers Web CMS DAM Tagging Tool Search UI Internally Created Taxonomy Tool CV = Controlled Vocabulary

46 46 Taxonomy Strategies LLC The business of organized information Who should build the taxonomy? v The taxonomy (and metadata specification) should be produced by a cross-functional team which includes business, technical, information management, and content creation stakeholders. v The team should plan on maintaining the taxonomy as well as building it. Maintenance will not (usually) be anyones full-time job. Exact mix of people on team will change. v It should be built in an iterative fashion, with more content and broader review for each iteration.

47 47 Taxonomy Strategies LLC The business of organized information Taxonomy governance: Generic team charter v Taxonomy Team is responsible for maintaining: The Taxonomy, a multi-faceted classification scheme. Associated taxonomy materials, such as: – Editorial Style Guides. – Taxonomy Training Materials. – Metadata Standard. Team rules and procedures for change management. v Taxonomy Team will consider costs and benefits of suggested changes. v Taxonomy Team will: Manage relationship between providers of source vocabularies and consumers of the Taxonomy. Identify new opportunities for use of the Taxonomy across the enterprise to improve information management practices. Promote awareness and use of the Taxonomy.

48 48 Taxonomy Strategies LLC The business of organized information Taxonomy governance team: Generic roles Keeps committee on track with larger business objectives. Balances cost/benefit issues to decide appropriate levels of effort. Obtains needed resources if those on committee cant accomplish a particular task. Estimates costs of proposed changes in terms of amount of data to be retagged, additional storage and processing burden, software changes, etc. Helps obtain data from various systems. Committees liaison to content creators. Estimates costs of proposed changes in terms of editorial process changes, additional or reduced workload, etc. Suggests potential taxonomy changes based on analysis of query logs, indexer feedback. Makes edits to taxonomy, installs into system with aid of IT specialist. Reality check on process change suggestions.

49 49 Taxonomy Strategies LLC The business of organized information Where taxonomy changes come from experience End User Firewall Taxonomy Content Tagging Logic Application UI Tagging UI Tagging Staff Taxonomy Editor Staff notes missing concepts Query log analysis Requests from other parts of NASA experience End User Taxonomy Team Firewall Taxonomy Content Tagging Logic Tagging Logic Application UI Application UI Tagging UI Tagging UI Tagging Staff Taxonomy Editor Staff notes missing concepts Query log analysis Requests from other parts of the organization Team Considerations 1.Business goals. 2.Changes in user experience. 3.Retagging cost. Recommendations by Editor 1.Small taxonomy changes (labels, synonyms) 2.Large taxonomy changes (retagging, application changes) 3.New best bets content. Application Logic

50 50 Taxonomy Strategies LLC The business of organized information Taxonomy maintenance processes v Different organizations will need to consider their own change processes. Organization 1: A custodian is responsible for the content, but checks facts with department heads before making changes. Organization 2: Analysts suggest changes, editors approve, copyeditors verify consistency. Organization 3: Marketing reps ask for a change, taxonomy editor makes demo, web representative approves it. v Change process MUST also consider cost of implementing the change Retagging data. Reconfiguring auto-classifier. Retraining staff. Changes in user expectations.

51 51 Taxonomy Strategies LLC The business of organized information Taxonomy maintenance workflow Problem? Yes No Suggest new name/category Review new name Taxon- omy Copy edit new name Add to enterprise Taxonomy Analyst Editor Copywriter Sys Admin Taxonomy Tool

52 52 Taxonomy Strategies LLC The business of organized information Sample taxonomy editor: Data Harmony Hierarchy Browser Standard Term Info

53 53 Taxonomy Strategies LLC The business of organized information Taxonomy editing tools vendors Ability to Execute low high Completeness of Vision VisionariesNiche Players Most popular taxonomy editor is MS Excel An immature area– No vendors are in upper- right quadrant! MultiTes is widely used, cheap with functionality High functionality /high cost products ($100K+)

54 54 Taxonomy Strategies LLC The business of organized information Taxonomy maturity model v Taxonomy governance processes must fit the organization. v As consultants, we notice different levels of maturity in the business processes around content management, taxonomy, and metadata. v Honestly assess your organizations metadata maturity in order to design appropriate governance processes. v We are starting to define a maturity model, similar to the Software Capability Maturity Model (CMM) Initial: Ad hoc, each project begins from scratch. Repeatable: Procedures defined and used, but not standardized across organization or are misapplied to projects. Defined: Standard processes are tailored for project needs. Strategic training for long-range goals is in place. Managed: Projects managed using quantitative quality measures. Process itself is measured and controlled. Optimizing: Continual process improvement. Extremely accurate project estimation.

55 55 Taxonomy Strategies LLC The business of organized information Purpose of maturity model v Estimating the maturity of an organizations information management processes tells us: How involved the taxonomy development and maintenance process should be – Overly sophisticated processes will fail. What to recommend as first steps. v Maturity is not a goal, it is a characterization of an organizations methods for achieving particular goals. v Mature processes have expenses which must be justified by consequent cost savings or revenue gains. v IT Maturity may not be core to your business.

56 56 Taxonomy Strategies LLC The business of organized information Taxonomy maturity scorecard InitialRepeatableDefinedManagedOptimizing Organizational Structure Executive Sponsorship* Budgeting* Hiring & Training* Quality Assurance Manual Processes*1 Automated Processes* Project Management Estimating & Scheduling* Cost Control* Project Methodology*2 Design and Execution Planning* Design Excellence* Development Maturity* 1 – X is starting to examine search query logs, which is an important first step in improving search. But this is only an isolated example. 2 – IT has a project methodology they are trying to use across all projects. But not all business units have project methodologies.

57 57 Taxonomy Strategies LLC The business of organized information Taxonomy governance self-assessment Background 1. Rate your organizations overall taxonomy maturity from 1 to 10. Immature 1 2 3 4 5 6 7 8 9 10 Mature 2. What type of change was most recently made to your organizations taxonomy management environment? Functionality Standards Tools People Data Quality 2. What is the area for your organizations taxonomy management environment improvement? Functionality Standards Tools People Data Quality Basic 1. Is there a process in place to examine search query logs? Yes No 2. Is there an organization-wide metadata standard, such as the Dublin Core, for use by search tools? Yes No Intermediate 1. Is there an ongoing data cleansing procedure to look for any redundant, obsolete or trivial content (ROT)? Yes No If there is a process, describe it briefly. 2. Does the search engine index more than 4 repositories around the organization? 3. Are system features and metadata fields added based on cost/benefit analysis, or because they are easy to do with the current applications and tools? Cost/Benefit Easy 4. Are applications and tools acquired after requirements have been analyzed, or are major purchases sometimes made to use up year-end money? Requirements Year-End 5. Are there hiring and training practices for metadata and taxonomy positions? Yes No If there is training, describe it briefly. Advanced 1. Are there established qualitative and quantitative measures of metadata quality? Yes No If there are measures, describe them briefly. 2. Can the CEO explain the return on investment (ROI) for content management, search and metadata? Yes No

58 58 Taxonomy Strategies LLC The business of organized information 2005 Maturity survey: Search practices n=87 Not current practice Being developedIn practice Former practice NA or Unknown Search Box in standard place on all web pages. 20% (12)11% (7)62% (38)2% (1)5% (3) Search engine indexes multiple repositories in addition to web sites. 25% (15)21% (13)44% (27)2% (1)8% (5) Spell Checking. 31% (19)18% (11)38% (23)0% (0)13% (8) Synonym Searching. 41% (25)23% (14)30% (18)0% (0)7% (4) Search results grouped by date, location, or other factors in addition to simple relevance score. 37% (22)20% (12)37% (22)0% (0)7% (4) Queries are logged and the logs are regularly examined 31% (19)25% (15)31% (19)5% (3)8% (5) Common queries identified, 'best' pages for those queries are found, and search engine configured to return them at the top. (Best Bets) 46% (28)25% (15)21% (13)0% (0)8% (5) Advanced computation of relevance based on data in addition to the text of the document. 43% (26)16% (10)25% (15)0% (0)16% (10) A faceted search tool, such as Endeca, has been implemented for the organization's external site or product catalog search. 68% (41)7% (4)10% (6)0% (0)15% (9) A faceted search tool, such as Endeca, has been implemented for the organization's internal website(s) or portal. 57% (34)15% (9)17% (10)0% (0)12% (7)

59 59 Taxonomy Strategies LLC The business of organized information 2005 Maturity survey: Metadata practices n=87 Not current practice Being developedIn practice Former practice NA or Unknown Metadata standards are developed for the needs of each system with no overall attempt to unify them. 22% (13)12% (7)37% (22)20% (12)10% (6) An Organization-wide metadata standard exists and new systems consider it during development. 37% (22) 20% (12)0% (0)7% (4) The Organization-wide metadata standard is based on the Dublin Core. 52% (30)16% (9)21% (12)0% (0)12% (7) Multiple repositories comply with metadata standard. 52% (31)20% (12)17% (10)0% (0)12% (7) A Cataloging Policy document exists to teach people how to tag data in compliance with organizational metadata standard. 48% (29)20% (12) 0% (0)12% (7) The Cataloging Policy document is revised periodically. 48% (29)15% (9)17% (10)0% (0)20% (12) A centralized metadata repository exists to aggregate and unify metadata from disparate sources. 57% (34)17% (10) 0% (0)10% (6) Metadata is manually entered into web forms. 15% (9)12% (7)61% (36)3% (2)8% (5) Metadata is generated automatically by software. 38% (23)18% (11)27% (16)2% (1)15% (9) Metadata is generated automatically, then reviewed manually for correction. 48% (29)18% (11)17% (10)2% (1)15% (9)

60 60 Taxonomy Strategies LLC The business of organized information 2005 Maturity survey: Taxonomy practices n=87 Not current practice Being developedIn practice Former practice NA or Unknown Org Chart Taxonomy - One based primarily on the structure of the organization. 36% (21)10% (6)34% (20)5% (3)15% (9) Products Taxonomy - One based primarily on the products and/or services offered by the organization. 37% (22)10% (6)32% (19)5% (3)15% (9) Content Types Taxonomy - One based primarily on the different types of documents. 28% (16)21% (12)40% (23)5% (3)7% (4) Topical Taxonomy - One based primarily on topics of interest to the site users. 20% (12)36% (21)34% (20)3% (2)7% (4) Faceted Taxonomy - One which uses several of the approaches above. 32% (19)29% (17)34% (20)0% (0)5% (3) The Taxonomy, or a portion of it, was licensed from an outside taxonomy vendor. 75% (44)3% (2)14% (8)0% (0)8% (5) The Taxonomy follows a written 'style guide' to ensure its consistency over time. 47% (28)22% (13)20% (12)0% (0)10% (6) The Taxonomy is maintained using a taxonomy editing tool other than MS Excel. 35% (21)17% (10)40% (24)2% (1)7% (4) The Taxonomy was validated on a representative sample of content during its development. 28% (17)22% (13)33% (20)3% (2)13% (8) A Roadmap for the future evolution of the Taxonomy has been developed. 38% (23)40% (24)13% (8)0% (0)8% (5)

61 61 Taxonomy Strategies LLC The business of organized information Todays agenda 9:00-9:1010 minIntroduction 9:10-9:155 minWarm-up exercise 9:15-9:4530 minTaxonomy fundamentals: Building taxonomies 9:45-10:0015 minTaxonomy exercise 10:00-10:3030 minTaxonomy fundamentals: Taxonomy business case 10:30-11:0030 minTea Break 11:00-12:0060 minTaxonomy governance 12:00-12:3030 minCapabilities self-assessment 12:30-13:3060 minLunch 13:30-14:3060 minTaxonomy benchmarking 14:30-14:4515 minBenchmarking exercise 14:45-15:1530 minTea Break 15:15-16:1560 minContent tagging 16:15-16:3015 minTagging exercise 16:30-17:0030 minQ&A

62 62 Taxonomy Strategies LLC The business of organized information Taxonomy testing methods MethodProcessWhoRequiresValidation Walk-thruShow & explain Taxonomist SME Team Rough taxonomy Approach Appropriateness to task Walk-thruCheck conformance to editorial rules Taxonomist Draft taxonomy Editorial Rules Consistent look and feel Usability Testing Contextual analysis (card sorting, scenario testing, etc.) Users Rough taxonomy Tasks & Answers Tasks are completed successfully Time to complete task is reduced User Satisfaction Survey Users Rough Taxonomy UI Mockup Search prototype Reaction to taxonomy Reaction to new interface Reaction to search results Tagging Samples Tag sample content with taxonomy Taxonomist Team Indexers Sample content Rough taxonomy (or better) Content fit Fills out content inventory Training materials for people & algorithms

63 63 Taxonomy Strategies LLC The business of organized information Walk-through method Show & explain ABC Computers.com All Business Employee Education Gaming Enthusiast Home Investor Job Seeker Media Partner Shopper First Time Experienced Advanced Supplier Audience All Home & Home Office Gaming Government, Education & Healthcare Medium & Large Business Small Business Line of Business All Asia-Pacific Canada EMEA Japan Latin America & Caribbean United States Region- Country Desktops MP3 Players Monitors Networking Notebooks Printers Projectors Servers Services Storage Televisions Other Brands Product Family Award Case Study Contract & Warranty Demo Magazine News & Event Product Information Services Solution Specification Technical Note Tool Training White Paper Other Content Types Content Type Business & Finance Interpersonal Development IT Professionals Technical Training IT Professionals Training & Certification PC Productivity Personal Computing Proficiency CompetencyIndustry Banking & Finance Communica- tions E-Business Education Government Healthcare Hospitality Manufacturing Petro-chemicals Retail / Wholesale Technology Transportation Other Industries Service Assessment, Design & Implementa- tion Deployment Enterprise Support Client Support Managed Lifecycle Asset Recovery & Recycling Training

64 64 Taxonomy Strategies LLC The business of organized information Walk-through method Editorial rules consistency check v Abbreviations v Ampersands v Capitalization v General…, More…, Other… v Languages & character sets v Length limits v Multiple parents v Plural vs. singular form v Scope notes v Serial comma v Sources of terms v Spaces v Synonyms & acronyms v Term order (Alphabetic or …) v Term label order (Direct vs. inverted) … Rule NameEditorial Rule AbbreviationsAbbreviations, other than colloquial terms and acronyms, shall not be used in term labels. Example: Public Information NOT: Public Info. AmpersandsThe ampersand [&] character shall be used instead of the word and. Example: Licensing & Compliance NOT: Licensing and Compliance CapitalizationTitle case capitalization shall be used. Example: Customer Service NOT: CUSTOMER SERVICE NOT: Customer service NOT: customer service General…, More…, Other… The term labels General…, More…, and Other… shall be used for categories which contain content items that are not further classifiable. Example:Other Property Other Services General Information General Audience ……

65 65 Taxonomy Strategies LLC The business of organized information Task-based testing* v 15 representative questions were selected Perspective of various organizational units Most frequent website searches Most frequently accessed website content Correct answers to the questions were agreed in advance by team. v 15 users were tested Did not work for the organization Represented target audiences v Testers were asked where would you look for … under which facet… Topic, Commodity, or Geography? Then, … under which category? Then, …under which sub-category? Tester choices were recorded v Testers were asked to think aloud Notes were taken on what they said v Pre- and post questions were asked Tester answers were recorded * Based on Donna Maurers usability work with the Australian government

66 66 Taxonomy Strategies LLC The business of organized information Task-based testing Representative questions 1. How much cotton is imported from China? 2. What are the impacts of mad cow" disease on U.S. meat production, sales? 3. What is the average farm income level in your state? 4. How much of our diet comes from fast food? 5. How many people receive WIC benefits (Special Supplemental Nutrition Program for Women, Infants, and Children)? 6. How much acreage is planted to genetically engineered corn? 7. What is the cost of foodborne illness in the United States? 8. What part of food costs go to farmers, retailers? 9. Which States produce the most tobacco? 10. What percentage of farms in the United States are small farms? 11. What are the costs and benefits associated with providing more traceability in the U.S. food supply? 12. How many people in America dont get enough to eat? 13. What is behind the trade balance (surplus or deficit) in agricultural goods? 14. What is the extent of conservation compliance? How does that impact farmer's decisions? 15. What are the impacts of foreign trade restrictions on U.S. farmers, U.S. food prices?

67 67 Taxonomy Strategies LLC The business of organized information Task-based testing Closed card sorting 3. What is the average farm income level in your state? 1.Topics 2.Commodities 3. Geographic Coverage 1.Topics 1.1 Agricultural Economy 1.2Agriculture-Related Policy 1.3Diet, Health & Safety 1.4Farm Financial Conditions 1.5Farm Practices & Management 1.6Food & Agricultural Industries 1.7Food & Nutrition Assistance 1.8Natural Resources & Environment 1.9Rural Economy 1.10Trade & International Markets 1.4Farm Financial Conditions 1.4.1Costs of Production 1.4.2Commodity Outlook 1.4.3Farm Financial Management & Performance 1.4.4Farm Income 1.4.5Farm Household Financial Well-being 1.4.6Lenders & Financial Markets 1.4.7Taxes

68 68 Taxonomy Strategies LLC The business of organized information Task based testing Card sort analysis Find-it TasksUser 1User 2User 3User 4User 5 1. CottonCotton AsiaCotton 2. Mad cowCattleFood SafetyCattle 3. Farm incomeFarm Income US StatesFarm Income 4. Fast food Food Consumption Diet Quality & Nutrition Food Expenditures Diet Quality & Nutrition 5. WICWIC Program 6. GE CornCorn 7. Foodborne illness Foodborne Disease Consumer Food Safety Foodborne Disease 8. Food costsFood PricesMarket StructureMarket Analysis Food Expenditures Retailing & Wholesaling 9. TobaccoTobacco 10. Small FarmsFarm Structure 11. TraceabilityFood SystemLabeling Policy Food Safety Innovations Food Safety PolicyFood Prices 12. HungerFood Security 13. Trade balance Commodity Trade Trade & Intl Markets Commodity TradeMarket Analysis Commodity Trade 14. Conservations Cropping Practices Conservation Policy 15. Trade restrictionsTrade Policy Food Safety & TradeWTOMarket Analysis Commodity Trade

69 69 Taxonomy Strategies LLC The business of organized information Task based testing Card sort results v In 80% of the trials users looked for information under the categories that we expected them to look for it. v Breaking-up topics into facets makes it easier to find information, especially information related to commodities.

70 70 Taxonomy Strategies LLC The business of organized information Task based testing Card sort results Test Questions % Correct % Agree 1. Cotton91%82% 2. Mad cow73%64% 3. Farm income100%55% 4. Fast food91%73% 5. WIC100% 6. GE corn100% 7. Foodborne illness82% 8. Food costs55%27% 9. Tobacco100% 10. Small farms91% 11. Traceability36%18% 12. Hunger100%73% 13. Trade balance36%64% 14. Conservation91% 15. Trade restrictions55%36% Possible change required. Change required. Possible error in categorization of this question because 64% thought the answer should be Commodity Trade. On these trials, only 50% looked in the right category, & only 27-36% agreed on the category. Policy of Traceability needs to be clarified. Use quasi-synonyms.

71 71 Taxonomy Strategies LLC The business of organized information Task-based testing User satisfaction survey v Was it easy, medium or difficult to choose the appropriate Topic? – Easy – Medium – Difficult v Was it easy, medium or difficult to choose the appropriate Commodity? – Easy – Medium – Difficult v Was it easy, medium or difficult to choose the appropriate Geographic Coverage? – Easy – Medium – Difficult

72 72 Taxonomy Strategies LLC The business of organized information User satisfaction survey Results EasierMore Difficult

73 73 Taxonomy Strategies LLC The business of organized information User interface survey Which search UI is better? v Criteria User satisfaction Success completing tasks Confidence in results Fewer dead ends v Methodology Design tasks from specific to general Time performance Calculate success rates Survey subjective criteria Pay attention to survey hygiene: – Participant selection – Counterbalancing – T-scores Source: Yee, Swearingen, Li, & Hearst

74 74 Taxonomy Strategies LLC The business of organized information User interface survey Results (1) Which Interface would you rather use for these tasks? Google-like Baseline Faceted Category Find images of roses1516 Find all works from a certain period230 Find pictures by 2 artists in the same media129 … Overall assessment: Google-like Baseline Faceted Category More useful for your usual tasks428 Easiest to use823 Most flexible624 More likely to result in dead-ends283 Helped you learn more131 Overall preference229 … Source: Yee, Swearingen, Li, & Hearst

75 75 Taxonomy Strategies LLC The business of organized information User interface survey Results (2) Faceted Category Google-like Baseline Source: Yee, Swearingen, Li, & Hearst

76 76 Taxonomy Strategies LLC The business of organized information Tagging samples How many items? Goal Number of ItemsCriteria Illustrate metadata schema1-3Random (excluding junk) Develop training documentation 10-20Show typical & unusual cases Qualitative test of small vocabulary (<100 categories) 25-50Random (excluding junk) Quantitative test of vocabularies * 3-10X number of categories Use computer-assisted methods when more than 10-20 categories. Pre- existing metadata is the most meaningful. *Quantitative methods require large amounts of tagged content. This requires specialists, or software, to do tagging. Results may be very different than how real users would categorize content.

77 77 Taxonomy Strategies LLC The business of organized information Tagging samples Manually tagged metadata sample AttributeValues TitleJupiters Ring System URLhttp://ringmaster.arc.nasa.gov/jupiter/ DescriptionOverview of the Jupiter ring system. Many images, animations and references are included for both the scientist and the public. Content TypesWeb Sites; Animations; Images; Reference Sources AudiencesEducators; Students OrganizationsAmes Research Center Missions & ProjectsVoyager; Galileo; Cassini; Hubble Space Telescope LocationsJupiter Business FunctionsScientific and Technical Information DisciplinesPlanetary and Lunar Science Time Period1979-1999

78 78 Taxonomy Strategies LLC The business of organized information Tagging samples Spreadsheet for tagging 10s-100s of items 1) Clickable URLs for sample content 2) Review small sample and describe 3) Drop-down for tagging (including Other entry for the unexpected 4) Flag questions

79 79 Taxonomy Strategies LLC The business of organized information Rough bulk tagging Facet demo (1) v Collections: 4 content sources NTRS, SIRTF, Webb, Lessons Learned v Taxonomy Converted MultiTes format into RDF for Seamark v Metadata Converted from existing metadata on web pages, or Created using simple automatic classifier (string matching with terms & synonyms) 250k items, ~12 metadata fields, 1.5 weeks effort v OOTB Seamark user interface, plus logo

80 80 Taxonomy Strategies LLC The business of organized information Rough bulk tagging Facet demo (2) Facet demo

81 81 Taxonomy Strategies LLC The business of organized information Document distribution How evenly does it divide the content? v Documents do not distribute uniformly across categories v Zipf (1/x) distribution is expected behavior v 80/20 rule in action (actually 70/20 rule) Leading candidate for splitting Leading candidates for merging

82 82 Taxonomy Strategies LLC The business of organized information Document distribution How evenly does it divide the content? v Methodology: 115 randomly selected URLs from corporate intranet search index were manually categorized. Inaccessible files and junk were removed. v Results: Slightly more uniform than Zipf distribution. Above the curve is better than expected.

83 83 Taxonomy Strategies LLC The business of organized information Document distribution How does taxonomy shape match that of content? Background: v Hierarchical taxonomies allow comparison of fit between content and taxonomy areas Methodology: v 25,380 resources tagged with taxonomy of 179 terms. (Avg. of 2 terms per resource) v Counts of terms and documents summed within taxonomy hierarchy Results: v Roughly Zipf distributed (top 20 terms: 79%; top 30 terms: 87%) v Mismatches between term% and document% flagged Term Group % Terms % Docs Administrators7.815.8 Community Groups2.81.8 Counselors3.41.4 Federal Funds Recipients and Applicants 9.534.4 Librarians2.81.1 News Media0.63.1 Other7.32.0 Parents and Families2.86.0 Policymakers4.511.5 Researchers2.23.6 School Support Staff2.20.2 Student Financial Aid Providers1.70.7 Students27.47.0 Teachers25.111.4 Source: Courtesy Keith Stubbs, US. Dept. of Ed.

84 84 Taxonomy Strategies LLC The business of organized information Usability testing How intuitive (repeatable) are the categorizations (1)? v Methodology: Closed Card Sort For alpha test of a grocery site 15 Testers put each of 71 best-selling product types into one of 10 pre-defined categories Categories where fewer than 14 of 15 testers put product into same category were flagged

85 85 Taxonomy Strategies LLC The business of organized information Usability testing How intuitive (repeatable) are the categorizations (2)?

86 86 Taxonomy Strategies LLC The business of organized information % of Testers Cumulative % of Products 15/1554% 14/1570% 13/1577% 12/1583% 11/1585% <11/15100% With Poly-Hierarchy 69% 83% 93% 100% Usability testing How intuitive (repeatable) are the categorizations?

87 87 Taxonomy Strategies LLC The business of organized information The #1 underused source of quantitative information on how to improve your taxonomy? Query Logs & Click Trails

88 88 Taxonomy Strategies LLC The business of organized information Query log & click trail examination Who are the users & what are they looking for? v Only 30-40% of organizations regularly examine their logs*. v Sophisticated software available, but dont wait. v 80% of value comes from basic reports

89 89 Taxonomy Strategies LLC The business of organized information Query log & click trail examination Query log UltraSeek Reporting v Top queries v Queries with no results v Queries with no click-through v Most requested documents v Query trend analysis v Complete server usage summary

90 90 Taxonomy Strategies LLC The business of organized information Query log & click trail examination Click trail packages v iWebTrack v NetTracker v OptimalIQ v SiteCatalyst v Visitorville v WebTrends

91 91 Taxonomy Strategies LLC The business of organized information Summary Start a Measure & Improve mindset v Taxonomy changes do not stand alone Search system improvements Navigation improvements Content improvements Process improvements

92 92 Taxonomy Strategies LLC The business of organized information Benchmarking exercise v What are 5 representative questions that your users ask or tasks that your users do when using your application? v Is it currently easy, medium or difficult to answer these questions or accomplish these tasks? Rating (Easy/ Medium/Difficult)Questions or Tasks

93 93 Taxonomy Strategies LLC The business of organized information Conclusion What is a good taxonomy? v Incremental, extensible process that identifies and enables owners, and engages stakeholders. v Quick implementation that provides measurable results as quickly as possible. v A means to an end, and not the end in itself. v Not perfect, but it does the job it is supposed to dosuch as improving search and navigation. v Improved over time, and maintained.

94 94 Taxonomy Strategies LLC The business of organized information Todays agenda 9:00-9:1010 minIntroduction 9:10-9:155 minWarm-up exercise 9:15-9:4530 minTaxonomy fundamentals: Building taxonomies 9:45-10:0015 minTaxonomy exercise 10:00-10:3030 minTaxonomy fundamentals: Taxonomy business case 10:30-11:0030 minTea Break 11:00-12:0060 minTaxonomy governance 12:00-12:3030 minCapabilities self-assessment 12:30-13:3060 minLunch 13:30-14:3060 minTaxonomy benchmarking 14:30-14:4515 minBenchmarking exercise 14:45-15:1530 minTea Break 15:15-16:1560 minContent tagging 16:15-16:3015 minTagging exercise 16:30-17:0030 minQ&A

95 95 Taxonomy Strategies LLC The business of organized information Tagging Overview v Tagging is better than the words that happen to occur in a piece of content. v All tagging is useful End user tagging Tagging by librarians Automated tagging by OS and algorithms v Content should be tagged throughout its lifecycle, each time the content is handled and used so that it accrues value or its significance is diminished.

96 96 Taxonomy Strategies LLC The business of organized information MS Office: File Properties How many people fill this in?

97 97 Taxonomy Strategies LLC The business of organized information Organize How many people click on this?

98 98 Taxonomy Strategies LLC The business of organized information What is social tagging? v End user tagging v Easy, intuitive tagging interfaces v Almost instantaneous feedback Enables people to tag & re-tag content … in response to seeing their tags in context with other tags. v Emergent categories Resembles open card sort process in which patterns emerge … rather than validating categories using closed card sorts.

99 99 Taxonomy Strategies LLC The business of organized information Social tagging innovators v flickr founders Caterina Fake Stewart Butterfield v del.icio.us founder Joshua Schachter v del.icio.us & flickr are now both part of Yahoo! v As of April 2006 flickr had 130 million photos posted by 3 million registered users.

100 100 Taxonomy Strategies LLC The business of organized information Four tagging rules for end users RuleDescription Use specific terms Apply the most specific terms when tagging content. But do not tag every possible topic, just the ones that are most important or best characterize the content as a whole. Use multiple terms Use as many terms as necessary to describe overall What the content is about & Why it is important. Do not over-tag. Use appropriate terms Only fill-in the facets & values that make sense. Not all facets apply to all content. Consider how content will be used Anticipate how the content will be searched for in the future, & how to make it easy to find it. Remember that search engines can only operate on explicit information.

101 101 Taxonomy Strategies LLC The business of organized information Agenda v Content Tagging v Tagging Interface

102 102 Taxonomy Strategies LLC The business of organized information Requirements for a tagging interface v Automated form fill-in (automatically fills in known data) v Tagging precedents (see tags already assigned by others) v Controlled vocabularies, e.g., with pull-down list v Multi-valued tags v Geo-tagging v Group tagging v Clean-up tag tools, e.g., alpha list v Batch editing v Share/Dont share (Public/Private) v Identified owner (who can be emailed) v Almost immediate feedback, e.g., tag cloud

103 103 Taxonomy Strategies LLC The business of organized information Form fill-in: Automatically filled-in known data

104 104 Taxonomy Strategies LLC The business of organized information Form fill-in: Automatically filled-in known data Manual form fill-in w/ check boxes, pull-down lists, etc. Auto keyword & summarization

105 105 Taxonomy Strategies LLC The business of organized information Form fill-in: Automatically filled-in known data Auto-categorization Parse & lookup (recognize names) Rules & pattern matching

106 106 Taxonomy Strategies LLC The business of organized information Tagging precedents: See tags assigned by others

107 107 Taxonomy Strategies LLC The business of organized information Multi-valued group tagging

108 108 Taxonomy Strategies LLC The business of organized information Group geo-tagging

109 109 Taxonomy Strategies LLC The business of organized information Group geo-tagging

110 110 Taxonomy Strategies LLC The business of organized information Clean up tag tools: Alpha list

111 111 Taxonomy Strategies LLC The business of organized information Batch edit

112 112 Taxonomy Strategies LLC The business of organized information Share or dont share tagging

113 113 Taxonomy Strategies LLC The business of organized information Bulk tagging v ID collection of related content items by pattern or context v Then, apply same attributes to all content items

114 114 Taxonomy Strategies LLC The business of organized information Tag a folder v Drag & drop content items into folder v Then, content items inherit properties of folder

115 115 Taxonomy Strategies LLC The business of organized information Workflow v Approve & improve mindset Review & Improve Add Metadata Create Content Publish

116 116 Taxonomy Strategies LLC The business of organized information Interactive rewards v Almost instantaneous exposure of tags in simple user interfaces on the web provides positive reinforcement for user tagging that simply did not exist before. v For example, Most popular Tag clouds Alerts

117 117 Taxonomy Strategies LLC The business of organized information Most popular Another example is most emailed from, e.g., the NY Times.

118 118 Taxonomy Strategies LLC The business of organized information Tag cloud

119 119 Taxonomy Strategies LLC The business of organized information Alerts v New (content selected by date) v Subscriptions (content selected by tags) v Interest (content selected by other people) v Individual (content selected for you by other people)

120 Strategies LLC Taxonomy 6-15 June 2007Copyright 2007 Taxonomy Strategies LLC. All rights reserved. Is faceted indexing the future of social tagging?

121 121 Taxonomy Strategies LLC The business of organized information Tagging exercise: Blog tagging (a) ALA Tech Source. http://www.techsource.ala.org/blog/2007/04/google-buys-oclc-announces-new-products.htmlhttp://www.techsource.ala.org/blog/2007/04/google-buys-oclc-announces-new-products.html

122 122 Taxonomy Strategies LLC The business of organized information Tagging exercise: Blog tagging (b) HBSP. http://discussionleader.hbsp.com/davenport/2007/04/cause_and_effect_reporting_raw.html#commentshttp://discussionleader.hbsp.com/davenport/2007/04/cause_and_effect_reporting_raw.html#comments

123 123 Taxonomy Strategies LLC The business of organized information Tagging exercise: Taxonomy facetsdefinitions Taxonomy FacetsDescriptions Business activity Use for common business function or activity such as finance, marketing and sales. Industry / Product Use for content that is about or related to an industrial sector or product such as construction equipment. GeographyUse for content that is about a region, country or city. Organization Use for named organizations, brands and business entities. Person / Role Use for named people and the roles people have in organizations. Content Type Use for content genres such as letters, memos and reports. AudienceUse to indicate the intended audience. Topic Use for other business and associated topics that the content is about or related to.

124 124 Taxonomy Strategies LLC The business of organized information Tagging exercise: Taxonomy facetsvalues GeographyIndustry / ProductPeople / Role Organization / Entity Content TypeBusiness activity Business Leaders Thought Leaders Political Leaders Roles Business entities Companies & brands Government agencies International NGOs Organization types Agriculture … Mining Utilities Construction Manufacturing Wholesale trade Retail trade Transportation & warehousing Information Finance & insurance Real estate Professional Management Administrative support Education Health care Arts, entertainment & recreation Accommodation & food Other services Public administration Africa Americas Antarctica Asia Europe Oceania Global Historical geography Oceans & seas Regions Audience Accounting Auditing Finance HR management IT Marketing Operations management Sales Consumer Employee Manager Executive Basic facts & information Blog Brochure Database E-mail Letter Memo Multimedia Report Newsletter Podcast Press Release Research & Analysis RSS Feed Taxonomy FacetsTags Business activity Industry / Product Geography Organization Person / Role Content Type Audience Topic

125 125 Taxonomy Strategies LLC The business of organized information Summary v There are lessons to be learned from web tagging about how to get good metadata in document and content management applications. v Document and content management system tagging must be simple, and it must be almost instantaneously easier to find relevant work products.

126 Strategies LLC Taxonomy 6-15 June 2007Copyright 2007 Taxonomy Strategies LLC. All rights reserved. Questions? Joseph A. Busch + 415-377-7912 jbusch@taxonomystrategies.com http://www.taxonomystrategies.com jbusch@taxonomystrategies.com http://www.taxonomystrategies.com


Download ppt "Strategies LLC Taxonomy 6-15 June 2007Copyright 2007 Taxonomy Strategies LLC. All rights reserved. Taxonomy & metadata strategies for effective content."

Similar presentations


Ads by Google