Presentation on theme: "Taxonomy Development An Infrastructure Model"— Presentation transcript:
1 Taxonomy Development An Infrastructure Model Tom Reamy Chief Knowledge ArchitectKAPS GroupKnowledge Architecture Professional Services
2 Agenda Introduction Type of Taxonomies The Enterprise Context Making the Business CaseInfrastructure Model of Taxonomy DevelopmentTaxonomy in 4 ContextsContent, People, Processes, TechnologyInfrastructure Solutions – the ElementsApplying the Model – Practical DimensionStarting and ResourcesConclusion
3 KAPS Group Knowledge Architecture Professional Services (KAPS) Consulting, strategy recommendationsKnowledge architecture auditsPartners – Convera, Inxight, FAST, and othersTaxonomies: Enterprise, Marketing, Insurance, etc.Taxonomy customizationIntellectual infrastructure for organizationsKnowledge organization, technology, people and processesSearch, content management, portals, collaboration, knowledge management, e-learning, etc.
4 Two Types of Taxonomies: Browse and Formal Browse Taxonomy – Yahoo
6 Browse Taxonomies: Strengths and Weaknesses Strengths: Browse is better than searchContext and discoveryBrowse by task, type, etc.Weaknesses:Mix of organizationCatalogs, alphabetical listings, inventoriesSubject matter, functional, publisher, document typeVocabulary and nomenclature IssuesProblems with maintenance, new materialPoor granularity and little relationship between parts.Web site unit of organizationNo foundation for standards
7 Formal Taxonomies: Strengths and Weaknesses Fixed Resource – little or no maintenanceCommunication Platform – share ideas, standardsInfrastructure ResourceControlled vocabulary and keywordsMore depth, finer granularityWeaknesses:Difficult to develop and customizeDon’t reflect users’ perspectivesUsers have to adapt to language
8 Facets and Dynamic Classification Facets are not categoriesEntities or concepts belong to a categoryEntities have facetsFacets are metadata - properties or attributesEntities or concepts fit into one categoryAll entities have all facets – defined by set of valuesFacets are orthogonal – mutually exclusive – dimensionsAn event is not a person is not a document is not a place.Facets – variety – of units, of structureDate or price – numerical rangeLocation – big to small (partonomy)Winery – alphabeticalHierarchical - taxonomic
9 Faceted Navigation: Strengths and Weaknesses More intuitive – easy to guess what is behind each door20 questions – we know and useDynamic selection of categoriesAllow multiple perspectivesTrick Users into “using” Advanced Searchwine where color = red, price = x-y, etc..Weaknesses:Difficulty of expressing complex relationshipsSimplicity of internal organizationLoss of Browse ContextDifficult to grasp scope and relationshipsLimited Domain Applicability – type and sizeEntities not concepts, documents, web sites
10 Dynamic Classification / Faceted navigation Search and browse better than either aloneCategorized search – contextBrowse as an advanced searchDynamic search and browse is bestCan’t predict all the ways people thinkAdvanced cognitive differencesPanda, Monkey, BananaCan’t predict all the questions and activitiesIntersections of what users are looking for and what documents are often aboutChina and BiotechEconomics and Regulatory
11 Business Case for Taxonomies: The Right Context Traditional MetricsTime Savings – 22 minutes per user per day = $1Mil a YearApply to your organization – customer service, content creation, knowledge industryCost of not-finding = re-creating contentResearchAdvantages of Browsing – Marti Hearst, Chen and DumaisNielsen – “Poor classification costs a 10,000 user organization $10M each year – about $1,000 per employee.”StoriesPain points, success and failure – in your corporate language
12 Business Case for Taxonomies: IDC White Paper Information Tasks– 14.5 hours a weekCreate documents – 13.3 hours a weekSearch – 9.5 hours a weekGather information for documents – 8.3 hours a weekFind and organize documents – 6.8 hours a weekGartner: “Business spend an estimated $750 Billion annually seeking information necessary to do their job % of a knowledge worker’s time is spent managing documents.”
13 Business Case for Taxonomies: IDC White Paper Time WastedReformat information - $5.7 million per 1,000 per year (400M)Not finding information - $5.3 million per 1,000 (370M)Recreating content - $4.5 Million per 1,000 (315M)Small Percent Gain = large savings1% - $10 million5% - $50 million10% - $100 million
14 Business Case for Taxonomies: The Right Context JustificationSearch Engine - $500K-$2MilContent Management - $500K-$2MilPortal - $500-$2MilPlus maintenance and employee costsTaxonomySmall comparative costNeeded to get full value from all the aboveROI – asking the wrong questionWhat is ROI for having an HR department?What is ROI for organizing your company?
15 Ideas – Content Structure Infrastructure Model of Taxonomy Development Taxonomy in Basic 4 ContextsIdeas – Content StructureLanguage and Mind of your organizationApplications - exchange meaning, not dataPeople – Company StructureCommunities, Users, Central TeamActivities – Business processes and proceduresCentral team - establish standards, facilitateTechnology / ThingsCMS, Search, portals, taxonomy toolsApplications – BI, CI, Text Mining
16 Taxonomy in Context Structuring Content All kinds of content and Content StructuresStructured and unstructured, Internet and desktopMetadata standards – Dublin core+Keywords - poor performanceNeed controlled vocabulary, taxonomies, semantic networkOther MetadataDocument TypeForm, policy, how-to, etc.AudienceRole, function, expertise, information behaviorsBest bets metadataFacets – entities and ideasWine.com
17 Taxonomy in Context: Structuring People Individual PeopleTacit knowledge, information behaviorsAdvanced personalization – category prioritySales – forms ---- New Account FormAccountant ---- New Accounts ---- FormsCommunitiesVariety of types – map of formal and informalVariety of subject matter – vaccines, research, scubaVariety of communication channels and information behaviorsCommunity-specific vocabularies, need for inter-community communication (Cortical organization model)
18 Taxonomy in Context: Structuring Processes and Technology Technology: infrastructure and applicationsEnterprise platforms: from creation to retrieval to applicationTaxonomy as the computer networkApplications – integrated meaning, not just dataCreation – content management, innovation, communities of practice (CoPs)When, who, how, and how much structure to addWorkflow with meaning, distributed subject matter experts (SMEs) and centralized teamsRetrieval – standalone and embedded in applications and business processesPortals, collaboration, text mining, business intelligence, CRM
19 Taxonomy in Context: The Integrating Infrastructure Starting point: knowledge architecture audit, K-MapSocial network analysis, information behaviorsPeople – knowledge architecture teamInfrastructure activities – taxonomies, analytics, best betsFacilitation – knowledge transfer, partner with SMEs“Taxonomies” of content, people, and activitiesDynamic Dimension – complexity not chaosAnalytics based on concepts, information behaviorsTaxonomy as part of a foundation, not a projectIn an Infrastructure Context
20 Taxonomy in Context: The Integrating Infrastructure Integrated Enterprise requires both an infrastructure team and distributed expertise.Software and SME’s is not the answer - keywordsTaxonomies not stand aloneMetadata, controlled vocabularies, synonyms, etc.Variety of taxonomies, plus categorization, classification, etc.Important to know the differences, when to use whichMultiple ApplicationsSearch, browse, content management, portals, BI & CI, etc.Infrastructure as Operating SystemWord vs. Word PerfectInstead of sharing clipboard, share information and knowledge.
21 Knowledge Map - Understand what you have, what you are, what you want Infrastructure Solutions: The start and foundation Knowledge Architecture AuditKnowledge Map - Understand what you have, what you are, what you wantThe foundation of the foundationContextual interviews, content analysis, surveys, focus groups, ethnographic studiesCategory modeling – “Intertwingledness” -learning new categories influenced by other, related categoriesNatural level categories mapped to communities, activitiesNovice prefer higher levelsBalance of informative and distinctivenessLiving, breathing, evolving foundation is the goal
22 Knowledge Architect and learning object designers Infrastructure Solutions: Resources People and Processes: Roles and FunctionsKnowledge Architect and learning object designersKnowledge engineers and cognitive anthropologistsKnowledge facilitators and trainers and librariansPart TimeLibrarians and information architectsCorporate communication editors and writersPartnersIT, web developers, applications programmersBusiness analysts and project managers
23 Infrastructure Solutions: Resources People and Processes: Central Team Central Team supported by software and offering servicesCreating, acquiring, evaluating taxonomies, metadata standards, vocabulariesInput into technology decisions and design – content management, portals, searchSocializing the benefits of metadata, creating a content cultureEvaluating metadata quality, facilitating author metadataAnalyzing the results of using metadata, how communities are usingResearch metadata theory, user centric metadataDesign content value structure – more nuanced than good / poor content.
24 Infrastructure Solutions: Resources People and Processes: Facilitating Knowledge Transfer Need for FacilitatorsAmazon hiring humans to refine recommendationsGoogle – humans answering queriesFacilitate projects, KM project teamsFacilitate knowledge capture in meetings, best practicesAnswering online questions, facilitating online discussions, networking within a communityDesign and run KM forums, education and innovation fairsWork with content experts to develop training, incorporate intelligence into applicationsSupport innovation, knowledge creation in communities
25 KM/KA Dept. – Cross Organizational, Interdisciplinary Infrastructure Solutions: Resources People and Processes: Location of TeamKM/KA Dept. – Cross Organizational, InterdisciplinaryBalance of dedicated and virtual, partnersLibrary, Training, IT, HR, Corporate CommunicationBalance of central and distributedIndustry variationPharmaceutical – dedicated department, major place in the organizationInsurance – Small central group with partnersBeans – a librarian and part time functionsWhich design – knowledge architecture audit
26 Infrastructure Solutions: Resources Technology Taxonomy ManagementText and VisualizationEntity and Fact ExtractionText MiningSearch for professionalsDifferent needs, different interfacesIntegration Platform technologyEnterprise Content Management
27 Taxonomy Development: Tips and Techniques Stage One – How to Begin Step One: Strategic Questions – why, what value from the taxonomy, how are you going to use itVariety of taxonomies – important to know the differences, when to use what.Step Two: Get a good taxonomist! (or learn)Library Science+ Cognitive Science + Cognitive AnthropologyStep Three: Software ShoppingAutomatic Software – Fun Diversion for a rainy dayUneven hierarchy, strange node names, weird clustersTaxonomy Management, Entity Extraction, VisualizationStep Four: Get a good taxonomy!Glossary, Index, Pull from multiple sourcesGet a good document collection
28 Infrastructure Solutions: Taxonomy Development Stage Two: Taxonomy Model Enterprise TaxonomyNo single subject matter taxonomyNeed an ontology of facets or domainsStandards and CustomizationBalance of corporate communication and departmental specificsAt what level are differences represented?Customize pre-defined taxonomy – additional structure, add synonyms and acronyms and vocabularyEnterprise Facet Model:Actors, Events, Functions, Locations, Objects, Information ResourcesCombine and map to subject domains
29 Combination of top down and bottom up (and Essences) Taxonomy Development: Tips and Techniques Stage Three: Development and/or CustomizationCombination of top down and bottom up (and Essences)Top: Design an ontology, facet selectionBottom: Vocabulary extraction – documents, search logs, interview authors and usersDevelop essential examples (Prototypes)Most Intuitive Level – genus (oak, maple, rabbit)Quintessential Chair – all the essential characteristics, no moreWork toward the prototype and out and up and downRepeat until dizzy or doneMap the taxonomy to communities and activitiesCategory differencesVocabulary differences
30 Taxonomy Development: Tips and Techniques Stage Four: Evaluate and Refine Formal EvaluationQuality of corpus – size, homogeneity, representativeBreadth of coverage – main ideas, outlier ideas (see next)Structure – balance of depth and widthKill the verbsEvaluate speciation steps – understandable and systematicPerson – Unwelcome person – Unpleasant person - Selfish personAvoid binary levels, duplication of contrastsPrimary and secondary education, public and private
31 Taxonomy Development: Tips and Techniques Stage Four: Evaluate and Refine Practical EvaluationTest in real life applicationSelect representative users and documentsTest node labels with Subject Matter ExpertsBalance of making sense and jargonTest with representative key conceptsTest for un-representative strange little concepts that only mean something to a few people but the people and ideas are key and are normally impossible to find
32 Sources Books Software Courses Women, Fire, and Dangerous Things What Categories Reveal about the MindGeorge LakoffThe Geography of ThoughtRichard E. NisbettSoftwareConvera RetrievalwareInxight Smart Discovery – entity and fact extractionCoursesConvera Taxonomy Certification
33 Conclusion Taxonomy development is not just a project It has no beginning and no endTaxonomy development is not an end in itselfIt enables the accomplishment of many endsTaxonomy development is not just about search or browseIt is about language, cognition, and applied intelligenceStrategic Vision (articulated by K Map) is importantEven for your under the radar vocabulary projectPaying attention to theory is practicalSo is adapting your language to business speak
34 Conclusion Taxonomies are part of your intellectual infrastructure Roads, transportation systems not cars or types of carsTaxonomies are part of creating smart organizationsSelf aware, capable of learning and evolvingThink Big, Start Small, Scale FastIf we really are in a knowledge economyWe need to pay attention to –Knowledge!
35 Questions? Tom Reamy firstname.lastname@example.org KAPS Group Knowledge Architecture Professional Services