Presentation is loading. Please wait.

Presentation is loading. Please wait.

Robert Patt-Corner Senior Principal Scientist Knowledge Management Practice Mitretek Systems 703-610-1730 (C) 2001 Mitretek Systems,

Similar presentations

Presentation on theme: "Robert Patt-Corner Senior Principal Scientist Knowledge Management Practice Mitretek Systems 703-610-1730 (C) 2001 Mitretek Systems,"— Presentation transcript:


2 Robert Patt-Corner Senior Principal Scientist Knowledge Management Practice Mitretek Systems 703-610-1730 (C) 2001 Mitretek Systems, Inc. ALL RIGHTS RESERVED Knowledge Management Technical Architecture Framework 5/2/2002

3 2ControlNumber Implementing Knowledge Management: An Overall Architecture and Framework This half day workshop lays out a comprehensive layered architecture for tacit and explicit KM systems, orienting each layer to both technical and business functions. Existing commercial and in-house developed offerings are mapped to the various layers so that a clear picture of cross-vendor integration possibilities is available. Issues in technical implementation, cultural barriers and opportunities and case studies are presented to illustrate the overall framework. Robert Patt-Corner, Senior Principal Scientist Knowledge Management, Mitretek Systems

4 3ControlNumber Knowledge Management Capabilities Technical (geek) scale for this talk … Ordinary Business English Compressed Technospeak (TLA-land) Some Specific Knowledge of Core Technologies

5 4ControlNumber Knowledge Management Capabilities An Enabling Technology to help organizations “know what they know” A System that networks people, organizations and documents to solve business problems Collaboration Content Structuring Expertise Location and Skills Mining A Set of Components that can be installed into existing systems

6 5ControlNumber Realizing Capabilities thru Technical Architecture Identify logical layers and interfaces Map the layers and interfaces –To existing pools of content –To existing and proposed business processes –To existing technologies with high investment Perform gap analysis within and between layers Evaluate and prototype products Determine implementation and leadership –Needs central coordination –Needs business sponsorship –Needs time to beat the Metcalfe curve

7 6ControlNumber KM Metcalfe Curve Designed to model network utilization Maps well to KM usage Get past the knee by various strategies, esp. pre-seeding Usage takes off from here!

8 7ControlNumber Without Technical Architecture Negative Outcomes Too much talk and touting of favorite technology solutions Vendors marketing clients directly –Leading to fragmentation or local optimization Too many meetings at too low a level resulting in: –No action at all or –Too much uncoordinated action leading to Fragile systems Fragmented systems Expensive and unmaintainable systems

9 8ControlNumber Knowledge Management Conceptual Architecture Categorization Retrieval Presentation Content Stores Mining Abstraction and Association Publishing Access (Re)Use (Re) Organize Capture Create (Re) Organize Capture Federation

10 9ControlNumber Architecture and Security External Partners Internal Domain External Domain Categorization Retrieval Presentation Content Stores Mining Abstraction and Association Publishing Federation

11 10ControlNumber Content Stores The Gold Mine of Knowledge Explicit Resources, including... –Document Stores –Mail Stores –Directories –Web pages –Transactional Reports –Systems and Databases Pointers to tacit Knowledge Resources … –People who have tacit knowledge in an area –Relationships relevant to an area –Interest profiles and customer/user interactions –Clickstream data and affinities Content Stores

12 11ControlNumber Content Stores Managing the Mine Why manage content? –For security, survivability and due diligence But it’s getting harder … –Archival discipline challenged by the dynamic web Direct content management using either –Hands-on management –Ubiquitous archival agent –Commercial content management solutions and federation/syndication Indirect content management by catalog –Catalog knows what is in the store –Catalog can track when it disappears –Synchronization of inserts, deletes and updates an issue. Direct via common tool Indirect via a taxonomic or other crawl Content Stores

13 12ControlNumber Federation Core Functions Normalized Resources from the content stores, often in XML format, sometimes abstracts Reference Data and Relationships to describe the normalized resources, taxonomies, etc. A Programming Interface –Access the normalized resources –Retrieve the original resource if abstracted –Apply business rules –Increasingly in the form of EJB or other access object Federation

14 13ControlNumber Federation Extended Functions Classification information on normalized resources –Many attributes per entry -- theme, author, URL, etc. –Many attribute groupings per entry Associations between entries Person to Document Report and Document to Project Document to Document Abstract to document Security interfaces Federation

15 14ControlNumber Automation Methods Information Warehouse Federation Components Catalog Entries TITLE= Helping Kids Learn UNID=233865 Country = Fiji Sector = Education Topic= Human Development Topic=Econ. of Education Author = Robert Patt-Corner Creator=Jeanne Doe Date Created = 12/1/96 Last Modified 3/1/99... Helping Kids Learn Reference Data Projects By Country Taxonomy Vectors Country Vector: Pacific Rim/Islands/Fiji Theme Vector: Human Development/Education/Economics of Education Theme Vector: Economics/Education Author Vector: Staff/Robert Patt-Corner Helping Kids Learn

16 15ControlNumber Categorization… Needles in a Haystack Start with… Apply Content Structuring to… Aiding user to… … a large unstructured collection Official Multiagency FilesThe Web Categorize and present … identify key results Categorization

17 16ControlNumber Classification... Multiple Axes and Bins Taxonomic classifications for structured “knowledge bins”, ideally: –Orthogonal – Can classify in multiple unrelated ways –Dynamic – Can create new classification axes on the fly –Constrained – Can manage choices to keep values meaningful “Bins” serve different purposes –Retrieval –Management –Security … “Bins” must be well-mapped to business process “Bins” can be derived from data by visualization and analytical tools Categorization

18 17ControlNumber Categorization Tagging and Binning Apply tags to the federated resources Use the tags to navigate to similar / desired items “Tagging” or “Binning” is process of: –Determining the correct tag –Registering / applying the correct tag Originally manual – Library Science –Currently done incrementally in batch –Recently done in realtime –Still Library Science Categorization

19 18ControlNumber Categorization Ontologies and Taxonomies Continent Laos Korea North Korea Asia Europe Ontologies: From “ontology”, study of being or existence Nested set of entities that pass the “is-a” test Taxonomies … sections of Ontologies Animals Felis Canis Familiaris Mammal Reptile Snake Is - a Categorization

20 19ControlNumber Animals Felis Canis Familiaris Mammal Reptile Snake Categorization Categories and Views Categories – Intersection or Union of Taxonomy bins –Dogs AND Japan –CounterTerror AND Tularemia Views – a meaningful arrangement of categories –Need not pass the “is—a test”  Dogs of Japan  Akita, Dog of the Emperors  …  Bacteria native to Afghanistan Continent Laos Japan Korea North Korea Asia Europe Categorization

21 20ControlNumber Categorization Taxonomies and Views CounterTerror Suspects –Expired student visa [Leaf of Bad Visa] –Pilot School [Leaf of International Student::Pilot School] Defunded State Clinics –Fund Cuts [Leaf of Budget Legislation] –Health Clinics [Leaf of State Agencies] Taxonomy is sometimes loosely used to described an Ontology, sometimes a View Ontologies used to build the Categories you select for your Business View or Personal View Intersections reduce ambiguity in Taxonomy leaves –Brazil NUT -- Brazil COUNTRY Categorization

22 21ControlNumber Categorization... Taxonomy Building Techniques Categorization Starter Taxonomies –Industry specific taxonomies (Semio Cartridges) –Prebuilt latching terms and clusters Manual Knowledge Engineering –Analysis of phrases and documents –Assignment to taxonomic leaves Manipulation of Clusters –Taxonomy engine makes a “best guess” pass at training set –Experts rename and restructure the surfaced clusters Category-specific training sets (Inxight) On-the-fly categorization (Vivissimo, Inxight) Relies heavily upon linguistic analysis –Identifying meaning by part of speech, clustering –Position in content indicating importance –By word structure (SEMIO, IBM, Inxight/ORACLE) or bitpattern (Autonomy)

23 22ControlNumber Categorization... Clustering Results in Real Time Categorization

24 23ControlNumber Resource Categorization Security Management Ownership Theme Association Valuation Resource Categorization

25 24ControlNumber Categorization... Multiple Axes/Bins for Multiple Access Categorization Business Area = Marketing Expertise = Customer Relation Mgt. Ownership = Dep’t 4523 Valuation=Best Practice/Internal Associations: DocID=223 (CRM White Paper) DocID=227 (CRM Surveys) DocID=2245 (CRM V1 Lessons Learned EmpID=3334 (Jane Doe, Author) EmpID=4325 (John Navarian, Interest) Read=Internal Use Only

26 25ControlNumber Many Categories are Multidimensional Take Valuation... Initially often binary -- Information and Knowledge Actually a multidimensional step function Utility Valuation Policy Final Form Currency Categorization

27 26ControlNumber Categorization Conflicts and Synergies Unproductive associations between: –Valuation and Security -- knowledge islands –Theme and Ownership -- turf wars I know the most about structure of environment Therefore all environment publishing must go through me! Productive Synergies between: –Association and Theme -- identifying bridges between themes –Security and Ownership -- business integrity Categorization

28 27ControlNumber Retrieval Full text Indexing and Retrieval Indexing content stores, federation layer or both Most text types accomodated, streaming media coming Unaided query, or assisted via –Natural language processing and query formulation –Search across multiple engines –Fuzzy search for OCR’d text and other typos –Term expansion via thesaurus, e.g. RUSSIA=USSR Summarization via –Hand-input abstracts –Relevance-ranked extracted sentences Ranking via –Location and frequency of keywords (AltaVista) –Site popularity or references(Google) –Reviewer schemas(Yahoo) –On-the-fly categorization(Vivissimo, Inxight, SEMIO) Retrieval

29 28ControlNumber Retrieval Browsing and Retrieval Classification assigned unambiguous tags Retrieval must be tuned to make business sense –Education : Gender and Workplace : Gender Does Education include Gender (and all subs)? –If so, will find non-Educational gender content –If not, how do you aggregate all Gender content? –Good system will handle both cases, vectors help Tags can be used to cluster results of a full search –Northern Lights –Vivissimo –Inxight Tags themselves can be used as a query return –Find “Sex Education”, get link to Educ/Gender Browsing as a disguised query Retrieval

30 29ControlNumber Design – Vera and Lewis Retrieval Retrieval Knowledge Centric Search / Actionable Searches Knowledge Portal Search Primary Interface to  Build Communities  Launch Applications and Sessions  Send notifications Allows a variety of actions on each search hit Allows various ranking schemas Provides insight into expertise and confidence

31 30ControlNumber Presenting Resources Often extremely difficult to obtain consensus Clear stakeholders ease the process Separate from other layers to ease evolution Current capabilities allow –Drill down narrowing categories –Pivoting results Topics in a country Countries represented in a topic –Increasingly graphical displays Portal views with varying degrees of federation Predictive presentation based on profiles and/or behaviour Presentation

32 31ControlNumber Presenting Resources “Garden Variety Presentation Presentation Categorized Browsing Browse and View Mostly Search Post-Search Taxonomy Advanced Search MultiSearch

33 32ControlNumber Presentation Tools A Knowledge Mapping Starfield SEMIO Presentation

34 33ControlNumber Presentation Tools A Knowledge Mapping Hyperbolic Tree Inxight Presentation

35 34ControlNumber Presentation Tools A Knowledge Mapping Topography Presentation

36 35ControlNumber Presentation Tools A Topic Map Derived from a Taxonomy Presentation

37 36ControlNumber Presentation Tools TextArc – Mapping a Text Presentation

38 37ControlNumber Presentation Tools A Production Portal Presentation

39 38ControlNumber Presentation Tools A Production Portal – Common Categorization Presentation

40 39ControlNumber Presentation Tools Portal with Federation Presentation

41 40ControlNumber Publishing Resources Explicit Publishing – Putting Knowledge in the “Bins” Manual classification rapidly becoming outdated –Should mirror the retrieval process in structure –Ubiquitous tool or process When classification is used … –Authors will avoid it –Retrievers will be glad when it is done! –So … Limit author-required fields … rule of thumb=3 Intelligent defaults, for Name, Department, Date… Publication

42 41ControlNumber Mining Resources Implicit Publication Predefined metadata mapping –ERP Transactions to Documents by Project Dynamic classification Less obvious sources –High frequency of searches –Discovering hidden expertise and interests –Associating like resources if permitted … Email threads Clickstream data on documents or taxonomies –Refining interest profiles Assigned, Personal and Implied interests Mined from documents and from profiles Group profiles for “Knowledge Grids” Mining

43 42ControlNumber Connecting resources to the taxonomy or to each other –Documents to Documents -- Clustering –People to People – Expertise Location and Collaboration Provided by layered products such as: –Autonomy KnowledgeServer … document to document –Inxight Categorizer … document to document –Lotus K-Station … interest mining –Lotus “SameTime” … place based awareness –Autonomy KnowledgeServer … show similar agents –ThirdVoice place-based awareness showing comment overlays (deceased) –Semio Map (Cluster Analysis) Associating Resources Association

44 43ControlNumber Associating Resources Document to Document Association

45 44ControlNumber Associating Resources Person to Document, Person to Person Association Derives categories from public documents Discovers people from document metadata Discovers taxonomic activity from document use Discovers affinities from peoples categories related to their documents

46 45ControlNumber Associating Resources Person to Document, Person to Person Source: Public Lotus Website, LotusPhere 2000, not composite Directories are mapped and mined for people information Documents are indexed to locate derived affinities Interest profiles declare or assign affinities Privacy control approves or declines affinities

47 46ControlNumber Associating Resources Person to Document, Person to Person Association Taxonomies link  Documents  People  Workspaces (composite shot to illustrate taxonomy and entries together from public Lotus Website)

48 47ControlNumber Associating Resources People to People Place Based Awareness Association ThirdVoice Software Comments are visible as overlay icons Comments made visible on request

49 48ControlNumber Associating Resources People to People Place Based Awareness Association SameTime Place-Based Awareness Realtime participants made visible on per page basis Available for realtime audio or text collaboration

50 49ControlNumber KM Technology Availability Content Storage: File Systems, Domino, Web, Database BLOBS, Doc. Mgt., Content Management Repositories Federation: IBM Portal, Domino Domain Catalog, Broadvision, Documentum, Semio Boardwalk, Stratify Repository Classification: Semio, Verity, IBM/Lotus Discovery Server, Inxight Categorizer, Autonomy, Vivissimo, Stratify Retrieval: Google, Verity, Domino, Inktomi, AltaVista, RetrievalWare … Presentation: Domino, K-Station, Plumtree, Semio, Verity, Hummingbird, Inxight, the Brain, TextArc Mining: IBM/Lotus Discovery Server, Vivissimo, Inxight, Semio, Stratify Autonomy

51 50ControlNumber Case Study Large Development Multilateral Over 10,000 internal clients and a worldwide mission Dedicated to knowledge sharing and collaboration Distributed development and governance with multiple incompatible resource repositories –Individual Domino databases –Legacy file-based web sites –Large official repositories of images Central mandate to pull information and knowledge together Ongoing ERP project to bring together transactional systems

52 51ControlNumber Early Manual Implementation Ubiquitous publishing tool based on ActiveX Direct global publishing of resource by author From within Domino for Domino documents –From within Domino to point to file URL’s –Can be added to web, office, etc. Information Warehouse Catalog providing –Consistent reference data and catalog –Connection to ERP data warehouse Retrieval engine can be replaced or removed Presentation flexible, not tied to infrastructure

53 52ControlNumber Content Storage Publication Content Stores

54 53ControlNumber Publication Process Publish, with Defaulting Publication

55 54ControlNumber Classification Publication Mirrors Presentation Classification

56 55ControlNumber Retrieval Basic Concept Search Advanced Concept Fielded Boolean Search Retrieval

57 56ControlNumber Presentation Rotation of Axes on Demand Presentation

58 57ControlNumber Case Study Corporate Intranet 1000 clients in web-only environment Ease of implementation and configuration a key Integration with expertise and document mining, place based awareness Reconfigurable for multiple clients Domino Databases Semio Java Semio Taxonomy IBM Full Text,, Domino IBM WebSphere / J2EE Agents and Interest Profiles Semio Domino

59 58ControlNumber Case Study Corporate Intranet Federation and Classification Originally a Domino KM Implementation Manual categorization, web only interface Later…  Integrated Categorization  Notes Client interfaces  Basic document mgt  Mail-in Catagorization Notes API Java / IIOP SEMIO Tagger Taxonomy XML Attributes NSF Notes API SEMIO Tagger Taxonomy XML Attributes NSF XSLT / XPATH DXL XML Updates New Process

60 59ControlNumber Case Study Multiple Presentation Layers Community Interface Collaboration focus Manual Classification Project orientation Document Management features

61 60ControlNumber Case Study Multiple Presentation Layers Taxonomy Interface Retrieval Focus Automated classification and summarization Corporate Orientation

62 61ControlNumber Case Study Multiple Presentation Layers Document Management Interface Business Management Focus Manual Summaries Departmental Orientation

63 62ControlNumber Case Study Knowledge-Centric Search People Documents

64 63ControlNumber Case Study Knowledge-Centric Search Portals 1.Select and Search for People and documents 2. Seed and Create Community Area

65 Issues in Knowledge Management Architecture

66 65ControlNumber Current KM Capabilities Beyond Hierarchical Filing and Free-text Search Human managed content structuring –Structures can be … Directly related to the business Managed with unique values Orthogonal, crossed and combined Managed, dynamic, just-in-time –Search engines can retrieve based on structures –Expand queries by fuzzy search, concept and thesaurus Infrastructure-independent presentation –Catalogs can be separate from content storage –Search engines can be separate from Catalogs –Presentation can be separate from engines –Security can be separate from all the above

67 66ControlNumber Current KM Capabilities Combining Techniques Need capabilities to merge and integrate multiple approaches Content Structuring Example: –Search external sites with a boolean engine –Apply pattern matching algorithms to known site pages to extract metadata –Apply full text content structuring that respects the metadata –Mix and match the results for personalized categories

68 67ControlNumber Developing and Future Capabilities of KM Web Services for mining and categorization Categorization schemas Mining of media streams and facial recognition Fine-tuned summarization Case-based reasoning and inference Improved presentation paradigms and standardization of presentation interface Improved standards for exchange (XML schemas) Interest profiling and merge of explicit/implicit

69 68ControlNumber Current Limitations in Knowledge Management Technologies Absence of standards and tools in the areas of: –Metadata management, Attribute Mapping XML has won, but schemas evolving Topic Maps a key area to watch –Thesaurus standards and tools – ISO but hard to get! Human indexing or taxonomy definition required for meaningful classification –2-3 attribute UI barrier for creators –Taxonomy creation requires at least 1 technical expert and “n” subject matter experts –Tag differences lead to proliferation of incompatible portals Limited presentation capabilities of HTML, “Netscaping” of Java

Download ppt "Robert Patt-Corner Senior Principal Scientist Knowledge Management Practice Mitretek Systems 703-610-1730 (C) 2001 Mitretek Systems,"

Similar presentations

Ads by Google