Improving Search for Discovery Tom Reamy Chief Knowledge Architect KAPS Group Program Chair – Text Analytics World Knowledge Architecture Professional.

Slides:



Advertisements
Similar presentations
Taxonomy Development An Infrastructure Model Tom Reamy Chief Knowledge Architect KAPS Group Knowledge Architecture Professional Services
Advertisements

Taxonomy & Ontology Impact on Search Infrastructure John R. McGrath Sr. Director, Fast Search & Transfer.
Top Tips Enterprise Content Management Tom Reamy Chief Knowledge Architect KAPS Group Knowledge Architecture Professional Services
Metadata Strategies Alternatives for creating value from metadata Tom Reamy Chief Knowledge Architect KAPS Group Knowledge Architecture Professional Services.
Improving Navigation and Findability Tom Reamy Chief Knowledge Architect KAPS Group Knowledge Architecture Professional Services
Beyond Sentiment New Dimensions for Social Media A Panel Discussion of Trends and Ideas Dave Hills, Twelvefold Media Mike Lazarus, Atigeo, LLC Moderator:
Copyright © 2012, SAS Institute Inc. All rights reserved. #analytics2012 Quick Start for Text Analytics Tom Reamy Chief Knowledge Architect KAPS Group.
Enterprise Information Architecture A Platform for Integrating Your Organization’s Information and Knowledge Activities Tom Reamy Chief Knowledge Architect.
Faceted Navigation: Search and Browse Tom Reamy Chief Knowledge Architect KAPS Group Knowledge Architecture Professional Services
Innovation in Search? Tom Reamy Chief Knowledge Architect KAPS Group Knowledge Architecture Professional Services
Model of Taxonomy Development Tom Reamy Chief Knowledge Architect KAPS Group Knowledge Architecture Professional Services
Semantic Infrastructure Workshop Development Tom Reamy Chief Knowledge Architect KAPS Group Knowledge Architecture Professional Services
Semantic Infrastructure Workshop Development Tom Reamy Chief Knowledge Architect KAPS Group Knowledge Architecture Professional Services
Taxonomy Boot Camp Panel Text Analytics Tom Reamy Chief Knowledge Architect KAPS Group Knowledge Architecture Professional Services
Automatic Facets: Faceted Navigation and Entity Extraction Tom Reamy Chief Knowledge Architect KAPS Group Knowledge Architecture Professional Services.
Copyright © 2011, SAS Institute Inc. All rights reserved. #analytics2011 Text Analytics Evaluation A Case Study: Amdocs Tom Reamy Chief Knowledge Architect.
Beyond Sentiment Mining Social Media A Panel Discussion of Trends and Ideas Marie Wallace, IBM Marcello Pellacani, Expert System Fabio Lazzarini, CRIBIS.
Enterprise Semantic Infrastructure Workshop Tom Reamy Chief Knowledge Architect KAPS Group Knowledge Architecture Professional Services
Beyond Sentiment Mining Social Media Tom Reamy Chief Knowledge Architect KAPS Group Knowledge Architecture Professional Services
Facets and Faceted Navigation Development Tom Reamy Chief Knowledge Architect KAPS Group Knowledge Architecture Professional Services
Expanding Enterprise Roles for Librarians Tom Reamy Chief Knowledge Architect KAPS Group Knowledge Architecture Professional Services
Text Analytics Workshop Development Tom Reamy Chief Knowledge Architect KAPS Group Knowledge Architecture Professional Services
Best of Both Worlds Text Analytics and Text Mining Tom Reamy Chief Knowledge Architect KAPS Group Knowledge Architecture Professional Services
Selecting Taxonomy Software Who, Why, How Tom Reamy Chief Knowledge Architect KAPS Group Knowledge Architecture Professional Services
Building a Foundation for Info Apps Tom Reamy Chief Knowledge Architect KAPS Group Program Chair – Text Analytics World Knowledge Architecture Professional.
Enterprise Search/ Text Analytics Evaluation Tom Reamy Chief Knowledge Architect KAPS Group Knowledge Architecture Professional Services
Text Analytics And Text Mining Best of Text and Data
Best of All Worlds Text Analytics and Text Mining and Taxonomy Tom Reamy Chief Knowledge Architect KAPS Group Knowledge Architecture Professional Services.
New Directions in Social Media Tom Reamy Chief Knowledge Architect KAPS Group
SemTech Text Analytics Evaluation Tom Reamy Chief Knowledge Architect KAPS Group Knowledge Architecture Professional Services
Text Analytics and Taxonomies Tom Reamy Chief Knowledge Architect KAPS Group
Smart Text How to Turn Big Text into Big Data Tom Reamy Chief Knowledge Architect KAPS Group Program Chair – Text Analytics World.
Adding Semantics to Enterprise Search Workshop
Integrating an Enterprise Taxonomy with Local Variations Tom Reamy Chief Knowledge Architect KAPS Group Program Chair – Text Analytics World Knowledge.
Applying Semantics to Search Text Analytics Tom Reamy Chief Knowledge Architect KAPS Group Enterprise Search Summit New York.
Text Analytics Workshop Applications Tom Reamy Chief Knowledge Architect KAPS Group Knowledge Architecture Professional Services
Taxonomy and Social Media Social Taxonomies Tom Reamy Chief Knowledge Architect KAPS Group Program Chair – Text Analytics World Knowledge Architecture.
Content Categorization Tools Taxonomies & Technologies for Infrastructure Solutions Tom Reamy Chief Knowledge Architect KAPS Group Knowledge Architecture.
Text Analytics Summit Text Analytics Evaluation Tom Reamy Chief Knowledge Architect KAPS Group Knowledge Architecture Professional Services
Text Analytics Software Choosing the Right Fit Tom Reamy Chief Knowledge Architect KAPS Group Text Analytics World October 20.
New Directions in Social Media Tom Reamy Chief Knowledge Architect KAPS Group
Metadata and Taxonomies The Best of Both Worlds Tom Reamy Chief Knowledge Architect KAPS Group Knowledge Architecture Professional Services
Integrating an Enterprise Taxonomy with Local Variations Tom Reamy Chief Knowledge Architect KAPS Group Taxonomy Boot Camp.
Text Analytics Mini-Workshop Quick Start Tom Reamy Chief Knowledge Architect KAPS Group Program Chair – Text Analytics World Knowledge Architecture Professional.
Enterprise Semantic Infrastructure Workshop Tom Reamy Chief Knowledge Architect KAPS Group Knowledge Architecture Professional Services
Folksonomy Folktales Tom Reamy Chief Knowledge Architect KAPS Group Knowledge Architecture Professional Services
Selecting Taxonomy Software Who, Why, How Tom Reamy Chief Knowledge Architect KAPS Group Knowledge Architecture Professional Services
Text Analytics Workshop Tom Reamy Chief Knowledge Architect KAPS Group Knowledge Architecture Professional Services
Advanced Semantics and Search Beyond Tag Clouds and Taxonomies Tom Reamy Chief Knowledge Architect KAPS Group Knowledge Architecture Professional Services.
Text Analytics for Search Applications Workshop Tom Reamy Chief Knowledge Architect KAPS Group Knowledge Architecture Professional Services
Text Analytics A Tool for Taxonomy Development Tom Reamy Chief Knowledge Architect KAPS Group Program Chair – Text Analytics World Knowledge Architecture.
Text Analytics Workshop Applications Tom Reamy Chief Knowledge Architect KAPS Group Knowledge Architecture Professional Services
Text Analytics Workshop Tom Reamy Chief Knowledge Architect KAPS Group Program Chair – Text Analytics World Knowledge Architecture Professional Services.
Taxonomy and Text Analytics Case Studies Tom Reamy Chief Knowledge Architect KAPS Group Knowledge Architecture Professional Services
Taxonomy Development An Infrastructure Model Tom Reamy Chief Knowledge Architect KAPS Group Knowledge Architecture Professional Services
Deep Text New Approaches in Text Analytics and Knowledge Organization Tom Reamy Chief Knowledge Architect KAPS Group Author: Deep.
Text Analytics World Future Directions of Text Analytics: Smarter, Bigger, and Better Tom Reamy Chief Knowledge Architect KAPS Group Program Chair – Text.
Text Analytics Webinar
Tom Reamy Chief Knowledge Architect KAPS Group
Text Analytics Tutorial
Text Analytics Workshop
Tom Reamy Chief Knowledge Architect KAPS Group
Combining Taxonomy, Ontology, Text, and Data A Deep Text Approach
Enterprise Social Networks A New Semantic Foundation
Program Chair: Tom Reamy Chief Knowledge Architect
Taxonomies, Lexicons and Organizing Knowledge
Using Text Analytics to Spot Fake News
Text Analytics Workshop: Introduction
Program Chair: Tom Reamy Chief Knowledge Architect
Expertise Location Basic Level Categories
Presentation transcript:

Improving Search for Discovery Tom Reamy Chief Knowledge Architect KAPS Group Program Chair – Text Analytics World Knowledge Architecture Professional Services

Improving Search for Discovery and Everything Else Tom Reamy Chief Knowledge Architect KAPS Group Program Chair – Text Analytics World Knowledge Architecture Professional Services

3 Agenda  Introduction  What is Wrong With Search?  What Works? – Metadata & taxonomies – Infrastructure / Information Life Cycle  Yes, But – – Missing Link - Text Analytics – Search and Beyond  Conclusion

4 Introduction: KAPS Group  Knowledge Architecture Professional Services – Network of Consultants  Applied Theory – Faceted taxonomies, complexity theory, natural categories, emotion taxonomies  Services: – Strategy – IM & KM - Text Analytics, Social Media, Integration – Taxonomy/Text Analytics development, consulting, customization – Text Analytics Quick Start – Audit, Evaluation, Pilot – Social Media: Text based applications – design & development  Partners: Smart Logic, Expert Systems, SAS, SAP, IBM, FAST, Concept Searching, Attensity, Clarabridge, Lexalytics  Clients: Genentech, Novartis, Northwestern Mutual Life, Financial Times, Hyatt, Home Depot, Harvard Business Library, British Parliament, Battelle, Amdocs, FDA, GAO, World Bank, etc.  Presentations, Articles, White Papers –  Program Chair – Text Analytics World

5 Improving Search for Discovery  They  Won’t  Work!

6 Improving Search for Discovery Why Won’t It Work?  Search Engines are Stupid! – (and people have better things to do)  Documents deal in language BUT it’s all chicken scratches to Search  Relevance – requires meaning – Imagine trying to understand what a document is about in a language you don’t know  Mzndin agenpfre napae ponaoen afpenafpenae timtnoe. – Dictionary of chicken scratches (variants, related) – Count the number of chicken scratches = relevance - Not  Google = popularity of web sites and Best Bets – For documents in an enterprise – Counting and Weighting

7 Improving Search for Discovery Why Won’t It Work?  Option – Add metadata – good for archiving & indexing  Keywords – don’t scale – Pilots or small doc set and many authors – Folksonomies don’t really work  Tagging – Governance – Thou Shalt Tag! – No they won’t or really badly  Add taxonomies – beautiful to behold, but gap between taxonomy and documents – and too complex for authors  Power Search – statistical signature of a document – apply all kinds of math = Find Similar!  Not trashing search, but just want to say: – Survey Says – Users Unhappy with Search – Text Analytics is (part of) the answer

8 Semantic Infrastructure Text Analytics Features  Text Mining – NLP, machine learning, complex statistics  Noun Phrase Extraction – Feed facets – People, Organizations, Dates, Geographic, Methods, etc. – Catalogs with variants, rule based dynamic.  Sentiment Analysis – Positive and Negative Phrases – Dictionaries & rules – “I hate your product”  Summarization – replace snippets  Ontologies – fact extraction + reasoning about relationships  Auto-categorization – built on a taxonomy – Training sets, Terms, Semantic Networks – Rules: AND, OR, NOT, DIST, PARAGRAPH, SENTENCE – Foundation – subjects, disambiguation, add intelligence to all

Case Study – Categorization & Sentiment 9

Improving Search Adding Meaning and Structure  Text Analytics and Taxonomy Together – Text Analytics provides the power to apply the taxonomy – And metadata of all kinds – Consistent in every dimension, powerful and economic  Hybrid Model – Publish Document -> Text Analytics analysis -> suggestions for categorization, entities, metadata - > present to author – Cognitive task is simple -> react to a suggestion instead of select from head or a complex taxonomy – Feedback – if author overrides -> suggestion for new category – Facets – Requires a lot of Metadata - Entity Extraction feeds facets  Hybrid – Automatic is really a spectrum – depends on context – Automatic – adding structure at search results 10

11 Improving Search Adding Meaning and Structure  Documents are not unstructured – they have a variety of structures  Categorization by page, sections (text markers) or even sentence or phrase  Use generic components – like the level of generality of terms or concepts (general and context specific)  Additional metadata - document types-purpose, authors  Relevance – complex rules – based on structure (intelligent use of titles, headlines, sections + complex categorization

12 Improving Search Document Type Rules  (START_2000, (AND, (OR, _/article:"[Abstract]", _/article:"[Methods]“), (OR,_/article:"clinical trial*", _/article:"humans",  (NOT, (DIST_5, (OR,_/article:"approved", _/article:"safe", _/article:"use", _/article:"animals"),  If the article has sections like Abstract or Methods  AND has phrases around “clinical trials / Humans” and not words like “animals” within 5 words of “clinical trial” words – count it and add up a relevancy score  Primary issue – major mentions, not every mention – Combination of noun phrase extraction and categorization – Results – virtually 100%

13 Need One More Piece: Smart Semantic Infrastructure  Integrate entire information life cycle & environment  Semantic Layer = Content, Taxonomies, Metadata, Vocabularies + Text Analytics – Integrated / Federated Search – all content  Technology Layer – Search, Content Management, SharePoint, Intranets  People – communities (formal and dynamic), business processes (embedded information needs and behaviors)  Publishing process – Hybrid human automatic structure (tagging)  Feedback is essential – direct user comments to deep analytics

Search Can Work!  Simple Subject Taxonomy structure – Easy to develop and maintain  Combined with categorization capabilities – Added power and intelligence  Combined with Faceted Metadata – Dynamic selection of simple categories – Allow multiple user perspectives Can’t predict all the ways people think Monkey, Banana, Panda  Combined with ontologies and semantic data – Multiple applications – Text mining to Search  Combined with feedback before and after Search  ROI is enormous - $7M per 1,000 employees a year 14

15 Enterprise Text Analytics Building on the Foundation: Applications  Focus on business value, cost cutting  Enhancing information access is means, not an end – Governance, Records Management, Doc duplication, Compliance – Business Intelligence, CI, Behavior Prediction – eDiscovery, litigation support, Risk Management – Productivity / Portals -KM communities & knowledge bases  Sentiment Analysis, Social Media Analysis – Adding Search-based intelligence – context – New taxonomies – emotion, Appraisal

16 Beyond Search: Info Apps Search-based Applications Plus  Legal Review – Significant trend – computer-assisted review – TA- categorize and filter to smaller, more relevant set – Payoff is big – One firm with 1.6 M docs – saved $2M  Expertise Location – Data (HR) plus text – authored documents – subject & level  Financial Services – Combine structured data (what) and unstructured text (why) – Anti-Money Laundering

17 Beyond Search: Info Apps Behavior Prediction – Telecom Customer Service  Problem – distinguish customers likely to cancel from mere threats  Basic Rule – (START_20, (AND, (DIST_7,"[cancel]", "[cancel-what-cust]"), – (NOT,(DIST_10, "[cancel]", (OR, "[one-line]", "[restore]", “[if]”)))))  Examples: – customer called to say he will cancell his account if the does not stop receiving a call from the ad agency. – cci and is upset that he has the asl charge and wants it off or her is going to cancel his act  More sophisticated analysis of text and context in text  Combine text analytics with Predictive Analytics and traditional behavior monitoring for new applications

18 Beyond Search: Info Apps Pronoun Analysis: Fraud Detection - Enron s  Patterns of “Function” words reveal wide range of insights  Function words = pronouns, articles, prepositions, conjunctions, etc. – Used at a high rate, short and hard to detect, very social, processed in the brain differently than content words  Areas: sex, age, power-status, personality – individuals and groups  Lying / Fraud detection: Documents with lies have – Fewer and shorter words, fewer conjunctions, more positive emotion words – More use of “if, any, those, he, she, they, you”, less “I” – More social and causal words, more discrepancy words  Current research – 76% accuracy in some contexts  Text Analytics can improve accuracy and utilize new sources

19 Conclusions  Traditional Search improvements – nice, but  Relevance needs meaning, Keyword and human tagging don’t work  Search + Text Analytics + Semantic Infrastructure work  Text Analytics THE essential component of a multi-modal solution  Semantic Infrastructure – Content, People, Technology, Processes – Integration of text analytics, search, content management – Hybrid Model of tagging – best of human & machine  Smart Search as foundation for new universe of Apps  = Success beyond your wildest dreams!

20 Conclusions  Now You Believe!  So, what next – how can you get started?  Quick Start – software evaluation, Knowledge Map, POC or Pilot = Good choice and Learn by doing  Fall – Attend ESS, TBC, KMWorld – latest ideas  Or develop a time machine and go back to yesterday and take my workshop  Fall 2014 – early 2015: New Book: – Text Analytics: Everything You Need to Know to Conquer Information Overload, Mine Social Media for Real Value, and Turn Big Text Into Big Data – Title might be shorter but it will be cover all you need to know

Questions? Tom Reamy KAPS Group Knowledge Architecture Professional Services