Presentation is loading. Please wait.

Presentation is loading. Please wait.

Text Analytics And Text Mining Best of Text and Data

Similar presentations


Presentation on theme: "Text Analytics And Text Mining Best of Text and Data"— Presentation transcript:

1 Text Analytics And Text Mining Best of Text and Data
Tom Reamy Chief Knowledge Architect KAPS Group Knowledge Architecture Professional Services

2 Agenda Text Analytics Capabilities Text Analytics Applications
Text Mining and Text Analytics Data and Unstructured Content Case Study – Text Mining for Taxonomy Development Conclusion

3 KAPS Group: General Knowledge Architecture Professional Services
Virtual Company: Network of consultants – 8-10 Partners – SAS, Smart Logic, Microsoft-FAST, Concept Searching, etc. Consulting, Strategy, Knowledge architecture audit Services: Text Analytics evaluation, development, consulting, customization Knowledge Representation – taxonomy, ontology, Prototype Metadata standards and implementation Knowledge Management: Collaboration, Expertise, e-learning Applied Theory – Faceted taxonomies, complexity theory, natural categories

4 Introduction to Text Analytics Text Analytics Features
Noun Phrase Extraction Catalogs with variants, rule based dynamic Multiple types, custom classes – entities, concepts, events Feeds facets Summarization Customizable rules, map to different content Fact Extraction Relationships of entities – people-organizations-activities Ontologies – triples, RDF, etc. Sentiment Analysis Statistical, rules – full categorization set of operators

5 Introduction to Text Analytics Text Analytics Features
Auto-categorization Training sets – Bayesian, Vector space Terms – literal strings, stemming, dictionary of related terms Rules – simple – position in text (Title, body, url) Semantic Network – Predefined relationships, sets of rules Boolean– Full search syntax – AND, OR, NOT Advanced – NEAR (#), PARAGRAPH, SENTENCE This is the most difficult to develop Build on a Taxonomy Combine with Extraction, Sentiment Foundation for best text analytics & combination

6

7

8

9

10

11

12 Varieties of Taxonomy/ Text Analytics Software
Taxonomy Management Synaptica, SchemaLogic Full Platform SAS-Teragram, SAP-Inxight, Smart Logic, Data Harmony, Concept Searching, Expert System, IBM, GATE Content Management – embedded Embedded – Search FAST, Autonomy, Endeca, Exalead, etc. Specialty Sentiment Analysis , VOC – Lexalytics, Attensity / Reports Ontology – extraction, plus ontology

13 Text Analytics Applications Platform for Multiple Applications
Content Aggregation, Duplicate Documents – save millions! Business intelligence, Customer Intelligence Social Media - sentiment analysis, Voice of the Customer Social – Hybrid folksonomy / taxonomy / auto-metadata Social – expertise, categorize tweets and blogs, reputation Ontology – travel assistant, semantic web, etc. eDiscovery, Reputation management, Customer Experience Expertise Location, Crowd sourcing Technical support

14 Text Analytics Applications: Enterprise Search - Elements
Text Analytics can “solve” enterprise search Multiple Knowledge Structures Facet – orthogonal dimension of metadata Taxonomy - Subject matter / aboutness Software - Search, ECM, auto-categorization, entity extraction, Text Analytics and Text Mining People – tagging, evaluating tags, fine tune rules and taxonomy Rich Search Results – context and conversation Platform for search based applications

15

16

17 Text Analytics and Text Mining Data and Unstructured Content
80% of content is unstructured – adding to semantic web is major Text Analytics – content into data Big Data meets Big Content Real integration of text and ontology Beyond “hasDescription” Improve accuracy of extracted entities, facts – disambiguation Pipeline – oil & gas OR research / Ford Add Concepts, not just “Things” – 68% want this Semantic Web + Text Analytics = real world value Linked Data + Text Analytics – best of both worlds Build superior foundation elements – taxonomies, categorization

18 Combine with Data Mining New sources of information
Text Analytics and Text Mining and Data Mining Vaccine Adverse Reaction Combine with Data Mining New sources of information News stories, medical records Blogs, social Find new connections, sources of knowledge Vaccine Adverse Effects – disease, symptoms, variables Unstructured text into a data source Some preliminary analysis, content structure Find unknown adverse effects and prevalence Drug Discovery + search / research – 5 year story

19 Text Analytics Applications Example – Vaccine Adverse Effects

20 Text Analytics Applications Example – Vaccine Adverse Effects

21 Text Analytics Applications Example – Vaccine Adverse Effects

22 Text Analytics and Text Mining Case Study – Taxonomy Development
Problem – 200,000 new uncategorized documents Old taxonomy –need one that reflects change in corpus Text mining, entity extraction, categorization Bottom Up- terms in documents – frequency, date, Clustering – suggested categories Clustering – chunking for editors Time savings – only feasible way to scan documents Quality – important terms, co-occurring terms

23 Text Analytics and Text Mining Case Study – Taxonomy Development
Text into Data: Article, Abstract, Title, Subtitle – fields & source of terms Add Data: PubDate, journalTitle, Taxonomy Node Terms – Map to frequency, date, date ranges, Taxonomy Node New Terms, Trends Relevance – frequency, Abstract, Title, human judgment Entity Extraction – Authors, Organizations, Products, Categorization – build on clusters & taxonomy Combination – reports, visualizations, interactive explorations

24 Case Study – Taxonomy Development

25

26

27 Case Study – Taxonomy Development

28 Case Study – Taxonomy Development

29 Conclusion The best is yet to come!
Text Analytics impact is huge – solve information overload Enterprise Search and Search Based Applications: Save millions and enhance productivity Combination of Text Analytics & Text Mining – unlimited range of applications Mutual Enrichment – more data, add structure to unstructured Add Ontology = Richer Text Analytics – smarter, more useful Text Analytics + Text Mining + Semantic Web Move from theory to new practical applications The best is yet to come!

30 Questions? Tom Reamy tomr@kapsgroup.com KAPS Group
Knowledge Architecture Professional Services


Download ppt "Text Analytics And Text Mining Best of Text and Data"

Similar presentations


Ads by Google