September 24, 2008 Zuzana Gedeon – Research Labs

Slides:



Advertisements
Similar presentations
Support.ebsco.com Searching the Petroleum Abstracts TULSA ® Database Tutorial.
Advertisements

Getting Your Web Site Found. Meta Tags Description Tag This allows you to influence the description of your page with the web crawlers.
Classification & Your Intranet: From Chaos to Control Susan Stearns Inmagic, Inc. E-Libraries E204 May, 2003.
For Details Visit : or For any Help Contact the Librarian EBSCOhost 2.0.
Support Home Page SHP_ENABLED enables this page SHP_PASSWD_REQD makes this page password protected If EU_SEQUENTIAL is enabled, the Ask a Question and.
Search Engine Optimization (SEO) Guideline Powered by DonorCommunity TM DonorCommunity eLearning Series v1.2, February 2012 Search Engine Optimization.
Principles of Web Design 5 th Edition Chapter Nine Site Navigation.
Google Chrome & Search C Chapter 18. Objectives 1.Use Google Chrome to navigate the Word Wide Web. 2.Manage bookmarks for web pages. 3.Perform basic keyword.
© 2008 RightNow Technologies, Inc. Title Best Practices for Maintaining Your RightNow Knowledge Base Penni Kolpin Knowledge Engineer.
Engineering Village ™ Basic Searching.
SOFTWARE PRESENTATION ODMS (OPEN SOURCE DOCUMENT MANAGEMENT SYSTEM)
Computer Information Technology – Section 3-2. The Internet Objectives: The Student will: 1. Understand Search Engines and how they work 2. Understand.
Leveraging Your Taxonomy to Increase User Productivity MAIQuery and TM Navtree.
Information Retrieval in Practice
Searching The Web Search Engines are computer programs (variously called robots, crawlers, spiders, worms) that automatically visit Web sites and, starting.
Searching and Researching the World Wide: Emphasis on Christian Websites Developed from the book: Searching and Researching on the Internet and World Wide.
Overview of Search Engines
Search Engine Optimization March 23, 2011 Google Search Engine Optimization Starter Guide.
Overview of New Behind the Blackboard for Blackboard Customers APRIL 2012 TM.
Section 13.1 Add a hit counter to a Web page Identify the limitations of hit counters Describe the information gathered by tracking systems Create a guest.
Yahoo! Proprietary. Not for re-distribution. 0  Trip Planner is a tool to help consumers envision, research, plan, and share their travel experience 
Lesson 12 — The Internet and Research
Classroom User Training June 29, 2005 Presented by:
Wasim Rangoonwala ID# CS-460 Computer Security “Privacy is the claim of individuals, groups or institutions to determine for themselves when,
Support.ebsco.com EBSCOhost Basic Searching for Academic Libraries Tutorial.
KW Agent Website Training Getting Good with Google.
SEO Part 1 Search Engine Marketing Chapter 5 Instructor: Dawn Rauscher.
Creating Effective School and PTA Websites Sam Farnsworth Utah PTA Technology Specialist
Chapter 7 Web Content Mining Xxxxxx. Introduction Web-content mining techniques are used to discover useful information from content on the web – textual.
Building Search Portals With SP2013 Search. 2 SharePoint 2013 Search  Introduction  Changes in the Architecture  Result Sources  Query Rules/Result.
Product Feeds. What is a Product? In marketing terms, a product is an item, service or idea that is for sale Examples are: A flight with set dates and.
Basic Web Applications 2. Search Engine Why we need search ensigns? Why we need search ensigns? –because there are hundreds of millions of pages available.
Nobody’s Unpredictable Ipsos Portals. © 2009 Ipsos Agenda 2 Knowledge Manager Archway Summary Portal Definition & Benefits.
Support.ebsco.com EBSCOhost Basic Searching for Academic Libraries Tutorial.
Support.ebsco.com Basic Searching for K-12 School Libraries Tutorial.
UOS 1 Ontology Based Personalized Search Zhang Tao The University of Seoul.
Internet Business Foundations © 2004 ProsoftTraining All rights reserved.
Research & Learning For Libraries and Patrons that need to stay Ahead of the Learning Curve Presenter Name Here Books24x7® for Libraries.
Do's and don'ts to improve your site's ranking … Presentation by:
Medline on OvidSP. Medline Facts Extensive MeSH thesaurus structure with many synonyms used in mapping and multidatabase searching with Embase Thesaurus.
© RightNow Technologies, Inc. Ask The Experts: Getting the most out of Smart Assistant David Fulton, Product Manager, Web Experience Center Of Excellence,
Search Engine Optimization & Pay Per Click Advertising
XP New Perspectives on The Internet, Sixth Edition— Comprehensive Tutorial 3 1 Searching the Web Using Search Engines and Directories Effectively Tutorial.
The Internet 8th Edition Tutorial 4 Searching the Web.
Search Engines. Search Strategies Define the search topic(s) and break it down into its component parts What terms, words or phrases do you use to describe.
HUMANS do it better! dmoz: The Open Directory Project.
0 SharePoint Search 2013 Rafael de la Cruz SharePoint Developer Seneca Resources twitter.com/delacruz_rafael
Search Engines Reyhaneh Salkhi Outline What is a search engine? How do search engines work? Which search engines are most useful and efficient? How can.
Copyright © 2006 Pilothouse Consulting Inc. All rights reserved. Search Overview Search Features: WSS and Office Search Architecture Content Sources and.
Unit 1—Computer Basics Lesson 3 The Internet and Research.
January 2006Colby College ITS Setting Up Course Pages.
CIW Lesson 6MBSH Mr. Schmidt1.  Define databases and database components  Explain relational database concepts  Define Web search engines and explain.
The World Wide Web. What is the worldwide web? The content of the worldwide web is held on individual pages which are gathered together to form websites.
2004/051 >> Supply Chain Solutions That Deliver Users.
Microsoft Office 2008 for Mac – Illustrated Unit D: Getting Started with Safari.
By: Kem Forbs Advanced Google Search. Tips and Tricks Keywords: adding additional terms or keywords can redefine your search and make the most relevant.
Week 5  SEO  CSS Please Visit: to download all the PowerPoint Slides for.
Third Edition Discovering the Internet Discovering the Internet Complete Concepts and Techniques, Second Edition Chapter 3 Searching the Web.
Search Engine Optimization
Information Architecture
Search Engine Optimization
Creating Oracle Business Intelligence Interactive Dashboards
Search Engines and Search techniques
Lesson 6: Databases and Web Search Engines
Web Mining Ref:
Objective % Explain concepts used to create websites.
EBSCOhost Basic Searching for Academic Libraries
Lesson 6: Databases and Web Search Engines
Objective Explain concepts used to create websites.
Presentation transcript:

September 24, 2008 Zuzana Gedeon – Research Labs Knowledge Base Tuning September 24, 2008 Zuzana Gedeon – Research Labs

Overview Key Ideas Searching KB Configuration & Tuning KB – what do we/you mean by KB Searching Background on Search Engine Technology KB Configuration & Tuning

KB in general terms Knowledge base Data mining – making KB available Information available to the user Data mining – making KB available Applications helping user to access this information

Who uses the Knowledge Base? End users Internal database Managers CSRs External documents Answer database Community Forum Coming up: how users search? What do we mean by “search” Marketers Subject experts

Types of search - architecture Filter based Direct database query – built into views engine product/category filtering Date, customer email address, … Most runtime selectable filters in Reports Text/Index based “Google style” search Documents -> index Boosting and weight calculation KB Browse Navigational, exploratory search No Search !! Get what you need without need for search

Mashup Report filters with index based search Incident search Answer search pages Filter > search_thread (search_xxx) Sort by match_wt!!!

Types of search - architecture Filter based Direct database query – built into views engine product/category filtering Date, customer email address, … Most runtime selectable filters in Reports Text/Index based “Google style” search Documents -> index Boosting and weight calculation KB Browse Navigational, exploratory search No Search !! Get what you need without need for search

EU Knowledge sources and delivery Syndication widget Voice KB Search KB Browse Answer database External documents Community Forum Pro Services integration

No Search !! Fact: A large percentage of user sessions do NOT do a search Users find what they are looking for without any search just by showing them the right stuff as soon as they access page.

How do we do that? Good content Administrator Users Well-chosen category and product organization Good descriptive titles Concise information (generic vs. specific) Consistency Administrator Topic/Add words User specifiable content tags to start/stop indexing for searching Answer as a file attachment or URL versus just Q&A pair SmartGuide to create branching (script-like) Answers Publish-on and review-on dates Place on top (“fix on top” really sparingly) Answer access level conditional sections Users Users ranking helpfulness - explicitly Ants leaving pheromone trail – implicit ranking

Find information where they search Sitemap: exporting KB to search engines What are Sitemaps? Sitemaps are an easy way for webmasters to inform search engines such as Google and Yahoo about pages on their sites that are available for crawling. Sitemap Feature Description: Facilitates Google’s (and other search engine’s) spidering of your public RightNow knowledgebase content. Benefits: Allows you to control how search engine spiders visit and consume your knowledgebase content. If you desire, this can help your content go to the front of the line in Google/Yahoo web spiders.

Information placement Knowledge Syndication Widget with Product filter

How do we do that? Good content Well-chosen category and product organization Good descriptive titles Concise information (generic vs. specific) Consistency

How do we do that? Administrator Topic/Add words User specifiable content tags to start/stop indexing for searching Answer as a file attachment or URL versus just Q&A pair Publish-on and review-on dates Answer access level conditional sections Place on top (“fix on top” really sparingly)

Topic Words for Search Allows KB administrator to associate either a WWW document or KB Answer to a specific single search term The given document appears first in the list of search results Document can be set to always be shown Useful for directed information presentation, advertising, notices, announcements, etc.

Separate words by comma or newline

How do we do that? Administrator Topic/Add words User specifiable content tags to start/stop indexing for searching Answer as a file attachment or URL versus just Q&A pair Publish-on and review-on dates Answer access level conditional sections Place on top (“fix on top” really sparingly)

Stop/start index This text is being indexed <!--stopindex--> this text is not being indexed <!--startindex--> And this text is again indexed Case independent

How do we do that? Administrator Topic/Add words User specifiable content tags to start/stop indexing for searching Answer as a file attachment or URL versus just Q&A pair Publish-on and review-on dates Answer access level conditional sections Place on top (“fix on top” really sparingly)

How do we do that? Users AI Administrator Users ranking helpfulness - explicitly Ants leaving pheromone trail – implicit ranking AI aging of the information agedatabase Administrator Promoting new answers

No Search !! Users find what they are looking for without any search just by showing them the right stuff as soon as they access page.

Users + AI Common-> knowledge base -> Answer search: SA_SOLVED_WEIGH_PREF – long term or short term preference

Smart Assistant

Smart Assistant

Relationships Between Answers Sibling Answers section must be enabled from workspace property Can manually relate answers together

Use Smart Assistant Set up Smart Assistant Rules Help in populating KB – respond to customer inquiries – propose new answers Set up Smart Assistant Rules Try to answer the question without admin interaction

Smart Assistant tuning Limit by matching Browse topics RNT UI → Support → SA_NL_MATCH_THRESHOLD Enables the ability to restrict SmartAssistant suggested answers to answers that have the same or closely matching locations in the browse tree. The accepted values are: 0 - do not restrict, 1 - use answers from any closely matching clusters, and 2 - use only best matching clusters. If SA_DM_FREQ is set to 0, the value of SA_NL_MATCH_THRESHOLD will be forced to 0 regardless of the value set here. Default is 1.

Suggested Searches EU_SUGGESTED_SEARCHES_ENABLE Using history of end-user searches we use a data-mining technique to establish relationships between similar search phrases EU_SUGGESTED_SEARCHES_ENABLE Each search phrase suggested to an end-user must pass these tests Each word spelled correctly Positive SmartSense value No words in blacklist Be complementary to current search SEARCH_SUGGESTIONS_DISPLAY 0 no recommendations 1 turn on recommended products 2 turn on recommended categories 4 turn on recommended Browse topics MAX_SEARCH_SUGGESTIONS

Web Like Search Attributes of Search Traditional keyword searching on the internet or within an operating system. User’s mental model (Google, Yahoo, MSN) Attributes of Search Indexes the ‘entire’ corpus of information. Almost never results in a zero matches. User Testing in Jan 08 showed that Google is expected behavior whenever the term ‘Search’ is paced next to a text box on the web.

Answer Search

External documents search Web pages Answers extra

What’s an Index? The index is where all the information about what is searchable is stored Indexes are used to speed finding search results by not requiring each document to be scanned during the search process Most search engines (including ours) use an ‘inverted index’ which means that they map words to documents, or words to locations within documents - Similar to the index in the back of a book Vs “find a word with your finger” Indexes are pre-computed when documents are created/edited

Example of an Index Index Four score and seven years ago our fathers brought forth on this continent, conceived in Liberty, and Score: A group of 20 items. Hence, four score is 4x20, or 80. Index years united states seven score restriction north mexico liberty Location Word Liberty: The condition of being free from restriction or control. The North American Continent consists of the countries: the United States of America, Canada, Mexico,

Stopwords and Word Stemming Stopwords are human-language connector words that are not generally useful in information retrieval a, an , the, or, on , for, … “To be or not to be” RightNow Feature: multiple editable stop word lists Incidents Answers Word Stemming Standard natural language processing technique Unique stemmer for each language CONNECT CONNECTED CONNECTING CONNECTION CONNECTIONS => CONNECT - Generalizes searches (exact matches not considered)

Query Processing and Result Ranking How does a search query work? Query is processed via word stemming and removal of stopwords Aliases are added to the search terms (non stopwords, original form) Search terms are looked up in the index The total hits are gathered and sorted by document via weighting formula(s) The documents’ attributes (title, link, etc.) are fetched and returned to the browser postprocessing algorithm may be used before display

Answer Search

Word Bias Configuration Some words are relatively more important than others based upon location Words in the Subject & Keywords field are more important than words in the body of a document or the attachments RightNow Configuration Options SRCH_KEY_WEIGHT 50 Keywords SRCH_PROD_WEIGHT 50 Product Words SRCH_CAT_WEIGHT 50 Category Words SRCH_SUBJ_WEIGHT 45 Subject/Title Words SRCH_DESC_WEIGHT 30 Question Words SRCH_BODY_WEIGHT 4 Answer Words SRCH_ATTACH_WEIGHT 4 File-Attach. Words Set these to be the same across interfaces! Make sure to point out to go back to best practices—global changes

AND vs. OR Query Processing Do the search results contain ALL words in the search text or just SOME words? All major Internet search engines use AND We use OR by default with a heavy multi-word weight bias .. “AND-like ordering” Why do we use OR? AND does not work well for small documents sets (under 10,000 answers). Why does AND perform badly on small document sets? It’s too easy for a user to construct a query with no search results. Add animation

Result Focusing and Truncation Dynamic Truncation Bias (Answers) Truncate Search Results to those most scoring best RNT UI: SEARCH_RESULT_LIMITING – natural breaks RNT UI: ANS_SRCH_THRESHOLD – break by weight RNT UI: ANS_SRCH_SUB_THRESHOLD – avoid 0 results Concept-biased Search Focus Search results based upon matching of query to existing KB learned topics RNT UI: SEARCH_RELEVANCE_FOCUS (Answers) RNT UI: SA_NL_MATCH_THRESHOLD (SmartAssistant) Make sure to point out to go back to best practices—global changes

External documents search Web pages Answers

External documents and tuning No much of content control spider uses only externally available content Not much tuning control Title and body weight SRCH_KEY_WEIGHT Meta+ products, categories SRCH_SUBJ_WEIGHT Title SRCH_DESC_WEIGHT Text HtDig with Clucene File Attachment Size FATTACH_MAX_SIZE Core Engine Search Pulldowns – Kill them ANS_SEARCH_BY_ENABLED ANS_SORT_BY_ENABLED

Important Files in the File Manager exclude_answers.txt End-user Stopwords exclude_incidents.txt Incident Stopwords aliases.txt Always-On Search Thesaurus thesaurus.txt Thesaurus for similar search smartsense.txt Emotional Word Ratings blacklist.txt No-Show words for Sugg. Searches userdic.tlx Custom Dictionary for Spellchecker

Wizard exclude_answers.txt

Aliases Establishes a link between two words to treat them as synonyms for every search type FBI = Federal Bureau of Investigation Whiskey = Scotch Go to demo

Analytics Keyword Searches report Gap report Frequent searches (important content) Searches with no answers (missing content) Searches with too many answers (configuration and tuning needed) Gap report

Keyword Searches Report

Information Gap Report Use the Gap Report to identify ‘holes’ in the end-user KB. Compares recent incidents to existing Answers. Gap Report Config Options: GAP_FREQUENCY & GAP_TIME_PERIOD – default 7 days for both.

Information Gap Report Screenshot

Other Customization  EU_BROWSER_SEARCH_PLUGIN - Enables the Answer and External Document search pages to provide an interface for web browsers to query them directly from their built-in search bars, such as those provided by Google or Yahoo!. Default is disabled (No).  EU_SYNDICATION_ENABLE – widgets  ANS_SORT_BY_ENABLED Enables the Sort By drop-down menu on the Find Answers page. This setting overrides any view settings. Default is disabled (No). – this is the reason to have limited results set!!!!  SEARCH_WITH_OPERATORS Enables processing of +, - and ~ operators while searching for answers. Default is enabled (Yes).

Thank You Questions?