Presentation is loading. Please wait.

Presentation is loading. Please wait.

Next Generation Search Engines Ehsun Daroodi 1 Feb, 2003.

Similar presentations


Presentation on theme: "Next Generation Search Engines Ehsun Daroodi 1 Feb, 2003."— Presentation transcript:

1

2 Next Generation Search Engines Ehsun Daroodi 1 Feb, 2003

3 Agenda Asnwerer Brilliant Conclusion

4 Answerer Answerer Co. http://www.answerer.co.kr/2002/

5 Answerer 1st generation: Yahoo! keyword matching technology 2nd generation: AskJeeves Uses multi-keyword search technology Shows the proximate questions Shows sites which might contain the sought information

6 Answerer 3rd generation: Answerer Uses more complex natural language analyzer and AI data mining and inferencing technology Gives exacts answers to the questions asked along with the sites which might contain more information Yahoo => AskJeeves => Answerer

7 Example Input Question: Who is the chairman of Microsoft Corporation? Output Result: Chairman of Microsoft Corporation is Bill Gates + related sites

8 Brilliant Microsoft Research China

9 1 st Generation Keyword based Problem: a simple keyword may not be able to convey complex search semantics a user wishes to express. Returning many irrelevant documents and eventually, disappointed users. Examples: Yahoo!, MSN,...

10 2 nd Generation FAQ-based Extracting FAQs and manually indexes these questions and their answers Users asked to confirm one or more rephrased questions in order to find their answers A few very precise results as answer

11 2 nd generation (cont.) Limited domain application such as web- base technical support Prime example: AskJeeves

12 3 rd Generation (Brilliant) Dealing with Concepts Accepting natural language queries Extracting syntactic as well as semantic information Robustness: Partial parsing whenever possible Interact with user for conformation the concept when facing ambiguity

13 Hypothesis Concept-space coverage hypothesis: A small subset of concepts can cover most user queries Track this small subset and use semi- automated methods to index the concepts precisely Results in a search engine that satisfies most user most of the time

14 Hypothesis (cont.) To support their hypothesis: They took a one-day log from MSN.com query log and manually mapped queries to pre-defined concept categories. 3000 distinct queries are taken that represent 418,248 queries on Sep 4, 1999 and are classified.

15 Hypothesis (cont.) Example of concepts: “Finding computer and Internet related products and services” “Finding movies and toys on the Internet” and so on.

16

17 Hypothesis (cont.) Both keyword and concept distribution obey the pattern that the first few popular categories will cover most of the queries The concept distribution converges much faster that the keyword distribution Shows that their hypothesis stands at least for MSN.com query log data

18 Answer Question List Keywords Question String User InterfaceNLPMeta Search FAQ MatchingAnswer FindingUser Interface FAQ Database Template Database Answer Database Tools Crawler Keywords (and other…) Search Result Feedback Question Log Log Writer Dictionary Web Sites/Pages Architecture Overview

19 Parsing Parsing natural language based on grammatical knowledge obtained through analysis of query log data Processing query logs for the purpose of obtaining new question templates with indexed answers supports relevance-feedback

20 Robust Parsing Robust parsing to handle ill-formed inputs Robust parser attempts to overcome extra- grammatically by: Ignoring the un-parsable words and fragments conducing a search with maximal subset of the original input

21 LEAP The rule to travel from one place to another TravelPath { => @from @to @route; @from => from |... ;... } Place { Beijing | Shanghai |... ; }

22 Example “How to go from Beijing to Shanghai?” LEAP parser returns the following result: How to go from place to place place

23 NL-Processor

24 Question Matching Mapping from the question space to the concept space (using concept-FAQ table) Mapping form the FAQ space to the template space Mapping form the template space to the answer space

25

26

27 Query Log Mining System is purely data-driven How to find the frequently asked questions from large amount of user questions? Statistical query co-occurrence analysis Clustering Classification

28 Multimedia Search Initially limited to just image based search Query by either text or example User interface issues Similarity measures –Color space measures –Feature space measures

29 User's Query Search Engine Image Database/ WWW Query by content Retrieval Content-based Image/Video Retrieval

30 Face-based Retrieval

31 Conclusion In the future, the knowledge of mankind will be really unmanageable by current approaches. Future users want precise answers to their questions and not millions of relevant or irrelevant web pages

32 Conclusion (cont.) I think the next generation of search engines will be a mixture of QA systems and current keyword- based SEs such as Google. This strictly depends on future developments of AI & IR & NLP techniques Future search engines wont be just machines. They will read a web page, understand it and answer our questions intelligently like humans or maybe better!

33


Download ppt "Next Generation Search Engines Ehsun Daroodi 1 Feb, 2003."

Similar presentations


Ads by Google