Presentation is loading. Please wait.

Presentation is loading. Please wait.

Web Information Retrieval Prof. Alessandro Agostini 1 Context in Web Search Steve Lawrence Speaker: Antonella Delmestri IEEE Data Engineering Bulletin.

Similar presentations


Presentation on theme: "Web Information Retrieval Prof. Alessandro Agostini 1 Context in Web Search Steve Lawrence Speaker: Antonella Delmestri IEEE Data Engineering Bulletin."— Presentation transcript:

1 Web Information Retrieval Prof. Alessandro Agostini 1 Context in Web Search Steve Lawrence Speaker: Antonella Delmestri IEEE Data Engineering Bulletin September 2000

2 Web Information Retrieval Prof. Alessandro Agostini 2 Abstract Actual Web search engines treat search requests in isolation: they give identical results to identical queries made by different users in different contexts. This “one size fits all” model limits diversity, competition and functionality on the web. To find an answer to this problem, next-generation Web search engines will make increasing use of context information, either by using explicit or implicit context information from users or by adding functionalities within restricted contexts.

3 Web Information Retrieval Prof. Alessandro Agostini 3 Introduction The web is more and more representative of our society and millions of different people publish and organize information in it. These people differ for background, knowledge and expectations and this makes the web very diverse in terms of content and structure. Even if databases used in traditional information retrieval systems are strictly structured, current web search engines are similar in operation to these information retrieval systems. Web search engines return ranked lists of relevant documents in response to user queries, but just a few of these results may be valuable to a user.

4 Web Information Retrieval Prof. Alessandro Agostini 4 In fact the value depends on the context of the query, i.e. the education, interests, experiences of the user and further information about the current request. Even if the major search engines make an incredible amount of information quickly and easily available and they are widely used, they do have significant limitations: Often out-of-date Index only a part of the publicly indexable web Do not index documents requiring authentication Do not index sites equally The need for better web search engines is becoming increasingly important.

5 Web Information Retrieval Prof. Alessandro Agostini 5 Inserting context in Web search Understand the context of search requests Adding explicit context information Automatically inferring implicit context information in personalized search engines Guessing what the user wants Restrict the context of search engines Specialized search engines Identifying communities on the web Locating specialized search engines

6 Web Information Retrieval Prof. Alessandro Agostini 6 Adding explicit context information Provided by the user in the form of keywords added to a query (e.g. “home” or “homepage”) The Inquirus 2 project, developed at NEC Research Institute, asks the user making a query also to explicitly choose a category (e.g. “personal home pages”, “research papers”). The system uses this context information: To select the search engines to send queries to To modify queries To select the ordering policy Inquirus 2 has proven to be highly effective at improving search precision within given categories.

7 Web Information Retrieval Prof. Alessandro Agostini 7 Inferring implicit context information: personalized search The Watson project is an example of client-based personalized search engine that extracts implicit context information (e.g. words, font size, etc). It automatically infers user’s context information basing on the content of the current document being edited in Microsoft Word or viewed in Internet Explorer. Watson uses the inferred information to modify the user’s query and forwards the new query to web search engines. Client-based systems like Watson have limited functionalities.

8 Web Information Retrieval Prof. Alessandro Agostini 8 Personalized search Search engines that know all of user’s previous requests and interest could use that information to tailor results. A server-based search engine like Google could infer user interests and rank links in agreement with them. Server-based full-scale personalization is currently too expensive for the major web search engines: hopefully it will be more feasible over time. Personalizes search services deal with the problems of consistency and privacy.

9 Web Information Retrieval Prof. Alessandro Agostini 9 Guessing what user wants It is an increasingly common search engines technique that aims to produce additional results and links to the user’s query, trying to guess what user wants giving also links related to the keywords (e.g. stock symbols -> stock quotes links + company information links). This technique is limited to cases where potential context can be identified based on keyword query and can be improved by a personalized search engines. Clustering search results into categories (e.g. “News”) allows the user to easily narrow results.

10 Web Information Retrieval Prof. Alessandro Agostini 10 Restricting the context of search engines Another way to add context into web search is to restrict the context of the search engine, i.e. to create specialized search engines for specific domains. Thousands of specialized search engines already exist and provide higher functionality than the regular web search engines, also allowing to retrieve information not accessible using conventional search engines in a specific domain (www.completeplanet.com).

11 Web Information Retrieval Prof. Alessandro Agostini 11 Specialized search engines CiteSeer (ResearchIndex) is the world’s largest specialized free, public search engine for scientific literature, currently indexing over 300.000 articles and 3.000.000 citations. CiteSeer incorporates many features specific to scientific literature: it automates the creation of citation indices, provides easy access to their context and gives functionality for extracting information found in research articles.

12 Web Information Retrieval Prof. Alessandro Agostini 12 Identifying communities on the web Domain-specific search engines need a method for locating the subset of the web within their domain. A good method has been successfully identified in communities (Flake). A web community is defined as a collection of pages where each member has more links inside the community than outside of the community. Because there is no central authority governing the formation of links on the web, this discovery is important and allows identification of communities independent of the specific words used inside pages.

13 Web Information Retrieval Prof. Alessandro Agostini 13 Locating specialized search engines Many queries that would be best served by specialized services are likely to be sent to the major web search engines because the overheads in locating a specialized engine are too great. Much research has been done in this area and several methods of selecting search engines based on user queries have been proposed. It would be very useful if the major search engines attempted to direct users to the best specialized search engine, but they have economic reasons not to provide such a service.

14 Web Information Retrieval Prof. Alessandro Agostini 14 Does one size fit all? All users receive the same responses for given queries and that is the great benefit of the web: to allow equalizing access to information. Whereas, not much appears to be equal on the web: influenced by search engines, the distribution of traffic is disproportionate, the majority of links go to a small number of very popular sites, the results to queries follow unpredictable ranking criteria: “winners take all”. Specialized search engines may provide less biased results (e.g. Yellow pages), but most of the people use the major web search engines.

15 Web Information Retrieval Prof. Alessandro Agostini 15 Conclusions New search engines should make a greater use of context to mitigate any negative effects of biased access to information on the web and to increase competition, diversity and functionality. Web search is today one of the most challenging problems of the Internet and as web search becomes a more important function within society, the need for even better web search services is becoming increasingly important.

16 Web Information Retrieval Prof. Alessandro Agostini 16 Thank you for your attention.


Download ppt "Web Information Retrieval Prof. Alessandro Agostini 1 Context in Web Search Steve Lawrence Speaker: Antonella Delmestri IEEE Data Engineering Bulletin."

Similar presentations


Ads by Google