Web Object Retrieval Zaiqing Nie, Yunxiao Ma, Shuming Shi, Ji-Rong Wen, Wei-Ying Ma, MSRA.

Slides:



Advertisements
Similar presentations
Answering Approximate Queries over Autonomous Web Databases Xiangfu Meng, Z. M. Ma, and Li Yan College of Information Science and Engineering, Northeastern.
Advertisements

1 Evaluation Rong Jin. 2 Evaluation  Evaluation is key to building effective and efficient search engines usually carried out in controlled experiments.
Effective Keyword Based Selection of Relational Databases Bei Yu, Guoliang Li, Karen Sollins, Anthony K.H Tung.
The State of the Art in Distributed Query Processing by Donald Kossmann Presented by Chris Gianfrancesco.
WSCD INTRODUCTION  Query suggestion has often been described as the process of making a user query resemble more closely the documents it is expected.
1 Knowledge Technologies 2001 Siemens Automation and Drive Help Desk: A Knowledge Work-Place with Self-Service Norman Zimmer empolis NA, Inc. Burlington,
Data Quality Class 10. Agenda Review of Last week Cleansing Applications Guest Speaker.
Ranking models in IR Key idea: We wish to return in order the documents most likely to be useful to the searcher To do this, we want to know which documents.
Web Document Clustering: A Feasibility Demonstration Hui Han CSE dept. PSU 10/15/01.
1 Entity Ranking Using Wikipedia as a Pivot (CIKM 10’) Rianne Kaptein, Pavel Serdyukov, Arjen de Vries, Jaap Kamps 2010/12/14 Yu-wen,Hsu.
Precision and Recall.
Sunita Sarawagi.  Enables richer forms of queries  Facilitates source integration and queries spanning sources “Information Extraction refers to the.
PROBLEM BEING ATTEMPTED Privacy -Enhancing Personalized Web Search Based on:  User's Existing Private Data Browsing History s Recent Documents 
Unsupervised Information Extraction from Unstructured, Ungrammatical Data Sources on the World Wide Web Mathew Michelson and Craig A. Knoblock.
Discovering Query Context using Concept Hierarchy Mukesh Mohania IBM Research – India.
Analyses of K-Group Designs : Analytic Comparisons & Trend Analyses Analytic Comparisons –Simple comparisons –Complex comparisons –Trend Analyses Errors.
Visual Querying By Color Perceptive Regions Alberto del Bimbo, M. Mugnaini, P. Pala, and F. Turco University of Florence, Italy Pattern Recognition, 1998.
1 Chapter 19: Information Retrieval. ©Silberschatz, Korth and Sudarshan19.2Database System Concepts - 5 th Edition, Sep 2, 2005 Chapter 19: Information.
Compare&Contrast: Using the Web to Discover Comparable Cases for News Stories Presenter: Aravind Krishna Kalavagattu.
An Overview of Relevance Feedback, by Priyesh Sudra 1 An Overview of Relevance Feedback PRIYESH SUDRA.
Information Retrieval
Chapter 4: Managing Information Resources with Databases Copyright © 2013 Pearson Education, Inc. publishing as Prentice Hall Chapter
Cordotomy in mesothelioma- related pain: a systematic review CASP Analysis Emma Lowe.
THOMSON SCIENTIFIC Web of Science 7.0 via the Web of Knowledge 3.0 Platform Access to the World’s Most Important Published Research.
1 Probabilistic Language-Model Based Document Retrieval.
Estimation of the value of unquoted shares of enterprises in the public sector OECD Working Party on Financial Statistics 2008 Paris Paper prepared by.
1 Chapter 19: Information Retrieval Chapter 19: Information Retrieval Relevance Ranking Using Terms Relevance Using Hyperlinks Synonyms., Homonyms,
Computing & Information Sciences Kansas State University Monday, 04 Dec 2006CIS 560: Database System Concepts Lecture 41 of 42 Monday, 04 December 2006.
THOMSON SCIENTIFIC Web of Science 7.0 via the Web of Knowledge 3.0 Platform Access to the World’s Most Important Published Research.
1 Formal Models for Expert Finding on DBLP Bibliography Data Presented by: Hongbo Deng Co-worked with: Irwin King and Michael R. Lyu Department of Computer.
Redeeming Relevance for Subject Search in Citation Indexes Shannon Bradshaw The University of Iowa
UOS 1 Ontology Based Personalized Search Zhang Tao The University of Seoul.
When Experts Agree: Using Non-Affiliated Experts To Rank Popular Topics Meital Aizen.
Query Expansion By: Sean McGettrick. What is Query Expansion? Query Expansion is the term given when a search engine adding search terms to a user’s weighted.
25/03/2003CSCI 6405 Zheyuan Yu1 Finding Unexpected Information Taken from the paper : “Discovering Unexpected Information from your Competitor’s Web Sites”
April 14, 2003Hang Cui, Ji-Rong Wen and Tat- Seng Chua 1 Hierarchical Indexing and Flexible Element Retrieval for Structured Document Hang Cui School of.
Probabilistic Query Expansion Using Query Logs Hang Cui Tianjin University, China Ji-Rong Wen Microsoft Research Asia, China Jian-Yun Nie University of.
Retrieval Models for Question and Answer Archives Xiaobing Xue, Jiwoon Jeon, W. Bruce Croft Computer Science Department University of Massachusetts, Google,
Distributed Information Retrieval Using a Multi-Agent System and The Role of Logic Programming.
Recommending Twitter Users to Follow Using Content and Collaborative Filtering Approaches John HannonJohn Hannon, Mike Bennett, Barry SmythBarry Smyth.
Multilingual Retrieval Experiments with MIMOR at the University of Hildesheim René Hackl, Ralph Kölle, Thomas Mandl, Alexandra Ploedt, Jan-Hendrik Scheufen,
Detecting Dominant Locations from Search Queries Lee Wang, Chuang Wang, Xing Xie, Josh Forman, Yansheng Lu, Wei-Ying Ma, Ying Li SIGIR 2005.
Chapter 4: Managing Information Resources with Databases Copyright © 2013 Pearson Education, Inc. publishing as Prentice Hall Chapter
EasyQuerier: A Keyword Interface in Web Database Integration System Xian Li 1, Weiyi Meng 2, Xiaofeng Meng 1 1 WAMDM Lab, RUC & 2 SUNY Binghamton.
Comparing and Ranking Documents Once our search engine has retrieved a set of documents, we may want to Rank them by relevance –Which are the best fit.
American Community Survey Multi-Year Estimates: Challenges and Opportunities Discussant II: Mike Cohen Study Director, CNSTAT September 25 th, 2008.
OLAP Recap 3 characteristics of OLAP cubes: Large data sets ~ Gb, Tb Expected Query : Aggregation Infrequent updates Star Schema : Hierarchical Dimensions.
Facilitating Document Annotation using Content and Querying Value.
© 2004 Chris Staff CSAW’04 University of Malta of 15 Expanding Query Terms in Context Chris Staff and Robert Muscat Department of.
Gravitation-Based Model for Information Retrieval Shuming Shi, Ji-Rong Wen, Qing Yu, Ruihua Song, Wei-Ying Ma Microsoft Research Asia SIGIR 2005.
1 Latent Concepts and the Number Orthogonal Factors in Latent Semantic Analysis Georges Dupret
1 Evaluating High Accuracy Retrieval Techniques Chirag Shah,W. Bruce Croft Center for Intelligent Information Retrieval Department of Computer Science.
Evidence from Content INST 734 Module 2 Doug Oard.
Facilitating Document Annotation Using Content and Querying Value.
1 Object-Level Vertical Search CIDR, Jan 9, 2007 Zaiqing Nie Microsoft Research Asia With Ji-Rong Wen and Wei-Ying Ma.
PAIR project progress report Yi-Ting Chou Shui-Lung Chuang Xuanhui Wang.
Using Blog Properties to Improve Retrieval Gilad Mishne (ICWSM 2007)
(8) Potential required for planning with management Top-Down Estimating Method: Top-down estimating method is also called Macro Model. Using it, estimation.
Semantic Web Technologies Readings discussion Research presentations Projects & Papers discussions.
Harnessing the Deep Web : Present and Future -Tushar Mhaskar Jayant Madhavan, Loredana Afanasiev, Lyublena Antova, Alon Halevy January 7,
Database System Concepts, 5th Ed. ©Sang Ho Lee Chapter 19: Information Retrieval.
Large Scale Search: Inverted Index, etc.
Designing Cross-Language Information Retrieval System using various Techniques of Query Expansion and Indexing for Improved Performance  Hello everyone,
Information Retrieval
Compact Query Term Selection Using Topically Related Text
Information Retrieval
Chapter 31: Information Retrieval
Information Retrieval and Web Design
Chapter 19: Information Retrieval
Presentation transcript:

Web Object Retrieval Zaiqing Nie, Yunxiao Ma, Shuming Shi, Ji-Rong Wen, Wei-Ying Ma, MSRA

Main Topic  how to treat the information on the Web as different structured objects which have attributes with different weights in terms of their importance  how to improve the performance of information retrieval based on these structured objects, especially for the precision

Proposed Models  Unstructured Model Each record is a bag of words Take into account the record extraction accuracy when merging different records  Structured Model Each is an structured object with multiple attributes Attributes have different weights in terms of importance Take into account both record extraction accuracy and attribute extraction accuracy  Hybrid Model A tradeoff between the two models above

Criticism  Advantages Shows clearly that either absolutely unstructured model or absolutely structured model is not the best way of information retrieval, with respect to the retrieval precision, so a tradeoff does have to be made Addresses how to handle multiple records corresponding to a same object and how to handle the large amount of inconsistent information on the web  Disadvantages and Problems It’s not really a novel idea in this paper since a lot of works in this area have been done to take advantage of the existing structures within the information on the web. This paper only uses the term “object” to make itself seem new The formula used in this paper to estimate the query relevance of a document (or an object or attribute) seems not reasonable enough, since TF/IDF measure can be much better than this kind of linearly combination.

Correlations  Information Retrieval  Object Retrieval Bag of words (Jaccard) Take advantage of structured-data NBC-like measure  Information Integration  Record Merging

Web Object Retrieval