WIRED Week 3 Syllabus Update (next week) Readings Overview - Quick Review of Last Week’s IR Models (if time) - Evaluating IR Systems - Understanding Queries.

WIRED Week 3 Syllabus Update (next week) Readings Overview - Quick Review of Last Week’s IR Models (if time) - Evaluating IR Systems - Understanding Queries Assignment Overview & Scheduling - Leading WIRED Topic Discussions - Web Information Retrieval System Evaluation & Presentation Projects and/or Papers Discussion - Initial Ideas - Evaluation - Revise & Present

Evaluating IR Systems Recall and Precision Alternative Measures Reference Collections - What - Why Trends

Why Evaluate IR Systems? Leave it to the developers? - No bugs - Fully functional Let the market (users) decide? - Speed - (Perceived) accuracy Relevance is relevant - Different types of searches, data and users “How precise is the answer set?” p 73

Retrieval Performance Evaluation Task - Batch or Interactive - Each needs a specific interface Setting Context - New search - Monitoring Usability - Lab tests - Real world (search log) analysis

Recall and Precision Basic evaluation measurement for IR system performance Recall: the fraction of relevant documents retrieved - 100% is perfect recall - Every document that is relevant is found Precision: the fraction of retrieved documents which are relevant - 100% relevancy is perfect precision - How good the recall is

Recall and Precisions goals Everything is found (recall) The right set of documents is pulled from the found set (precision) What about ranking? - Ranking is an absolute measure of relevance for the query. - Ranking is Ordinal in almost all cases

Recall and Precision Considered 100 documents have been analyzed 10 documents relevant to the query in the set - 4 documents are found and all are relevant ??% recall, ??% precision - 8 documents are found, but 4 are relevant ??% recall, ??% precision Which is more important?

Recall and Precision Appropriate? Disagreements over perfect sets User errors in using results Redundancy of results - Result diversity - Metadata Dynamic data - Indexable - Recency of information may be key A single measure is better - Combinatory - User evaluation

Back to the User User evaluation Is one answer good enough? Rankings Satisficing Studies of Relevance are key

Other Evaluation Measures Harmonic Mean - Single, combined measure - Between 0 (none) & 1 (all) - Only high when both P & R are high - Still a percentage E measure - User determines (parameter) value of R & P - Different tasks (legal, academic) - An interactive search?

Coverage and Novelty System effects - Relative recall - Relative effort sMore natural, user understandable measure User knows some % documents are relevant Coverage = % documents user expects Novelty = % of documents user didn’t know of - Content of document - Document itself - Author of document - Purpose of document

Reference Collections Testbeds for IR evaluation TREC (Text Retrieval Conference) set - Industry focus - Topic-based or General - Summary tables for tasks (queries) - R & P averages - Document analysis - Measures for each topic CACM (general CS) ISI (academic, indexed, industrial)

Trends in IR Evaluation Personalization Dynamic Data Multimedia User Modeling Machine Learning (CPU/$)

Understanding Queries Types of Queries: - Keyword - Context - Boolean - Natural Language Pattern Matching - More like this… - Metadata Structural Environments

Boolean AND, OR, NOT Combination or individually Decision tree parsing for the system Not so easy for the user when advanced queries Hard to backtrack and see differences in results

Keyword Single word (most common) - Sets - “Phrases” Context - “Phrases” - Near (# value in characters, words, documents links)

Natural Language Asking Quoting Fuzzy matches Different evaluation methods might be needed Dynamic data “indexing” problematic Multimedia challenges

Pattern Matching Words Prefixes “comput*” Suffixes “*ology” Substrings “*exas*” Ranges “four ?? years ago” Regular Expressions (GREP) Error threshold User errors

Query Protocols HTTP Z39.50 - Client – Server API WAIS - Information/ database connection ODBC JDBC P2P

Assignment Overview & Scheduling Leading WIRED Topic Discussions - # in class = # of weeks left? Web Information Retrieval System Evaluation & Presentation - 5 page written evaluation of a Web IR System - technology overview (how it works) - a brief history of the development of this type of system (why it works better) - intended uses for the system (who, when, why) - (your) examples or case studies of the system in use and its overall effectiveness

How can (Web) IR be better? - Better IR models - Better User Interfaces More to find vs. easier to find Scriptable applications New interfaces for applications New datasets for applications Projects and/or Papers Overview

Project Idea #1 – simple HTML Graphical Google What kind of document? When was the document created?

Project Ideas Google History: keeps track of what I’ve seen and not seen Searching when it counts: Financial and Health information requires guided, quality search

WIRED Week 3 Syllabus Update (next week) Readings Overview - Quick Review of Last Week’s IR Models (if time) - Evaluating IR Systems - Understanding Queries.

Similar presentations

Presentation on theme: "WIRED Week 3 Syllabus Update (next week) Readings Overview - Quick Review of Last Week’s IR Models (if time) - Evaluating IR Systems - Understanding Queries."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

WIRED Week 3 Syllabus Update (next week) Readings Overview - Quick Review of Last Week’s IR Models (if time) - Evaluating IR Systems - Understanding Queries.

Similar presentations

Presentation on theme: "WIRED Week 3 Syllabus Update (next week) Readings Overview - Quick Review of Last Week’s IR Models (if time) - Evaluating IR Systems - Understanding Queries."— Presentation transcript:

Similar presentations

About project

Feedback