Experimental Components for the Evaluation of Interactive Information Retrieval Systems Pia Borlund Dawn Filan 3/30/04 610:551.

Slides:



Advertisements
Similar presentations
Chapter Thirteen Conclusion: Where We Go From Here.
Advertisements

Web Search Results Visualization: Evaluation of Two Semantic Search Engines Kalliopi Kontiza, Antonis Bikakis,
Wynne HARLEN Susana BORDA CARULLA Fibonacci European Training Session 5, March 21 st to 23 rd, 2012.
 Crowdsourcing Interactions A proposal for capturing user interactions through crowdsourcing G. Zuccon, T. Leelanupab, S. Whiting, J. M. Jose, and L.
Kalervo Järvelin – Issues in Context Modeling – ESF IRiX Workshop - Glasgow THEORETICAL ISSUES IN CONTEXT MODELLING Kalervo Järvelin
Search Engines and Information Retrieval
The art and science of measuring people l Reliability l Validity l Operationalizing.
SLIDE 1IS 240 – Spring 2007 Prof. Ray Larson University of California, Berkeley School of Information Tuesday and Thursday 10:30 am - 12:00.
About metaphorical expressions The essence of a metaphor is understanding and experiencing one kind of things in terms of another Metaphor is pervasive.
Measuring the quality of academic library electronic services and resources Jillian R Griffiths Research Associate CERLIM – Centre for Research in Library.
INFO 624 Week 3 Retrieval System Evaluation
A Task Oriented Non- Interactive Evaluation Methodology for IR Systems By Jane Reid Alyssa Katz LIS 551 March 30, 2004.
Reference Collections: Task Characteristics. TREC Collection Text REtrieval Conference (TREC) –sponsored by NIST and DARPA (1992-?) Comparing approaches.
IR2001 Oulu Vakkari & Serola & Pennanen1 The impact of the anticipated contribution of information on search tactics and results in task performance.
© Anselm SpoerriInfo + Web Tech Course Information Technologies Info + Web Tech Course Anselm Spoerri PhD (MIT) Rutgers University
Advance Information Retrieval Topics Hassan Bashiri.
WXGB6106 INFORMATION RETRIEVAL Week 3 RETRIEVAL EVALUATION.
Evaluation of Evaluation in Information Retrieval - Tefko Saracevic Historical Approach to IR Evaluation.
Preparation of the Body. In this key area you will investigate the specific fitness demands of activities. You will learn about: 1.Types of fitness, this.
Noynay, Kelvin G. BSED-ENGLISH Educational Technology 1.
Search and Retrieval: Relevance and Evaluation Prof. Marti Hearst SIMS 202, Lecture 20.
Advisor: Hsin-Hsi Chen Reporter: Chi-Hsin Yu Date:
The 2nd International Conference of e-Learning and Distance Education, 21 to 23 February 2011, Riyadh, Saudi Arabia Prof. Dr. Torky Sultan Faculty of Computers.
TRAINING AND DEVELOPMENT. WHAT IS TRAINING ? The acquisition of knowledge and skills for present tasks. A tool to help individuals contribute to the organizations.
Search Engines and Information Retrieval Chapter 1.
Evaluation Experiments and Experience from the Perspective of Interactive Information Retrieval Ross Wilkinson Mingfang Wu ICT Centre CSIRO, Australia.
The Cognitive Perspective in Information Science Research Anthony Hughes Kristina Spurgin.
Psychology: memory. Overview An understanding of human memory is critical to an appreciation of how users will store and use relevant information when.
An Analysis of Assessor Behavior in Crowdsourced Preference Judgments Dongqing Zhu and Ben Carterette University of Delaware.
Jane Reid, AMSc IRIC, QMUL, 16/10/01 1 Evaluation of IR systems Jane Reid
Tag Data and Personalized Information Retrieval 1.
ILE 08 and ILE 09 Judith Good IDEAs Lab University of Sussex Liz Thackray Open University National Workshop in Learning in Immersive Virtual Worlds 23.
Hao Wu Nov Outline Introduction Related Work Experiment Methods Results Conclusions & Next Steps.
Implicit Acquisition of Context for Personalization of Information Retrieval Systems Chang Liu, Nicholas J. Belkin School of Communication and Information.
End-user interaction with corporate digital thesaurus Marianne Lykke Nielsen The Royal School of Library and Information Science Department of Information.
Search Engine Architecture
Personalized Interaction With Semantic Information Portals Eric Schwarzkopf DFKI
Jane Reid, AMSc IRIC, QMUL, 30/10/01 1 Information seeking Information-seeking models Search strategies Search tactics.
L&I SCI 110: Information science and information theory Instructor: Xiangming(Simon) Mu Sept. 9, 2004.
Copyright All right reserved 1 i - LIKE Linked Data enrichment for an e-learning system Networked interactions to create, learn and share knowledge.
The Cross Language Image Retrieval Track: ImageCLEF Breakout session discussion.
1 13/05/07 1/20 LIST – DTSI – Interfaces, Cognitics and Virtual Reality Unit The INFILE project: a crosslingual filtering systems evaluation campaign Romaric.
Why IR test collections are so bad Mark Sanderson University of Sheffield.
Chapter. 3: Retrieval Evaluation 1/2/2016Dr. Almetwally Mostafa 1.
Evaluation of Information Retrieval Systems Xiangming Mu.
Evaluation. The major goal of IR is to search document relevant to a user query. The evaluation of the performance of IR systems relies on the notion.
{ Adaptive Relevance Feedback in Information Retrieval Yuanhua Lv and ChengXiang Zhai (CIKM ‘09) Date: 2010/10/12 Advisor: Dr. Koh, Jia-Ling Speaker: Lin,
Facilitating UFE step-by-step: a process guide for evaluators Module 4: Steps 8-12 of UFE Checklist.
Research Methods in Psychology Introduction to Psychology.
Searching and Using Electronic Literature III. Research Design.
1 DAFFODIL Effective Support for Using Digital Libraries Norbert Fuhr University of Duisburg-Essen, Germany.
Two Central Research Frameworks in Information Retrieval: Drifting outside the Cave of the Laboratory Framework Peter Ingwersen* & Kalervo Järvelin** *
Usefulness of Quality Click- through Data for Training Craig Macdonald, ladh Ounis Department of Computing Science University of Glasgow, Scotland, UK.
Methods: The project used a mixed methods approach including qualitative focus groups, a questionnaire study and systematic searches of the research literature.
Evaluation of an Information System in an Information Seeking Process Lena Blomgren, Helena Vallo and Katriina Byström The Swedish School of Library and.
Preparing for your research report
SIE 515 Design Evaluation Lecture 7.
Safeguards- Feedback on Safeguards ED-2 and Task Force Proposals
Search Engine Architecture
Outline What is Literature Review? Purpose of Literature Review
AMP 434 Education for Service-- snaptutorial.com
AMP 434 Teaching Effectively-- snaptutorial.com
Evaluation.
Activity in Context – Planning to Keep Learners ‘in the Zone’ for Scenario-based Mixed-Initiative Training Austin Tate MSc in e-Learning.
Item 1: This task required students to evaluate search results to choose the most appropriate one for a specified topic. This task illustrates achievement.
Relevance in ISR Peter Ingwersen Department of Information Studies
Design Situation Design Brief.
Measuring Learning During Search: Differences in Interactions, Eye-Gaze, and Semantic Similarity to Expert Knowledge Florian Groß Mai
Simulation-driven Enterprise Modelling: WHY ?
Presentation transcript:

Experimental Components for the Evaluation of Interactive Information Retrieval Systems Pia Borlund Dawn Filan 3/30/04 610:551

The Goal To evaluate IR systems in a way that is as close to actual information seeking process as possible, while still being in a controlled environment.

Research Questions Can simulated information needs be substituted for real information needs? What makes a good simulated situation with reference to semantic openness and types of topics of the simulated situations?

Hybrid Evaluation Model Increased demand –Relevance revolution –Cognitive revolution –Interactive revolution Combine two main approaches –System-driven approach (controlled) –Cognitive user-centered approach (realism)

The Experimental Setting 3 components: The involvement of potential users as test persons The application of dynamic and individual information needs Use of dynamic relevance judgements

Ideal IIR Setting Real users who state personal information needs to the system and judge the relevance of the retrieved documents under controlled circumstances. Use of “simulated work task” Must be under controlled circumstances so that results can be compared across systems and user groups.

Simulated Work Task Triggers and develops a simulated information need by allowing for user interpretations of the situation. Platform against which situational relevance is measured. 2 variants applied: –Complete need applied (sim 1) –Only situation applied (sim 2)

Situational Relevance User-centered, realistic, and dynamic measure of relevance Judgements are not based on the request or query, but rather relate to the person’s requirements and mental state at the time of the retrieval Assessed continuously and interactively during the session

Relevance (Schamber, Eisenberg, and Nilan) Multidimensional cognitive concept whose meaning is dependant in users’ perceptions of information and their information needs Dynamic concept that depends on users’ judgements of quality of the relationship between information and the information need Complex but systematic and measurable concept if approached conceptually and operationally from the user’s perspective

Meta-Evaluation Should simulated work tasks be recommended as a component of the experimental setting for evaluating IIR systems?

Meta-Evaluation Questions Possibility of substituting real information needs with simulated information needs through the application of simulated work task situations. Whether the variants of the simulated task makes any difference to the test persons’ treatment of the information need What characterizes a good simulated work task in terms of how tailored the task should be to the user

Test Setting Full-text online system applying TREC data and probabilistic-based retrieval engine Search activity and relevance scores were logged 24 users from various academic backgrounds and education levels Asked to prepare a personal information need

Testing Procedure Brief questionnaire Introduction Explanation of the test person’s role Demo of the system Execution of 6 search tasks (training, real, 4 simulated tasks) Post-search interview

Conclusions One can substitute real information needs with simulated information through the application of simulated work tasks One can mix simulated and real information needs Treatment of the information need did not differ between the group that received the work task and request, and those who received just the work task.