Presentation is loading. Please wait.

Presentation is loading. Please wait.

Using Cross-evaluation to evaluate interactive QA systems

Similar presentations


Presentation on theme: "Using Cross-evaluation to evaluate interactive QA systems"— Presentation transcript:

1 Using Cross-evaluation to evaluate interactive QA systems
Ying Sun Associate Professor Department of Library and Information Studies

2 Cross Evaluation (X-Eval)
A systematic method focusing on assessing the differential contribution of systems to the user’s final results. interactive information systems Two entities: system and individual system effect on users’ end-products

3 Cross Evaluation - Process

4 Cross Evaluation - Analysis
General linear model The measurement score y for task t, done using system s, by user u, as assessed by judge j, is given in first approximation by the linear expression: B: self-judgment bias variable, b=0 when u<>j, b=1 when u=j

5 Experimental Design

6 Cross Evaluation Criteria
Seven characteristics Covers the important ground Avoids the irrelevant materials Avoids redundant information Includes selective information Is well organized Reads clearly and easily Overall rating 6/25/2018 Ying Sun

7 Possible Effects 4 systems: S1, S2, S3 and S0
7* analysts (as authors): 1 – 7 8 scenarios: A – H 4 observers: I – IV 7* analysts (as judges): 1 – 7 Self judgment 6/25/2018 Ying Sun

8 Analytical Model - DVs Leading Factor of 7 characteristics
If the instrument has a balanced set of questions that accurately reflect the decision makers’ concerns, then factor analysis is a good way to summarize them. 79% variance. 7 characteristics individually 6/25/2018 Ying Sun

9 Results - System effect

10 Results - System effect
Post-hoc Scheffe analysis s1 s2 s0 s3 .30 .37* .44** .06 .14 .07

11 Results – self judgment bias

12 Conclusion The X-Eval method
can effectively reveal differences as small as those attributable to systems in spite of the very large effects of tasks and users with a very small number of participants. does not rely on pre-determined relevance judgments is a successful model for the “3-realities” paradigm: real users, real problems and real systems


Download ppt "Using Cross-evaluation to evaluate interactive QA systems"

Similar presentations


Ads by Google