Presentation is loading. Please wait.

Presentation is loading. Please wait.

Chunyi Peng, Zaoyang Gong, Guobin shen Microsoft Research Asia HotWeb 2006 MEASUREMENT AND MODELING OF A WEB-BASED QUESTION ANSWERING SYSTEM.

Similar presentations


Presentation on theme: "Chunyi Peng, Zaoyang Gong, Guobin shen Microsoft Research Asia HotWeb 2006 MEASUREMENT AND MODELING OF A WEB-BASED QUESTION ANSWERING SYSTEM."— Presentation transcript:

1 Chunyi Peng, Zaoyang Gong, Guobin shen Microsoft Research Asia HotWeb 2006 MEASUREMENT AND MODELING OF A WEB-BASED QUESTION ANSWERING SYSTEM

2 A short introduction to Web-based QA system QA Measurement of behavior pattern on time, topics, users and incentive effects QA Modeling Discussion: How can be better? Outline

3 Solve it yourself! – Ooh, out of our scope! Usually, Search it! –A common and good way in many cases, but Search engine typically returns pages of links, not direct answers. Some time it is very difficult for people to describe their questions in a precise way. not all information is readily available in the web. So, Ask! –A natural and effective way Question-Answering (QA) utilizes grassroots intelligence and collaboration Especially as a specific information acquisition. When you have a question…

4 Difference from other QA systems Different from AI-type QA Back to 1960s - Kill the semantic ambiguity Web as a resource of QA– Search + Natural Language I/O Limited to fact-/knowledge-based questions However, many questions are communicative-specific location-specific time-specific Another (interactive) QA system – enable grassroots intelligence and collaboration

5 So, our goals… Measurement and modeling o f a real large-scale QA system how a real QA system works? What are the typical user behaviors and their impacts? Seek Better QA system How to design a QA system? How to make performance tradeoffs?

6 iAsk (http://iask.sina.com.cn) A topic-based web-QA system Question lifecycle: questioning->wait for reply -> confirmation (closed) Provide optimal reply selection & reply rewarding

7 Measurement Results Data Set 2-month (Nov 22, 2005 to Jan 23, 2006) 350K questions and 2M replies 220K users, 1901 topics Measurement on Question/reply patterns over time Question/reply pattern over topics Question/reply pattern across users Question/reply Incentive mechanisms

8 Behavior Pattern over Time On Hourly Scale: a consistent usage pattern

9 Behavior Pattern over Topics Topic characteristics P--Popularity (#Q) (Zipf-Popularity) questioning and replying activities Q--Question Proneness (#Q/#U) the likelihood that a user will ask a question R-- Reply Proneness (#R/#U) the likelihood that a user will reply a question Our measurement shows that topic characteristics vary intensively and user behaves quite differently.

10 Behavior Pattern across Users Active and non-active users about 9% users to 80% replies VS. about 22% users to 80% questions asymmetric questioning/replying pattern 4.7% altruists VS. 17.7% free-riders Narrow user interests #topic (Q): 1.8 #topic (R): 3.3

11 Performance Metric Reply-Rate how likely his question can be replied Reply-Number How likely his question can get an expected answer Reply-Latency how quickly he can get an answer

12 iAsk performance Long-term performance: Reply-Rate: 99.8% Reply-Number: about 5 Reply-Latency: about 10hr Within 24hrs Reply-Rate: 85% Reply-Number: about 4 Reply-Latency: about 6hr In summary, the performance is quite satisfactory except sometimes users need tolerate a relative long delay

13 Measurement on Incentive Mechanism

14 Modeling The question arrival distribution: Poisson distribution The reply behavior: an approximate exponentially- decaying model Performance formula Define dynamic performance

15 Parameter Impact

16 Possible Improvement Active or Push-based Question Delivery Better Webpage Layout, e.g. adding shortcuts Better Incentive mechanism Utilize Power of Social Networks

17 Conclusions Web-QA that leverages the grassroots intelligence and collaboration is hot and getting hotter… Our measurement and model revealed that the QAs QoS heavily depends on three key factors: user scale, user reply probability and a system design artifact, e.g. webpage design. Current simple Web-QA System achieved the acceptable performance, but there still is improvement room

18 Backup

19 Behavior Pattern over Topics Topic characteristics P--Popularity (#Q) (Zipf-Popularity)

20 Behavior Pattern over Topics Topic characteristics P--Popularity (#Q), Zipf-Popularity Q--Question Proneness (#Q/#U) R-- Reply Proneness (#R/#U)

21 Narrow User Interest Scope

22 Reply distribution (measured)

23 Static Performance Formula Reply-Rate Reply-Number Reply-Latency

24 Dynamic Performance Formula Define dynamic performance We have,


Download ppt "Chunyi Peng, Zaoyang Gong, Guobin shen Microsoft Research Asia HotWeb 2006 MEASUREMENT AND MODELING OF A WEB-BASED QUESTION ANSWERING SYSTEM."

Similar presentations


Ads by Google