Presentation is loading. Please wait.

Presentation is loading. Please wait.

Mining Bulletin Board Systems Using Community Generation Ming Li, Zhongfei (Mark) Zhang, and Zhi-Hua Zhou PAKDD’08 Reporter: Che-Wei, Liang Date: 2008.07.10.

Similar presentations


Presentation on theme: "Mining Bulletin Board Systems Using Community Generation Ming Li, Zhongfei (Mark) Zhang, and Zhi-Hua Zhou PAKDD’08 Reporter: Che-Wei, Liang Date: 2008.07.10."— Presentation transcript:

1 Mining Bulletin Board Systems Using Community Generation Ming Li, Zhongfei (Mark) Zhang, and Zhi-Hua Zhou PAKDD’08 Reporter: Che-Wei, Liang Date: 2008.07.10 1

2 Outline Introduction General Model Interest-Sharing Group Identification Predicting User Behavior Using Generated Community Experiment 2

3 Introduction Bulletin Board System (BBS) – Information exchanging and sharing platform – Consists of a number of boards – Users can read/post messages on different topics Users with similar interests may have similar actions Effective discovery of relationships between users of a BBS is essential 3

4 4

5 General Model Consider the posted messages, – Use title to fully determine the topics of message – Extracted key words of titles – Mapped to collected topics A BBS user tends to join in a discussion on topics that he or she is interested – Messages that users posted may reflect users’ interests – Users’ interests are time-dependent – Frequency of messages posted should also be assessed 5

6 General Model Access pattern of BBS users – View of Topics A set of topics and user access frequencies of the messages posted to different boards by different users along the timeline – View of Boards A set of boards and frequencies of messages posted to the boards along the timeline 6

7 General Model BBS model – A collection of users, each being represented by two timelines of actions on Boards view and Topics view 7

8 Interest-Sharing Group Identification 8

9 Given two timelines of actions X and Y of two users id x and id y A Straight forward way – Similarity between X i and Y j = 9

10 Interest-Sharing Group Identification Average frequency differences of actions Local similarity between X i and Y j 10

11 Interest-Sharing Group Identification Hybrid similarity between X i and Y Global similarity between X and Y 11

12 Predict User Behavior Using Generated Community Given a user id i, – Predict what action id i may take in the near future Actions that have been taken by id i may be closely related to id i ’s future actions – Possible solution Compute posterior probability 12

13 Predict User Behavior Using Generated Community Resolved with interest-sharing groups – Similar users may take similar actions at some time instants 13

14 BPUC algorithm 14

15 Experiment Data Set – BBS of Nanjing University – messages collected from January 1st, 2003 to December 1st, 2005 on 17 most popular boards. – 4512 topics of 17 boards, 1109 users. Evaluation set – 42 volunteers, 18 users interested in modern weapons, 12 users are fond of programming skills; rest of users are interested in computer games 15

16 16

17 Experiments on Community Generation Neighborhood accuracy – Describes how accurate the neighbors of a user in a generated community share similar interests to that of the user Component accuracy – Measures how well these generated groups represent certain interests that are common to the individuals of the groups 17

18 Experiments on Community Generation Example – A generated community, 7 links between similar users, 10 links between dissimilar users – Neighborhood accuracy = (7+10)/21 = 0.810 Component accuracy = (7+0)/21 = 0.333 18

19 Experiments on Community Generation Compare with CORAL 19

20 Experiments on Community Generation 20

21 Experiments on Community Generation Running time comparison 21

22 Experiments on User Behavior Prediction 1056 days for training the probability model Last 10 days for testing 22


Download ppt "Mining Bulletin Board Systems Using Community Generation Ming Li, Zhongfei (Mark) Zhang, and Zhi-Hua Zhou PAKDD’08 Reporter: Che-Wei, Liang Date: 2008.07.10."

Similar presentations


Ads by Google