Expertise Networks in Online Communities: Structure and Algorithms Jun Zhang Mark S. Ackerman Lada Adamic University of Michigan WWW 2007, May 8–12, 2007,

Expertise Networks in Online Communities: Structure and Algorithms Jun Zhang Mark S. Ackerman Lada Adamic University of Michigan WWW 2007, May 8–12, 2007, Banff, Alberta, Canada.

Outline  Automatically identify users with expertise.  Analysis of the java forum  Test various network based ranking algorithms such as HITS and PageRank  Use simulations rules to evaluate how other alogorithms perform on Java Forum.  Evaluate performance in communities with different characteristics.

Introduction  Expertise Finder – Systems that help to find others with appropriate expertise to answer a question.  Current Expertise finders – Modern Information retrieval techniques.  Represent as term vector, match expertise queries using standard IR techniques.  Problem : Reflect if a person knows about a topic but does not distinguish person’s relative expertise levels.  Solution – Use network based ranking algorithm + content analysis.

Expertise Network  Usually have discussion thread structure Not a network focused on social relationships User replies because of interest in content. CEN – Community Expertise Network – Distribution of expertise along with network responses Structural Prestige – Closely related. Receiving more positive choices is prestigious.

Empirical Study – Java Forum  People come to ask questions.  87 sub forums with large diversity of users.  333,314 messages in 49,888 threads.  13,739 nodes and 55,761 edges.  Used human raters and selected 135 users – omitting users postings less than 10 times.

Characterizing the Network  Bow-tie Structure analysis Degree Distribution – To capture Level of interaction. Scale Free - Highly uneven distribution of participation. Degree Correlations Indegree – how many people a given user helps. Does not provide users’ own tendency to provide help- Eg. Only reply to newbies or talk to similar expertise level people. For Each asker-replier count indegree of replier vs asker.

Expertise Ranking Algorithms  Simple Statistical Measure Answers lot = knows the topic well. Spammers – inflammatory or disruptive posts. Handling Problem  Users’ relevance feedback. AnswerNum – No of questions answered. Also count no of users a user helped. Shows broader or greater expertise.

Z- Score Measures  Replying many = High Expertise  Asking many = lacks expertise on topics  Z – Score Combines both q + a.  Measure how different from a random user Post answers with p = 0.5 so n*p =n/2 replies Std Dev. Sqrt ( n*p*(1-p) = Sqrt(n) / 2  Asks and answers ~= 0, Answer more +

Expertise Rank Algorithm  Problem in Counting no posts user answered 100 newbie questions ranked equally expert as 100 advanced users’ ques.  Adopt method similar to PageRank.  Intuition B<-A and C<- B.C’s Expertise boosted. C(Ui) – Total no of users helping U1 d – Damping factor was set to 0.85 Could also be weighted including W iA – No times i was helped by A In this study, weighting does not improve the accuracy.

Evaluations  2 raters- Java Programming experts.  Five Levels of Expertise Rating.

Statistical Metrics  Frequently used correlation measures Spearmans rho : Does not handle weak ordering(i.e. Multiple items in ranking such that neither item is preferred over the other). Kendall’s Tau : Gives equal weight to any interchange of equal distance, no matter where it occurs. Eg between 1 & 2, 101 &102 TopK :Calculates Kendall’s Tau only for highest 20 ranks

Performance of Various Algorithms in different statistical metrics.

Simulations  The Need for it Understanding the human dynamics that shape an online community. This will help select appropriate algorithm for communities where dynamics different from the Java Forum.  2 Models - Best Preferred and Just Better Network

Best Preferred Network  Many experts answered others’ questions and seldom asked questions.  Very much similar to the Java Forum. P of replying increases exponentially with expertise level difference between 2 users

Just Better Network  Eg. Within an Organisation, experts may be under time constraints. Choose to answer only questions makes best use of their expertise.  Users having slightly better level of expertise answers.  U’s probability of answering a’s question

Contd…  Users make best use of their time  They are more selective in answering.  ExpertiseRank propagates expertise score from newbies to intermediate users who answer their question.  From them to experts.  In General ExpertiseRank outperforms others.

Network generated from both the models.

Summary & Future Work  Structural Information can be used to evaluate expertise network in online setting.  Relative expertise could be found using social network-based algorithms.  These algorithms did nearly as well as human raters.  In Future, Combine content information – to differentiate specific knowledge and structural information.

THANK YOU !!!

Human raters Vs Algorithms

Expertise Networks in Online Communities: Structure and Algorithms Jun Zhang Mark S. Ackerman Lada Adamic University of Michigan WWW 2007, May 8–12, 2007,

Similar presentations

Presentation on theme: "Expertise Networks in Online Communities: Structure and Algorithms Jun Zhang Mark S. Ackerman Lada Adamic University of Michigan WWW 2007, May 8–12, 2007,"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Expertise Networks in Online Communities: Structure and Algorithms Jun Zhang Mark S. Ackerman Lada Adamic University of Michigan WWW 2007, May 8–12, 2007,

Similar presentations

Presentation on theme: "Expertise Networks in Online Communities: Structure and Algorithms Jun Zhang Mark S. Ackerman Lada Adamic University of Michigan WWW 2007, May 8–12, 2007,"— Presentation transcript:

Similar presentations

About project

Feedback