School of Information University of Michigan Expertise networks in online communities: structure and algorithms Jun Zhang, Mark Ackerman, Lada Adamic School.

School of Information University of Michigan Expertise networks in online communities: structure and algorithms Jun Zhang, Mark Ackerman, Lada Adamic School of Information, University of Michigan International Symposium on Self-Organizing Online Communities March 31 st, 2007

motivation  lots of people are turning to question-answer forums for help  automatically infer the expertise of participants  expertise could be used to rank answers, or recommend posts one could reply to methods  empirical evaluation of ranking algorithms  social network analysis  simulation  understand underlying dynamics  predict performance of ranking algorithms in communities with yet-unobserved dynamics

related work Netscan (Marc Smith & co) Robert Kraut commitment & online community Virtual communities (Barry Wellman) using link-based ranking algorithms to evaluate expertise in email networks (Dom et al.) image credit: Danyel Fisher

Can we automatically infer expertise? We use PageRank, HITS, ask/reply ratios, etc. to try and automatically infer the expertise of the users Human raters read the posts made by users In online JavaForum, ask/reply ratio outperforms PageRank… Develop simulations: distribution of expertise (skewed) who asks questions most often? (novices) who answers questions 1. best expert most likely 2. someone a bit more expert

Constructing a community expertise network A BC Thread 1 Thread 2 Thread 1: Large Data, binary search or hashtable? user A Re: Large... user B Re: Large... user C Thread 2: Binary file with ASCII data user A Re: File with... user C A B C 1 1 A BC 1 2 A BC 1/2 1+1//2 A B C 0.9 0.1 unweighted weighted by # threads weighted by shared credit weighted with backflow

JavaForum 87 sub-forums 1,438,053 messages community expertise network constructed: 196,191 users 796,270 edges Observations More than 55% of users usually only ask questions, while there are about 25% of users answer questions. Many questions are answered by few advanced users while majority of users only answer a few. Top repliers answer questions for everyone. However, less expert users tend to answer questions of others with lower expertise level.

Uneven participation number of people one replied to ‘answer people’ may reply to thousands of others ‘question people’ are also uneven in the number of repliers to their posts, but to a lesser extent

Not Everyone Asks/Replies Core: A strongly connected component, in which everyone asks and answers IN: Mostly askers. OUT: Mostly Helpers The Web is a bow tieThe Java Forum network is An uneven bow tie

relating network structure to Java expertise Human-rated expertise levels 2 raters 135 JavaForum users with >= 10 posts inter-rater agreement (  = 0.74,  = 0.83) for evaluation of algorithms, omit users where raters disagreed by more than 1 level (  = 0.80,  = 0.83) LCategoryDescription 5Top Java expertKnows the core Java theory and related advanced topics deeply. 4Java professionalCan answer all or most of Java concept questions. Also knows one or some sub topics very well, 3Java userKnows advanced Java concepts. Can program relatively well. 2Java learnerKnows basic concepts and can program, but is not good at advanced topics of Java. 1NewbieJust starting to learn java.

Structural Info Based Expertise Ranking Metrics # replies posted (# answers) experts can answer many questions # people replied to (# indegree) experts can answer questions from many different people z-score for the 2 above (observed –  )/  experts are above the mean in the above two metrics PageRank replying to people who reply to people higher level experts can answer mid-level experts HITS experts answer questions by people whose questions other experts have answered hubs point to good authorities

automated vs. human ratings # answers human rating automated ranking z # answers HITS authority indegree z indegree PageRank

JavaForum empirical evaluation of ranking algorithms simple local measures do as well (and better) than measures incorporating the wider network topology Top K Kendall’s  Spearman’s  # answers z-score # answers indegree z-score indegree PageRank HITS authority 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0

Modeling community structure to explain algorithm performance

simulating probability of expertise pairing suppose: expertise is uniformly distributed probability of posing a question is inversely proportional to expertise p ij = probability a user with expertise j replies to a user with expertise i 2 models: ‘best’ preferred‘just better’ preferred j>i

visualization Best “preferred”just better

degree correlation profiles best preferred (simulation)just better (simulation) degree-degree correlations between asker and helper asker indegree

It can tell us when to use which algorithms Preferred Helper: ‘ just better ’ Preferred Helper: ‘ best available ’

Different ranking algorithms perform differently In the ‘just better’ model, a node is correctly ranked by PageRank but not by HITS

simplest models do not capture all ‘local’ interactions

Summary Expertise Networks have interesting characteristics A set of useful metrics Simulation as an analysis tool There are rich design opportunities Find experts with the help of structural information (and content analysis) Predict good answers Re-order questions/answers to match expertise questions posed by experts wait an average of 9 hours for the first reply novice questions are answered in 40 minutes working paper: “Expertise-Level based Interface Personalization for Online Help-seeking Communities”

Looking at diverse sets of question-answer forums (Yahoo Answers) Expertise across different topics Using explicit ratings for evaluation of automated expertise identification & incorporation into algorithms (battling spam) Users’ expertise change over time Developing applications, e.g. recommender engines for questions Future Work cars & transportation maintenance & repairs beauty & style hair

for more info ExpertiseRank algorithms and evaluations Zhang, J., Ackerman, M.S., Adamic, L., Expertise Networks in Online Communities: Structure and Algorithms, WWW’07 Simulations of expertise networks Zhang, J., Ackerman, M.S., Adamic, L., CommunityNetSimulator: Using Simulations to Study Online Community Network Formation and Implications, C&T2007 Jun Zhang junzh@umich.edujunzh@umich.edu http://www- personal.si.umich.edu/~junzh Mark Ackerman ackerm@eecs.umich.edu ackerm@eecs.umich.edu http://www.eecs.umich.edu/~ackerm/ Lada Adamic ladamic@umich.eduladamic@umich.edu http://www- personal.umich.edu/~ladamic

School of Information University of Michigan Expertise networks in online communities: structure and algorithms Jun Zhang, Mark Ackerman, Lada Adamic School.

Similar presentations

Presentation on theme: "School of Information University of Michigan Expertise networks in online communities: structure and algorithms Jun Zhang, Mark Ackerman, Lada Adamic School."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

School of Information University of Michigan Expertise networks in online communities: structure and algorithms Jun Zhang, Mark Ackerman, Lada Adamic School.

Similar presentations

Presentation on theme: "School of Information University of Michigan Expertise networks in online communities: structure and algorithms Jun Zhang, Mark Ackerman, Lada Adamic School."— Presentation transcript:

Similar presentations

About project

Feedback