Download presentation
Presentation is loading. Please wait.
1
Discovering Leaders from Community Actions Amit Goyal 1 Francesco Bonchi 2 Laks V.S. Lakshmanan 1 Oct 27, 2008 1 2
2
Context & Motivations: Viral Marketing
3
3 Word of Mouth and Viral Marketing We are more influenced by our friends than strangers 68% of consumers consult friends and family before purchasing home electronics (Burke 2003) Amit Goyal (University of British Columbia) http://cs.ubc.ca/~goyal/
4
4 Viral Marketing Also known as Target Advertising Initiate chain reaction by Word of mouth effect Low investments, maximum gain Amit Goyal (University of British Columbia) http://cs.ubc.ca/~goyal/
5
5 Viral Marketing as an Optimization Problem Given: Network with influence probabilities Problem: Select top-k leaders such that by targeting them, the spread of influence is maximized Hao Ma et al 2008, Domingos et al 2001, Richardson et al 2002, Kempe et al 2003 How to calculate true influence probabilities? Amit Goyal (University of British Columbia) http://cs.ubc.ca/~goyal/
6
6 A pattern mining approach We propose a completely different approach based on frequent pattern mining. We focus on the actions performed by users: Joining a community (as in flickr/facebook community) Rating a song, a movie (as in Y! Music, Y! Movie) Importance of time in which actions are performed Assumption: Users can see their friends’ actions Amit Goyal (University of British Columbia) http://cs.ubc.ca/~goyal/
7
7 Our Contributions Formally define the notion of leaders and its various flavors Efficient algorithms for extracting these leaders Demonstrate the utility and scalability of our algorithms, via an extensive set of experiments on a real world dataset Yahoo! Messenger (social graph) Yahoo! Movies rating (actions log) Amit Goyal (University of British Columbia) http://cs.ubc.ca/~goyal/
8
8 Rest of the talk Framework definition: Influence propagation on the social network Various notions of leaders Algorithms Experiments Related Work Conclusion Amit Goyal (University of British Columbia) http://cs.ubc.ca/~goyal/
9
Framework Definition
10
10 Input Data (1) A social network, i.e., an undirected graph G=(V,E) where nodes are users and edges represent social ties. Users declare their friends. e.g. Facebook, Yahoo! Messenger etc Amit Goyal (University of British Columbia) http://cs.ubc.ca/~goyal/
11
11 Input Data (2) An actions log sorted in chronological order, i.e., a relation Actions(User, Action, Time) Example: Jack joined Yoga community at time 5 Assumption: Users can see their friends actions (feeds) Amit Goyal (University of British Columbia) http://cs.ubc.ca/~goyal/
12
12 Action Propagation JackJill Mary Jack and Jill are friends Jack and Mary are friends Action is “Joining the Yoga community” Joined Yoga Community at time 5 Joined Yoga Community at time 8 Joined Yoga Community at time 1000 Action Propagated from Jack to Jill Action propagated from Jack to Mary Amit Goyal (University of British Columbia) http://cs.ubc.ca/~goyal/ 3 time units 995 time units
13
13 Propagation Graph JackJill Joey Joined Yoga Community at time 5 Joined Yoga Community at time 8 Joined Yoga Community at time 1000 Mary Ben Joined Yoga Community at time 12 Joined Yoga Community at time 15 Can we say Mary got influenced by Jack?? NO Amit Goyal (University of British Columbia) http://cs.ubc.ca/~goyal/
14
14 User Influence Graph When an action propagates from user u to user v, we may think of v being influenced by u Influence should decay in time Size of influence graph << Size of PG Amit Goyal (University of British Columbia) http://cs.ubc.ca/~goyal/ Propagation Graph User Influence Graph for Jack
15
15 Leaders – first definition Who should be a leader? For an action, should influence sufficiently large number of users ( >ψ ) For an action, should influence these users in a reasonable amount of time ( <π ) Should act as a leader in sufficiently large number of actions ( >σ ) If ψ= 2, π = 15, σ = 1 then, both Jack and Jill are leaders Amit Goyal (University of British Columbia) http://cs.ubc.ca/~goyal/ 3 7 4 7 3 995 JackJill Joey Joined Yoga Community at time 5 Joined Yoga Community at time 8 Joined Yoga Community at time 1000 Mary Ben Joined Yoga Community at time 12 Joined Yoga Community at time 15 3 7 7 JackJill Joey Joined Yoga Community at time 5 Joined Yoga Community at time 8 Joined Yoga Community at time 1000 Mary Ben Joined Yoga Community at time 12 Joined Yoga Community at time 15 JackJill Joey Joined Yoga Community at time 5 Joined Yoga Community at time 8 Joined Yoga Community at time 1000 Mary Ben Joined Yoga Community at time 12 Joined Yoga Community at time 15 JackJill Joey Joined Yoga Community at time 5 Joined Yoga Community at time 8 Joined Yoga Community at time 1000 Mary Ben Joined Yoga Community at time 12 Joined Yoga Community at time 15
16
16 Tribe Leader A leader may influence different users for different actions What if a leader lead a fixed set of users for different actions? We call these leaders as Tribe Leaders Can be considered as small communities Amit Goyal (University of British Columbia) http://cs.ubc.ca/~goyal/ jack A1 A3 A2 A1, A2 and A3 are 3 different actions
17
17 Additional Constraint: Genuineness It may happen that one user acts as a leader but in concrete he is always a follower of the other leaders We want to avoid this kind of fake leaders. gen(Jill) = 1/3 Another constraint: confidence Amit Goyal (University of British Columbia) http://cs.ubc.ca/~goyal/ Tom Jill Jack A1 A3 A2 A1 A2 A1, A2 and A3 are 3 different actions
18
Algorithms but how will I discover the leaders??
19
19 Algorithms: Overview Assumptions: Social graph is huge – millions of nodes Actions log is huge – millions of tuples For an action, size of user Influence Graph << size of Propagation Graph for all users Our algorithms are able to extract the patterns (leaders and tribe leaders) in no more than one scan of the action log table. Amit Goyal (University of British Columbia) http://cs.ubc.ca/~goyal/
20
20 Algorithms: Overview Scan the action log table by means of a window of sizeπbackward in time, i.e., starting from the most recent timestamp (bottom of the table if we assume tuples to be ordered by time). Efficiently compute the influence matrix, i.e., a matrix Users x Actions IM π (u, a) represents number of users, influenced by u w.r.t. action a within timeπ Compute leaders from IM Amit Goyal (University of British Columbia) http://cs.ubc.ca/~goyal/ IM 10 (Jack, “joining yoga community”) = 3
21
21 Computing Influence Matrix (1) We use a bit vector to track which users are influenced by a given user. Updated incrementally Locking mechanism using another bit vector 0 => free bit; 1 => occupied bit Node to bit index mapping stored in a queue Bits must be dynamically allocated. S R T W V NodeInfVec R01010111 S01000110 T00010110 W00000110 V00000100 (V,2)(W,1)(T,4)(S,6)(R,0) Head Queue 01010111 Lock bit Vector Time window on propagation graph Amit Goyal (University of British Columbia) http://cs.ubc.ca/~goyal/
22
22 Computing Influence Matrix (2) Slide up the current window – delete node V Delete the entry from queue Update the lock Update influence vectors S R T W V NodeInfVec R01010011 S01000010 T00010010 W00000010 V00000100 01010011 Lock bit Vector (V,2)(W,1)(T,4)(S,6)(R,0) Head Queue Amit Goyal (University of British Columbia) http://cs.ubc.ca/~goyal/ (V,2)(W,1)(T,4)(S,6)(R,0) 01010111 Lock bit Vector NodeInfVec R01010111 S01000110 T00010110 W00000110 V00000100 Time window on propagation graph
23
23 Computing Influence Matrix (3) New node P added Issue a lock, add entry to the queue Compute its Influence Vector by propagation Number of followers of P = 4 IM(P,a) = 4 S R T W NodeInfVec P01010111 R01010011 S01000010 T00010010 W00000010 (W,1)(T,4)(S,6)(R,0)(P,2) Head Queue 01010111 Lock bit Vector P Amit Goyal (University of British Columbia) http://cs.ubc.ca/~goyal/ (W,1)(T,4)(S,6)(R,0) 01010011 Lock bit Vector NodeInfVec R01010011 S01000010 T00010010 W00000010 Time window on propagation graph
24
24 Mining Tribe Leaders Influence Matrix not enough We use influence cube: Users x Actions x Users IC π (u,a,v) = 1, when user v is influenced by user u for action a within time π We do not explicitly compute the whole cube due to sparsity. Problem same as discovering existence of frequent itemsets of size larger than a given threshold Amit Goyal (University of British Columbia) http://cs.ubc.ca/~goyal/
25
25 Algorithms - Final Comments The only truly mandatory threshold is π(time threshold) Influence Matrix: O(TAn 2 ) in bit level operations T = total number of tuples in action log A = total number of distinct actions n = maximum number of nodes visible in any position of the time window n << N, where N is the total number of users Tribe Leaders: Influence Cube: O(TAn 2 ) Finding existence of frequent itemsets: exponential in number of followers But very fast due to optimizations (Bonchi 2003) Amit Goyal (University of British Columbia) http://cs.ubc.ca/~goyal/
26
Experiments enough talking, show me the results dude!!
27
27 Data Preparation Data Social graph: Yahoo! Instant Messenger Actions log: Yahoo! Movies Action = user u rated movie m at time t joined through common users identifiers Started from Yahoo! Instant Messenger subgraph of “most active” users (110M nodes) and 21M ratings from Yahoo! Movies. Ended with 217.5K nodes, 221.4K edges and 1.8M ratings. Amit Goyal (University of British Columbia) http://cs.ubc.ca/~goyal/
28
28 Data characteristics: connected components Giant component 94K Users (43.2% of connected users) Total 46,650 connected components Amit Goyal (University of British Columbia) http://cs.ubc.ca/~goyal/
29
29 Leaders Vs. Tribe leaders Amit Goyal (University of British Columbia) http://cs.ubc.ca/~goyal/ π – threshold on time σ – threshold on number of actions ψ – threshold on number of influenced users
30
30 Number of leaders found Amit Goyal (University of British Columbia) http://cs.ubc.ca/~goyal/ π – threshold on time σ – threshold on number of actions ψ – threshold on number of influenced users
31
31 Run-time Amit Goyal (University of British Columbia) http://cs.ubc.ca/~goyal/ π – threshold on time σ – threshold on number of actions ψ – threshold on number of influenced users
32
32 Genuineness: an almost binary concept! Amit Goyal (University of British Columbia) http://cs.ubc.ca/~goyal/
33
33 Top-10 tribe leaders w.r.t. tribe size Tribe leaders exhibit high confidence. Tribe leaders with low genuineness were found dominated by other tribe leaders present in the tables. We found many users acting as leader in many actions but not being a tribe leader. Amit Goyal (University of British Columbia) http://cs.ubc.ca/~goyal/
34
34 Related Work (1) Identifying influential users Domingos et al 2001, Richardson et al 2002, Kempe et al 2005 Identifying influential bloggers Agarwal et al 2008 Identifying communities in Social Networks Hoproft et al 2003, Kumar et al 2006, Backstrom et al 2006, Tantipathananadh et al 2007, Huang et al 2008, Friedland at el 2007 Amit Goyal (University of British Columbia) http://cs.ubc.ca/~goyal/
35
35 Related Work (2) Influence and Correlation in Social Networks Aris Anagnostopoulos et al 2008 Revenue maximization Hartline et al 2008 Near optimal sensor placement for outbreak detection Leskovec et al 2007 Heat Diffusion Model Hao Ma et al 2008 (CIKM) Amit Goyal (University of British Columbia) http://cs.ubc.ca/~goyal/
36
36 Conclusions Proposed framework based on frequent pattern mining for discovering leaders in social networks Formally define the problem of extracting leaders from social graph and actions log. Various notions of leader, tribe leader Their confidence and genuine variants Efficient algorithms for extracting leaders of various flavors Just one pass over the actions log table Demonstrate the utility and scalability of our algorithms, via an extensive set of experiments on a real world dataset Yahoo! Messenger (social graph) Yahoo! Movies rating (actions log) Amit Goyal (University of British Columbia) http://cs.ubc.ca/~goyal/
37
37 Ongoing/Future Work Gurumine: Pattern Mining System for Discovering Leaders and Tribes (Demo paper to appear in ICDE 2009) Leadership Cube: What kind of leaders attract what kind of followers for what kind of actions? Viral Marketing Stronger notions of influence? Amit Goyal (University of British Columbia) http://cs.ubc.ca/~goyal/
38
38 Thanks! 1 3 4 1 2 3 5 2 3 13 3 7 4 Amit Goyal (University of British Columbia) http://cs.ubc.ca/~goyal/
39
39 Backup
40
40 Number of leaders found π – threshold on time σ – threshold on number of actions ψ – threshold on number of influenced users Amit Goyal (University of British Columbia) http://cs.ubc.ca/~goyal/
41
41 Additional constraint: confidence Similarly to association rules, we can have a confidence measure for leaders. Leadership confidence = # actions in which is a leader / # actions performed Example: Lets say Jack performed 10 actions out of which in 7 actions, he acted as a leader (i.e. more than ψ users followed in short time), then conf(Jack) = 7/10 Amit Goyal (University of British Columbia) http://cs.ubc.ca/~goyal/
Similar presentations
© 2025 SlidePlayer.com Inc.
All rights reserved.