Presentation is loading. Please wait.

Presentation is loading. Please wait.

School of Information University of Michigan 1 Information diffusion in online communities Lada Adamic ICOS Sept. 29, 2006.

Similar presentations


Presentation on theme: "School of Information University of Michigan 1 Information diffusion in online communities Lada Adamic ICOS Sept. 29, 2006."— Presentation transcript:

1 School of Information University of Michigan 1 Information diffusion in online communities Lada Adamic ICOS Sept. 29, 2006

2 2 Talk outline discussions in the political blogosphere online person-to-person product recommendations

3 3 joint work with Natalie Glance @ Nielsen/Buzzmetrics The political blogosphere and the 2004 election: Divided they blog

4 4 Political blogs gaining in importance Pew Internet & American Life Project Report, January 2005, reports: 63 million U.S. citizens use the Internet to stay informed about politics (mid-2004, Pew Internet Study) 9% of Internet users read political blogs preceding the 2004 U.S. Presidential Election 2004 Presidential Campaign Firsts Candidate blogs: e.g. Dean’s blogforamerica.com Successful grassroots campaign conducted via websites & blogs Bloggers credentialed as journalists & invited to nominating conventions

5 5 Related research on political blogs 10 most popular political blogs account for half the blogs read by surveyed journalists (Drezner and Farrell 2004) The most popular blogs also receive the majority of citation links (Shirky 2003). Citation link structure reveals topical subcommunities: Catholicism, homeschooling, A-list bloggers (Herring et. al. 2005) Comparison of network neighborhoods of Atrios and Instapundit: no overlap in linking behavior (Welsch 2005) Research question: Are we witnessing cyberbalkanization of the Internet?

6 6 Collected self-identified liberal and conservative blogs from online directories (eTalkingHead, BlogCatalog, CampaignLine, Blogorama) Crawled home page of each blog in February 2005: found 30 more well-cited political blogs (manually categorized) biases toward sidebar/blogroll links Did not include libertarian, independent or moderate blogs (fewer in number and lesser in popularity) Identified: 676 liberal and 659 conservative blogs Calling all political blogs

7 7 The larger political blogosphere Results 91% of links point to blog of same persuasion Conservative blogs show greater tendency to link 82% of conservative blogs linked to at least once; 84% link to at least one other blog 67% of liberal blogs are linked to at least once; 74% link to at least one other blog Average # of links per blog is similar: 13.6 for liberal; 15.1 for conservative Higher proportion of liberal blogs that are not linked to at all

8 8 Indegree distributions for political blogs

9 9 Different rankings produce similar A-lists

10 10 Top 20 liberal blogs

11 11 Top 20 conservative blogs

12 12 Harvested posts for top 20 lists from BlogPulse BlogPulse stores individual posts: date, permalink, and content Date range: late August 2004 –> mid-November 2004 Collected: 12,470 liberal posts; 10,414 conservative posts Identifying citation links (weblog post -> blog OR post) For each post, extract all links (hrefs) Exclude self-links Blogroll/sidebar links not included 1511 L-L citations; 2110 R-R citations; 247 L-R; 312 R-L Result: Conservatives had 16% fewer posts but cited each other 40% more often Methodology for detailed study of A-list blogs

13 13 A)all citations between A-list blogs in 2 months preceding the 2004 election B)citations between A-list blogs with at least 5 citations in both directions C)edges further limited to those exceeding 25 combined citations Citations between blogs in their posts (Aug 29 th – Nov 15 th, 2004) only 15% of the citations bridge communities

14 14 21 JawaReport 22 Vodka Pundit 23 Roger L Simon 24 Tim Blair 25 Andrew Sullivan 26 Instapundit 27 Blogs for Bush 28 LittleGreenFootballs 29 Belmont Club 30 Captain’s Quarters 31 Powerline 32 Hugh Hewitt 33 INDC journal 34 Real Clear Politics 35 Winds of Change 36 Allahpundit 37 Michelle Malkin 38 Wizbang 39 Dean’s World 40 Volokh 1 Digby’s Blog 2 James Walcott 3 Pandagon 4 blog.johnkerry.com 5 Oliver Willis 6 America Blog 7 Crooked Timber 8 Daily Kos 9 American Prospect 10 Eschaton 11 Wonkette 12 Talk Left 13 Political Wire 14 Talking Points Memo 15 Matthew Yglesias 16 Washington Monthly 17 MyDD 18 Juan Cole 19 Left Coaster 20 Bradford DeLong

15 15 Notable examples of blogs breaking a story 1. Swiftvets.com anti-Kerry video Bloggers linked to this in late July, keeping accusations alive Kerry responded in late August, bringing mainstream media coverage 2. CBS memos alleging preferential treatment of Pres. Bush during the Vietnam War Powerline broke the story on Sep. 9 th, launching flurry of discussion Dan Rather apologized later in the month 3. “Was Bush Wired?” Salon.com asked the question first on Oct. 8 th, echoed by Wonkette & PoliticalWire.com MSM follows-up the next day

16 16 Discussion of “forged documents” Liberals and conservatives differ in the topics they discuss

17 17 Political blogs as echo chambers Pairwise comparison of URLs and phrases posted by each blog v A = w U1 w U2 … w UN tf*idf weight~ (number of times blog mentions URL)* log[(total number of blogs monitored by blogpulse)/(number of those blogs citing the URL)] Similarity of two blogs is given by the cosine of their vectors cos(A,B) = v A.v B /(||v A ||*||v B ||) Similarity in URLs between blogs of the same persuasion was higher (0.08 for liberal blogs and 0.09 for conservative ones), than between mixed pairs (0.03) Same trend for phrases. We can even invert the analysis, and see what phrases are similar…

18 18 Network of phrases found on the same blogs

19 19 Political figures being discussed 59% of the mentions of Kerry are by right leaning blogs 53% of the mentions of Bush are by left leaning blogs

20 20 Mainstream media bias (links from 1,400 blog set)

21 21 Mainstream media cited about once every other post from the A-list bloggers (6,762 times from the left, 6,364 from the right)

22 22 Insights from the political blogosphere Liberal and conservative blogs are balanced in numbers and tend to link primarily to their own communities But is 10% cross-linking really too little? Or is it a sign of significant discussion? Conservative blogs are more likely to include links to other blogs on their pages, and their A-list blogs reference one another more frequently Liberal and conservative blogs tend to discuss different things, but one is not more ‘coherent’ than the other

23 23 Trying to bridge the divide Opposition to the bankruptcy bill (March 2005) conservative blog post liberal blog post uncategorized blog post news article government website link between posts/pages posts/pages belonging to same blog/site but, bill was passed nevertheless: Senate 74 - 25, House 302 - 126

24 School of Information University of Michigan 24 The dynamics of viral marketing Jure Leskovec, Carnegie Mellon University Lada Adamic, University of Michigan Bernardo Huberman, HP Labs

25 25 Using online networks for viral marketing Burger King’s subservient chicken

26 26 Outline prior work on viral marketing & information diffusion incentivised viral marketing program cascades and stars network effects product and social network characteristics The dynamics of Viral Marketing

27 27 Information diffusion Studies of innovation adoption hybrid corn (Ryan and Gross, 1943) prescription drugs (Coleman et al. 1957) Models (very many) Rogers, ‘Diffusion of Innovations’ Watts, Information cascades (2003) Kempe, Kleinberg, Tardos, Maximizing the spread of Influence, (2005)

28 28 Motivation for viral marketing viral marketing successfully utilizes social networks for adoption of some services hotmail gains 18 million users in 12 months, spending only $50,000 on traditional advertising gmail rapidly gains users although referrals are the only way to sign up customers becoming less susceptible to mass marketing mass marketing impractical for unprecedented variety of products online

29 29 The web savvy consumer and personalized recommendations > 50% of people do research online before purchasing electronics personalized recommendations based on prior purchase patterns and ratings Amazon, “people who bought x also bought y” MovieLens, “based on ratings of users like you…” Is there still room for viral marketing?

30 30 Is there still room for viral marketing next to personalized recommendations? We are more influenced by our friends than strangers  68% of consumers consult friends and family before purchasing home electronics (Burke 2003)

31 31 Incentivised viral marketing (our problem setting) Senders and followers of recommendations receive discounts on products 10% credit10% off Recommendations are made to any number of people at the time of purchase Only the recipient who buys first gets a discount

32 32 Product recommendation network purchase following a recommendation customer recommending a product customer not buying a recommended product

33 33 the data large anonymous online retailer (June 2001 to May 2003) 15,646,121 recommendations 3,943,084 distinct customers 548,523 products recommended Products belonging to 4 product groups: books DVDs music VHS

34 34 summary statistics by product group productscustomersrecommenda- tions edgesbuy + get discount buy + no discount Book103,1612,863,9775,741,6112,097,80965,34417,769 DVD19,829805,2858,180,393962,341 17,232 58,189 Music393,598794,1481,443,847585,7387,8372,739 Video26,131239,583280,270160,683909467 Full542,7193,943,08415,646,1213,153,67691,32279,164 high low people recommendations

35 35 viral marketing program not spreading virally 94% of users make first recommendation without having received one previously size of giant connected component increases from 1% to 2.5% of the network (100,420 users) – small! some sub-communities are better connected 24% out of 18,000 users for westerns on DVD 26% of 25,000 for classics on DVD 19% of 47,000 for anime (Japanese animated film) on DVD others are just as disconnected 3% of 180,000 home and gardening 2-7% for children’s and fitness DVDs

36 36 medical study guide recommendation network

37 37 measuring cascade sizes delete late recommendations count how many people are in a single cascade exclude nodes that did not buy steep drop-off very few large cascades books 10 0 1 2 0 2 4 6 = 1.8e6 x -4.98

38 38 cascades for DVDs shallow drop off – fat tail a number of large cascades DVD cascades can grow large possibly as a result of websites where people sign up to exchange recommendations 10 0 1 2 3 0 2 4 ~ x -1.56

39 39 simple model of propagating recommendations (ignoring for the moment the specific mechanics of the recommendation program of the retailer) Each individual will have p t successful recommendations. We model p t as a random variable. At time t+1, the total number of people in the cascade, N t+1 = N t * (1+p t ) Subtracting from both sides, and dividing by N t, we have

40 40 simple model of propagating recommendations (continued) Summing over long time periods The right hand side is a sum of random variables and hence normally distributed. Integrating both sides, we find that N is lognormally distributed if  large resembles power-law

41 41 participation level by individual very high variance The most active customer made 83,729 recommendations and purchased 4,416 different items!

42 42 Network effects

43 43 does receiving more recommendations increase the likelihood of buying? BOOKS DVDs

44 44 does sending more recommendations influence more purchases? BOOKS DVDs

45 45 the probability that the sender gets a credit with increasing numbers of recommendations consider whether sender has at least one successful recommendation controls for sender getting credit for purchase that resulted from others recommending the same product to the same person probability of receiving a credit levels off for DVDs

46 46 Multiple recommendations between two individuals weaken the impact of the bond on purchases BOOKS DVDs

47 47 product and social network characteristics influencing recommendation effectiveness

48 48 recommendation success by book category consider successful recommendations in terms of av. # senders of recommendations per book category av. # of recommendations accepted books overall have a 3% success rate (2% with discount, 1% without) lower than average success rate (significant at p=0.01 level) fiction romance (1.78), horror (1.81) teen (1.94), children’s books (2.06) comics (2.30), sci-fi (2.34), mystery and thrillers (2.40) nonfiction sports (2.26) home & garden (2.26) travel (2.39) higher than average success rate (statistically significant) professional & technical medicine (5.68) professional & technical (4.54) engineering (4.10), science (3.90), computers & internet (3.61) law (3.66), business & investing (3.62)

49 49 professional and organized contexts In general, professional & technical book recommendations are more often accepted (probably in part due to book cost) Some organized contexts other than professional also have higher success rate, e.g. religion overall success rate 3.13% Christian themed books Christian living and theology (4.7%) Bibles (4.8%) not-as-organized religion new age (2.5%) occult spirituality (2.2%) Well organized hobbies books on orchids recommended successfully twice as often as books on tomato growing

50 50 examples of communities use a community finding algorithm to identify groups of people sending recommendations to one another # nodes # senders topics _ 735 74 books: American literature, poetry 710 179 sci-fi books, TV series DVDs, alternative rock music 667 181 music: dance, indie 653 121 discounted DVDs 541 112 books: art & photography, web development, graphical design, sci-fi 502 104 books: sci-fi and other 388 77 books: Christianity and Catholicism 309 81 books: business and investing, computers, Harry Potter 192 30 books: parenting, women’s health, pregnancy 163 48books: comparative religion, Egypt’s history, new age, role playing games

51 51

52 52 telling the world vs. telling your friends consider # reviewers per book # recommenders per book what we observe (ratio of recommendations to reviews given in parentheses) tell the world but not your friends literature & fiction (0.57) mystery & thrillers (0.36) horror (0.44) tell the world and your friends biographies (0.90) children’s books (1.12) religion (1.73) history (1.27) nonfiction (1.89) tell just your friends about personal pursuits health, mind & body (2.39) home & garden (3.48) arts & photography (3.85) cooking, food & wine (3.49) tell your colleagues about professional interests professional & technical (3.22) computers & internet (3.10) medicine (4.19) engineering (3.85) law (4.25)

53 53 anime DVDs 47,000 customers responsible for the 2.5 out of 16 million recommendations in the system 29% success rate per recommender of an anime DVD giant component covers 19% of the nodes Overall, recommendations for DVDs are more likely to result in a purchase (7%), but the anime community stands out

54 54 regressing on product characteristics VariabletransformationCoefficient const-0.940 *** # recommendationsln(r)0.426 *** # sendersln(n s )-0.782 *** # recipientsln(n r )-1.307 *** product priceln(p)0.128 *** # reviewsln(v)-0.011 *** avg. ratingln(t)-0.027 * R2R2 0.74 significance at the 0.01 (***), 0.05 (**) and 0.1 (*) levels small tightly knit communities purchasing expensive products

55 55 Marketing in the long tail Chris Anderson, ‘The Long Tail’, Wired, Issue 12.10 - October 2004Issue 12.10

56 56 The long tail of successful recommendations 20% of the products account for 50% of the purchases 67 % of the products have only a single successful recommendation (30% of all purchases)

57 57 Conclusions Overall incentivized viral marketing contributes marginally to total sales occasionally large cascades occur Observations for future diffusion models purchase decision more complex than threshold or simple infection influence saturates as the number of contacts expands links lose effectiveness if they are overused Conditions for successful recommendations professional and organizational contexts discounts on expensive items small, tightly knit communities

58 58 Overall observations Interests and social connections co-evolve What we are exposed to depends on the active topics in our community Who we communicate with depends on our interests

59 59 Thanks! http://www-personal.umich.edu/~ladamic


Download ppt "School of Information University of Michigan 1 Information diffusion in online communities Lada Adamic ICOS Sept. 29, 2006."

Similar presentations


Ads by Google