Presentation is loading. Please wait.

Presentation is loading. Please wait.

LOGO Identifying the Influential Bloggers in a Community Nitin Agarwal, Huan Liu, Lei Tang and Philip S. Yu WSDM 2008 Advisor : Dr. Koh Jia-Ling Speaker.

Similar presentations


Presentation on theme: "LOGO Identifying the Influential Bloggers in a Community Nitin Agarwal, Huan Liu, Lei Tang and Philip S. Yu WSDM 2008 Advisor : Dr. Koh Jia-Ling Speaker."— Presentation transcript:

1 LOGO Identifying the Influential Bloggers in a Community Nitin Agarwal, Huan Liu, Lei Tang and Philip S. Yu WSDM 2008 Advisor : Dr. Koh Jia-Ling Speaker : Tu Yi-Lang Date : 2008.11.27

2 2 Outline  Introduction  Influential bloggers  Identifying the influentials  An initial set of intuitive properties  A preliminary model  Further study and experiments  Conclusions

3 3 Introduction  The advent of participatory Web applications has created online media that turn the former mass information consumers to the present information producers.  Before people buy or make decisions, they talk, and they listen to other’s experience, opinions, and suggestions, the latter affect the former in their decision making, and are aptly termed as the influential.

4 4 Introduction  Since the bloggers can be connected in a virtual community anywhere anytime, the identification of the influential bloggers can benefit all in :  Developing innovative business opportunities,  Forging political agendas,  Discussing social and societal issues,  Lead to many interesting applications.

5 5 Introduction  Here addressing a novel problem of identifying influential bloggers on a blog site and investigate its issues and challenges :  Are they simply active bloggers?  What measures should be used to define influential bloggers?  Can we create a robust model that quantitatively tells how influential a blogger is?

6 6 Influential bloggers  Blogs can be categorized into two major types : individual and community blogs.  Individual blogs :  Single-authored,  Others can comment on a blog post, but cannot start a new line of blog posts,  More like diary entries or personal experiences.  Community blogs :  Each blogger can not only comment on some blog posts, but also start some topic lines.

7 7 Influential bloggers  Propose a preliminary model :  Quantify the properties of the influential bloggers by combining various statistics collectable from a blog site and assigning influence scores to each blogger and their blog posts.  Investigate how these statistics can be used in various ways to adjust the model for different purposes.

8 8 Influential bloggers  An intuitive way of defining an influential blogger is to check if the blogger has any influential blog post, i.e., a blogger can be influential if s/he has more than one influential blog post.  Assume we have an influence score I(p i ) for a post p i :  For a blogger b k who has N blog posts, {p 1, p 2,..., p N }, and iIndex(b k ) = max(I(p i )), where 1 ≤ i ≤ N.

9 9 Influential bloggers  Given a set U of M bloggers, {b 1, b 2,..., b M } :  An ordered subset V of K bloggers, {b j1, b j2,..., b jK } that are ordered according to their iIndex such that V ⊆ U and K ≤ M, i.e. iIndex(b j1 ) ≥ iIndex(b j2 ) ≥... ≥ iIndex(b jK ).  For all the blog posts {p 1, p 2,..., p L } by all M bloggers, influential blog posts are those whose influence scores are greater than iIndex(b jK ) or, I(p l ) ≥ iIndex(b jK ) for 1≤ l ≤ L.

10 10 Identifying the influentials  An initial set of intuitive properties :  Recognition : it can be equated to the case that an influential post p is referenced in many other posts, or its number of inlinks ( ι ) is large.  Activity generation : it be indirectly measured by how many comments it receives, a large number of comments ( γ ) indicates that the post affects many such that they care to write comments.

11 11 Identifying the influentials  An initial set of intuitive properties : (cont.)  Novelty : a large number of outlinks ( θ ) may suggest that a post refers to many other blog posts or articles, indicating that it is less likely to be novel.  Eloquence : a long post often suggests some necessity of doing so, therefore, it uses the length of a post ( λ ) as a heuristic measure for checking if a post is influential or not.

12 12 Identifying the influentials  A preliminary model : .

13 13 Further study and experiments  Data collection :  The Unofficial Apple Weblog (TUAW) site provide most needed information like blogger identification, date and time of posting, number of comments, and outlinks.  The only missing piece of information at TUAW is the inlinks information, which we can obtain using Technorati API.  From Feb. 2004 till Jan. 2007, it collected over 10,000 posts.

14 14 Further study and experiments  Influential bloggers and active bloggers :  Many blog sites publish a list of top bloggers based on their activities on the blog site, and in this paper, we call these people active bloggers.  Using the number of posts of a blogger posted is obviously an oversimplified indicator, which basically says the most frequent blogger is an influential one.

15 15 Further study and experiments  Influential bloggers and active bloggers : (cont.)  Active and influential : can be verified by the large number of posts and the large number of comments and citations by other bloggers.  Inactive but influential : these bloggers submit a few but influential posts.  Active but non-influential : these bloggers post actively, but their posts may not generate sufficient interests to be ranked as the top 5 influentials.

16 16 Further study and experiments

17 17  Evaluating the model :  It uses Web2.0 site Digg (http://www.digg.com/) to provide a reference point.  As people read articles or blog posts, they can give their votes in the form of digg and these votes are recorded on Digg servers.  For January 2007, there were in total 535 blog posts submitted on TUAW, as Digg only returns top 100 voted posts, we use these 100 blog posts at Digg as the benchmark in evaluation. Further study and experiments

18 18 Further study and experiments

19 19 Further study and experiments  Influential vs. non-influential blog posts :  Totally we have 22 influential and 513 non- influential blog posts for January 2007.

20 20 Further study and experiments  Effects and usages of weights :  We obtain the same ranking of influential bloggers for w comm ≥ 0.6, w in ≥ 0.9, w out ≥ 0.2.  The value change of the above three weights can lead to different rankings, this allows one to adjust the weights of the model to attain different goals.

21 21 Further study and experiments  Temporal patterns of the influentials :  Long-term influentials : They steadily maintain the status of being influential for a very long time.  Average-term influentials : They maintain their influence status for 4-5 months.  Transient influentials : They are influential for a very short time period.  Burgeoning influentials : They are emerging as influential bloggers recently.

22 22 Further study and experiments

23 23 Conclusions  Bloggers form their virtual communities of similar interests.  Finding the influential bloggers will not only allow us to better understand interesting activities happening in a virtual world, but also present unique opportunities for industry, sales, and advertisements.

24 24 Conclusions  Discussing the challenges of identifying influential bloggers, investigate what constitutes influential bloggers.  Presenting a preliminary model attempting to quantify an influential blogger.  Paving the way for building a robust model that allows for finding various types of the influentials.


Download ppt "LOGO Identifying the Influential Bloggers in a Community Nitin Agarwal, Huan Liu, Lei Tang and Philip S. Yu WSDM 2008 Advisor : Dr. Koh Jia-Ling Speaker."

Similar presentations


Ads by Google