Topical Authority Detection and Sentiment Analysis on Top Influencers

Slides:



Advertisements
Similar presentations
Entity-Centric Topic-Oriented Opinion Summarization in Twitter Date : 2013/09/03 Author : Xinfan Meng, Furu Wei, Xiaohua, Liu, Ming Zhou, Sujian Li and.
Advertisements

Learning more about Facebook and Twitter. Introduction  What we’ve covered in the Social Media webinar series so far  Agenda for this call Facebook.
Influence and Passivity in Social Media Daniel M. Romero, Wojciech Galuba, Sitaram Asur, and Bernardo A. Huberman Social Computing Lab, HP Labs.
TI: An Efficient Indexing Mechanism for Real-Time Search on Tweets Chun Chen 1, Feng Li 2, Beng Chin Ooi 2, and Sai Wu 2 1 Zhejiang University, 2 National.
Finding Topic-sensitive Influential Twitterers Presenter 吴伟涛 TwitterRank:
1 KSIDI June 9, 2010 Measuring User Influence in Twitter: The Million Follower Fallacy Meeyoung Cha Max Planck Institute for Software Systems (MPI-SWS)
Towards Twitter Context Summarization with User Influence Models Yi Chang et al. WSDM 2013 Hyewon Lim 21 June 2013.
PSRC Technology Integration Team TWITTER 101.  Twitter is a social networking tool or microblog.  It is composed of short text, pictures, and URLs called.
Pete Bohman Adam Kunk.  Introduction  Related Work  System Overview  Indexing Scheme  Ranking  Evaluation  Conclusion.
The Role of Twitter in YouTube Videos Diffusion George Christodoulou EPFL Switzerland Laboratory for Internet Computing Department of Computer Science.
Social Media for Health Advocates Twitter
August 23, 2013 Social Media Audit. Overview  Goals –Evaluate current social networking status –Identify trending topics and social influencers –Provide.
Twitter rank—finding topic- sensitive influential twitters Singapore Management University Jianshu WENG Ee Peng LIM Jing JIANG Qi He ACM International.
TWITTER EFFECT: A S OCIAL N ETWORK ? OR A N EWS MEDIA ? Presented by: Bohyun Kim Under the Guidance of: Augustin Chaintreau.
Computing Trust in Social Networks
Projects ( ) Ida Mele. Rules Students have to work in teams (max 2 people). The project has to be delivered by the deadline that will be published.
Presented by Karen Porter UM School of Business Administration & ImpactOnlineMarketing.com Google + and Twitter for Biz ImpactOnlineMarketing.com.
Emerging Topic Detection on Twitter (Cataldi et al., MDMKDD 2010) Padmini Srinivasan Computer Science Department Department of Management Sciences
1 Announcements Research Paper due today Research Talks –Nov. 29 (Monday) Kayatana and Lance –Dec. 1 (Wednesday) Mark and Jeremy –Dec. 3 (Friday) Joe and.
12/2014 Heidi Larson HeidiL_edc.  Setting up an account  Twitter vocabulary – With Strategy tips  How to Tweet  Why to Tweet  How to get started.
A Comparison of Microblog Search and Web Search.
Presented by: Apeksha Khabia Guided by: Dr. M. B. Chandak
Pete Bohman Adam Kunk. Real-Time Search  Definition: A search mechanism capable of finding information in an online fashion as it is produced. Technology.
TWITTER What is Twitter, a Social Network or a News Media? Haewoon Kwak Changhyun Lee Hosung Park Sue Moon Department of Computer Science, KAIST, Korea.
Microblogs: Information and Social Network Huang Yuxin.
Recommending Twitter Users to Follow Using Content and Collaborative Filtering Approaches John HannonJohn Hannon, Mike Bennett, Barry SmythBarry Smyth.
Date: 2012/4/23 Source: Michael J. Welch. al(WSDM’11) Advisor: Jia-ling, Koh Speaker: Jiun Jia, Chiou Topical semantics of twitter links 1.
By Gianluca Stringhini, Christopher Kruegel and Giovanni Vigna Presented By Awrad Mohammed Ali 1.
Jiafeng Guo(ICT) Xueqi Cheng(ICT) Hua-Wei Shen(ICT) Gu Xu (MSRA) Speaker: Rui-Rui Li Supervisor: Prof. Ben Kao.
Twitter Games: How Successful Spammers Pick Targets Vasumathi Sridharan, Vaibhav Shankar, Minaxi Gupta School of Informatics and Computing, Indiana University.
Automatic Video Tagging using Content Redundancy Stefan Siersdorfer 1, Jose San Pedro 2, Mark Sanderson 2 1 L3S Research Center, Germany 2 University of.
Date: 2015/11/19 Author: Reza Zafarani, Huan Liu Source: CIKM '15
Recognizing Stances in Online Debates Unsupervised opinion analysis method for debate-side classification. Mine the web to learn associations that are.
NORTHWEST ENTREPRENEUR NETWORK social media strategy & roadmap NWEN mission: to help entrepreneurs succeed Team 8: Jarrod Gerhardt, Mary Hubbard, Maureen.
NORTHWEST ENTREPRENEUR NETWORK social media strategy & roadmap NWEN mission: to help entrepreneurs succeed Team 8: Jarrod Gerhardt, Mary Hubbard, Maureen.
Reputation Management System
EventGraphs: mapping the social structure of events with NodeXL.
Fabricio Benevenuto, Gabriel Magno, Tiago Rodrigues, and Virgilio Almeida Universidade Federal de Minas Gerais Belo Horizonte, Brazil ACSAC 2010 Fabricio.
Measuring User Influence in Twitter: The Million Follower Fallacy Meeyoung Cha Hamed Haddadi Fabricio Benevenuto Krishna P. Gummadi.
TwitterFeedRank Nick Flacco Dalton Huynh Abhishek Jha Phong Lam.
Our path Understanding emphaty in a Twitter community - Valerio Cestrone, Simona Balbi, Agnieszka Stawinoga What empathy is ? How can be measured on Twitter.
More than words: Social network’s text mining for consumer brand sentiments Expert Systems with Applications 40 (2013) 4241–4251 Mohamed M. Mostafa Reporter.
Philip Scanlon & Alan F Smeaton
Negative Link Prediction and Its Applications in Online Political Networks Mehmet Yigit Yildirim Mert Ozer Hasan Davulcu.
Chapter 7 E-commerce Marketing Communications. Chapter 7 E-commerce Marketing Communications.
Social Media Measurement Tools
Hijacking the Hashtag: A Case Study of #BreakTheInternet on Twitter
Summary Presented by : Aishwarya Deep Shukla
E-Commerce Theories & Practices
Information Propagation Speed and Patterns in Social Networks: a Case Study Analysis of German Tweets International Conference of Algorithms, Computing.
Overview Social media applications inform, educate, and entertain people through online (multi-)media A social networking application allows users to create.
Applying Key Phrase Extraction to aid Invalidity Search
An Efficient method to recommend research papers and highly influential authors. VIRAJITHA KARNATAPU.
#VisualHashtags Visual Summarization of Social Media Events using Mid-Level Visual Elements Sonal Goel (IIIT-Delhi), Sarthak Ahuja (IBM Research, India),
Online Tool Screen shots
Twitter Equity Firm Value
A Network Science Approach to Fake News Detection on Social Media
Improved Algorithms for Topic Distillation in a Hyperlinked Environment (ACM SIGIR ‘98) Ruey-Lung, Hsiao Nov 23, 2000.
Text Mining & Natural Language Processing
Pooria Taghizadeh : Dr. Hadi Tabatabaee : Dr. Mona Ghassemian :
Example: Academic Search
Text Mining & Natural Language Processing
Item-to-Item Recommender Network Optimization Methodology
A Classification-based Approach to Question Routing in Community Question Answering Tom Chao Zhou 22, Feb, 2010 Department of Computer.
Big Data Environment. Analysing Public Perceptions of South Africa’s Local Elections by using Geo-located Twitter Data.
Modeling Trust and Influence in the Blogosphere using Link Polarity
Analyzing Influence of Social Media Through Twitter
Yingze Wang and Shi-Kuo Chang University of Pittsburgh
--WWW 2010, Hongji Bao, Edward Y. Chang
Presentation transcript:

Topical Authority Detection and Sentiment Analysis on Top Influencers Machine Learning with Large Datasets Course Project (under the guidance of Prof. William W. Cohen) Team Members: Manuel, Shubham and Soumya

Outline Introduction Related Work Problem Statement Methodology Results Evaluation plan Conclusion

Introduction Topical authority detection in social networks is an active research area Important for recommending relevant feed to users interested in certain topics Challenges - Results should not be overly biased towards: popular authors (such as celebrities) generic authorities (such as news channels) Relatively new users, who may not exist prior to an event, but post dedicatedly on the topic, should also be considered

Related Work TwitterRank [2]: Authority Detection in Twitter using the idea of PageRank Leverages topical similarity and link structure between users Fails to filter out spammers, or celebrities who are not always influential Meeyoung Cha et. al. [3] find that popular users who have high in- degree are not necessarily influential in terms of spawning retweets or mentions Aditya Pal et. al. [5] (considered as the baseline): Use clustering to identify influential vs. non-influential users on Twitter Rank users in the influential cluster, considering various important features

Problem Statement Aim: Perform authority detection on a collection of topics in Twitter for a time window Sentiment analysis to determine the influence of top users tweeting on specific topics on their respective communities Period: June 6th 2010 to June 10th 2010 Topics: Oil Spill iPhone World Cup

Methodology - User Metrics M = Mentions M1: Number of mentions of other users by the author M2: Number of unique users mentioned by the author M3: Number of mentions by others of the author M4: Number of unique users mentioning the author G = Graph Characteristics (restricted by the availability of data) G1: Number of topically active followers G2: Number of topically active friends G3: Number of followers tweeting on topic after the author G4: Number of friends tweeting on topic before the author OT1: Number of original tweets OT2: Number of links shared OT3: Self-similarity score OT4: Number of keyword hashtags used CT = Conversational tweets CT1: Number of conversational tweets CT2: Tweets where conversation is initiated by the author RT = Repeated tweets RT1: Number of retweets of others’ tweets RT2: Number of unique tweets retweeted by other users RT3: Number of unique users who retweeted author’s tweets

Methodology - Features Extracted Topic Signal (TS) Signal Strength (SS) Non-Chat Signal (NCS) Retweet Impact (RI) - modified Mention Impact (MI) Information Diffusion (ID) Network Score (NS) URL Impact (UI)

Methodology - Features Formulae

Methodology - Steps Data in Twitter API format -> User Metrics MapReduce (using Hadoop on AWS) Src-follows-Dest edge-list -> Adjacency Lists User Metrics and Adjacency Lists -> Features Features -> Clusters -> Influential Cluster Using Gaussian Mixture Model and Expectation Maximization Influential Cluster -> Top 20 Influencers Using Gaussian Ranking Sentiment Analysis and Visualization Using Liu Hu Lexicon and Gephi

Results - Authority Detection Normalized Not Normalized 60069699: sandiebanandie 17918561: LATenvironment 17918827: latimesgreen 14323791: dbiello 58315230: mrt7384 138775765: BPOilSpill 3554721: NWF 28657802: climateprogress 47739450: ByronYork 152315367: Oil_Spill_News 22024951: SwampSchool 19029137: BrentSpiner 14717197: TPM 139909476: USGulfOilSpill 15458181: kate_sheppard 48365916: Fertic 138761645: GulfOilCleanup 11856592: msnbcvideo 81696616: alabamainsider 9848: jimmybuffett 17918561: LATenvironment 138775765: BPOilSpill 3554721: NWF 14323791: dbiello 138761645: GulfOilCleanup 60069699: sandiebanandie 14192680: NOLAnews 139119046: BoycottBP 26642006: Alyssa_Milano 139909476: USGulfOilSpill 20582958: guardianeco 28657802: climateprogress 14293310: TIME 47739450: ByronYork 14138785: TelegraphNews 2467791:washingtonpost 58315230: mrt7384 139477825:BPOilNews 46969537:greenforyou 14511951: HuffingtonPost

Results - Sentiment Analysis Dbeillo Negative Sentiment Influence LATenvironment Neutral Sentiment Influence

Evaluation - Clustering, Ranking and Authority We randomly sample users from the “good” and “bad” clusters to ask people how relevant the tweets are for the topic. Using the assigned rank (1 to 5) of the users from the top k Twitter users in our ranking, we run NCGD to compare the relative rank that the users assigned to our ranking. WIth a final survey, we plan to ask people to rank the authoritativeness of the top k users in our rank with anonymized and non-anonymized tweets.

Evaluation

Conclusion While the baseline had more authorities who seemed generic, such as news Twitter accounts, our results show more topical authorities. We have also analyzed the sentiment influence of the top authorities, which can have further applications in formulating better marketing strategies for products and to influence consumers. Further, we plan to include evaluation results in our final report, and also improve upon the features related to the follower-following graph.

References [1] Pal, Aditya, and Scott Counts. "Identifying topical authorities in microblogs." Proceedings of the fourth ACM international conference on Web search and data mining. ACM, 2011. [2] Weng, Jianshu, et al. "Twitterrank: finding topic-sensitive influential twitterers." Proceedings of the third ACM international conference on Web search and data mining. ACM, 2010. [3] Cha, Meeyoung, et al. "Measuring User Influence in Twitter: The Million Follower Fallacy." ICWSM 10.10-17 (2010): 30. [4] Yoshida, M., & Yamaguchi, Y. (2015). Interactive Tagging Networks (Following/Followers and Tags on 1 million Twitter Users) [Data set]. Zenodo. http://doi.org/10.5281/zenodo.16267 [5] Page, Lawrence, et al. "The PageRank citation ranking: bringing order to the web." (1999). [6] Bishop, Christopher M. "Pattern recognition." Machine Learning 128 (2006).

Baseline Results NWF TIME Huffingtonpost NOLAnews Reuters CBSNews LATenvironment kate_sheppard MotherNatureNet mparent77772