Presentation is loading. Please wait.

Presentation is loading. Please wait.

Towards Passive Political Opinion Polling using Twitter Nicholas Thapen Imperial College London Moustafa Ghanem Middlesex University BCS SGAI Social Medial.

Similar presentations

Presentation on theme: "Towards Passive Political Opinion Polling using Twitter Nicholas Thapen Imperial College London Moustafa Ghanem Middlesex University BCS SGAI Social Medial."— Presentation transcript:

1 Towards Passive Political Opinion Polling using Twitter Nicholas Thapen Imperial College London Moustafa Ghanem Middlesex University BCS SGAI Social Medial Analytics Workshop Cambridge, UK 10 th Nov 2013

2 Introduction Hundreds of millions of tweets sent daily Unlike other social media, tweets are mainly for public record 66% of U.S social media users engage in some political behaviour (Pew Research poll 2013) Twitter used extensively for political discussion Most politicians have Twitter accounts and use them to communicate their view point

3 07/09/2014Slide 3

4 Motivation 1: Towards Passive Opinion Polling Opinion polling is expensive, takes a while to set up, and is prone to framing effects Can we use Twitter data to infer –voting intention and/or –public concern about particular issues Can we use it as a useful adjunct to polls that potentially avoids framing effects? What are the challenges?

5 Motivation II: (Side Effect) What are politicians discussing What are the dynamics of the political discussions on Twitter –What are politicians discussing, are they in line with what the public see as important issues? –What are the stand points of different politicians on key issues? Are they in line with their party line? –What are politicians from one party saying about other parties? These are all interesting questions that we initially did not have in mind, but came as a side effect of collecting and analysing our data 07/09/2014Slide 5

6 Related Work Quite a few studies on predicting elections using Twitter –Mostly measuring volume of tweets mentioning parties or candidates and extrapolating to vote shares –Some use sentiment and sentiment-weighted tweet volumes –Varying accuracies reported in the literature (1.65% - 17% )!! –Different assumptions on collecting data with most analysis being retrospective (potentially biased??) –Lack of methodologies on how we should collect and analyse Twitter data and evaluate results Not many studies looking at –Validating methodology on regular basis –Conducting other forms of opinion polling with Twitter, especially on inferring top issues that would influence voting behaviour

7 Personal Experience prior to this work Near Real-time analysis of 1st ever Televised Egyptian Presidential Debate Collecting and Analysing Tweets on the day of the 1 st Stage of Egyptian Presidential Elections in 2011 07/09/2014Slide 7

8 Egyptian Televised Debate Volume of Tweets not necessarily good measure of popularity, Moussa (Green) did a mistake (Called Iran an Arab Country) which caused a flurry of sarcastic comments 07/09/2014Slide 8

9 Topic Analysis 07/09/2014Slide 9

10 Debate Sentiment 07/09/2014Slide 10 Could only infer that people were becoming more opinionated about both candidates (with caveats on how sentiment is calculated)

11 Real-time analysis of 1 st ever Televised Egyptian Presidential Debate Purple Candidate (H. Sabahi) gaining increasing mentions as the debate progresses. 07/09/2014Slide 11

12 1 st Stage Egyptian Election Looks right for most candidates, but massive errors for two of them!! 07/09/2014Slide 12

13 1 st Stage Analysis from Jaddaleya presidential-elections-and-twitter-talk presidential-elections-and-twitter-talk 07/09/2014Slide 13

14 Seems like Voodoo to me! How do we collect data? How much data? What are we measuring? How are we measuring? How do we validate one set of results? How do we validate methodology? May disappoint you by not providing concrete answers to the above, but let’s try to understand some of the challenges. 07/09/2014Slide 14

15 Approach UK Politics Establish ground truth from traditional polls for specific (historical?) period –Voting intentions –Key political issues public concerned about Collect and analyse sample Twitter data for same period –Infer Voting Intention using different approaches –Infer Key issues discussed using different approaches Compare inferred results with ground truth –Understand sources of errors –understand challenges better 07/09/2014Slide 15

16 Data – Polls (Ground Truth) Collected list of voting intention polls compiled by U.K Political Report website (all polls June 2012 to June 2013) Also accessed Ipsos MORI Issues Index archive. This is a monthly survey conducted since 1974 asking unprompted question about issues of concern to the public

17 Ipsos MORI Categories To keep our analysis tractable, we consolidated the most frequent issues appearing in the poll data in the past year into 14 main categories, as well as an ‘Other’ category intended to catch all other issues. The categories chosen are: Crime, Economy, Education, Environment, EU, Foreign Affairs, Government Services, Health, Housing, Immigration, Pensions, Politics, Poverty, Unemployment 07/09/2014Slide 17

18 Challenges in Collecting Twitter Data (Real-time vs. Historical Data) Ideally want historical data Twitter streaming API allows collecting data based on queries (forward collection from when you start collecting), can put limits on number of tweets collected but usually not a problem Difficult to collect historical Twitter data because of Twitter API limitations –Not possible to query data older than 3 days –Can retrieve older data by tweet ID only –Can retrieve only up to 3200 tweets from a known user timeline Otherwise have to use a commercial historical data archive, which is not cheap! 07/09/2014Slide 18

19 Challenges in Collecting Twitter Data (Representative Sample?) How do we collect representative Twitter sample? Ideally, should query by UK geo-location –Would allow considering geographic distribution of voting intention and key issues –However Not too many people tag their location and may be biased Can generate too much data Not sure if they are voters (or even interested in politics) or if they are tourists, …. 07/09/2014Slide 19

20 Data - Twitter Two data set collected usingTwitter’s REST API v1.1 Data Set 1: MP Data set –All available tweets sent by 423 U.K M.Ps as of June 2013 (689,637 in total) – list of M.P accounts available at –Data set mainly used to analyse political language on Twitter, build and evaluate topic classifiers and evaluate sentiment analysis Data Set 2: –All available tweets from a sample of 600 users who had mentioned names of party leaders (filtered M.Ps, news, …) –This sample was adopted as one representing political discussion on Twitter (1,431,348 in total) collected in August 2013

21 Other Issues/Challenges How do we infer topic mappings for Ipsos MORI categories –Semi-supervised Iterative Keyword Classifier –Bayesian topic classifer How do we infer voting intention? –Tweet volume mentioning each party –Sentiment-weighted tweet volume (using open access sentiment tool – Sentistrength) 07/09/2014Slide 21

22 Evaluating Sentiment Analysis (Sentistrength) Using the MP dataset, we extracted tweets where political parties are mentioned (name, nick names, shortenings, etc) Excluded tweets mentioning more than one party producing a data set with 48,140 tweets (Labour: 23,070; Conservative: 18,034; Liberal Democrats: 7,036). Applied Sentistength (mainly based on keyword lookups) 07/09/2014Slide 22

23 Evaluation using annotated tweets 30 tweets were selected at random from each of the nine groups and manually annotated as ‘Positive ’or ‘Negative’ based on the language used and the sense meant. Results are far from perfect. 07/09/2014Slide 23 ClassPrecisionRecallF1 Measure Negative0.5830.4830.528 Positive0.6510.7370.691 Overall0.6170.6100.614

24 Sentiment Analysis amongst M.Ps Despite the inaccuracies (Low precision and recall), when Looking at how MPs from each party mention other parties, the results still make sense and could be usable.

25 Voting Intention Tweet Volume First experiment looked at tweet volumes mentioning each political party in July 2013 Proportion of tweets mentioning each party is compared to the polling average for that month

26 Voting Intention (Sentiment Weighted) 07/09/2014Slide 26

27 Voting Intention Over time Applying same methodology over time reveals huge variability Due to data collection methodology have far fewer tweets from months prior to July 2013

28 Topic Detection in Political Tweets Snowballing Keyword Classifier Initial keywords for each of 14 topics manually compiled consulting Wikipedia (Health: {Health, NHS, Doctors, Nurses, Hospitals, …}) Initial keyword matching found topic labels for around 1/6 of the M.P tweet dataset Snowballing –In order to extend the keyword set we looked for 1,2,3-grams that were frequent inside each topic class but rare outside it –Fisher’s Test was used to calculate a score and sort list and then manually scanned for appropriate keywords –Over 5 iterations increased tweets categorized from 16% to 30%

29 Iterative extension of keyword set

30 Iterative extension - detail Suppose we are determining whether ‘NHS’ is a good keyword for the ‘Health’ topic We build up a contingency table of the frequency of ‘NHS’ occurring vs other words in the tweets we have tagged as ‘Health’ vs the rest Fisher’s test reveals how unlikely this contingency table is to appear by chance The more unlikely, the more significant the difference between frequency inside and outside the ‘Health’ topic

31 Effect on individual categories 07/09/2014Slide 31

32 Evaluation against manually annotated sample distribution of 300 tweets 07/09/2014Slide 32

33 Bayesian Topic Detection Bag of words Bayesian classifier for each topic Training data = output classes from keyword matching stage Evaluated against a manually annotated sample of 300 tweets and found 78% F1 measure

34 Bayesian Topic Detection 2 Basic Bayesian equations: In our case the features are word counts in the tweets. Priors calculated by sampling and annotating 300 tweets.

35 Precision, Recall, F-Measure against manually annotated 300 tweets ClassifierPrecisionRecallF1 Measure Keyword matching (5 th iteration) 0.2870.2790.283 Bayesian on non-stemmed data0.7530.7930.773 Bayesian on stemmed data0.7640.7980.781 07/09/2014Slide 35

36 Evaluation of Topic Distribution 07/09/2014Slide 36

37 Comparison with Polls

38 What different parties are saying 07/09/2014Slide 38

39 Smoothing tweet frequency data Daily tweet data very ‘bursty’ – needs smoothing for effective comparison

40 Comparison over time 30 day smoothing was applied, and both time series were mean-centred and divided by their standard deviation (for easier comparison).

41 Comparison over time for different topics

42 Not always good match 07/09/2014Slide 42

43 Further work Naïve Bayesian sentiment analysis developed by using partisan tweets as training data on positive and negative political terms When tested initial implementation against SentiStrength over 300 annotated tweets, this improved F-measure from 61% to 68%

44 Summary and Conclusions Overall, promising results from comparisons of Twitter data with opinion polling on voting intention and topics of concern A lot more to do! Issues of sampling, demographic weighting and finding appropriate analysis techniques

45 Summary and Conclusions How do we collect data? How much data? What are we measuring? How are we measuring? How do we validate one set of results? How do we validate methodology? 07/09/2014Slide 45

Download ppt "Towards Passive Political Opinion Polling using Twitter Nicholas Thapen Imperial College London Moustafa Ghanem Middlesex University BCS SGAI Social Medial."

Similar presentations

Ads by Google