Presentation is loading. Please wait.

Presentation is loading. Please wait.

Social Psychological Analysis of Public Political Comments on Facebook

Similar presentations


Presentation on theme: "Social Psychological Analysis of Public Political Comments on Facebook"— Presentation transcript:

1 Social Psychological Analysis of Public Political Comments on Facebook
Márton Miháltz

2 TrendMiner Overview What kind of social political trends are there in Hungarian comments to political posts on Facebook? Facebook in Hungary: 4.27M registered users = 59.2% of internet users, 43% of total population Download all public comments from Hungarian politicians’, parties’ facebook pages Analysis of comments: Basic NLP (tokenization, PoS, stemming), domain-adapted Entities: political actors (people, organizations) Sentiment Social psychology dimensions: agency/communion, individualism/collectivism, optimism/pessimism, primordial/conceptual thinking In cooperation with Narrative Psychology Research Group, Hungarian Academy of Sciences

3 Data Acquisition Get comments via fb Graph API
1.9M comments for 141K fb posts ( – ) from 1344 fb pages Organizations: parties, regional and associated branches People: candidate and elected representatives (MPs), government, party officials Official and fan pages In 3 categories Hungarian parliament Hungarian parliament elections 2014 (6th April) EU parliament elections 2014 (25th May) Sources: valasztas.hu, wikipedia.hu Everything in a MySQL database For arbitrary queries (political groups, time etc.)

4 Data model Fb_pages Fb_posts, Fb_comments Comments_annotations
Id, URL, Page title Type: person or organization Affiliated party (3 campaigns) Fb_posts, Fb_comments Id, Created_timestamp Message text, Author_user_id Comments_annotations Sentence_id, Start_token, End_token index Annotated text, Lemmatized_annotated_text, Annotation_tag Fb_comments_scores 16 scores and counts (sentiment, RID,, agency, communion, optimism, …)

5 Hungarian Political Ontology
Extending TM multilingual political ontology 8 New classes, 3+3 new object/data properties, new instances (1 Country,18 Party, 661 Politician, 899 Nomination) Nominated and elected MPs (2010 Hu. Parl., 2014 Hu. Parl., 2014 EU Parl.), nominating parties; Names, abbreviated names, nicknames, Facebook page URLs etc. Example:

6 Hungarian Political Ontology
Example: Benedek Jávor was member of Hungarian Parliament during (nominated by LMP), member of European Parliament from 2014 (nominated by EGYÜTT-PM).

7 Processing Pipeline Downloading (Fb Graph API py script)
Tokenizaton (huntoken tool) PoS-tagging (hunmorph tool) Morphological analysis (hunmorph tool) Stem+analysis disambiguation (Python script) Content analysis (Java NooJ) Scoring & storage in DB Uploading in RDF to TM Integration Server

8 Domain Adaptation Problem: existing NLP tools developed on different domain, (f)ail on social media language (facebook comments) Using corpus for survey: 1.25M fb comments (29M tokens) 2.25M unknown tokens (694K types) Frequency list, f > 15 items manually revised Identify common problems Lists of frequent, relevant unknown, new words etc.

9 Domain Adaptation: Tokenization
Huntoken tool Frequent problems: missing spaces around punctuation ... end of sentence.Beginning of another ... Multiplicated punctuation first part……. Second part Contracted words (slang) asszem = azt hiszem (“I think”) Consonant multiplication (interjections, onomatopeic words etc.) e.g. pfffffffff, uffffff, ejjjjjjjj (pff(f*), uff(f*), ej(j*)) split large numbers by decimal groups split URLS split emoticons : D

10 Domain Adaptation: PoS/stemming
Hunpos tagger + hunmorph analyzer + stemming script Frequent problems: Unknown words (no lemma/PoS) add to hunmorph analyzer’s lexicon using analogous words (morphological paradigm) Compounds, abbreviations, acronyms, slang words etc. Frequently misspelled word forms: replace with correct forms Wrong capitalization e.g. SENTENCES IN ALL CAPS Missing accent characters –disambiguation model needed E.g. kor (age), kór (disease), kör (circle)

11 NooJ, Java NooJ, Nooj-cmd
Open source version of NooJ: define and run finite state machines for querying, annotation etc. (morphology, syntax) NooJ-Cmd extension: all NooJ GUI features => command line options Open source: NooJ grammars (FSMs) for annotation: Actors (entities) Emotional valence (sentiment polarity) Regressive imagery dictionary Agency-communion Optimism-pessimism Individualism-collectivism

12 Development of NooJ Grammars
In collaboration with social psychologist researchers Social Psychology Department, Eötvös Lóránd University, Budapest Narrative Psychology Research Group, Hungarian Academy of Sciences Development Corpus 176K sample fb comments from 570 fb pages (4.9M tokens) NLP annotation Frequency lists (lemmas, lemmas+PoS, lemmas+morphological info etc.) Development: f > 100 content words from development corpus (3500 types) 7 independent annotators >= 4 annotartors agree: manual revision Compile into NooJ grammar with polarity shifters, items to be excluded etc.

13 1. Political Actors (NEs)
Maxent NE tool (huntag): low performance on domain Trained on standard language news texts Miscategorization, false positive NEs, entity boundary recognition problems NooJ grammar/lexicon for Trendminer Person names: family_name (given_name_lemmatized)? | frequent_nicknames … Organization names: Standard_form | abbreviated_forms… | nicknames… Created automatically (names from DB) + manually (nicknames from freq. lists)

14 2. Emotional Valence Emotions with positive or negative polarity
Polarity in context: recognize negation using simple rules Nouns, adjectives, verbs, adverbs, emoticons, multi-word expressions 500 Positive, 420 negative entries

15 3. Regressive Imagery Dictionary
Martindale (1975, 1990): uncover psychological processes reflected in the text 2 basic categories of thinking: Primordial (primary): associative, concrete, and takes little account of reality (fantasy, dreams) Conceptual (secondary): abstract, logical, reality oriented, aimed at problem solving 7+29 more subcategories (social behavior, cognition, perceptions, sensations etc.) Hungarian version by Pólya and Szász 3000+ terms

16 4. Agency/Communion 2 fundamental dimensions of social values:
Communion: moral and emotional aspects of an individual’s relations to others (affection, expressiveness, cooperation, social benefit etc.) Agency: efficiency of an individual’s goal-orientated behavior (motivation, competence, control) Positive or negative for both dimensions Context dependent (e.g. negation) 640 expressions

17 5. Optimism/Pessimism Based on PoS and morphology annotations + time expressions 2 measures: 1. |future_tense_verbs| / (|present_tense_verbs| + |past_tense_verbs|) 2. |present_tense_verbs| / |past_tense_verbs| Both correlate with degree of optimism

18 6. Individualism/Collectivism
Based on PoS and morphology annotations 1 measure: |personal pronouns| / (|verbs with personal inflection| + |nouns with possessive inflection|) Higher score: higher degree of individualism

19 Visualisation

20

21

22

23

24 Dissemination and Exploitation
Presentations Hungarian NLP Meetup, Sept , Budapest conText, Nov , Budapest Conference papers, presentations 2 papers at 11th Conference on Hungarian Computational Linguistics (January , Szeged) Source code Project website ( Download political ontology Download 1.9M facebook comments corpus (w/ annotations) Project info, papers, presentations slides

25 Thank You!


Download ppt "Social Psychological Analysis of Public Political Comments on Facebook"

Similar presentations


Ads by Google