Presentation is loading. Please wait.

Presentation is loading. Please wait.

Textural sentiment in finance

Similar presentations


Presentation on theme: "Textural sentiment in finance"— Presentation transcript:

1 Textural sentiment in finance
Text sources used in textural sentiment Some sentiment methods David Ling

2 Textural sentiment in finance - Sources
Many previous works (Colm Kearney listed about 40 papers) Sources for stock prediction: 3 main types (Colm Kearney 2015) Corporation-expressed sentiment corporate annual reports, earning press releases, earning conference calls Li (2006), Feldman et. al(2008), Li(2010), Loughran and McDonald(2011), … Twitter (Gabriele 2015) Media-expressed sentiment Newspapers, the wall Street Journal and the New York Times (Tetlock 2007) Reuters newscope sentiment engine Internet-expressed sentiment Messages posted on Yahoo!Finance (Antweiler 2004)

3 Textural sentiment in finance - Sources
Comments Corporate reports: directly related to the firm (firm specific) Not ideal for time-series modeling (low frequency, quarterly, annually) HKEX, WRDS News: Usually hindsight rather than foresight More frequent, suitable for weekly or daily prediction Online comments and messages: Little new information (incremental to public news) Noisy and less reliable

4 Textural sentiment in finance - Sources
Hong Kong company anural reports are available on HKEX One of the financial centers in the world Data can be used for initial trials or start Easy to acquire and understand, companies are more familiar (compared to WRDS) Annual report links are extracted for all companies (urllib + beautifulsoup)

5 Textural sentiment in finance - methods
Dictionary-based Detecting keywords in user defined word list (bag-of-words) Usual dictionaries : Harvard psychosocial dictionary Loughran and McDonald’s positive and negative financial word lists (Loughran and McDonald 2011) DICTION (software) Machine learning neural network (Reuters NewScope Sentiment Engine) SVM (Gabriele Ranco et al 2015)

6 Dictionary-based: Harvard psychosocial dictionary
Html on the official website One example category: Positive, negative, strong, week, active, passive Tagged with part of speech: Noun, verb, adjective

7 Dictionary-based: Loughran and McDonald’s positive and negative financial word lists
In financial view (more accurate) positive, uncertainty, litigious, strong modal, and weak modal Downloaded word list from the official site WRDS provides word list counting

8 Dictionary-based: DICTION
Dictionary based, text analysing software, using 33 dictionaries  Certainty – resoluteness, inflexibility, and completeness and a tendency to speak ex cathedra.  Activity – movement, change, the implementation of ideas and the avoidance of inertia.  Optimism – endorsing some person, group, concept or event, or highlighting their positive entailments.  Realism – describing tangible, immediate, recognizable matters that affect people’s everyday lives.  Commonality – highlighting the agreed-upon values of a group and rejecting idiosyncratic modes of engagement.

9 ML: The Effects of Twitter Sentiment on Stock Price Returns (Gabriele Ranco et al 2015)
Supervised learning using Support Vector Machine (SVM) 15 months, Twitter volume and sentiment about 30 stock companies (eg. McDonald’s, Visa, Coca-colar) Over 100,000 tweets were labeled by 10 financial experts with three sentiment labels: negative, neutral, or positive Tokenization, Lemmatization, n-gram construction Lemmatization: (eg. had -> have, takes -> take) Bag-of-words, using unigram and bigrams as the feature set, with Term Frequency Inverse Document Frequency (TFIDF) weighting scheme

10 ML: The Effects of Twitter Sentiment on Stock Price Returns (Gabriele Ranco et al 2015)
Corpus: Document 1: “cat sat mat” Document 2: “cat hate cat” Unigram and bigram feature vector (Term Freq. = count doc length ) Weighted by IDF = log ( no. of documents no. of documents with that term ) Eg. IDF(cat) = log(2/2)=0 (no extra information) Eg. IDF(hate) = log(2/1) = 0.3 . (Term Freq.) cat sat mat hate Cat sat Sat mat Cat hate Hate cat Document 1 1/3 1/2 Document 2 2/3 (TFIDF) cat sat mat hate Cat sat Sat mat Cat hate Hate cat Document 1 0.1 0.15 Document 2

11 ML: The Effects of Twitter Sentiment on Stock Price Returns (Gabriele Ranco et al 2015)
Many other kinds of TFIDF Term weighting “has an enormous impact on the effectiveness of a retrieval system (Jurafsky and Martin 2009, p. 771) nt: number of document with that term N: total number of documents in corpus From wikipedia

12 ∅ 𝑊 = 1 𝑛 𝑖 𝑛 max⁡(0,1− 𝑦 𝑖 (𝑊 ∙ 𝑋 𝑖 −𝑏𝑖𝑎𝑠)) + regulatory_term
ML: The Effects of Twitter Sentiment on Stock Price Returns (Gabriele Ranco et al 2015) Feature vectors: X1 = [0,0.1,0.1,0,0.15,0.15,0,0] X2 = [0,0,0,0.1,0,0,0.15,0.15] SVM loss function (find the weight vector W to minimize below): ∅ 𝑊 = 1 𝑛 𝑖 𝑛 max⁡(0,1− 𝑦 𝑖 (𝑊 ∙ 𝑋 𝑖 −𝑏𝑖𝑎𝑠)) + regulatory_term n: no. of data sets, 𝑦 𝑖 = ±1 (belong to the class or not) SVM probability (one against all): 𝑝 𝑐𝑙𝑎𝑠𝑠 𝑋 = exp⁡( 𝑊 𝑐𝑙𝑎𝑠𝑠 𝑋+ 𝑏𝑖𝑎𝑠 𝑐𝑙𝑎𝑠𝑠 ) 𝑐𝑙𝑎𝑠𝑠′ exp⁡( 𝑊 𝑐𝑙𝑎𝑠 𝑠 ′ 𝑋+ 𝑏𝑖𝑎𝑠 𝑐𝑙𝑎𝑠𝑠 ′) loss function is roughly the distance of the misclassifying point from the separating line

13 ML: Reuters NewScope Sentiment Engine
Commercial product (Non-free) 3-layer neural network (at this moment still cannot find what kinds of feature used) Output: Positive, negative, or neural Tagged with the related company Use Reuters global news, and scan across 35,000 companies in real time

14 References Colm Kearney and Sha Liu, Textual sentiment in finance: A survey of methods and models, International Review of Financial Analysis, Volume 33, Pages (May 2014) TIM LOUGHRAN and BILL MCDONALD, When Is a Liability Not a Liability? THE JOURNAL OF FINANCE, VOL. LXVI, NO. 1 (2011) Ranco G, Aleksovski D, Caldarelli G, Grčar M, Mozetič I, The Effects of Twitter Sentiment on Stock Price Returns, PLoS ONE, 10(9): e (2015)


Download ppt "Textural sentiment in finance"

Similar presentations


Ads by Google