Presentation is loading. Please wait.

Presentation is loading. Please wait.

SOPS: Stock Prediction using Web Sentiment Presented by Vivek sehgal, Charles Song Department of Computer Science, University of Maryland ICDMW 2007 2009-05-29.

Similar presentations


Presentation on theme: "SOPS: Stock Prediction using Web Sentiment Presented by Vivek sehgal, Charles Song Department of Computer Science, University of Maryland ICDMW 2007 2009-05-29."— Presentation transcript:

1 SOPS: Stock Prediction using Web Sentiment Presented by Vivek sehgal, Charles Song Department of Computer Science, University of Maryland ICDMW Summarized by Jaeseok Myung

2 Copyright  2009 by CEBT In this talk..  Introducing some papers about sentiment analysis in finance [1] 0Event and Sentiment Detection in Financial Markets (ISWC 08) – Simple Architecture [2] SOPS: Stock Prediction using Web Sentiment (ICDMW 07) – Entire Process [3] Yahoo! for Amazon: Sentiment Extraction from Small Talk on the Web (Management Science 07) – An Idea that can improve prediction performance  We will focus on SOPS, but brief introductions about the others will also be presented Center for E-Business Technology

3 Copyright  2009 by CEBT Sentiment Analysis in Financial Markets  Sentiment analysis is one of my favorite research topic I’ve conducted some researches by using product reviews  In my opinion, finance is more suitable domain than product Product sales statistics is not publicly available – Stock values are always opened Financial markets are really related to investors’ sentiment – ‘ 경제는 심리 ’ – Behavioral finance – Lots of evidences Interesting & Worth Center for E-Business Technology

4 Copyright  2009 by CEBT Research Problem from [1][2][3]  How can information from various, heterogeneous sources be integrated? Different formats  How can the opinions in the documents be extracted? Statistical, NLP ways  How can the important opinions be filtered? Reliable Source(news, blog), Trusted Author, Promising Alg.  How can the users’ trading decisions be supported? Finding out the relationships between investors’ sentiment and stock values Center for E-Business Technology

5 Copyright  2009 by CEBT An Architecture from [1] Center for E-Business Technology Monitor a huge number of relevant sources Extract metadata and Make a single representation Decide whether the information has to be analyzed or not

6 Copyright  2009 by CEBT SOPS: System Overview Center for E-Business Technology Collect data from a message board Remove HTML tags and extract features Identify reliable users in order to filter noise Use several classifiers

7 Copyright  2009 by CEBT SOPS: Data Collection  260,000 messages for 52 popular stocks on Yahoo! Finance The messages covered over 6 month time period  A message board exists for each stock traded on major stock exchange such as NYSE and NASDAQ Users must sign up before they can post messages Every message posted is associated with the author Center for E-Business Technology

8 Copyright  2009 by CEBT SOPS: Data Collection Center for E-Business Technology

9 Copyright  2009 by CEBT SOPS: Feature Representation  After the relevant information has been extracted Converting each message to a vector of words and author names  The value of each entry in the vector is then calculated using TFIDF formula Center for E-Business Technology M : set of all messages m : a message w : a term M : set of all messages m : a message w : a term ( 3.2, 1.6, 1.09, , 0.5, …) “good” “stop”“asdf”date% of change in stock price

10 Copyright  2009 by CEBT SOPS: Sentiment Prediction Center for E-Business Technology a message (undisclosed) Classifier Strong BuyStrong Sell BuySell Hold What How a message (disclosed) Classifier (Training) Strong BuyStrong Sell BuySell Hold

11 Copyright  2009 by CEBT SOPS: Sentiment Prediction  The sentiment for a message m at time instant i is modeled as follows: Center for E-Business Technology m : a message M i : set of all messages SV i : Stock value m : a message M i : set of all messages SV i : Stock value Classifier 1.Naïve Bayes 2.Decision Trees 3.Bagging Strong Buy, Buy, Hold, Sell, Strong Sell Strong BuyStrong Sell BuySell Hold

12 Copyright  2009 by CEBT TrustValue Calculation  Some authors are more knowledgeable than others about the stock market Trusted author’s posts should carry more weight => TrustValue  TrustValue Not only cares about the direction in which the stock price went, but also care about the magnitude Takes into account the fact that a single author cannot be expert on all stocks => an author can be assigned different trust values for different stocks Center for E-Business Technology PredictionScore : author’s prediction performance that is how closely does the author’s prediction follow the stock market NumberOfPrediction : the total number of predictions made by the author ExactPrediction : the number of exact predictions ClosePrediction : the number of “good enough” predictions ActivityConstant : a constant used to penalize low activity or predictions by the author PredictionScore : author’s prediction performance that is how closely does the author’s prediction follow the stock market NumberOfPrediction : the total number of predictions made by the author ExactPrediction : the number of exact predictions ClosePrediction : the number of “good enough” predictions ActivityConstant : a constant used to penalize low activity or predictions by the author

13 Copyright  2009 by CEBT SOPS: Stock Prediction Center for E-Business Technology Classifier Go upGo down

14 Copyright  2009 by CEBT SOPS: Evaluation Metrics Center for E-Business Technology

15 Copyright  2009 by CEBT SOPS: Experiments Center for E-Business Technology

16 Copyright  2009 by CEBT Conclusion  SOPS can predict Web sentiment with high precision and recall  SOPS introduced TrustValue which takes into account the trust- worthiness of an author  In my opinion, there are some points that are unclear Presentation – About Summarization Users Time Period Center for E-Business Technology

17 Copyright  2009 by CEBT Furthermore  We have the paper [3] Center for E-Business Technology

18 Copyright  2009 by CEBT Research Problem from [1][2][3]  How can information from various, heterogeneous sources be integrated? Different formats  How can the opinions in the documents be extracted? Statistical, NLP ways  How can the important opinions be filtered? Reliable Source(news, blog), Trusted Author, Promising Alg.  How can the users’ trading decisions be supported? Finding out the relationships between investors’ sentiment and stock values Center for E-Business Technology


Download ppt "SOPS: Stock Prediction using Web Sentiment Presented by Vivek sehgal, Charles Song Department of Computer Science, University of Maryland ICDMW 2007 2009-05-29."

Similar presentations


Ads by Google