Company LOGO Stock Price Forecasting with Support Vector Machines based on Web Financial Information Sentiment Analysis Run Cao School of Information Renmin.

Slides:



Advertisements
Similar presentations
Agenda of Week V. Forecasting
Advertisements

Stock Price Prediction Based on Social Network A survey Presented by: CHEN En.
Spreadsheet Modeling & Decision Analysis
Decomposition Method.
Exercise 7.5 (p. 343) Consider the hotel occupancy data in Table 6.4 of Chapter 6 (p. 297)
Applications of Stochastic Processes in Asset Price Modeling Preetam D’Souza.
Chapter 11: Forecasting Models
Software Quality Ranking: Bringing Order to Software Modules in Testing Fei Xing Michael R. Lyu Ping Guo.
Lecture Presentation Software to accompany Investment Analysis and Portfolio Management Seventh Edition by Frank K. Reilly & Keith C. Brown Chapter.
11-1 Copyright © 2011 by the McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin.
Chapter 12 - Forecasting Forecasting is important in the business decision-making process in which a current choice or decision has future implications:
Data Sources The most sophisticated forecasting model will fail if it is applied to unreliable data Data should be reliable and accurate Data should be.
1 Spreadsheet Modeling & Decision Analysis: A Practical Introduction to Management Science, 3e by Cliff Ragsdale.
Reduced Support Vector Machine
T T18-04 Linear Trend Forecast Purpose Allows the analyst to create and analyze the "Linear Trend" forecast. The MAD and MSE for the forecast.
Return, Risk, and the Security Market Line
The Role of Financial System in Economic Growth Presented By: Saumil Nihalani.
Bioinformatics Challenge  Learning in very high dimensions with very few samples  Acute leukemia dataset: 7129 # of gene vs. 72 samples  Colon cancer.
Mr. Perminous KAHOME, University of Nairobi, Nairobi, Kenya. Dr. Elisha T.O. OPIYO, SCI, University of Nairobi, Nairobi, Kenya. Prof. William OKELLO-ODONGO,
T T18-06 Seasonal Relatives Purpose Allows the analyst to create and analyze the "Seasonal Relatives" for a time series. A graphical display of.
DASHBOARDS Dashboard provides the managers with exactly the information they need in the correct format at the correct time. BI systems are the foundation.
Relationships Among Variables
CSCI 347 / CS 4206: Data Mining Module 04: Algorithms Topic 06: Regression.
Forecasting with Twitter data Presented by : Thusitha Chandrapala MARTA ARIAS, ARGIMIRO ARRATIA, and RAMON XURIGUERA.
NEURAL NETWORKS FOR TECHNICAL ANALYSIS: A STUDY ON KLCI 授課教師:楊婉秀 報告人:李宗霖.
Correlation and Linear Regression
FINANCIAL FORECASTING MARKETS1 ANDREAS KARATHANASOPOULOS.
Data Mining BS/MS Project Decision Trees for Stock Market Forecasting Presentation by Mike Calder.
Copyright © 2003 Pearson Education, Inc. Slide 5-1 Chapter 5 Risk and Return.
Part III Exchange Rate Risk Management Information on existing and anticipated economic conditions of various countries and on historical exchange rate.
The Importance of Forecasting in POM
Some Background Assumptions Markowitz Portfolio Theory
1 Spreadsheet Modeling & Decision Analysis: A Practical Introduction to Management Science, 3e by Cliff Ragsdale.
© 2004 Prentice-Hall, Inc. Chapter 7 Demand Forecasting in a Supply Chain Supply Chain Management (2nd Edition) 7-1.
Chapter 16 Jones, Investments: Analysis and Management
CHAPTER EIGHTEEN Technical Analysis CHAPTER EIGHTEEN Technical Analysis Cleary / Jones Investments: Analysis and Management.
Various topics Petter Mostad Overview Epidemiology Study types / data types Econometrics Time series data More about sampling –Estimation.
Data Mining Practical Machine Learning Tools and Techniques Chapter 4: Algorithms: The Basic Methods Section 4.6: Linear Models Rodney Nielsen Many of.
Investment Analysis and Portfolio Management First Canadian Edition By Reilly, Brown, Hedges, Chang 6.
Time Series Analysis and Forecasting
A hybrid SOFM-SVR with a filter-based feature selection for stock market forecasting Huang, C. L. & Tsai, C. Y. Expert Systems with Applications 2008.
6. Evaluation of measuring tools: validity Psychometrics. 2012/13. Group A (English)
Chapter McGraw-Hill/Irwin Copyright © 2008 by The McGraw-Hill Companies, Inc. All rights reserved. Risk and Capital Budgeting 13.
1 1 Slide Forecasting Professor Ahmadi. 2 2 Slide Learning Objectives n Understand when to use various types of forecasting models and the time horizon.
Pattern Discovery of Fuzzy Time Series for Financial Prediction -IEEE Transaction of Knowledge and Data Engineering Presented by Hong Yancheng For COMP630P,
BOĞAZİÇİ UNIVERSITY DEPARTMENT OF MANAGEMENT INFORMATION SYSTEMS MATLAB AS A DATA MINING ENVIRONMENT.
Welcome to MM305 Unit 5 Seminar Prof Greg Forecasting.
Review of fundamental 1 Data mining in 1D: curve fitting by LLS Approximation-generalization tradeoff First homework assignment.
Correlation & Regression Analysis
CISC 849 : Applications in Fintech Namami Shukla Dept of Computer & Information Sciences University of Delaware iCARE : A Framework for Big Data Based.
Chapter 7 An Introduction to Portfolio Management.
A Kernel Approach for Learning From Almost Orthogonal Pattern * CIS 525 Class Presentation Professor: Slobodan Vucetic Presenter: Yilian Qin * B. Scholkopf.
Realtime Financial Monitoring and Analysis System May 2010 Lietu Search Engine.
Stock market forecasting using LASSO Linear Regression model
Incremental Reduced Support Vector Machines Yuh-Jye Lee, Hung-Yi Lo and Su-Yun Huang National Taiwan University of Science and Technology and Institute.
Assignable variation Deviations with a specific cause or source. forecast bias or assignable variation or MSE? Click here for Hint.
Managerial Decision Modeling 6 th edition Cliff T. Ragsdale.
1 A latent information function to extend domain attributes to improve the accuracy of small-data-set forecasting Reporter : Zhao-Wei Luo Che-Jung Chang,Der-Chiang.
McGraw-Hill/Irwin Copyright © 2009 by The McGraw-Hill Companies, Inc. All Rights Reserved. Chapter 3 Forecasting.
Forecas ting Copyright © 2015 McGraw-Hill Education. All rights reserved. No reproduction or distribution without the prior written consent of McGraw-Hill.
Forecast 2 Linear trend Forecast error Seasonal demand.
PREDICTING STOCK MARKET MOVEMENT USING SENTIMENTS For EECSE 6898-From Data to Solutions class Presented by-Tulika Bhatt(tb2658)
A Document-Level Sentiment Analysis Approach Using Artificial Neural Network and Sentiment Lexicons Yan Zhu.
Welcome to MM305 Unit 5 Seminar Forecasting. What is forecasting? An attempt to predict the future using data. Generally an 8-step process 1.Why are you.
Chapter 3: Cost Estimation Techniques
Sentiment analysis algorithms and applications: A survey
Forecasting Exchange Rates
assignable variation Deviations with a specific cause or source.
Presentation transcript:

Company LOGO Stock Price Forecasting with Support Vector Machines based on Web Financial Information Sentiment Analysis Run Cao School of Information Renmin University of China

Agenda  Background & Research Question  Related work  Model and Method  Analysis of The experimental results  Conclusion & Discussion

Background & Research Question The financial market has always been a research hotspot, and stock price forecasting has been considered the most difficult challenges in the time series prediction. Data intensive Noisy Dynamic Unstructured High degree of uncertainty Rapid development of communication technology and the Internet, people can access to the stock news and reviews through the Internet anytime, anywhere. Internet could cover other channels, financial Internet information play an important role in the financial markets.

Background & Research Question  According to the efficient market hypothesis, the financial information has an important impact on the financial market volatility.  The information has two dimensions characteristics: information volume information sentiment--the information sentiment analysis has become an important topic of natural language processing and machine intelligence field.  In the financial markets, information sentiment is an important indicator reflecting the opinions and emotions of investors and traders  Previous studies have shown that Internet financial information volume and stock market volatility are closely related

Related work Industry Fundamental analysis; Technical analysis; Stock Price Forecasting Academia Statistical methods: linear regression prediction, polynomial regression prediction, ARMA modeling, GARCH modeling; Machine learning method: Artificial Neural Networks, Support vector machine, Markov model, fuzzy network ;

Related work-Text sentiment analysis  Text sentiment analysis, also known as Opinion Mining;  For example, by automatically analyzing the text content of the comments of a commodity, we can get the consumer’s attitudes and opinions of the goods.  The application field of sentiment analysis: user comments analysis decision-making monitoring public opinion information prediction  Text sentiment analysis is mainly from two aspects: sentimental knowledge-based approach; the method based on feature classification;

3 Model and Method  The framework of the overall project

The calculation of Financial News sentimental value  The sentiment of financial text corresponds to the tendency of the attitude embodied in the text, i.e., bullish, bearish or neutral, and the financial text intensity reflects the influence of the text.  High intensity of the text has a greater influence on financial markets, on the contrary, the low intensity of the text for the influence of the financial market is relatively weak.  A sentimental value that is calculated according to a financial text, then the positive and negative symbols of the value means bullish bearish or neutral, while the absolute value represent the intensity of the text.

The calculation of Financial News sentimental value  In the experiment, we use the algorithm Nan Li and Desheng Dash Wu proposed in 2010[26], the text sentiment algorithm based on Hownet sentiment dictionary.  Assume that the current Financial News p, first conduct the segmentation tool to convert it into a sequence constituted by the word, that is, {w 1, w 2, w 3, …w n }, the number of total words is n, on each one the w i (i=1, 2, 3,…n) to calculate an sentimental value of v i, then the sentimental value of the entire Financial News p is the sum of all the words sentimental value.

 In the experiment, we randomly selected three stocks in the Growth Enterprise Market in China Toread (300005) Hanwei Electronics (300007), Huayi Brothers (300027),.  We have a Financial News crawler to get Financial News data, which is supported by Internet Financial Intelligence Laboratory, School of Information, Renmin University of China.

Stock price information processing  We obtained the transaction data and historical prices of three stocks from Yahoo Finance ( Huayi Brothers (300027) increase in shares in 2010, doubling the total share capital, we need to multiplex the right to calculate the actual price.

 P = (opening price + closing price + highest price + lowest price) / 4).  Financial news and stock prices are divided into a time window, we use three-day sliding window to predict the fourth day of the price.  Toread (300005) a total of 370 groups (P t-2, P t-1, P t,P t+1 ) data, t = 1, 2, 3..., 370.  Hanwei Electronics (300007) 336 group(P t-2, P t-1, P t,P t+1 ) the data, t = 1, 2, 3..., 336.  Huayi Brothers (300027) a total of 262 groups(P t-2, P t-1, P t,P t+1 ) data, t = 1, 2, 3..., 262.  Use prediction models for a prediction, that is, input the relevant data during T, T-1, T-2 to forecast the stock priceT+1, and then compare the predict values to actual values, analysis the forecasting results.

Data preprocessing  The format requirements of Libsvm training data collection is : : :....  is the target value of the training data set, for regression it could be any real number.In our experiments, column is the stock price of the next time period. is an integer from 1, may not be continuous. is the value of predictors, or value of explanatory variables.  First, we will converted the data to the format required by the above Libsvm, and use the scale Toolbox provided by Libsvmto make each factor data fall into [-1, 1] interval.  Selected training samples in accordance with the training set: test set for the ratio of 4:5,to select 4/5 of all the data as a training set.

Group experiments  Here we use the SVM regression function to exploit the relationship between financial news sentiments and stock prices of financial markets, take Libsvm as the experimental tool.  We designed four sets of three-day sliding window SVM prediction comparative experiment.  Experiment 1: As shown in Table 1, use the stock price the data of previous three days to forecast the stock price of t+1.  Experiment 2: based on Experiment 1, add the corresponding daily number of stock news as the exogenous variables.  Experiment 3: based on Experiment 1, add the corresponding stock news sentimental value of the day.  Experiment 4: adding corresponding daily number of stock news and stock news sentimental value at the same time based on Experiment 1.

Calculate the evaluation score  We use the following statistical indicators to evaluate the prediction results: mean square error (MSE) the standardized mean square error (NMSE) mean absolute error (MAE) multiple correlation coefficients (SCC) the direction of symmetry (DS) weighted direction symmetry (WDS) correct upward trend (CP correct downward trend (CD)

Analysis of The experimental results  The results data display the performance of the training set and test set in the four groups of experiments.  With the increase and change in the input data, the evaluation index has more excellent performance, the accuracy of prediction has been gradually increasing, and this is more evident in the results of the training data set.

Discussion  The results indicate that information sentimental value is more influential than information volume on stock price forecasting.  Comparing to the experiment with one dimensional data, the result from the experiment combined information volume and information sentimental value is more accurate.  Improved the existing training and forecasting model. The learning ability of Support Vector Machines is to obtain information from data, by using time window, using more historical data to predict, it can carry more fully information.  Provides a more accurate stock price forecasting method for investors, which enable them manage investment and risk more efficiently in financial market.

limitations  The sentimental computing is the opinion analysis of online information. The information we use the still is mainly based on news, the trend now is Micro-blogging and other social media has become a generally accepted the opinions platform, use the sentimental value of portal news to represent the overall sentimental is not comprehensive enough.  In order to get more effectively and accurately predict and analysis, we need to analyze finance-related government departments, companies, social media, financial experts and commentators, even the opinions of all shareholders in the future work.

Company LOGO