题名

透過新聞文章預測股價漲跌趨勢-結合情緒分析、主題模型與模糊支持向量機

并列篇名

Sentiment and Topic Analysis on Financial News for Stock Movement Prediction by Using Fuzzy Support Vector Machine

作者

郝沛毅(Pei-Yi Hao);歐仁彬(Jen-Bing Ou);黃天受(Tien-Shou Huang);林振穎(Zhen-Ying Lin);吳建生(Jian-Sheng Wu)

关键词

股價預測 ; 情緒分析 ; 潛在狄利克雷分配 ; 文字探勘 ; 模糊理論 ; 支持向量機 ; stock trend prediction ; sentiment analysis ; latent dirichlet allocation ; text mining ; fuzzy theory ; support vector machine

期刊名称

資訊管理學報

卷期/出版年月

25卷4期(2018 / 10 / 31)

页次

363 - 395

内容语文

繁體中文

中文摘要

能夠成功預測股票漲跌趨勢明顯地有許多好處,根據效率市場假設,公司股票的價值是由當前所有可用的信息給定。當分析師、投資者和機構交易者評估當前股價時,新聞在股價估值過程中發揮重要作用。事實上,金融新聞刊載有關於公司基本面的訊息,和影響市場參與者期望的質化訊息。在大數據時代,線上新聞文章的數量持續增長,在如此巨量的文字資料面前,越來越多的機構依靠現代計算機的高速處理能力來進行文字探勘與機器學習,以建構更準確的股價趨勢預測模型。使用文章中非結構化的數據,是最具挑戰性的研究方向,也將是本研究工作的重點,在本論文中,我們將從新聞文章中萃取出隱含的主題模型與情緒資訊,此外,我們將開發一個模糊支持向量機來融合線上新聞文章內含的豐富資訊,以預測股價的漲跌趨勢。我們認為模糊理論非常適用於本研究,因為文字本身就是模糊的(例如,高低、大小),而且在漲跌趨勢之間,存在一條曖昧的模糊邊界(例如,漲0.01%與漲1%雖然都屬於上漲的類別,但是屬於的程度明顯不同)。本研究在食品類股的預測正確率最高為87%,半導體類股的正確率最高為71%,電腦周邊類股的預測正確率最高為69%,相較於傳統支持向量機透過關鍵字來預測股價漲跌趨勢的正確率僅五成多(接近於隨機猜測),本研究所提出的方法明顯優於傳統的支持向量機預測模型。

英文摘要

Purpose-In Big Data era, the amount of news articles has been increasing tremendously. In front of such a big volume of textual data, more and more institutions rely on the high processing power of modern computers for text mining and machine learning to make more accurate predictions of stock market. Discovering the fundamental data available in unstructured text is the most challenging research aspect and therefore is the goal of this work. Design/methodology/approach-In this study, we extracted the hidden topic model and emotional information from news articles. Besides, we developed a fuzzy support vector machine to merge the abundant information from the on-line news, which can be used to forecast the trend of stock prices. Fuzzy set theory is very useful for this study because the texts are fuzzy in itself (such as high/low and big/small), and there is an ambiguous boundary between rise and fall categories. For example, going up either 10% or 1% belongs to rise category, but is different in degree. Findings-As for this study, the highest forecast accuracy rate was 87% for the food-related stocks, 71% for the semiconductors-related stocks, and 69% for the computer peripheral-related stocks. When compared with traditional support vector machine, which the forecast accuracy rates of stock price trends were just over 50% (nearly to random guess), the method proposed in this study is significantly better than the forecasting model of traditional support vector machine. Research limitations/implications-This study focused only on accurately classifying the stock movement based on hidden topic and sentiment features. In our future work, we plan to investigate more complex semantic features. Practical implications-Successful predictions of stock price movement tendency have obvious advantages. According to the Efficient Market Hypothesis, the price of a stock asset is given by all information available in the moment. Financial news carries information about the firm's fundamentals and qualitative information influencing expectations of market participants. This study employs sentiment and topic analysis on financial news to predict stock movement. This can help analysts, investors and institutional traders to effectively evaluate current stock prices. Originality/value-This study is, to the best of our knowledge, the first attempt to apply fuzzy support vector machine and hidden topic/semantic features for the prediction of stock movement in Taiwan.

主题分类 基礎與應用科學 > 資訊科學
社會科學 > 管理學
参考文献
  1. 黃金蘭,林以正,謝亦泰,程威銓(2012)。中文版「語文探索與字詞計算」詞典之建立。中華心理學刊,54(2),185-201。
    連結:
  2. (2012).Mining Text Data.Springer.
  3. An, W.,Liang, M.(2013).Fuzzy support vector machine based on within-class scatter for classification problems with outliers or noises.Neurocomputing,110,101-110.
  4. Bahrepour, M.,Akbarzadeh-T, M.-R.,Yaghoobi, M.,Naghibi-S, M.-B.(2011).An adaptive ordered fuzzy time series with application to FOREX.Expert Systems with Applications,38(1),475-485.
  5. Blei, D.M.,Ng, A.Y.,Jordan, M.I.(2003).Latent dirichlet allocation.Journal of Machine Learning Research,3,993-1022.
  6. Bollen, J.,Mao, H.,Zeng, X.(2011).Twitter mood predicts the stock market.Journal Computational Science,2(1),1-8.
  7. Day, M.-Y.,Lee, C.-C.(2016).Deep learning for financial sentiment analysis on finance news providers.2016 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM)
  8. Fama, E.F.(1991).Efficient capital markets: II.The Journal of Finance,46(5),1575-1617.
  9. Fung, G.P.C.,Yu, J.X.,Lam, W.(2002).News sensitive stock trend prediction.Advances in Knowledge Discovery and Data Mining. (PAKDD 2002,)
  10. Groth, S.S.,Muntermann, J.(2011).An intraday market risk management approach based on textual analysis.Decision Support Systems,50(4),680-691.
  11. Hagenau, M.,Liebmann, M.,Neumann, D.(2013).Automated news reading: Stock price prediction based on financial news using context-capturing features.Decision Support Systems,55(3),685-697.
  12. Hao, P.-Y.(2016).Support vector classification with fuzzy hyperplane.Journal of Intelligent & Fuzzy Systems,30(3),1431-1443.
  13. Kang, D.,Park, Y.(2014).Review-based measurement of customer satisfaction in mobile service: sentiment analysis and VIKOR approach.Expert Systems with Applications,41(4),1041-1050.
  14. Kennedy, J.,Eberhart, R.(1995).Particle swarm optimization.Proceedings., IEEE International Conference on Neural Networks,Australia:
  15. Kennedy, J.,Eberhart, R.C.(1997).A discrete binary version of the particle swarm algorithm.IEEE International Conference on Systems, Man, and Cybernetics,Orlando USA:
  16. LeBaron, B.,Arthur, W.B.,Palmer, R.(1999).Time series properties of an artificial stock market.Journal of Economic Dynamics and Control,23(9-10),1487-1516.
  17. Leigh, W.,Purvis, R.,Ragusa, J.M.(2002).Forecasting the NYSE composite index with technical analysis, pattern recognizer, neural network, and genetic algorithm: a case study in romantic decision support.Decision Support Systems,32(4),361-377.
  18. Li, F.(2010).Textual analysis of corporate disclosures: A survey of the literature.Journal of Accounting Literature,29(2010),143-165.
  19. Li, X.,Xie, H.,Chen, L.,Wang, J.,Deng, X.(2014).News impact on stock price return via sentiment analysis.Knowledge-Based Systems,69,14-23.
  20. Li, Y.M.,Li, T.-Y.(2013).Deriving market intelligence from microblogs.Decision Support Systems,55,206-217.
  21. Lin, C.-F.,Wang, S.-D.(2002).Fuzzy support vector machines.IEEE Transactions on Neural Networks,13(2),464-471.
  22. Lu, C.J.,Lee, T.S.,Chiu, C.C.(2009).Financial time series forecasting using independent component analysis and support vector regression.Decision Support Systems,47(2),115-125.
  23. Mabu, S.,Hirasawa, K.,Obayashi, M.,Kuremoto, T.(2013).Enhanced decision making mechanism of rule-based genetic network programming for creating stock trading signals.Expert Systems with Applications,40(16),6311-6320.
  24. Nassirtoussi, A.K.,Aghabozorgi, S.,Wah, T.Y.,Ngo, D.C.L.(2014).Text mining for market prediction: a systematic review.Expert Systems with Applications,41(16),7653-7670.
  25. Nizer, P.S.M.,Nievola, J.C.(2012).Predicting published news effect in the Brazilian stock market.Expert Systems with Applications,39(12),10674-10680.
  26. Premanode, B.,Toumazou, C.(2013).Improving prediction of exchange rates using differential EMD.Expert Systems with Applications,40(1),377-384.
  27. Ranco, G.,Aleksovski, D.,Caldarelli, G.,Grčar, M.,Mozetič, I.(2015).The effects of Twitter sentiment on stock price returns.PLoS ONE,10(9),e0138441.
  28. Ravi, K.,Ravi, V.(2015).A survey on opinion mining and sentiment analysis: Tasks, approaches and applications.Knowledge-Based System,89,14-46.
  29. Rui, H.,Liu, Y.,Whinston, A.(2013).Whose and what chatter matters? The effect of tweets on movie sales.Decision Support Systems,55(4),863-870.
  30. Ryan, P.,Taffler, R.J.(2004).Are economically significant stock returns and trading volumes driven by firm specific news releases?.Journal of Business Finance & Accounting,31(1-2),49-82.
  31. Schumaker, R.P.,Chen, H.(2009).Textual analysis of stock market prediction using breaking financial news: the AZFin text system.ACM Transactions on Information Systems,27(2),a12.
  32. Schumaker, R.P.,Zhang, Y.,Huang, C.-N.,Chen, H.(2012).Evaluating sentiment in financial news articles.Decision Support Systems,53(3),458-464.
  33. Sermpinis, G.,Laws, J.,Karathanasopoulos, A.,Dunis, C.L.(2012).Forecasting and trading the EUR/USD exchange rate with gene expression and psi sigma neural networks.Expert Systems with Applications,39(10),8865-8877.
  34. Shi, Y.,Eberhart, R.(1998).A modified particle swarm optimizer.Proc. of the IEEE Congress on Evolutionary Computation,Anchorage USA:
  35. Si, J.,Mukherjee, A.,Liu, B.,Li, Q.,Li, H.,Deng, X.(2013).Exploiting topic based twitter sentiment for stock prediction.Proceedings of the 51st annual meeting of the association for computational linguistics, Vol. 2, short papers
  36. Tetlock, P.C.(2011).All the news that's fit to reprint: Do investors react to stale information?.The Review of Financial Studies,24(5),1481-1512.
  37. Tetlock, P.C.,Saar-Tsechansky, M.,Macskassy, S.(2008).More than words: Quantifying language to measure firms’ fundamentals.Journal of Finance,63(3),1437-1467.
  38. Vapnik, V.(1995).The Nature of Statistical Learning Theory.New York:Springer-Verlag.
  39. Vu, T.T.,Chang, S.,Ha, Q.T.,Collier, N.(2012).An experiment in integrating sentiment features for tech stock prediction in Twitter.Proceedings of the Workshop on Information Extraction and Entity Analytics on Social Media Data
  40. Walczak, S.(2001).An empirical analysis of data requirements for financial forecasting with neural networks.Journal of Management Information Systems,17(4),203-222.
  41. Weng, B.,Ahmed, M.A.,Megahed, F.M.(2017).Stock market one-day ahead movement prediction using disparate data sources.Expert Systems With Applications,79,153-163.
  42. Wu, D.,Fung, G.P.C.,Yu, J.X.,Pan, Q.(2009).Stock prediction: An event-driven approach based on bursty keywords.Frontiers Computer Science in China,3(2),145-157.
  43. Yu, H.,Nartea, G.V.,Gan, C.,Yao, L.J.(2013).Predictive ability and profitability of simple technical trading rules: Recent evidence from Southeast Asian stock markets.International Review of Economics and Finance,25,356-371.
  44. Yu, L.-C.,Wu, J.-L.,Chang, P.-C.,Chu, H.-S.(2013).Using a contextual entropy model to expand emotion words and their intensity for the sentiment classification of stock market news.Knowledge-Based Systems,41,89-97.
  45. Zadeh, L. A.(1965).Fuzzy sets.Information and Control,8,338-353.
被引用次数
  1. 龔千芬,郝沛毅(2022)。融合深度神經網路與深層模糊孿生支持向量機於股價預測。資訊管理學報,29(4),303-333。
  2. 許婉琪,柯建全,李飛涵,王明昌(2022)。市場恐慌情緒對台股新聞事件之股價反應的影響。管理與系統,29(2),147-186。
  3. 張益誠,張育傑,余泰毅(2021)。探討環境教育論文的文件自動分類技術-以2013-2018年環境教育研討會摘要為例。環境教育研究,17(1),85-128。
  4. (2024)。以時間卷積網路結合長短期記憶模型預測股價:臺股預測實證。資訊管理學報,31(2),177-207。