题名

網路直播聊天室情緒探勘-使用模糊支持向量機

并列篇名

Sentiment Analysis and Opinion Mining in Live Streaming by Using Fuzzy Support Vector Machine

作者

郝沛毅(Pei-Yi Hao);歐仁彬(Jen-Bing Ou);黃天受(Tien-Shou Huang);楊盛琮(Sheng-Cong Yang)

关键词

線上直播 ; 情緒分析 ; 網路用語 ; 支持向量機 ; 模糊理論 ; live streaming video ; sentiment analysis ; web crawler ; internet slang ; support vector machine

期刊名称

資訊管理學報

卷期/出版年月

25卷2期(2018 / 04 / 30)

页次

185 - 218

内容语文

繁體中文

中文摘要

「網路直播平台(live streaming video)」是近年不可被忽視的新興社群媒體平台。相較於一般影音平台乃是透過事先拍攝後進行上傳至網路分享,直播多了「即時性」與「互動性」兩種特性,在直播過程中,實況主能依據觀眾在聊天室的反應,立即做出回應,創造出更多的差異化內容,但觀眾踴躍的發言時,實況主就有可能遺漏觀眾的訊息。因此本研究之目的是希望透過情緒探勘技術,探勘聊天室的內容後,以較為簡單方式呈現觀眾想表達的意見,希望有助於實況主能以較輕鬆的方式得知觀眾反應,並可做為內容調整之參酌。本研究提出之情緒探勘系統主要分為三大步驟,首先透過「網路聊天室爬蟲」建立Socket連線至Twitch-IRC(Server),即時自動擷取聊天室內容,並將取得的留言內容進行「網路用語正規化」的步驟後,再透過模糊支持向量機進行情緒「正面」以及「負面」分類,透過模糊理論可以更精確的計算正面(負面)情緒的歸屬程度。最後以「文字雲」、「情緒波動圖」、「情緒雷達圖」、「情緒直方圖」、「情緒盒鬚圖」等各式圖表進行結果呈現。

英文摘要

Purpose-Live streaming video is an emerging community media platform in recent years. Compared to the traditional video, live streaming video is more instant and interactive. According to the response of the audience in the chat room, the live streamer can immediately respond and create different content. But when too many messages are generated in the chat room, the live streamer is likely to omit the audience's message. Therefore, the purpose of this research is to mining the contents of audience's message in the chat room through the sentiment mining technology and to present the results in a more simple way. Design/methodology/approach - The proposed sentiment analysis system consists of three major models. First, we create a Socket connection to Twitch-IRC (Server) of chat room via the web crawler and capture the message from the chat room at predefined intervals. Second, we normalize the internet slang in audience's message into the standard format that can be analyzed. Finally, we propose a fuzzy support vector machine to classify the audience's message into positive or negative emotion. Findings-On the average, the proposed approach yields satisfactory performance with accuracy rate of 98.88% on internet slang normalization and 87.72% on live streaming sentiment analysis. Questionnaires also demonstrate the efficacy and effectiveness of the resulted sentiment statistics. Research limitations/implications - This study focused only on accurately classifying the audience's sentiment on live streaming. In our future work, we plan to investigate the correlation between audience's sentiment and audience rating. Practical implications-The results of sentiment analysis are presented with text clouds, line graph, radar chart, histogram and box plot, etc. This can help the streamer to know the audience's response in a more relaxed way and to adjust the content of the live streaming video according to the audience's response. Originality/value-This study is, to the best of our knowledge, the first attempt to apply fuzzy support vector machine and word embedding cluster features for the live streaming sentiment analysis in Taiwan.

主题分类 基礎與應用科學 > 資訊科學
社會科學 > 管理學
参考文献
  1. 黃金蘭(2012)。中文版語文探索與字詞計算字典之建立。中華心理學刊,54(2),185-201。
    連結:
  2. 楊亨利,黃泓彰,林青峰(2015)。基於決策樹與二元語言模型的網路用語轉譯系統。電子商務學報,17(1),25-48。
    連結:
  3. (2012).Mining Text Data.Springer.
  4. Alshari, E.,Azman, A.,Alkeshr, M.,Doraisamy, S.C.,Mustapha, N.(2017).Improvement of sentiment analysis based on clustering of Word2Vec features.2017 28th International Workshop on Database and Expert Systems Applications (DEXA 2017)
  5. Batuwita, R.,Palade, V.(2010).Efficient resampling methods for training support vector machines with imbalanced datasets.Proceedings of the International Joint Conference on Neural Networks
  6. Bengio, Y.,Courville, A.,Vincent, P.(2013).Representation learning: A review and new perspectives.IEEE TPAMI,35(8),1798-1828.
  7. Bollen, J.,Mao, H.,Zeng X.(2011).Twitter mood predicts the stock market.Journal of Computational Science,2(1),1-8.
  8. Collobert, R.,Weston, J.,Bottou, L.,Karlen, M.,Kavukcuoglu, K.,Kuksa, P.(2011).Natural language processing (almost) from scratch.Journal of Machine Learning Research,12,2493-2537.
  9. Cortes C.,Vapnik V.(1995).Support-vector networks.Machine Learning,20(3),273-297.
  10. Feldman, R.(2013).Techniques and applications for sentiment analysis.Communications of the ACM,56(4),82-89.
  11. Ghose, A.,Ipeirotis, P.G.(2011).Estimating the helpfulness and economic impact of product reviews: Mining text and reviewer characteristics.IEEE Transactions on Knowledge and Data Engineering (TKDE),23(10),1498-1512.
  12. Hao, P.-Y.(2016).Support vector classification with fuzzy hyperplane.Journal of Intelligent & Fuzzy Systems,30(3),1431-1443.
  13. He, H.(Ed.),Ma, Y.(Ed.)(2012).Imbalanced Learning: Foundations, Algorithms, and Applications.John Wiley & Sons, Inc.
  14. Hinton, E.,Salakhutdinov, R.(2006).Reducing the dimensionality of data with neural networks.Science,313(5786),504-507.
  15. Jahanbakhsh, K. and Moon, Y. (2014), ‘The predictive power of social media: On the predictability of U.S. presidential elections using Twitter’, arXiv:1407.0622.
  16. Jiang, X.,Yi, Z.,Lv, J.C.(2006).Fuzzy SVM with a new fuzzy membership function.Neural Computing & Applications,15(3-4),268-276.
  17. Kaiser, K.,Miksch, S.(2005).Technical Report for Asgaard-TR-2005-6Technical Report for Asgaard-TR-2005-6,Vienna University of Technology, Institute of Software Technology and Interactive Systems.
  18. Kang, D.,Park, Y.(2013).Review-based measurement of customer satisfaction in mobile service: Sentiment analysis and VIKOR approach.Expert Systems with Applications,41(4),1041-1050.
  19. Ku L.-W.,Chen, H.-H.(2007).Mining opinions from the web: Beyond relevance retrieval.Journal of the American Society for Information Science and Technology,58(12),1838-1850.
  20. Laskov, P.,Gehl, C.,Kruger, S.,Muller, K.-R.(2006).Incremental support vector learning: Analysis, implementation and applications.Journal of Machine Learning Research,7,1909-1936.
  21. Li, X.,Xie, H.,Chen, L.(2014).News impact on stock price return via sentiment analysis.Knowledge-Based Systems,69,14-23.
  22. Li, Y.-M.,Li, T.-Y.(2013).Deriving market intelligence from microblogs.Decision Support Systems.,55(1),206-217.
  23. Lin, Z.,Hao, Z.,Yang, X.,Liu, X.(2009).Several svm ensemble methods integrated with under-sampling for imbalanced data learning.Proceedings of the 5th International Conference on Advanced Data Mining and Applications
  24. Liu, J.-S.,Cheng, Y.-W.(2006).Textual data error detection based on string features.ROCLING 2006
  25. Mikolov, T.,Sutskever, I,Chen, K.,Corrado, G.,Dean, J.(2013).Distributed representations of words and phrases and their compositionality.Proceedings of NIPS 2013
  26. Nikfarjam, A.,Sarker, A.,O’Connor, K.,Ginn, R.,Gonzalez, G.(2015).Pharmacovigilance from social media: Mining adverse drug reaction mentions using sequence labeling with word embedding cluster features.Journal of the American Medical Informatics Association,22(3),671-681.
  27. Ravi, K.,Ravi, V.(2015).A survey on opinion mining and sentiment analysis: Tasks, approaches and applications.Knowledge-Based Systems,89,14-46.
  28. Rui, H.,Liu, Y.,Whinston, A.(2013).Whose and what chatter matters? The effect of tweets on movie sales.Decision Support Systems,55(4),863-870.
  29. Singh, S. (2016), ‘Trump win: Poll experts failed but AI by an Indian got it right in October’, India Today, available at http://indiatoday.intoday.in/technology/story/trump-win-poll-experts-failed-but-ai-by-an-indian-got-it-right-in-october/1/807123.html (accessed 12 January 2018).
  30. Vapnik, V.N.(1995).The Nature of Statistical Learning Theory.New York:Springer-Verlag.
  31. Veropoulos, K.,Campbell, C.,Cristianini, N.(1999).Controlling the sensitivity of support vector machines.Proceedings of the International Joint Conference on Artificial Intelligence
  32. Yu, L.-C.,Wu, J.-L.,Chang, P.-C.,Chu, H.-S.(2013).Using a contextual entropy model to expand emotion words and their intensity for the sentiment classification of stock market news.Knowledge-Based Systems,41,89-97.
  33. Zadeh, L.A.(1965).Fuzzy sets.Information and Control,8,338-353.
  34. 林彩雯(2015)。台中市,靜宜大學資訊管理學系。
  35. 邱鴻達(2010)。新竹市,國立交通大學資訊科學與工程研究所。
  36. 張冬雯,楊鵬飛,許雲峰(2016)。基於 word2vec 和 SVMperf 的中文評論情感分類研究。計算機科學,43(6A),418-421。
  37. 甯格致,賴昆棋(2010)。基於網路社群之旅遊經驗及對應情境之情感意見分析研究。第廿二屆自然語言與語音處理研討會(ROCLING 2010),南投縣,台灣:
  38. 廖偉帆(2015)。台北市,實踐大學資訊科技與管理學系碩士班。
被引用次数
  1. 謝欣頻(2020)。網路直播平台與職場困境。諮商與輔導,420,31-34。
  2. 楊依純,陳思恩,胡錦玉,吳芝儀,王怡叡(2020)。利用臉書社群媒體直播平台推動心理健康預防之初探。輔導季刊,56(4),63-72。