题名

利用異常偵測技術於可疑帳號辨識之研究

并列篇名

On Identifying Suspicious Accounts Using Anomaly Detection Technology

作者

陳彥翔(YAN-SIANG CHEN);林祝興(CHU-HSING LIN);賴俊鳴(CHUN-MING LAI)

关键词

可疑帳號 ; 假訊息 ; 自然語言處理 ; 機器學習 ; 異常偵測 ; ETL ; 爬蟲 ; suspicious account ; misinformation ; natural language processing ; machine learning ; anomaly detection ; ETL ; crawler

期刊名称

資訊安全通訊

卷期/出版年月

28卷4期(2022 / 11 / 01)

页次

16 - 35

内容语文

繁體中文

中文摘要

近幾年來,「假新聞」、「假訊息」等威脅,在資訊戰中已達到國安等級,也成為了許多國家研究的重點。但此議題並非為新興現象,例如,早在2014年俄羅斯介入影響烏克蘭的克里米亞歸屬公投,以及最近的烏俄戰爭中,我們都可以看到不管是俄羅斯或是其餘國家,許多社群媒體帶風向的情況。因此,本論文專注於發布可疑訊息的帳號以及貼文,並利用Twitter官方的計畫網站-「Transparency」網站中,Twitter定義可疑帳號為跟政府或州有關的假訊息操弄帳號,公布經調查確認為可疑帳號以及貼文的資料。有別於以往的識別方式,我們利用機器學習中的「異常偵測」技術,訓練出一個能以高準度分辨出異常訊息以及異常帳號之分辨器。在資料收集方面,我們建立基於ETL框架的資料爬取系統,爬取了名人的官方帳號以及推文。並利用官方已經證實身分之有「藍勾勾」的帳號所發布之正常貼文,來驗證分辨器誤判之情形。從實驗結果,我們發現準確度達到96%,獲得很好的效果。

英文摘要

In recent years, threats such as "fake news" and "disinformation" have reached the level of national security in information warfare, and have become an important research issue. For example, as early as 2014, Russia intervened to influence Ukraine's Crimea referendum, and in the recent Ukrainian-Russian War, we can see that in many communities, whether Russia or the others, the media takes the wind. This article focuses on the accounts and posts that publish suspicious information, and uses Twitter's official project website-Transparency website. Twitter defines suspicious accounts as accounts that manipulate disinformation related to the government or state, and publishes them after investigation and confirmation. Different from the previous identification methods, in this paper we use the "anomaly detection" technology in machine learning to train a classifier that can distinguish abnormal messages and abnormal accounts with high accuracy. For the dataset, we established a data crawling system based on the ETL framework, and crawled official accounts and tweets of celebrities. And use the normal posts posted by the accounts with blue tick, whose identities have been officially confirmed, to verify the performance of the classifier. From the experimental results, we found that the accuracy of our identification method reached 96%.

主题分类 基礎與應用科學 > 資訊科學
参考文献
  1. 陳彥翔,林祝興(2022)。利用異常偵測技術於可疑帳號辨識之研究。第三十二屆全國資訊安全會議
    連結:
  2. Twitter Transparency:https://transparency.twitter.com/en/reports/information-operations.html
  3. Anomaly Detection 2020:https://medium.com/學以廣才/異常檢測-anomaly-detection-fa300fe6df71
  4. Global Vectors for Word Representation: https://nlp.stanford.edu/projects/glove/
  5. V-Dem for digital society project 2018: http://digitalsocietyproject.org/foreign-intervention-on-social-media/
  6. Du, B.,Liu, C.,Zhou, W.,Hou, Z.,Xiong, H.(2016).Catch Me If You Can: Detecting Pickpocket Suspects from Large-scale Transit Records.22nd ACM SIGKDD International Conference
  7. Guarino, S.,Trino, N.,Celestini, A,Chessa, A.,Riotta, G.(2020).,未出版
  8. Im, J.,Chandrasekharan, E.,Sargent, J.,Lighthammer, P.,Demby, T.,Bhargava, A.,Hemphill, L.,Jurgens, D.,Gilbert, E.(2020).Still Out There: Modeling and Identifying Russian Troll Accounts on Twitter.12th ACM Conference on Web Science
  9. Kipf, T. N.,Welling, M.(2016).Variational Graph Auto-Encoders.Bayesian Deep Learning Workshop
  10. Liu, Y.,Yi-Fang, W.(2018).Early detection of fake news on social media through propagation path classification with recurrent and convolutional networks.Thirty-Second AAAI Conference on Artificial Intelligence
  11. Mazza, M.,Cresci, S.,Avvenutil, M.,Quattrociocchi, W.,Tensconi, M.(2019).Rtbust: Exploiting temporal patterns for botnet detection on twitter.Proceedings of the 10th ACM Conference on Web Science
  12. Mikolov, T.,Chen, K.,Corrado, G.,Dean, J.(2013).,未出版
  13. Shu, K.,Mahudeswaran, D.,Wang, S.,Liu, H.(2020).Hierarchical propagation networks for fake news detection: Investigation and exploitation.Proceedings of the International AAAI Conference on Web and Social Media
  14. Tony, L. Fei,Ming, T. Kai,Zhi-Hua, Z.(2012).Isolation-based Anomaly Detection.ACM Transactions on Knowledge Discovery from Data (TKDD)
  15. 林祝興。,國家科學及技術委員會。