题名

使用動態分析資料於卷積神經網路上進行惡意程式家族分類

并列篇名

Using dynamic analysis data for malware family classification by convolution neural network

作者

蕭舜文(Shun-Wen Hsiao)

关键词

惡意程式 ; 動態分析 ; 卷積神經網路 ; 行為分類 ; Malware ; dynamic analysis ; convolution neural network ; behavior classification

期刊名称

資訊安全通訊

卷期/出版年月

24卷1期(2018 / 01 / 01)

页次

41 - 60

内容语文

繁體中文

中文摘要

傳統上惡意程式的病毒碼特徵擷取與惡意行為分析需要耗費大量的人力與時間,分析過程通常需要借助資訊安全專家多年對於惡意程式分析的經驗。資安專家通常會比對過去已知的惡意特徵將新發現的惡意程式歸類到已知的惡意程式家族。然而現今新的惡意程式變種數量已經大幅超越人工分析的能力,面對如此資安挑戰,本論文的目的是藉助卷積神經網路對惡意程式進行家族進行自動分類並產生行為特徵,將過去人工的動作轉為自動,與其他過去的研究不同,本論文先對惡意程式進行動態側寫分析並產出其高階的Windows API呼叫序列紀錄,而卷積神經網路將視Windows API呼叫序列為輸入資料並最終輸出惡意程式家族分類的結果。本文亦利用卷積神經網路的學習結果來解釋其惡意程式之特徵行為。在實驗上我們採用國網中心以及資策會於真實世界蒐集的惡意程式,進行動態分析側寫後進行監督式的訓練以及驗證,其家族分類準確率超過99%。我們的實驗並證明可以使用有限的Windows API呼叫序列就能進行正確的家族分類,如此我們的研究成果可以進一步導入至入侵防禦系統,進行早期的入侵偵測。

英文摘要

Conventionally, it takes lots of time and human resources to analyze malware to extract its byte signature and malicious behavior. Usually, such analysis process relies on years of experience of malware analysis by the cybersecurity domain experts. They usually classify the unseen malware sample into a known malware family by checking against known behavior characteristics. However, nowadays the number of new malware is too large for human experts to manually analyze them. To face such cybersecurity challenge, the purpose of this paper is to provide a method to automatically classify malware by using convolution neural network (CNN) and generate behavior characteristics with the help of CNN. Unlike previous research works, we firstly perform dynamic analysis on malware sample and produce its high-level Windows API call sequences as its behavior profile. Then, the API call sequences are fed into the convolution neural network as input to generate the malware family classification result. We also use the learning result of the convolution neural network to explain the behavior characteristics of the malware families. In our experiments, we use the malware samples collected from the real world by the National Center for High-Performance Computing (Taiwan) to generate malware profiles and perform supervised training and validation. The family classification accuracy is over 99%. Our experiments also show that we can use a limited number of Windows API call sequences to perform malware classification; in this case, our result can be used in an intrusion prevention system for early malware detection.

主题分类 基礎與應用科學 > 資訊科學
参考文献
  1. https://www.tensorflow.org/get_started/mnist/pros
  2. https://www.kaggle.com/c/malware-classification
  3. Bayer, U.,Comparetti, P. M.,Hlauschek, C.,Kruegel, C.,Kirda, E.(2009).Scalable, Behavior-Based Malware Clustering.Proc. Network and Distributed System Security Symposium (NDSS)
  4. Bayer, U.,Kruegel, C.,Kirda, E.(2006).TTAnalyze: A Tool for Analyzing Malware.Proc. European Institute for Computer Antivirus Research (EICAR 2006) Annual Conference
  5. Billar, D.(2007).Opcodes as predictor for malware.International Journal of Electronic Security and Digital Forensics,1,156-168.
  6. Chen, P. M.,Noble,B. D.(2001).When virtual is better than real.Proc. of 8th Workshop on Hot Topics in Operating Systems (HotOS)
  7. Dinaburg, A.,Royal, P.,Sharif, M.,Lee, W.(2008).Ether: Malware Analysis via Hardware Virtualization Extensions.Proc. of ACM Conference on Computer and Communications Security
  8. Dunlap, G. W.,King, S. T.,Cinar, S.,Basrai, M. A.,Chen, P. M.(2002).ReVirt: Enabling Intrusion Analysis through Virtual-Machine Logging and Replay.Proc. of USENIX Symposium on Operating Systems Design and Implementation
  9. Forrest, S.,Hofmeyr, S. A.,Somayaji, A.,Longstaff, T. A.(1996).A Sense of Self for Unix Processes.Proc. of IEEE Symposium on Security and Privacy (S&P)
  10. Garfinkel, T.,Rosenblum, M.(2003).A Virtual Machine Introspection Based Architecture for Intrusion Detection.Proc. of NDSS
  11. Gibert, D.(2016).Universitat Politecnica de Catalunya.
  12. Hofmeyr, S. A.,Forrest, S.,Somayaji, A.(1998).Intrusion Detection using Sequences of System Calls.Journal of Computer Security,6,155-180.
  13. Hsiao, S.-W.,Chen, Y.-N.,Sun, Y. S.,Chen, M. C.(2013).A Cooperative Botnet Profiling and Detection in Virtualized Environment.Proc. of IEEE Conference on Communications and Network Security (IEEE CNS)
  14. Jiang, X.,Wang, X.,Xu, D.(2007).Stealthy Malware Detection through VMM-based ‘out-ofthe-box’ Semantic View Reconstruction.Proc. of ACM CCS
  15. Kruegel, C.,Mutz, D.,Valeur, F.,Vigna, G.(2003).On the Detection of Anomalous System Call Arguments.Proc. of European Symposium on Research in Computer Security
  16. Lee, W.,Stolfo, S. J.(1998).Data Mining Approaches for Intrusion Detection.Proc. of USENIX Security Symposium
  17. Liu, L.,Chen, S.,Yan, G.,Zhang, Z.(2008).BotTracer: Execution-Based Bot-Like Malware Detection.Proc. of Int. Conf. on Information Security (ISC)
  18. Nataraj, L.,Karthikeyan, S.,Jacob, G.,Manjunath, B. S.(2011).Malware images: Visualization and automatic classification.Proc. of International symposium on Visualization for Cyber Security
  19. Ravi, C.,Manoharan, R.(2012).Malware detection using windows api sequence and machine learning.International Journal of Computer Applications,43
  20. Saxeand, J.,Berlin, K.(2015).Deep neural network based malware detection using two dimensional binary program features.Proc. of MALWARE
  21. Song, D.,Brumley, D.,Yin, H.,Caballero, J.,Jager, I.,Kang, M. G.,Liang, Z.,Newsome, J.,Poosankam, P.,Saxena, P.(2008).BitBlaze: A New Approach to Computer Security via Binary Analysis.Proc. of International Conference on Information Systems Security
  22. Tesauro, G. J.,Kephart, J. O.,Sorkin, G. B.(1996).Neural networks for computer virus recognition.IEEE expert,11
  23. Veeramani, N.,Rai, N.(2012).Windows api based malware detection and framework analysis.International Journal of Scientific & Engineering Research,3
  24. Wagner, D.,Dean, D.(2001).Intrusion Detection via Static Analysis.Proc. of IEEE Symposium on Security and Privacy (IEEE S&P)
  25. Willems, C.,Holz, T.,Freiling, F.(2007).Toward Automated Dynamic Malware Analysis Using CWSandbox.IEEE Security & Privacy,5(2),32-39.
  26. Yin, H.,Song, D.,Egele, M.,Kruegel, C.,Kirda, E.(2007).Panorama: Capturing System-Wide Information Flow for Malware Detection and Analysis.Proc. of ACM Conference on Computer and Communications Security (ACM CCS)