题名

基於漸增式分群法之惡意程式自動分類研究

并列篇名

Automatic malware classification based on incremental clustering algorithm

DOI

10.6188/JEB.2016.18(2).03

作者

陳嘉玫(Chia-Mei Chen);賴谷鑫(Gu-Hsin Lai)

关键词

誘捕系統 ; 惡意程式分類 ; 靜態分析 ; 漸增式分群 ; Honeypot ; Classification of Malware ; Static analysis ; Incremental clustering

期刊名称

電子商務學報

卷期/出版年月

18卷2期(2016 / 12 / 01)

页次

225 - 247

内容语文

繁體中文

中文摘要

近年來網路犯罪份子為了有效地躲避安全機制的檢驗,而不斷地開發惡意程式或是進行變種。現今分析方式大多數都只分析單一二進位檔案型態之惡意程式,無法適合誘捕系統所捕獲到之原始碼與二進位檔混和型態的惡意程式。目前仍然缺少一個有效且快速分析的工具針對誘捕系統所捕獲的惡意程式做分析。本研究提出一個惡意程式分類系統,此系統擷取惡意程式原始碼、以及檔案結構作為特徵值並且使用漸進式分群法分群。本研究利用漸增式的分群法改善階層式分群演算法效率並且藉由惡意程式分群可以知道新捕獲的惡意程式是否屬於已知的分類或是屬於新的類型。本研究與網路上知名病毒偵測與分類平台Virustotal比較以驗證分類準確度,實驗證明本研究所提出的分類優於Virustotal。

英文摘要

In recent years, cybercriminals have developed new malware or variants in order to effectively evade inspection from security mechanisms. Most prior works focused on analyzing malware which contain only single binary file. However, most honeypot captured malware contain several binary and source files. Therefore, existing malware analysis approaches do not suitable for honeypot captured malware. In this research, a novel malware classification approach which analyzes features extracted from malware’s file structure, source code and binary files and file name is proposed. An incremental clustering algorithm is developed to replace traditional hierarchical clustering algorithm for improving efficiency. By means of proposed system, when a honeypot captures a new malware, IT security staff could know whether the new malware belongs to any existing clusters or not. To evaluate the performance of proposed system, the proposed approach is compared with Virustotal- a popular platform for malware detection and classification. The experiment result shows that the proposed approach outperforms Virustotal.

主题分类 人文學 > 人文學綜合
基礎與應用科學 > 資訊科學
基礎與應用科學 > 統計
社會科學 > 社會科學綜合
参考文献
  1. 陳嘉玫、楊佳蕙、賴谷鑫(2013)。基於結構相似度之惡意程式原始碼分類研究。電子商務學報,15(4),519-540。
    連結:
  2. 陳嘉玫、楊佳蕙、賴谷鑫(2013)。基於結構相似度之惡意程式原始碼分類研究。電子商務學報,15(4),519-540。
    連結:
  3. Conti, G., Bratus, S., & Shubinay, A. (2010). A Visual Study of Primitive Binary Fragment Types. Black Hat USA
  4. HP (2012). Cost of Cyber Crime Study: United States. Retrieved 2012, from http://www.hp.com/hpinfo/newsroom/press/2012/121008a.html
  5. Virustotal (2015). Free Online Virus, Malware and URL Scanner. Retrieved 2015, from https://www.virustotal.com/
  6. Trend Micro (2011). Press Releases: “Soldier” Uses SpyEye to Net $3.2 Million in Six Months. Retrieved 2011, from http://blog.trendmicro.com/trendlabs-security-intelligence/soldier-spyeyes-a-jackpot/
  7. Alazab, M.,Layton, R.,Venkatraman, S.,Watters, P.(2010).Malware Detection Based On Structural and Behavioral Features of API Calls.Proceedings of International Cyber Resilience Conference
  8. Bailey, M.,Andersen, J.,Morleyman, Z.,Jahanian, F.(2007).Automated Classification and Analysis of Internet Malware.Proceedings of the 10th International Conference on Recent advances in intrusion detection (RAID'07)
  9. Cosma, G.,Joy, M.(2012).An Approach to Source-Code Plagiarism Detection and Investigation Using Latent Semantic Analysis. Computers.IEEE Transactions on,61(3),379-394.
  10. Day, W. H.,Edelsbrunner, H.(1984).Efficient algorithms for agglomerative hierarchical clustering methods.Journal of classification,1(1),7-24.
  11. Firdausi, I.,Lim, C.,Erwin, A.,Nugroho, A. S.(2010).Analysis of Machine learning Techniques Used in Behavior-Based Malware Detection.Proceedings of 2nd International Conference on Advances in Computing, Control, and Telecommunication Technologies
  12. Gitchell, D.,Tran, N.(1999).Sim: A Utility for Detecting Similarity In Computer Programs.Proceedings of the 30th SIGCSE Technical Symposium
  13. Inoue,D.,Yoshioka, K.,Eto, M.,Hoshizawa, Y.,Nakao, K(2008).Malware Behavior Analysis in Isolated Miniature Network for Revealing Malware's Network Activity.Proceedings of the IEEE International Conference on Communications (ICC 2008)
  14. Kolter, J. Z.,Maloof, M. A.(2006).Learning to Detect and Classify Malicious Executables in the Wild.Journal of Machine Learning Research,6,2721-2744.
  15. Lee, T.,Mody, J. J.(2006).Behavioral_Classification.Proceedings of EICAR (European Institute for Computer Antivirus Research) Conference
  16. MacQueen, J.(1967).Some methods for classification and analysis of multivariate observations.Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability
  17. Nataraj, L.,Karthikeyan, S.,Jacob, G.,Manjunath, B. S.(2011).Malware images: visualization and automatic classification.Proceedings of the 8th International Symposium on Visualization for Cyber Security
  18. Nataraj, L.,Yegneswaran, V.,Porras, P.,Zhang, J.(2011).A Comparative Assessment of Malware Classification Using Binary Texture Analysis and Dynamic Analysis.Proceedings of the 4th ACM workshop on Security and artificial intelligence
  19. Prechelt, L.,Malpohl, G.,Philippsen, M.(2002).Finding Plagiarisms Among a Set of Programs with JPlag.Journal of Universal Computer Science,8(11),1016-1038.
  20. Rieck, K.,Holz, T.,Willems, C.,Duessel, P.,Laskov, P.(2008).Learning and Classification of Malware Behavior.Proceedings of the 5th International Conference on Detection of Intrusions and Malware, and Vulnerability Assessment
  21. Rieck, K.,Trinius, P.,Willems, C.,Holz, T.(2011).Automatic Analysis of Malware Behavior using Machine Learning.Journal of Computer Security,19(4),639-668.
  22. Santos, I.,Brezo, F.,Nieves, J.,Penya, Y. K.,Sanz, B.,Laorden, C.,Bringsa, P. G.(2010).Idea: Opcode-sequence-based Malware Detection.Proceedings of the 2nd International Conference on Engineering Secure Software and Systems
  23. Shankarapani, M. K.,Ramamoorthy, S.,Movva, R. S.,Mukkamala, S.(2011).Malware Detection Using Assembly and API Call Sequences.Journal in computer virology,7(2),107-119.
  24. Symantac(2011).Symantec Internet Security Threat Report (ISTR),17
  25. Tian, R.,Islam, R.,Batten, L.,Versteeg, S.(2010).Differentiating Malware from Cleanware using Behavioral Analysis.Proceedings of 5th International Conference on Malicious and Unwanted Software,23-30.
  26. Tian,R.,Batten,L.,Islam, R.,Versteeg, S.(2009).An Automated Classification System Based on The Strings of Trojan and Virus Families.Proceedings of the 4th International Conference on Malicious and Unwanted Software (MALWARE)
  27. Wang, C.,Pang, J.,Zhao, R.,Liu, X.(2009).Using API Sequence and Bayes Algorithm to Detect Suspicious Behavior.Proceedings of International Conference on Communication Software and Networks (ICCSN)
  28. Yu, S.,Zhou, S.,Liu, L.,Yang, R.,Luo, J.(2010).Malware Variants Identification Based on Byte Frequency.Proceedings of 2nd International Conference on Network Security Wireless Communications and Trusted Computing (NSWCTC)
  29. Zhang, J.,Porras, P.,Yegneswaran, V(2009).,SRI International.
  30. Zhao, H.,Xu, M.,Zheng, N.,Yao, J.,Hou, Q.(2010).Malicious Executables Classification Based on Behavioral Factor Analysis.Proceedings of International Conference on e-Education, e-Business, e-Management and e-Learning