题名

基於結構相似度之惡意程式原始碼分類研究

并列篇名

Malware Classification Based on Structure Similarity

DOI

10.6188/JEB.2013.15(4).04

作者

陳嘉玫(Chia-Mei Chen);楊佳蕙(Chia-Hui Yang);賴谷鑫(Gu-Hsin Lai)

关键词

惡意軟體分類 ; 靜態分析 ; 結構相似度 ; Malware classification ; static analysis ; structure similarity

期刊名称

電子商務學報

卷期/出版年月

15卷4期(2013 / 12 / 01)

页次

519 - 539

内容语文

繁體中文

中文摘要

面對日益複雜的進階持續性滲透攻擊(Advanced Persistent Threat),惡意軟體分類為數位鑑識中最重要的一環。正確的惡意軟體分類可以得到惡意軟體最完整的系統行為,並且簡化鑑識之分析工作。傳統的惡意軟體分類著重於執行後之動態分析或者是以逆向工程結合靜態分析的方式,試圖取得惡意軟體的系統行為資訊,但惡意軟體會透過反虛擬機器監控和混淆技術來降低分類的正確率。隨著誘捕系統愈來愈健全,誘捕系統所蒐集到的惡意軟體原始碼也日漸增加,藉由分析惡意軟體的原始碼可以得到最正確的惡意軟體分類,因此本論文提出一個自動化惡意軟體分類機制。本論文藉由誘捕系統所擷取之惡意軟體原始碼,利用惡意軟體檔案結構相似度以及原始碼檔案相似度,透過階層式分群演算法(Hierarchical Clustering Algorithmn)之方法,不但可以正確的將新捕捉到的惡意軟體分類到正確的類別也可以快速地找出新類型的惡意軟體。本論文提出的方式可以大幅度減少數位鑑識者針對同一類型的惡意軟體重複進行高成本的分析,亦可在最短時間內了解攻擊者行為以及意圖。透過實驗證明,本論文所提出的系統可以將惡意軟體原始碼做正確的分類,而本論文所提出的方法亦可應用於其他有原始碼分類需求的領域。

英文摘要

In the face of APT (Advanced Persistent Threat), malware classification is one of the promising solutions in the field of digital forensics. In previous literature, researchers performed dynamic analysis or static analysis after reverse engineering. In the other hand, malware developers even use anti-VM and obfuscation techniques try to evade malware classifiers.Honeypots are increasingly deployed throughout different networks; malware source code is collected and unclassified. Source code analysis provides a better classification for forensics. In this paper, a novel classification approach is proposed, based on logic similarity and directory structure similarity. Hierarchical clustering algorithm finds the best fit classification for each testing data and creates one if none fits well. New type of malware could be identified and then analyzed further. Such classification avoids to re-analyze known malware and allocates resources for new malware. The experimental results demonstrate that the proposed system can classify the malware effectively with a small mis-classification ratio.

主题分类 人文學 > 人文學綜合
基礎與應用科學 > 資訊科學
基礎與應用科學 > 統計
社會科學 > 社會科學綜合
参考文献
  1. Altaher, A.,Supriyanto,ALmomani, A.,Anbar, M.,Ramadass, S.(2012).Malware detection based on evolving clustering method for classification.Scientific Research and Essays,7(22),2031-2036.
  2. Bergroth, L.,Hakonen, H.,Raita, T.(2000).A survey of longest common subsequence algorithms.Seventh International Symposium on String Processing and Information Retrieval (SPIRE 2000),A Curuña, Spain:
  3. Cesare, S.,Xiang, Y.(2010).Classification of malware using structured control flow.Proceedings of the 8th Australasian Symposium on Parallel and Distributed Computing (AusPDC 2010),Brisbane, Australia:
  4. Christodorescu, M.,Jha, S.(2003).Static analysis of executables to detect malicious patterns.Proceedings ofthe 12th USENIX Security Symposium,Washington, D.C., USA:
  5. Damerau, F. J.(1964).A technique for computer detection and correction of spelling errors.Communications of the ACM,7(3),171-176.
  6. Gheorghescu, M.(2005).An automated virus classification system.Virus Bulletin Conference,Dublin, Ireland:
  7. Hamming, R. W.(1950).Error detecting and error correcting codes.Bell System Technical Journal,29(2),147-160.
  8. Kolter, J. Z.,Maloof, M. A.(2006).Learning to detect and classify malicious executables in the wild.Journal of Machine Learning Research,7,2721-2744.
  9. Levenshtein, V. I.(1966).Binary codes capable of correcting deletions, insertions, and reversals.Soviet Physics Doklady,10(8),707-710.
  10. Maletic, J. I.,Valluri, N.(1999).Automatic software clustering via Latent Semantic Analysis.Proceedings of 14th IEEE International Conference on Automated Software Engineering (ASE'99),Florida, USA:
  11. Willems, C.,Holz, T.,Freiling, F.(2007).Toward automated dynamic malware analysis using CWSandbox.IEEE Security and Privacy,2(5),32-39.
  12. Winkler, W. E.(1990).String comparator metrics and enhanced decision rules in the fellegi-sunter model of record linkage.Proceedings of the Section on Survey Research Methods,USA:
  13. Ye, Y.,Chen, L.,Wang, D.,Li, T.,Jiang, Q.,Zhao, M.(2009).SBMDS: An interpretable string based malware detection system using SVM ensemble with bagging.Journal in Computer Virology,5(4),283-293.
  14. Ye, Y.,Li, T.,Huang, K.,Jiang, Q.,Chen, Y.(2010).Hierarchical associative classifier (HAC) for malware detection from the large and imbalanced gray list.Journal of Intelligent Information Systems,35(1),1-20.
  15. Ye, Y.,Wang, D.,Li, T.,Ye, D.,Jiang, Q.(2008).An intelligent PE-malware detection system based on association mining.Journal in Computer Virology,4(4),323-334.
  16. Zen, K.,Iskandar, D. N. F. A.,Linang, O.(2011).Using Latent Semantic Analysis for automated grading programming assignments.Proceedings of Semantic Technology and Information Retrieval (STAIR),Putrajaya, Malaysia:
被引用次数
  1. 賴谷鑫、陳嘉玫(2016)。基於漸增式分群法之惡意程式自動分類研究。電子商務學報,18(2),83-102。