题名

NOVEL MACHINE LEARNING APPROACH FOR ANALYZING ANONYMOUS CREDIT CARD FRAUD PATTERNS

DOI

10.7903/ijecs.1732

作者

Sylvester Manlangit;Sami Azam;Bharanidharan Shanmugam;Asif Karim

关键词

Fraudulent Credit Card Transactions ; k-NN ; SMOTE ; PCA ; Machine Learning

期刊名称

International Journal of Electronic Commerce Studies

卷期/出版年月

10卷2期(2019 / 12 / 01)

页次

175 - 201

内容语文

英文

中文摘要

Fraudulent credit card transactions are on the rise and have become a significantly problematic issue for financial intuitions and individuals. Various methods have already been implemented to handle the issue, but the embezzlers have always managed to employ innovative tactics to circumvent a number of security measures and execute the fraudulent transactions. Thus, instead of a rule-based system, an intelligent and adaptable machine learning based algorithm should be an answer to tackle such sophisticated digital theft. The presented framework uses k-NN for classification and utilises Principal Component Analysis (PCA) for raw data transformation. Neighbours (anomalies in data) were created using Synthetic Minority Oversampling Technique (SMOTE) and a distance-based feature selection method was employed. The proposed process performed well by having a precision and F-Score of 98.32% and 97.44% respectively for k-NN and 100% and 98.24% respectively for Time subset when using the misclassified instances. This work also demonstrates a larger and clearer classification breakdown, which aids in achieving higher precision rate and improved recall rate. In a view to accomplish such high accuracy, the original datum was transformed using Principal Component Analysis (PCA), neighbours (anomalies in data) were created using Synthetic Minority Oversampling Technique (SMOTE) and a distance based feature selection method was employed. The proposed process performed well when using the misclassified instances in the test dataset used in the previous work, while demonstrating a larger and clearer classification breakdown.

主题分类 基礎與應用科學 > 資訊科學
社會科學 > 經濟學
社會科學 > 財金及會計學
社會科學 > 管理學
参考文献
  1. Elitzur, R.,Sai, , Y.(2010).A Laboratory Study Designed for Reducing the Gap between Information Security Knowledge and Implementation.International Journal of Electronic Commerce Studies,1(1),37-50.
    連結:
  2. (2015).Data mining and knowledge discovery handbook.Springer.
  3. (2009).Encyclopedia of database systems.Springer.
  4. NilsonReport. retrieved from: The Nilson Report: https://www.nilsonreport.com/upload/content promo/The Nilson Report 10-17-2016.pdf, October, 2016.
  5. Bahnsen, A. C.,Aouada, D.,Stojanovic, A.,Ottersten, B.(2016).Feature engineering strategies for credit card fraud detection.Expert Systems with Applications,51,134-142.
  6. Bhattacharyya, S.,Jha, S.,Tharakunnel, K.,Westland, C.(2011).Data mining for credit card fraud: A comparative study.Decision Support Systems,50(3),602-613.
  7. Chaudhary, K.,Mallick, B.(2012).Exploration of Data mining techniques in Fraud. Detection: Credit Card.International Journal of Electronics and Computer Science Engineering,1(3),1765-1771.
  8. Ganji, V. R.,Mannem, S. N. P.(2012).Credit card fraud detection using anti-k nearest neighbor algorithm.International Journal on Computer Science and Engineering,4(6),1035-1039.
  9. Han, H.,Wang, W.,Mao, B.(2005).Borderline-SMOTE: a new over-sampling method in imbalanced datasets learning.Advances in intelligent computing,878-887.
  10. Hand, D. J.,Mannila, H.,Smyth, P.(2001).Principles of data mining.MIT Press.
  11. D. Jacobe and M. Jones, Consumers Spend More on Weekends, Payday Weeks; Average daily spending is lowest at beginning of work week, (Survey). Gallup Poll News Service, 2009.
  12. Jha, S.,Westland, C.(2013).A Descriptive Study of Credit Card Fraud Pattern.Global Business Review,14(3),373-384.
  13. Jiang, C.,Song, J.,Liu, G.,Zheng, L.,Luan, W.(2018).Credit Card Fraud Detection: A Novel Approach Using Aggregation Strategy and Feedback Mechanism.IEEE Internet of Things Journal
  14. Jose, J.,Kannoorpatti, K.,Shanmugam, B.,Azam, S.,Yeo, K.(2017).A Critical Review of Bitcoins Usage by Cybercriminals.International Conference on Computer Communication and Informatics (ICCCI),India:
  15. Kim, J.,Han, J.,Lee, J.(2016).Data Imbalance Problem solving for SMOTE Based Oversampling: Study on Fault Detection Prediction Model in Semiconductor Manufacturing Process.Advanced Science and Technology Letters (Information Technology and Computer Science),133,79-84.
  16. Lepoivre, M. R.,Avanzini, C. O.,Bignon, G.,Legendre, L.,Piwele, A. K.(2016).Credit card fraud detection with unsupervised algorithms (Report).Journal of Advances in Information Technology,7(1),34.
  17. Liñares-Zegarra, J. M.,Wilson, J. O. S.(2014).Credit card interest rates and risk: new evidence from US survey data.The European Journal of Finance,20(10),892-914.
  18. Manlangit, S.,Azam, S.,Shanmugam, B.,Kannoorpatti, K.,Jonkman, M.,Balasubramaniam, A.(2018).An efficient method for detecting fraudulent transactions using classification algorithms on an anonymized credit card dataset.Intelligent Systems Design and Applications,736,418-429.
  19. Nadarajan, S.,Ramanujam, B.(2016).Encountering imbalance in credit card fraud detection with metaheuristics.Advances in Natural and Applied Sciences,10(8),33-42.
  20. Novaković, J.(2016).Toward optimal feature selection using ranking methods and classification algorithms.Yugoslav Journal of Operations Research,21(1),119-135.
  21. Orange Data Mining. Linear Projection. Retrieved from: https://docs.orange.biolab.si/3/visual-programming/widgets/visualize/linearprojection.html
  22. V. Powell, Principal Component Analysis. Retrieved from: http://setosa.io/ev/principal-component-analysis/, 2015.
  23. Pozzolo, A. D.,Caelen, O.,Borgne, Y. L.,Waterschoot, S.,Bontempi, G.(2014).Learned lessons in credit card fraud detection from a practitioner perspective.Expert Systems with Applications,41(10),4915-4928.
  24. Pozzolo, A. D.,Caelen, O.,Johnson, R. A.,Bontempi, G.(2015).Calibrating Probability with Undersampling for Unbalanced Classification.IEEE Symposium Series on Computational Intelligence
  25. Prakash, A.,Chandrasekar, C.(2012).A Novel Hidden Markov Model for Credit Card Fraud Detection.International Journal of Computer Applications,59(3),35-41.
  26. Prakash, A.,Chandrasekar, C.(2013).A parameter optimized approach for improving credit card fraud detection.International Journal of Computer Science Issues,10(1),360-366.
  27. D. Roberts, Variance and Standard Deviation. Retrieved from: https://mathbitsnotebook.com/Algebra1/StatisticsData/STSD.html
  28. D. Roberts, Mean Absolute Deviation. Retrieved from: https://mathbitsnotebook.com/Algebra1/StatisticsData/STMAD.html
  29. Scott, D. W.(2009).Sturges' rule.Wiley Interdisciplinary Reviews: Computational Statistics,1(3),303-306.
  30. Smith, L. I.(2002).,未出版
  31. T. Srivastava, Introduction to k-nearest neighbours : Simplified. Retrieved from: https://www.analyticsvidhya.com/blog/2014/10/introduction-k neighbours algorithm-clustering/, 2014.
  32. Statistica. e-Commerece. Retrieved from: https://www.statista.com/outlook/243/100/ecommerce/worldwide#, 2018.
  33. Weiss, G.(2004).Mining with rarity: A unifying framework.SIGKDD Explorations,6(1),7-19.
  34. R. West. Training Set vs. Test Set. Retrieved from: http://content.nexosis.com/blog/training-set-vs.-test-set, 2016.
  35. Wold, S.,Esbensen, K.,Geladi, P.(1987).Principal component analysis.Chemometrics and intelligent laboratory systems,2(1-3),37-52.