题名

由已訓練類神經網路擷取成本敏感之分類規則

并列篇名

An Approach of Retrieving Cost-Sensitive Classification Rules from Trained Neural Networks

DOI

10.6188/JEB.2005.7(3).04

作者

黃宇翔(Yeu-Shiang Huang);毛紹睿(Shiao-Rei Mao)

关键词

資料探勘 ; 類神經網路 ; 規則擷取 ; 分類錯誤成本 ; Data Mining ; Neural Network ; Rule Extraction ; Misclassification Cost

期刊名称

電子商務學報

卷期/出版年月

7卷3期(2005 / 09 / 01)

页次

275 - 291

内容语文

繁體中文

中文摘要

類神經網路為處理資料探勘問題的技術之一,其學習結果通常有較高的正確率,且對於存有雜訊的資料有較好的容錯能力,其網路架構也能夠表達屬性間複雜的關係。然而其學習結果為一黑箱,對於使用者缺乏解釋能力,使得類神經網路在應用上受到一定程度的限制。本研究透過規則歸納演算法由已訓練類神經網路中擷取出明確的規則,用以解釋類神經網路的學習結果,且所提出之規則擷取架構將能夠適用於不同的類神經網路模式中。並於規則擷取的過程考量分類錯誤成本的影響,使所擷取之規則能反應不同類別的分類錯誤成本更能符合實務上的需要。本研究架構以Cendrowska所提出之PRISM演算法為規則擷取基礎,分別以Adacost、Metacost以及修改PRISM資訊函數三種方式使所擷取之規則能考量分類錯誤成本。並將本研究方法與REFNE規則擷取架構,以UCI-ML資料庫為評比基礎就所產生規則之規則數目、正確率以及分類錯誤成本進行比較與分析。

英文摘要

Neural network, as a popular approach in data mining, usually has better learning results with relatively high accuracy. It provides good fault-tolerant ability for handling data with noises, and its network structure can also presents the complicated relationships among attributes. However, such black-boxed type of neural network process lacks the ability of explanation to offer the users with comprehensibly manageable knowledge, and the applications of neural network are occasionally restricted. In this paper, a rule induction algorithm is employed to retrieve the explicit rules for interpret the learning results from neural networks. Furthermore, by considering the misclassification costs in the retrieval process, the retrieved rules would be more realistic to practical uses. The proposed approach is based on PRISM algorithm proposed by Cendrowska, and uses the methods of Adacost, Metacost, and information entropy to consider the misclassification costs. An empirical investigation is performed by utilizing g the UCI-ML database to verify the effectiveness of the proposed approach.

主题分类 人文學 > 人文學綜合
基礎與應用科學 > 資訊科學
基礎與應用科學 > 統計
社會科學 > 社會科學綜合
参考文献
  1. Boz, O.(2002).Extracting decision trees from trained neural networks.Proceedings of the Eighth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.
  2. Cendrowska, J.(1987).PRISM: An algorithm for inducing modular rules.International Journal of Man-Machine Studies,27(4),349-370.
  3. Chan, P.,Stolfo, S.(1998).Towards scalable learning with non-uniform class and cost distributions: A case study in credit card fraud detection.Proceedings of the Fourth International Conference on Knowledge Discovery and Data Mining.
  4. Cohen, W. W.(1995).Fast effective rule induction.Proceedings of the Twelfth International Conference on Machine Learning.
  5. Craven, M. W.,Shavlik, J. W.(1996).Extracting tree-structured representations of trained networks.Advances in Neural Information Processing Systems,8,24-30.
  6. Domingos, P.(1999).Metacost: A general method for making classifiers cost-sensitive.Proceedings of the Fifth International Conference on Knowledge Discovery and Data Mining.
  7. Drummond, C.,Holte, R.(2000).Exploiting the cost (in)sensitivity of decision tree splitting criteria.Proceedings of the Seventeenth International Conference on Machine Learning.
  8. Fan, W.,Stolfo, S. J.,Zhang, J.,Chan, P. K.(1999).AdaCost: Misclassification cost- sensitive boosting.Proceedings of the Sixteenth International Conference on Machine Learning.
  9. Fu, L.(1998).A neural-network model for learning domain rules based on its activation function characteristics.IEEE Transactions on Neural Networks,9(5),787-795.
  10. Fu, X,Wang, L.(2001).Rule extraction by genetic algorithms based on a simplified RBF neural network.Proceedings of the 2001 Congress on Evolutionary Computation.
  11. Han, J.,Kamber, M.(2001).Data Mining: Concepts and Techniques.CA:Morgan Kaufmann.
  12. Ikizler, N.(2002).Technical Report BU-CE-0208Technical Report BU-CE-0208,Bilkent University.
  13. Liu, H,Srtiono, R.(1997).Feature selection via discretization of numeric attributes.IEEE Transaction on Knowledge and Data,9(4),642-645.
  14. Norton, S.W.(1989).Generating better decision trees.Proceedings of the Eleventh International Joint Conference on Artificial Intelligence.
  15. Nunez, M.(1991).The use of background knowledge in decision tree induction.Machine Learning,6,231-250.
  16. Provest, F.,Fawcett, T.,Kohavi, R.(1998).The case against accuracy estimation for comparing induction algorithms.Proceedings of the Fifteenth International Conference on Machine Learning.
  17. Setiono, R.,Leow, W. K.,Zurada, J. M.(2002).Extraction of rules from artificial neural networks for nonlinear regression.IEEE Transactions on Neural Networks,13(3),564-577.
  18. Tan, M.(1991).Cost-sensitive reinforcement learning for adaptive classification and control.Proceedings of the Eighth International Workshop on Machine Learning.
  19. Thrun, S. B.(1995).Extracting rules from artificial neural networks with distributed representations.Advances in Neural Information Processing Systems,7,505-512.
  20. Tickle, A. B.,Golea, M.,Hayward, R.,Diederich, J.(1997).The truth is in there: Current issues in extracting rules from trained feed forward artificial neural networks.Proceedings of the International Conference on Neural Networks.
  21. Ting, K. M.,Zheng, Z.(1998).Boosting trees for cost-sensitive classifications.Proceedings of the Tenth European Conference on Machine Learning.
  22. Tsukimoto, H.(2000).Extracting rules from trained neural networks.IEEE Transactions On Neural Networks,11(2),377-389.
  23. Turney, P.(1995).Cost-sensitive classification: empirical evaluation of a hybrid genetic decision tree induction algorithm.Journal of Artificial Intelligence Research,2,369-409.
  24. Turney, P.(2000).Types of cost in inductive concept learning.Proceedings of the Cost-Sensitive Learning Works hop at the Seventeenth International Conference on Machine Learning.
  25. Witten, I. H.,Frank, E.(2000).Data Mining: Practical Machine Learning Tools And Techniques With Java Implementations.CA:Morgan Kaufmann.
  26. Zhou, Z. H.,Jiang, Y.,Chen, S. F.(2003).Extracting symbolic rules from trained neural network ensembles.AI Communications,16(1),3-15.
  27. Zubek, V. B.,Dietterich, T. G.(2002).Pruning improves heuristic search for cost-sensitive learning.Proceedings of the Nineteenth International Conference on Machine Learning.
被引用次数
  1. 葉建良、梁德馨(2008)。消費者信用貸款違約風險評估之研究—以CART分類與迴歸樹建模。中山管理評論,16(3),465-506。