题名

決策樹形式知識的合併修剪之研究

并列篇名

A Decision-Tree-Based Merging-Pruning Approach

DOI

10.29767/ECS.200606.0001

作者

馬芳資(Fang-Tz Ma);林我聰(Woo-Tsong Lin)

关键词

知識整合 ; 決策樹 ; 決策樹合併 ; 決策樹修剪 ; Knowledge Integration ; Decision Tree ; Decision Tree Merging ; Decision Tree Pruning

期刊名称

Electronic Commerce Studies

卷期/出版年月

4卷2期(2006 / 06 / 30)

页次

123 - 156

内容语文

繁體中文

中文摘要

隨著知識經濟時代的來臨,知識的產生、儲存、應用、整合等已成為重要的討論議題,本研究將針對知識整合此一議題進行探討;而在知識的呈現方式中,決策樹(Decision Tree)形式知識為樹狀結構,可用圖形化方式加以呈現,它的結構簡單且易於瞭解,本研究擬針對決策樹形式知識來探討其知識整合的課題。本研究提出一個決策樹合併修剪方法DTBMPA (Decision-Tree-Based Merging-Pruning Approach)以整合既有/原始的決策樹形式知識;此方法包括三個主要程序:決策樹合併、修剪,和驗證,其做法是先將兩棵原始樹經由合併程序結合成一棵合併樹,再透過修剪程序產生修剪樹,最後由驗證程序來評估修剪樹的準確度。此決策樹合併修剪方法藉由合併程序來擴大樹的知識,再利用修剪程序來修剪合併後樹的過度分支。在本研究的實驗中,合併樹的準確度優於原始一棵樹的比率有90%,而修剪樹的準確度大於或等於合併樹的比率有80%。在統計檢定中,合併樹和修剪樹的準確度優於原始一棵樹的準確度達顯著差異;而修剪樹與合併樹的準確度雖無顯著差異。然在節點數的比較上,修剪樹的節點數較合併樹的節點數平均約少了15%。

英文摘要

Along with approaching of knowledge economy era, knowledge creation, retaining, application and integration are becoming the important themes for discussion in nowadays. This research tends to focus on discussion of knowledge integration and regarding subjects. In the way of knowledge representation, ”decision tree” is the most common type to show the knowledge structure in a tree-shaped graphic. This ”decision tree” is considerably simple and easy understanding, thus we focus on decision-tree-based knowledge in connection with knowledge integration theme. Our research proposes an approach called DTBMPA (Decision-Tree-Based Merging-Pruning Approach) to integrate the knowledge of decision trees. There are 3 steps included in this approach. In the merging step, the first step, two primitive decision trees are merged as a merged tree to enlarge the knowledge of primitive trees. In the pruning step, the second step, the merged tree from the first step is pruned as a pruned tree to cut off the bias branches of the merged tree. In the validating step, the last step, the performance of the pruned tree from the second step is validated. In the simulation experiments, the percentage accuracy for the merged tree will have 90% of chance that is greater than or equal to the accuracy for those primitive trees, and the percentage accuracy for the pruned tree will have 80% of chance that is greater than or equal to the accuracy for merged tree. And we also find that the average number of nodes of the pruned tree will have 15% less than that of the merged tree.

主题分类 基礎與應用科學 > 資訊科學
社會科學 > 經濟學
参考文献
  1. Bauer, E.,Kohavi, R.(1999).An Empirical Comparison of Voting Classification Algorithms: Bagging, Boosting, and Variants.Journal of Machine Learning,36(1/2)
  2. Bolakova, I.(2002).Pruning Decision Trees to Reduce Tree Size.Proceedings of the international conference-Traditional And Innovations In Sustainable Development Of Society,Rezekne, Latvia:
  3. Bradford, J.,Kunz, C.,Kohavi, R.,Brunk, C.,Brodley, C.E.(1998).Pruning decision trees with misclassification costs.In Proceedings of Tenth European Conference on Machine Learning(ECML-98),Berlin:
  4. Breiman, L.(1996).Bagging predictors.Machine Learning,24,123-140.
  5. Breiman, L.,Friedman, J.H.,Olshen, R.,Stone, C.(1984).Classification and Regression Trees, Belmont.California:Wadsworth.
  6. Chan, P.K.,Stolfo, S.J.(1995).Learning arbiter and combiner trees from partitioned data for scaling machine learning.In Proc. Intl. Conf. on Knowledge Discovery and Data Mining.
  7. Dong M.,Kothari, R.(2001).Classifiability Based Pruning of Decision Trees, Proc.International Joint Conference on Neural Networks(IJCNN),3,1739-1743.
  8. Dunham, M.H.(2003).Data Ming: Introductory and Advanced Topics.Pearson Education, Inc..
  9. Esposito, F.,Malerba, D.,Semeraro, G.(1995).A Further Study of Pruning Methods in Decision Tree Induction.Proceedings of the Fifth International Workshop on Artificial Intelligence and Statistics
  10. Fournier D.,Crémilleux B.(2002).A Quality Index for Decision Tree Pruning.Knowledge-Based Systems,15,37-43.
  11. Frank, E.(2000).Pruning Decision Trees and Lists.New Zealand:Department of Computer Science, University of Waikato, Hamilton.
  12. Frank, E.,Hall, M.,Trigg, L.,Holmes, G.,Witten, I.H.(2004).Bioinformatics Advance Access, published online Bioinformatics, 8 April.Oxford University Press.
  13. Mingers, J.(1989).An empirical comparison of pruning methods for decision tree induction.Machine Learning,4,227-443.
  14. Murthy, S.K.(1997).University of Maryland.
  15. Niblett, T.,Bratko, I.,Bramer, M. A. (Ed.)(1986).Research and Development in Expert Systems III. Proceedings of Expert Systems `86.Brighton:Morgan Kaufmann, San Francisco, CA.
  16. Quinlan, J. R.(1998).MiniBoosting Decision Trees.Journal of Artificial Intelligence Research
  17. Quinlan, J.R.(1992).C4.5: Programs for Machine Learning.San Mateo:Morgan Kaufmann.
  18. Quinlan, J.R.(1987).Simplifying decision trees.International Journal of Man-Machine Studies,27(3),221-234.
  19. Quinlan, J.R.(1986).chapter The effect of noise on concept learning.Los Altos, CA:Morgan Kaufmann.
  20. Todorovski, L.,Dzeroski, S.(2000).Combining Multiple Models with Meta Decision Trees.In Proceedings of the Fourth European Conference on Principles of Data Mining and Knowledge Discovery
  21. Williams, G.(1990).Canberra, Australia,Australian National University.
  22. Windeatt T.,Ardeshir G.(2001).Proc. of Int. Conf Intelligent Data Analysis, Sept 13-15, Lisbon, Portugal, Lecture notes in computer science.Springer-Verlag.
  23. Witten, I. H.,Frank, E(2000).Data Mining: Practical Machine Learning Tools and Techniques with JAVA Implementations.Morgan Kaufmann.
  24. 馬芳資(1995)。信用卡信用風險預警範例學習系統之研究。第十屆全國技職及職業教育研討會,技職研討會,商業類I
  25. 馬芳資、林我聰(2003)。決策樹形式知識之線上預測系統架構。圖書館學與資訊科學,29(2),60-76。
  26. 陳重銘(1995)。碩士論文(碩士論文)。國立中山大學資訊管理研究所。
  27. 曾憲雄、黃國禎(2005)。人工智慧與專家系統-理論/實務/應用。台北市:旗標出版股份公司。