题名

數量關聯規則探勘之項目離散化方式評估

并列篇名

Evaluation of Item Discretization Methods for Quantitative Association Rule Mining

作者

葉進儀(Jinn-Yi Yeh);廖冠傑(Guan-Jie Liao)

关键词

關聯規則 ; 數量關聯規則 ; 離散化 ; 多重最小支持度 ; association rules ; quantitative association rules ; discretization ; multiple minimum supports

期刊名称

品質學報

卷期/出版年月

18卷6期(2011 / 12 / 01)

页次

489 - 517

内容语文

繁體中文

中文摘要

關聯規則探勘為資料探勘中的一種應用,對於如何找出商品之間的關係或關聯以產生收益,往往是零售組織們所重視的。過去許多關聯規則的研究著重在找出項目之間的關聯而忽略了購買數量,然而挖掘出項目購買數量之間的關聯可以更進一步地提升企業的決策品質。當考慮到項目購買數量時,項目之個別購買數量之支持度將會很低,因此探勘出的規則數量也會很少。為了可以探勘出更多潛在有價值之規則,本研究應用離散化方法將各個項目之購買數量分為數個區間,並以多重最小支持度之探勘方式探勘數量關聯規則。實驗設計法評估了各種不同離散化方式對於後續探勘數量關聯規則的影響,結果顯示以K—平均法及密度估計樹較其他離散化方式為佳,並且也顯示以多重最小支持度之探勘方式,可以找出低支持度之規則樣式,並且不會產生太多無意義之規則。

英文摘要

Mining association rules from a large database is a famous application in data mining. Enterprises in retail industry always focus on finding the connection or relationship among products to create profit. Previous studies on mining association rules only focused on discovering associations among items without considering the relationships between items and their purchased quantities. However, exploring associations among items with their purchased quantities may discover useful information to improve the quality of business decisions. When purchased quantities are considered, the supports of items associated with their purchased quantities may decrease drastically. The number of potentially interesting association rules discovered will also be few. In order to discover more potentially interesting rules, we apply discretization methods to partition all the possible quantities into intervals for each item and use multiple minimum supports for mining quantitative association rules. Using experimental design, we evaluate different discretization methods for mining quantitative association rules. Experimental results show that K-means and tree-based density estimation have better performances than other discretization methods. It also shows that mining association rules with multiple minimum supports enables us to find rare item rules without producing a huge number of meaningless rules.

主题分类 社會科學 > 管理學
参考文献
  1. Agrawal, R.,Imielinski, T.,Swami, A.(1993).Mining association rules between sets of items in large databases.Proceedings of the 1993 ACM SIGMOD International Conference on Management of Data,Washington, United States:
  2. Aumann, Y.,Lindell, Y.(2003).A statistical theory for quantitative association rules.Journal of Intelligent Information Systems,20(3),255-283.
  3. Chen, C.(2006).Using efficient algorithms for mining quantitative association rules.Journal of Information Technology and Applications,1(2),123-131.
  4. Han, J.,Kamber, M.(2007).Data Mining: Concepts and Techniques.NY:Morgan Kaufmann.
  5. Han, J.,Pei, J.,Yin, Y.,Mao, R.(2004).Mining frequent patterns without candidate generation: a frequent-pattern tree approach.Data Mining and Knowledge Discovery,8(1),53-87.
  6. Hong, T. P.,Kuo, C. S.,Chi, S. C.(1999).Mining association rules from quantitative data.Intelligent Data Analysis,3(5),363-367.
  7. Hsu, P. Y.,Chen, Y. L.,Ling, C. C.(2004).Algorithms for mining association rules in bag database.Information Sciences,166(1-4),31-47.
  8. Hu, Y. H.,Chen, Y. L.(2006).Mining association rules with multiple minimum supports: a new mining algorithm and a support tuning mechanism.Decision Support Systems,42(1),1-24.
  9. Imberman, S.,Domanski, B.(2001).Finding association rules from quantitative data using data booleanization.Proceedings of the Seventh Americas Conference on Information System,Boston, MA:
  10. Kantardzic, M.(2003).Data Mining: Concepts, Models, Methods, and Algorithms.New York:John Wiley.
  11. Lent, B.,Swami, A.,Widom, J.(1997).Clustering association rules.Proceedings of International Conference on Data Engineering,Birmingham, United Kingdom:
  12. Lian, W.,Cheung D. W.,Yiu, S. M.(2005).An efficient algorithm for finding dense regions for mining quantitative association rules.Computers and Mathematics with Applications,50(3-4),471-490.
  13. Liu, B.,Hsu, W.,Ma, Y.(1999).Mining association rules with multiple minimum supports.Proceedings of the Fifth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining,San Diego, California:
  14. Miller, R. J.,Yang, Y.(1997).Association rules over interval data.Proceedings of ACMSIGMOD Conference on Management of Data,Tuscon, AZ:
  15. Schmidberger, G.,Frank, E.(2005).Technique ReportTechnique Report,New Zealand:.
  16. Srikant, R.,Agrawal, R.(1996).Mining quantitative association rules in large relational tables.Proceedings of the 1996 ACM SIGMOD International Conference on Management of Data,Montreal, Canada:
  17. Tsai, S. M.,Chen, C. H.(2001).Mining quantitative association rules in a large database of sales transactions.Information Science and Engineering,17(1),667-681.
  18. Tseng, M. C.,Lin, W. Y.(2007).Efficient mining of generalized association rules with non-uniform minimum support.Data and Knowledge Engineering,62(1),41-64.
  19. Wang, W.,Yang, J.,Yu, P.(2000).Efficient mining of weighted association rules (WAR).Proceedings of the ACM SIGMOD International Conference on Management of Data,Boston, United States:
  20. Wei, J. M.,Yi, W. G.,Wang, M. Y.(2006).Novel measurement for mining effective association rules.Knowledge-based Systems,19(8),739-743.
  21. Xiong, H.,Tan, P. N.,Kumar, V.(2006).Hyperclique pattern discovery.Data Mining and Knowledge Discovery,13(2),219-242.
  22. Zhou, L.,Yau, S.(2007).Efficient association rule mining among both frequent and infrequent items.Computers and Mathematics with Applications,54(6),737-749.
  23. 丁一賢、陳牧言(2003)。資料探勘。台北:滄海書局。
  24. 康聖祥、鄭印良、羅貴魁、楊達立(2007)。以利潤為主要考量之多重最小支持度量化關聯規則。台灣區網際網路研討會論文集(2007TANET),台北:
  25. 曾憲雄、蔡秀滿、蘇東興、曾秋蓉、王慶堯(2004)。資料探勘。台北:旗標出版股份有限公司。