题名

應用以約定值為基礎之演算法於關聯規則探勘

并列篇名

Applying Bond-based Algorithm for Mining Association Rules

DOI

10.6382/JIM.200810.0123

作者

葉進儀(Jinn-Yi Yeh);林彣珊(Wen-San Lin);郭文熙(Wen-Hsi Kuo)

关键词

資料探勘 ; 關聯規則 ; 約定值 ; 跨支持度 ; data mining ; association rules ; bond ; cross-support

期刊名称

資訊管理學報

卷期/出版年月

15卷4期(2008 / 10 / 01)

页次

123 - 149

内容语文

繁體中文

中文摘要

現存於大型資料庫的關聯規則探勘方式,大都利用支持度修剪策略來降低搜尋關聯規則的時間,但此策略於低支持度門檻時,無法有效的找出潛在有價值的樣式,而且因為支持度太低,導致額外的資源(例如記憶體)需求也過大;在高支持度門檻時,則會遺失具有低支持度,但卻有高信賴度與高相關性的樣式。本研究先證明約定值具有跨支持度特性,然後再利用此特性修剪及刪除沒有價值的項目集,加快演算法的執行速度與節省系統的資源,而且如果一個項目集其約定值大於最小約定值門檻,則這一個項目集的支持度會大於某一個程度的底限,由此項目集所延伸出來的關聯規則,其信賴度也會大於某一程度的底限,因此利用約定值所探勘出來的關聯規則是有價值的。本研究最後將此演算機制應用於真實之交易資料上,實驗結果顯示利用約定值跨支持度特性的修剪策略可以減少尋找大型項目集的時間,且所探勘出的大型項目集,其項目間也具有高度的相關性。

英文摘要

Most current methods of mining association rules for large database use support pruning strategy to reduce searching space of finding out association rules. However, the strategy is not efficient to mine valuable patterns because it consumes lots of resources when the support threshold is low. Meanwhile when the support threshold is high, it will lose valuable itemsets which have lower support, higher confidence, and higher correlation. This paper applies the concept of bond-based threshold to mine association rules for large databases. We first prove that the bond has a cross-support property and then use this property to prune invaluable itemsets. This can improve the efficiency of the algorithm and reserve system resources. If the bond of itemset is greater than the bond-based threshold, the support of this itemset would be greater than some limit. The confidence of the association rules produced by the itemset would also be greater than some limit. The itemset would have high correlation between individual items. Therefore, when we use both bond and support pruning strategy, the association rules will be valuable. Our experiments were performed on real data sets. The experimental results show that this approach can reduce search space and find the valuable patterns, and the valuable patterns have high correlation between individual items.

主题分类 基礎與應用科學 > 資訊科學
社會科學 > 管理學
参考文献
  1. Aggarwal, C. C.,Procopiuc, C.,Yu, P. S.(2002).Finding Localized Associations in Market Basket Data.IEEE Transaction on Knowledge and Data Engineering,14(1),51-62.
  2. Agrawal, R.,Imielinski, T.,Swami, A.(1993).Mining Association Rules between Sets of Items in Large Databases.Proceedings of the 1993 ACM SIGMOD International Conference on Management of Data,Washington, D.C.:
  3. Agrawal, R.,Srikant, R.(1994).Fast Algorithms for Mining Association Rules in Large Databases.Proceedings of the 20th International Conference on Very Large Data Bases,Santiago, Chile:
  4. Blischok, T.(1995).Every Transaction Tells a Story.Chain Store Age Executive with Shopping Center Age,71(3),50-57.
  5. Brijs, T.,Swinnen, G.,Vanhoof, K.,Wets, G.(2004).Building an Association Rules Framework to Improve Product Assortment Decisions.Data Mining and Knowledge Discovery,8(1),7-23.
  6. Brijs, T.,Swinnen, G.,Vanhoof, K.,Wets, G.(1999).Using Association Rules for Product Assortment Decisions: a Case Study.Proceedings of the Fifth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining,San Diego, California:
  7. Brin, S.,Motwani, R.,Ullman, J. D.,Tsur, S.(1997).Dynamic Itemset Counting and Implication Rules for Market Basket Data.Proceedings of the 1997 ACM SIGMOD International Conference on Management of Data,Tucson, Arizona:
  8. Cohen, E.,Datar, M.,Fujiwara, S.,Gionis, A.,Indyk, P.,Motwani, R.,Ullman, J.D.,Yang, C.(2001).Finding Interesting Associations without Support Pruning.IEEE Transaction on Knowledge and Data Engineering,13(1),64-78.
  9. Han, J.,Kamber, M.(2001).Data Mining: Concepts and Techniques.London:Morgan Kaufmann.
  10. Han, J.,Pei, J.,Yin, Y.,Mao, R.(2004).Mining Frequent Patterns without Candidate Generation: A Frequent-Pattern Tree Approach.Data Mining and Knowledge Discovery,8(1),53-87.
  11. Hand, D. J.,Blunt, G.,Kelly, M. G.,Adams, N. M.(2000).Data Mining for Fun and Profit.Statistical Science,15(2),111-131.
  12. Lawrence, R. D.,Almasi, G. S.,Kotlyar, V.,Viveros, M. S.,Duri, S. S.(2001).Personalization of Supermarket Product Recommendations.Data Mining and Knowledge Discovery,5(1-2),11-32.
  13. Liu, B.,Hsu, W.,Ma, Y.(1999).Mining Association Rules with Multiple Minimum Supports.Proceedings of the Fifth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining,San Diego, California:
  14. Liu, J.,Pan, Y.,Wang, K.,Han, J.(2002).Mining Frequent Item Sets by Opportunistic Projection.Proceedings of 2002 International Conference on Knowledge Discovery in Databases (KDD`02),Edmonton, Canada:
  15. Ogihara, M.,Li, W.(1997).Technical ReportTechnical Report,University of Rochester.
  16. Omiecinski, E. R.(2003).Alternative Interest Measures for Mining Associations in Databases.IEEE Transaction on Knowledge and Data Engineering,15(1),57-69.
  17. Park, J. S.,Chen, M-S.,Yu, P. S.(1995).An Effective Hash-Based Algorithm for Mining Association Rules.Proceedings of the 1995 ACM SIGMOD International Conference on Management of Data,San Jose, California:
  18. Savasere, A.,Omiecinski, E. R.,Navathe, S. B.(1995).An Efficient Algorithm for Mining Association Rules in Large Databases.Proceedings of the 21th International Conference on Very Large Data Bases,Zrith, Switzerland:
  19. Song, M.,Rajasekaran, S.(2006).A Transaction Mapping Algorithm for Frequent Itemsets Mining.IEEE Transaction on Knowledge and Data Engineering,18(4),472-481.
  20. Xiong, H.,Tan, P.N.,Kumar, V.(2003).Mining Strong Affinity Association Patterns in Data Sets with Skewed Support Distribution.Proceedings of the Third IEEE International Conference on Data Mining
  21. Xiong, H.,Tan, P.N.,Kumar, V.(2006).Hyperclique Pattern Discovery.Data Mining and Knowledge Discovery,13(2),219-242.
  22. Xiong, H.,Tan, P.N.,Kumar, V.(2003).Technical ReportTechnical Report,Department of Computer Science, University of Minnesota.
  23. 丁一賢、陳牧言(2005)。資料探勘。台中:滄海書局。
  24. 彭文正譯、Michael J. A. Berry、Gordon S. Linoff著(2001)。資料採礦 顧客關係管理暨電子行銷之應用。台北:數博網資訊股份有限公司。
  25. 曾憲雄、蔡秀滿、蘇東興、曾秋蓉、王慶堯(2004)。資料探勘。台北:旗標出版股份有限公司。
被引用次数
  1. 黃冠凱,李晏華,吳信宏(2022)。探討醫院異常事件通報病患發生跌倒事件之分析-以中部某區域教學醫院為例。品質學報,29(2),99-117。