题名

Efficiently Mining Frequent Closed Itemsets by Eliminating Data Redundancies

DOI

10.29767/ECS.200503.0003

作者

Fan-Chen Tseng;Ching-Chi Hsu;Kuo-Sheng Fu

关键词

Data Mining, Frequent Closed Itemset ; Transaction Pattern List ; Data redundancy ; Frequent Pattern List (FPL)

期刊名称

Electronic Commerce Studies

卷期/出版年月

3卷1期(2005 / 03 / 31)

页次

39 - 55

内容语文

英文

英文摘要

Recently, data mining has been applied in business information and intelligence systems for discovering interesting patterns and knowledge to support decision making processes. One of the most basic and important tasks of data mining is the mining of frequent itemsets, which are sets of items frequently purchased by customers. Many methods have been proposed for this problem. However, mining the complete set of frequent itemsets often leads to a huge solution space. Fortunately, this problem can be reduced to the mining of Frequent Closed Itemsets (FCIs), which results in a much smaller yet representative set of purchase patterns of the customers. Still, there are redundancies in the databases that can be eliminated to enhance both space and time efficiency. In this paper, we propose a novel data structure, the Transaction Pattern List (TPL), for eliminating data redundancies, and design the algorithm TPLFCI-Mining for mining FCIs efficiently with the TPL. Our algorithm is evaluated under more rigorous conditions than previously proposed methods. Experimental results show that our method is efficient for both sparse and dense databases, and is scalable for large databases even at low support thresholds.

主题分类 基礎與應用科學 > 資訊科學
社會科學 > 經濟學
参考文献
  1. Agrawal, R.,Agrawal, C.,Prasad, V. V. V.(2001).A Tree Projection Algorithms for Generation of Frequent Itemsets.Journal of Parallel and Distributed Computing,61(3),350-371.
  2. The Digital Nervous System
  3. Ching-Chi Hsu,Fan-Chen Tseng,Wen-Chi Chen(2003).The Innovative Pilot Project of Academic Cooperation, Institute of Information Industry.
  4. Han, J.,Pei, J.,Yin, Y.(2000).Mining Frequent Patterns without Candidate Generation.Proc. ACM SIGMOD
  5. Pasquier, N.,Bastide, Y.,Taouil, R.,Lakhal, L.(1999).Discovering frequent closed itemsets for association rules.Proc. 7th Int. Conf Database Theory (ICDT`99),Jan.,398-416.
  6. Pei, J.,Han, J.,Mao, R.(2000).CLOSET: An efficient algorithm for mining frequent closed itemsets.Proc. 2000 ACMSIGMOD Int. Workshop Data Mining and Knowledge Discovery (DMKD00),May,11-20.
  7. Tseng, Fan-Chen,Hsu, Ching-Chi(2001).Generating Frequent Patterns with the Frequent Pattern List.Proc. The Fifth Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD 2001),Hong Kong:
  8. Tseng, Fan-Chen,Hsu, Ching-Chi,Henry Chen(2001).Mining Frequent Closed Itemsets with the Frequent Pattern List.Proc. The 2001 IEEE International Conference on Data Mining (ICDM 2001),San Jose, California, USA.:
  9. Zaki, M. J.,Hsiao, C.(1999).Technical Report 99-10.Computer Science:Rensselaer Polytechnic Institute.
  10. Zheng Z.,Kohavi R.,L. Mason(2001).Real World Performance of Association Rule Algorithms.Proc. The Seventh ACM SIGKDD International Conference on Knowledge Discovery and Data Mining
被引用次数
  1. Tseng, Fan-Chen,Fu, Guo-Sheng(2006).Efficiently Mining Maximal Frequent Itemsets by Item Grouping and 3-Dimensional Indexing.Electronic Commerce Studies,4(1),37-55.