题名

Efficiently Mining Maximal Frequent Itemsets by Item Grouping and 3-Dimensional Indexing

DOI

10.29767/ECS.200603.0003

作者

Fan-Chen Tseng;Guo-Sheng Fu

关键词

Data Mining ; Maximum Frequent Itemset ; Transaction Pattern List TPL ; Item Grouping ; 3D-Indexing

期刊名称

Electronic Commerce Studies

卷期/出版年月

4卷1期(2006 / 03 / 31)

页次

37 - 55

内容语文

英文

英文摘要

The mining of frequent itemsets has wide applications in data mining, and many methods have been proposed for this problem. However, mining the complete set of frequent itemsets often leads to a huge solution space. Fortunately, this problem can be reduced to the mining of Frequent Closed Itemsets (FCIs), which results in a much smaller solution space. Nevertheless, in some applications the number of FCIs is still too large. In such cases, the alternative is to mine the Maximal Frequent Itemsets (MFIs). In this paper, we propose a compact data structure, the Transaction Pattern List (TPL), for representing the transaction database. Efficient pruning of the search space can be accomplished with TPL. Besides, we develop the technique of item grouping to shorten the search paths and speed up the mining process. For the superset checking before generating new MFIs, we take advantage of the basic properties of itemsets to derive the three-dimensional indexing for quickly locating the set of relevant MFIs to be checked. Experimental results show that our method is more efficient than previously existing methods.

主题分类 基礎與應用科學 > 資訊科學
社會科學 > 經濟學
参考文献
  1. Tseng, Fan-Chen,Hsu, Ching-Chi,Fu, Kuo-Sheng(2005).Efficiently Mining Frequent Closed Itemsets by Eliminating Data Redundancies.Electronic Commerce Studies,3(1),39-56.
    連結:
  2. Agarwal, R. C.,Aggarwal, C. C.,Prasad, V. V. V.(2000).Depth First Generation of Long Patterns.Proc. ACM SIGMOD
  3. Agrawal, R.,Srikant, R.(1994).Fast Algorithms for Mining Association Rules in Large Databases.Proc. 20th VLDB
  4. D. Burdick(2001).MAFIA: a maximal frequent itemsets algorithm for transactional databases.Proc. ICDE.
  5. Gouda, Karam,Zaki, M. J.(2001).Efficiently mining maximal frequent itemsets.Proc. The 2001 IEEE ICDM
  6. Han, J.,Pei, J.,Yin, Y.(2000).Mining frequent patterns without candidate generation.Proc. ACM SIGMOD.
  7. M. J. Zaki,Karam Gouda(2001).Technical ReportTechnical Report,CS Dept. Rensselaer Polytechnic Institute.
  8. Pei, J.,Han, J.,Mao, R.(2000).CLOSET: An efficient algorithm for mining frequent closed itemsets.Proc. 2000 ACMSIGMOD Int. Workshop Data Mining and Knowledge Discovery.
  9. R. J. Bayardo(1998).Efficiently mining long patterns from databases.Proc. ACM SIGMOD.
  10. Shenoy, P.(2000).Turbo-charging vertical mining of large databases.Proc. ACM SIGMOD.
  11. Tseng, Fan-Chen(2005).Hierarchical Partitioning for Data Mining in Large Databases.The 2nd AIS SIG-ISAP Annual International Conference on IS/IT in Asia-Pacific,Las Vegas, USA:
  12. Tseng, Fan-Chen(2004).Dynamic Data Structure Switching Techniques for Data Mining.The 10th Conference on Information Management and Implementation,Taichung, Taiwan:
  13. Tseng, Fan-Chen,Hsu, Ching-Chi(2001).Lecture Notes in Artificial Intelligence, LNAI.Springer-Verlag.
  14. Tseng, Fan-Chen,Hsu, Ching-Chi,Fu, Kuo-Sheng(2005).The Frequent Pattern List: Another Framework for Mining Frequent Patterns.International Journal of Electronic Business Management (IJEBM),3(2),104-115.
  15. Wang, J.(2003).CLOSET+: Searching for the Best Strategies for Mining Frequent Closed Itemsets.Proc. ACM SIGKDD.
  16. Zaki, M. J.,Hsiao, C.(1999).Technical ReportTechnical Report,Computer Science, Rensselaer Polytechnic Institute.