题名

在資料串流的環境下探勘序列型樣

并列篇名

Mining Sequential Patterns in a Data Stream

DOI

10.6188/JEB.202304_25(1).0002

作者

顏秀珍(Show-Jane Yen);李御璽(Yue-Shi Lee)

关键词

資料探勘 ; 序列型樣 ; 資料串流 ; 交易資料庫 ; Data mining ; sequential pattern ; data stream ; transaction database

期刊名称

電子商務學報

卷期/出版年月

25卷1期(2023 / 04 / 30)

页次

37 - 62

内容语文

繁體中文;英文

中文摘要

探勘序列型樣是從交易資料中找出大部分客戶依時間先後順序購買商品的行為,我們可以根據客戶目前的購買行為,預測其下次可能會購買的商品。然而客戶的交易行為不斷在進行,交易資料會隨時間不斷增加,舊有的交易也需予以刪除,如何有效率的即時更新原有的序列型樣是一個很重要的研究議題,因為對於資料快速的改變,若無法即時更新原有的序列型樣,則所找到的資訊可能已經無法表示顧客目前的消費行為。因此,本篇論文提出在交易資料不斷新增與移除的情況下,有效率的更新原有序列型樣的方法,我們的方法不需重新掃描原始交易資料,只需處理新增與刪除的交易資料就可找出目前最新的序列型樣,實驗結果也顯示我們的方法比其他方法更有效率。

英文摘要

Mining sequential patterns is to find the sequential purchasing behaviors for most of the customers in the transaction database. We can predict the products that the customer may purchase next time based on the products that the customer currently purchases. Owing to the transactions will continuously increase over time, and the old transactions also need to be deleted. How to update the original sequential patterns efficiently in a data stream environment is an important research topic, because if the data is changed quickly, but the original sequential patterns cannot be updated immediately, the discovered Information may no longer represent the current consumer behaviors. Therefore, in this paper, we propose an algorithm for efficiently mining sequential patterns in a data stream. When the transactions are added or removed, our algorithms only need to process the inserted or deleted transactions without scanning the original database. Experimental results also show that our algorithms outperform the previous approaches.

主题分类 人文學 > 人文學綜合
基礎與應用科學 > 資訊科學
基礎與應用科學 > 統計
社會科學 > 社會科學綜合
参考文献
  1. Agrawal, R.,Srikant, R.(1994).Fast algorithms for mining association rules in large database.Proceedings of the 20th International Conference on Very Large Databases,Santiago, Chile:
  2. Agrawal, R.,Srikant, R.(1995).Mining sequential patterns.Proceedings of the Eleventh International Conference on Data Engineering,Taipei, Taiwan:
  3. Cheng, H.,Yan, X.,Han, J.(2004).IncSpan: Incremental mining of sequential patterns in large database.Proceedings of the Tenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining,Seattle, Washington, USA:
  4. Han, J.,Pei, J.,Yin, Y.(2000).Mining frequent patterns without candidate generation.ACM Sigmod Record,29(2),1-12.
  5. Han, J.,Pei, J.,Yin, Y.,Mao, R.(2004).Mining frequent patterns without candidate generation: A frequent-pattern tree approach.Data Mining and Knowledge Discovery,8(1),53-87.
  6. Ho, C. C.,Li, H. F.,Kuo, F. F.,Lee, S. Y.(2006).Incremental mining of sequential patterns over a stream sliding window.Proceedings of the Sixth IEEE International Conference on Data Mining-Workshops,Hong Kong, China:
  7. Hong, T. P.,Lin, C. W.,Wu, Y. L.(2008).Incrementally fast updated frequent pattern trees.Expert Systems with Applications,34(4),2424-2435.
  8. Li, H. F.,Ho, C. C.,Chen, H. S.,Lee, S. Y.(2012).A single-scan algorithm for mining sequential patterns from data streams.International Journal of Innovative Computing, Information and Control,8(3),1799-1820.
  9. Lin, C. W.,Hong, T. P.,Lin, W. Y.,Lan, G. C.(2014).Efficient updating of sequential patterns with transaction insertion.Intelligent Data Analysis,18(6),1013-1026.
  10. Pei, J.,Han, J.,Mortazavi-Asl, B.,Pinto, H.,Chen, Q.,Dayal, U.,Hsu, M. C.(2001).Prefixspan: Mining sequential patterns efficiently by prefix-projected pattern growth.Proceedings of the 17th International Conference on Data Engineering,Heidelberg, Germany:
  11. Pei, J.,Han, J.,Mortazavi-Asl, B.,Wang, J.,Pinto, H.,Chen, Q.,Dayal, U.,Hsu, M. C.(2004).Mining sequential patterns by Pattern-Growth: The PrefixSpan approach.IEEE Transactions on Knowledge and Data Engineering,16(11),1424-1440.
  12. SPMF. (2022). An open-source data mining library. Retrieved November 25, 2022, from http://www.philippe-fournier-viger.com/spmf/index.php?link=datasets.php
  13. Xu, Y.,Yu, J. X.,Liu, G.,Lu, H.(2002).From path tree to frequent patterns: A framework for mining frequent patterns.Proceedings of the International Conference on Data Mining,Maebashi City, Japan:
  14. Yen, S. J.,Chen, A. L. P.(2001).A graph-based approach for discovering various types of association rules.IEEE Transactions on Knowledge and Data Engineering,13(5),839-845.
  15. Yen, S. J.,Wang, C. K.,Ouyang, L. Y.(2012).A search space reduced algorithm for mining frequent patterns.Journal of Information Science and Engineering,28(1),177-191.
  16. Zaki, M. J.(2001).SPADE: An efficient algorithm for mining frequent sequences.Machine Learning Journal,42(1),31-60.