题名

資料間隱含關係的挖掘與展望

并列篇名

An Overview on Mining Implicit Data Relation

DOI

10.6382/JIM.200202.0075

作者

沈清正(Ching-Cheng Shen);陳仕昇(Shih-Sheng Chen);高鴻斌(Hong-Bin Gao);張元哲(Yuan-Zhe Chang);陳家仁(Jia-Ren Chen);黃琮盛(Cong-Sheng Huang);陳彥良(Yen-Iang Chen)

关键词

資料挖掘 ; 知識 ; 資料間隱含關係 ; Data mining ; knowledge ; Implicit Data Relation

期刊名称

資訊管理學報

卷期/出版年月

9卷S期(2002 / 02 / 01)

页次

75 - 99

内容语文

繁體中文

中文摘要

資料挖掘,指由大量資料中擷取出有價值之知識,亦即將資料轉換成知識的行為。這些資料包括各型態的資料,如一般的交易資料與多媒體資料,而知識則是資料間隱含關係的具體表達與呈現。因為資料挖掘能協助企業從資料中取得知識並創造競爭優勢,故引起廣大的重視,也促進了許多新的研究方法與系統的發展,而成為一個快速成長的領域。對於目前現有的資料挖掘方法和資料挖掘系統,本文根據“資料間隱含關係”的不同,提出了九種不同的類別,分別是資料關聯性、順序性、結構性、週期性、類似性、有趣性、個人性、合用性、歸納性,對每一種資料關係,我們將介紹其定義、應用狀況、研究現況和其研究展望。本文除了可幫助讀者了解資料挖掘領域的現況外,也提供了有用的資料挖掘分類方法並且介紹了資料挖掘的比較性研究。

英文摘要

Data mining is an extraction of useful knowledge from a huge amount of data. The data can be of a variety of types, such as transaction data, relational data and multimedia data, whereas knowledge is an explicit expression and representation of implicit data relation. Since that data mining can assist business to get knowledge and create competitive advantage, it is not surprising that a great number of researches have been done in this field. Because of its fast-growing development and abundant results, it is difficult to provide a complete survey to cover all the issues in a single paper. Therefore, this paper only provides a reasonably comprehensive report for the recent development of data mining technology. As to the present data mining methods and systems, this paper suggests 9 distinct categories according to their implicit data relation. These relations include association, sequence, structure, periodicity, similarity, interestingness, personalization, suitability and generalization. For each of them, we will discuss its definition, applications, algorithms and future research directions. The contributions of this paper include (1) a classification based on the implicit data relation is proposed, (2) a comparative study between these categories has been done, and (3) The state of the art for each category is described.

主题分类 基礎與應用科學 > 資訊科學
社會科學 > 管理學
参考文献
  1. Agarwal, R.,Aggarwal, C.,Prasad, V. V. V.(2001).A tree projection algorithm for generation of frequent item sets.Journal of Parallel and Distributed Computing,61(3)
  2. Agrawal, R.,Bayardo, Jr. R. J.,Srikant, R.(1999).Athena: Mining-based Interactive Management of Text Databases.IBM.
  3. Agrawal, R.,Faloutsos, C.,Swami, A.(1993).Efficient Similarity Search in Sequence Databases.Lecture Notes in Computer Science,730
  4. Agrawal, R.,Lin, K.,Sawhney, H. S.,Shim, K.(1995).Proceedings of the 21th International Conference on Very Large Databases.Zurich, Switzerland:
  5. Agrawal, R.,Srikant, R.(1994).Proceedings of the 20th International Conference on Very Large Databases.Santiago:Chile.
  6. Agrawal, R.,Srikant, R.(1995).Proceedings of the International Conference on Data Engineering (ICDE).Taipei, Taiwan:
  7. Bayardo, Jr. R. J.(1998).Proceedings of the 1998 ACM-SIGMOD International Conference on Management of Data.
  8. Bayardo, Jr. R. J.,Agrawal, R.,Gunopulos, D.(1999).Proceedings of the 15th International Conference on Data Engineering.Sydney, Australia:
  9. Berndt, D. J.,Clifford, J.(1996).Advances in Knowledge Discovery.AAAI MIT Press.
  10. Bettini, C.,Wang, X. Sean,Jajodia, S.,Lin, Jia-Ling(1998).Discovering Frequent Event Patterns With Multiple Granularities in Time Sequences.IEEE Transactions on Knowledge and Data Engineering,10(2)
  11. Brin, S.,Motwani, R.,Ullman, J. D.,Tsur, S.(1997).SIGMOD 1997, Proceedings of the ACM-SIGMOD International Conference on Management of Data.Tucson, Arizona:ACM Press.
  12. Cai, Y.,Cercone, N.,Han, J.(1990).Proceedings of the Sixth International Conference on Data Engineering (ICED'90).
  13. Carter, C. L.,Hamilton, H. J.(1995).Proceedings of the Seventh IEEE International Conference on Tools with Artificial Intelligence (ICTAI'95).
  14. Carter, C. L.,Hamilton, H. J.(1995).A Fast, On-Line Generalization Algorithm for Knowledge Discovery.Applied Mathematics Letters,8(2)
  15. Carter, C. L.,Hamilton, H. J.(1998).Efficient Attribute-Oriented Generalization for Knowledge Discovery from Large Databases.IEEE Transactions on Knowledge and Data Engineering,10(2)
  16. Chakrabarti, S.,Dom, B.,Agrawal, R.,Raghavan, P.(1997).Proceedings of the 23rd International Conference on Very Large Data Bases.Athens, Greece:
  17. Chan, Chien-Chung(1998).A rough set approach to attribute generalization in data mining.Information Sciences,107(1-4)
  18. Chen, E.,Wang, X.(1999).Proceedings of the 25th EUROMICRO Conference, Volume 2.
  19. Chen, M. S.,Han, J.,Yu, P. S.(1996).Data Mining: An Overview from a Database Perspective.IEEE Transactions on Knowledge and Data Engineering,8(6)
  20. Chen, Ming-Syan,Park, J. -S.,Yu, Philip S.(1998).Efficient Data Mining for Path Traversal Patterns.IEEE Transactions On Knowledge and Data Engineering,10(2)
  21. Cheung, D.,Lee, S. D.,Kao, B.(1997).Proceedings of the Fifth International Conference on Database Systems for Advanced Applications (DASFAA'97).Melbourne, Australia:
  22. Cheung, David Wai-Lok,Hwang, H. Y.,Fu, Ada Wai-Chee,Han, Jiawe(2000).Efficient Rule-Based Attribute-Oriented Induction for Data Mining.Journal of Intelligent Information Systems,15(2)
  23. Chiang, I. -J.,Lin, T. Y.(2000).24th Annual International Computer Software and Applications Conference.
  24. Cook, D. J.,Holder, L. B.(2000).Graph-Based Data Mining.IEEE Intelligent Systems,15(2)
  25. Cooley, R.,Mobasher, B.,Srivastava, J.(1997).Proceedings of the 1997 IEEE Knowledge and Data Engineering Exchange Workshop (KDEX-97).
  26. Cooley, R.,Mobasher, B.,Srivastava, J.(1999).Data Preparation for Mining World Wide Web Browsing Patterns.Knowledge and Information Systems,1(1)
  27. Cooley, R.,Mobasher, B.,Srivastava, J.(1997).Proceedings of the 9th IEEE International Conference on Tools with Artificial Intelligence (ICTAI'97).
  28. Dao, S.,Perry, B.(1996).WESCON/ 96.
  29. Dhar, V.,Tuzhilin, A.(1993).Abstract-Driven Pattern Discovery in Databases.IEEE Transactions on Knowledge and Data Engineering,5(6)
  30. Dua, S.,Cho, E.,Ivengar, S. S.(2000).Proceedings of the 3rd IEEE Symposium on Application-Specific Systems and Software Engineering Technology.
  31. Faloutsos, C.,Lin, K. -I.(1995).SIGMOD Conference 1995.
  32. Faloutsos, C.,Ranganathan, M.,Manolopoulos, Y.(1994).SIGMOD Conference 1994.
  33. Feng, L.,Lu, H.,Wong, A.(1998).1998 IEEE International Conference on Systems, Man, and Cybernetics, Vol. 3.
  34. Feng, T.,Murtagh, K.(2000).Proceedings of IEEE International Conference on Information Technology: Coding and Computing.
  35. Freitas, A. A.(1999).On Rule Interestingness Measures.Knowledge-Based Systems,12
  36. Goh, Chien-Le,Tsukamoto, M.,Nishio, S.(1996).Knowledge Discovery in Deductive Databases with Large Deduction Results: the First Step.IEEE Transactions on Knowledge and Data Engineering,8(6)
  37. Grahne, G.,Lakshmanan, L. V. S.,Wang, X.(2000).Proceedings of the 16th International Conference on Data Engineering.
  38. Hamilton, H. J.,Hilderman, R. J.,Cercone, N.(1996).Proceedings of the Eighth IEEE International Conference on Tools with Artificial Intelligence (ICTAI'96).
  39. Han, J.,Cai, Y.,Cercone, N.(1992).Proceedings of the 1992 International Conference on Very Large Data Bases (VLDB'92).Vancouver, Canada:
  40. Han, J.,Dong, G.,Yin, Y.(1999).15th International Conference on Data Engineering.
  41. Han, J.,Fu, Y.(1996).Advances in Knowledge Discovery and Data Mining.AAAI/ MIT Press.
  42. Han, J.,Kamber, M.(2001).Data Mining: Concepts and Techniques.Academic Press.
  43. Han, J.,Pei, J.,Yin, Y.(2000).Proceedings of the 2000 ACM-SIGMOD International Conference Management of Data (SIGMOD'00).Dallas, TX:
  44. Han, Jiawei,Cai, Yandong,Cercone, N.(1993).Data-driven discovery of quantitative rules in relational databases.IEEE Transactions on Knowledge and Data Engineering,5(1)
  45. Han, Jiawei,Fu, Yongjian(1999).Mining multiple-level association rules in large databases.IEEE Transactions on Knowledge and Data Engineering,11(5)
  46. Han, Jiawei,Lakshmanan, L. V. S.,Ng, R. T.(1999).Constraint-Based Multidimensional Data Mining.Computer,32
  47. Han, Jiawei,Nishio, S.,Kawano, H.,Wang, Wei(1998).Generalization-based data mining in object-oriented databases using an object-cube model.Data and Knowledge Engineering,25
  48. Hilderman, R. J.,Liangchun, L.,Hamilton, H. J.(1997).Proceedings of the Ninth IEEE International Conference on Tools with Artificial Intelligence (ICTAI'97).
  49. Hu, X.,Cercone, N.(1996).Proceedings of the Twelfth International Conference on Data Engineering (ICDE'96).
  50. Kamber, M.,Winstone, L.,Gong, W.,Cheng, S.,Han, J.(1997).Proceedings of the Seventh International Workshop on Research Issues in Data Engineering (RIDE'97).
  51. Klemettinen, M.,Mannila, H.,Ronkainen, P.,Toivonen, H.,Verkamo, A. I.(1994).Proceedings of the Third International Conference on Information and Knowledge Management.Gaithersburg, Maryland:
  52. Klemettinen, M.,Mannila, H.,Toivonen, H.(1999).Interactive Exploration of Interesting Findings in the Telecommunication Network Alarm Sequence Analyzer (TASA).Information and Software Technology,41(9)
  53. Lakshmanan, L. V. S.,Ng, R.,Han, J.,Pang, A.(1999).Proceedings of the 1999 ACM-SIGMOD Conference on Management of Data (SIGMOD'99).
  54. Lee, J.,Grossman, D.,Frieder, O.,McCabe, M. C.(2000).Proceedings of the International Conference on Information Technology: Coding and Computing.
  55. Lent, B.,Agrawal, R.,Srikant, R.(1997).Proceedings of the 3rd International Conference on Knowledge Discovery in Databases and Data Mining.Newport Beach, California:
  56. Lent, B.,Swami, A.,Widom, J.(1997).Proceedings of the Thirteenth International Conference on Data Engineering.Birmingham, UK:
  57. Lesh, N.,Zaki, M. J.,Oglhara, M.(2000).Scalable Feature Mining For Sequential Data.IEEE Intelligent Systems,15(2)
  58. Li, C. -S.,Yu, P. S.,Castelli, V.(1996).Proceedings of the Twelfth International Conference on Data Engineering.New Orleans, Louisiana:
  59. Lin, M. -Y.,Lee, S. -Y.(1998).Proceedings of the Tenth IEEE International Conference on Tools with Artificial Intelligence.
  60. Lin, X.,Liu, C.,Zhang, Y.,Zhou, X.(1999).Proceedings of the 31th International Conference on Technology of Object-Oriented Languages and Systems.
  61. Liu, B.,Hsu, W.,Chen, S.(1997).Proceedings of the Third International Conference on Knowledge Discovery and Data Mining.
  62. Liu, Bing,Hsu, Wynne,Mun, Lai-Fun,Lee, Hing-Yan(1999).Finding Interesting Patterns Using User Expectations.IEEE Transactions on Knowledge and Data Engineering,11(6)
  63. Lu, W.,Han, J.,Ooi, B. C.(1993).Proceedings of 1993 Far East Workshop on Geographic Information Systems (FEGIS'93).Singapore:
  64. Mannila, H.,Toivonen, H.,Verkamo, A. I.(1997).Discovery of Frequent Episodes in Event Sequences.Data Mining and Knowledge Discovery,1
  65. McClean, S.,Scotney, B.,Shapcott, M.(1998).IEE Colloquium on Knowledge Discovery and Data Mining.
  66. McClean, S.,Scotney, B.,Shapcott, M.(2000).Incorporating Domain Knowledge into Attribute-Oriented Data Mining.International Journal of Intelligent Systems,15(6)
  67. Ng, R.,Lakshmanan, L. V. S.,Han, J.,Mah, T.(1999).Proceedings of 1999 ACM-SIGMOD Conference on Management of Data (SIGMOD'99).Philadelphia, PA:
  68. Ng, R.,Lakshmanan, L. V. S.,Han, J.,Pang, A.(1998).Proceedings of 1998 ACM-SIGMOD Conference on Management of Data.
  69. Ozden, B.,Ramaswamy, S.,Silberschatz, A.(1998).International Conference on Data Engineering.
  70. Padmanabhan, B.,Tuzhilin, A.(1999).Unexpectedness as a Measure of Interestingness in Knowledge Discovery.Decision Support Systems,27
  71. Park, Jong Soo,Chen, Ming-Syan,Yu, Philip S.(1997).Using a Hash-Based Method with Transaction Trimming for Mining Association Rules.IEEE Transactions on Knowledge and Data Engineering,9(5)
  72. Park, S.(2000).Proceedings of Academia/ Industry Working Conference.
  73. Park, S.,Chu, W. W.,Yoon, J.,Hsu, C.(2000).Proceedings of the 16th International Conference on Data Engineering.
  74. Park, S.,Lee, D.,Chu, W. W.(1999).Proceedings of the 1999 Workshop on Knowledge and Data Engineering Exchange.
  75. Pasquier, N.,Bastide, Y.,Taouil, R.,Lakhal, L.(1999).Efficient Mining of Association Rules using Closed Itemset Lattices.Information Systems,24(1)
  76. Pei, J.,Han, J.(2000).Proceedings of the 2000 International Conference on Knowledge Discovery and Data Mining (KDD'00).Boston, MA:
  77. Pei, J.,Han, J.,Lakshmanan, L. V. S.(2001).Proceedings of the 2001 International Conference on Data Engineering (ICDE'01).Heidelberg, Germany:
  78. Piatetsky-Shapiro, G.(1991).Knowledge Discovery in Databases.California:AAAI/ MIT Press.
  79. Rainsford, C. P.,Roddick, J. F.(1997).Data Mining, Data Warehousing Client/ Server Database Proceedings of the 8th International Database Workshop.Hong Kong:
  80. Saar, T. M.,Nava, P.,Gadi, R.,Avi, P.(1999).Mining Relational Patterns From Multiple Relational Tables.Decision Support Systems,27
  81. Sadakane, K.,Imai, H.(1999).1999 International Symposium on Database Applications in Non-Traditional Environments.
  82. Sartipi, K.,Kontogiannis, K.,Mavaddat, F.(2000).8th International Workshop on Program Comprehension.
  83. Savasere, A.,Omiecinski, E.,Navathe, S.(1995).Proceedings of the 21th International Conference on Very Large Data Bases (VLDB).Zurich, Switzerland:
  84. Shan, N.,Hamilton, H. J.,Cercone, N.(1995).Proceedings of the Seventh International Conference on Tools with Artificial Intelligence (ICTAI'95).
  85. Shen, Li,Shen, Hong,Cheng, Ling(1999).New Algorithms for Efficient Mining of Association Rules.Information Sciences,118(1-4)
  86. Silberschatz, A.,Tuzhilin, A.(1996).What Makes Patterns Interesting in Knowledge Discovery Systems.IEEE Transactions on Knowledge and Data Engineering,8(6)
  87. Silberschatz, A.,Tuzhilin, A.(1995).First International Conference on Knowledge Discovery and Data Mining.
  88. Silverstein, C.,Brin, S.,Motwani, R.(1998).Beyond Market Baskets: Generalizing Association Rules to Dependence Rules.Data Mining and Knowledge Discovery,2
  89. Specht, G.,Kahabka, T.(2000).2000 International Database Engineering and Applications Symposium.
  90. Srikant, R.,Agrawal, R.(1995).Proceedings of the 21th International Conference on Very Large Databases (VLDB).Zurich, Switzerland:
  91. Srikant, R.,Vu, Q.,Agrawal, R.(1997).Proceedings of the 3rd International Conference on Knowledge Discovery in Databases and Data Mining.Newport Beach, California:
  92. Srinivasa, S.,Spiliopoulou, M.(1999).International Conference on Cooperative Information Systems.
  93. Sumi, K.,Sumi, Y.,Mase, K.,Nakasuka, S. -I.,Hori, K.(1999).1999 IEEE Conference on Systems, Man, and Cybernetics.
  94. Tang, C.,Lau, R. W.H.,Qing, L.,Huabei, Y.,Tong, L.,Kilis, D.(2000).Proceedings of the First International Conference on Web Information Systems Engineering, Vol. 2.
  95. Toivonen, H.(1996).22th International Conference on Very Large Databases (VLDB'96).Mumbay, India:
  96. Tsumoto, S.(2000).Knowledge discovery in clinical databases and evaluation of discovered knowledge in outpatient clinic.Information Sciences,124(1)
  97. Tsumoto, S.(1998).The 1998 IEEE International Conference on Fuzzy Systems Proceedings, Volume 2.
  98. Tung, A. K.,Han, H. J.,Lakshmanan, L. V. S.,Ng, R. T.(2001).Proceedings of the 2001 International Conference on Database Theory (ICDT'01).London, U. K.:
  99. Wang, J. T. -L.,Chirn, G. W.,Marr, T.,Shapiro, G.,Shasha, B. D.,Zhang, K.(1994).Proceedings of ACM SIGMOD.
  100. Wang, K.,He, Y.,Han, J.(2000).Proceedings of the 2000 International Conference on Very Large Data Bases.Cairo, Egypt:
  101. Wang, Ke,Liu, Huiqing(2000).Discovering Structural Association of Semistructured Data.IEEE Transactions on Knowledge and Data Engineering,12(3)
  102. Weiss, S. M.,Apte, C.,Damerau, F. J.,Johnson, D. E.,Oles, F. J.,Goetz, T.,Hampp, T.(1999).Maximizing Text-Mining Performance.IEEE Intelligent Systems,14(4)
  103. Yi, B. -K.,Sidiropoulos, N. D.,Johnson, T.,Jagadish, H. V.,Faloutsos, C.,Biliris, A.(2000).Proceedings of the 16th International Conference on Data Engineering.
  104. Yoon, S. -C.,Park, E. K.(1999).Proceedings of the 32nd Annual Hawaii International Conference on Systems Sciences, 1999, HICSS-32.
  105. Yu, P. S.(1999).6th International Conference on Database Systems for Advanced Applications.
  106. Zaiane, O. R.,Xin, M.,Han, J.(1998).Proceedings on Advances in Digital Libraries Conference (ADL'98).Santa Barbara, CA:
  107. Zaki, M. J.(1998).7th International Conference on Information and Knowledge Management.Washington DC:
  108. Zaki, M. J.,Lesh, N.,Ogihara, M.(1999).PlanMine: Predicting Plan Failures using Sequence Mining.Artificial Intelligence Review,14(6)
被引用次数
  1. 陳家仁、陳禹辰、陳彥良(2003)。在少樣商品或短交易長度情況下挖掘關聯規則。資訊管理學報,9(2),55-72。
  2. 謝尚文、林啟豐(2009)。資料採礦應用於中小企業服務業信用風險模型建置。數據分析,4(5),55-82。