英文摘要
|
For data mining applications, association rule can be used to support a decision making process. However, association rule algorithms usually yield a large numbers of rules, and many of the rules may contain redundant, irrelevant information or describe trivial knowledge. In this paper we present a four-stage data mining processes for finding relevant fuzzy association rules from medical database. Fuzzy association rules are especially suitable in medical mining, since they consist of simple linguistically interpretable rules and do not have the drawbacks of symbolic or crisp association rule. In the first phase, the Cluster partitioning technique was used to automatically transform quantitative values into fuzzy linguistically terms. The linguistically terms were modeled by means of fuzzy sets defined in the appropriate attribute domains. Next, a Kohonen self-organizing map (SOM) was used to identify clusters based on shared feature attribute values. The resulting clusters were then classified by feature attributes determined using an Apriori association rule algorithm. Because the association rule algorithm tended to generate large numbers of rules, we present interactive strategies for pruning redundant association rules on the basis of fuzzy resemblance relation to enhance its readability, and evaluate the truth degree of the discovered fuzzy association rules by the truth evaluation mechanism. Finally, we demonstrate our approach on a real disease medical database.
|
参考文献
|
-
Agrawal R.,Imielinski T.,Swami A.(1993).Mining association rules between sets of items in large databases.ACM SIGMOD International Conference.
-
Bastide Y.,Pasquier N.,Taouil R.,Stumme G.,Lakhal L.(2000).Mining minimal non-redundant association rules using frequent closed item sets.Lecture Notes In Computer Science,1861
-
Baysrdo R. J.,Agrawal R.(1999).Mining the most interesting rules.Proc. KDD Conference.
-
Brin S.,Motwani R.,Silversterin C.(1997).Beyond market baskets: Generalizing association rules to correlation.Proc. SIGMOD conference.
-
Chaea Y. M.,Kima H. S.,Tarkb K. C.,Parkb H. J.,Hoa S. H.(2003).Analysis of healthcare quality indicator using data mining and decision support system.Expert Systems with Applications,24,167-172.
-
Chen G.,Wei Q.(2002).Fuzzy association rules and the extended mining algorithms.Information Sciences,147,201-228.
-
Chen M. S.,Han J.,Yu P. S.(1996).Data mining: An overview from database perspective.IEEE Transactions on Knowledge and Data Engineering,8,866-883.
-
Cybenko G.(1989).Approximating by super positions of a sigmoidal function.Mathematical Control Signal Systems,2,303-314.
-
Delgado M.,Sánchez D.,Martín-Bautista M. J.,Vila M. A.(2001).Mining association rules with improved semantics in medical databases.Artificial Intelligence in Medicine,21,241-245.
-
Fayyad U.,Piatetsky-Shapiro G.,Smyth P.(1996).From data mining to knowledge discovery in databases.AI Magazine,17,37-54.
-
Frawley W. J.,Piatetsky-Shapiro G.,Matheus C. J.(1991).Knowledge discovery in databases: an overview.
-
Fu A.,Wong M.,Sze S.,Wong W.,Yu W.(1998).Finding fuzzy sets for the mining of fuzzy association rules for numerical attributes.Proceedings of International Symposium on Intelligent Data Engineering and Learning.
-
Han J.,Fu Y.(1995).Discovery of multiple-level association rule from large databases.Proceedings VLDB conference.
-
Heckerman D.(1996).Bayesian networks for knowledge discovery.Advances in Knowledge Discovery and Data Mining.
-
Hornik K.,Stinchcombe M.,White H.(1989).Multilayer feedforward networks are universal approximations.Neural Networks,2,336-359.
-
Hsieh N. C.(2004).Handling indefinite and maybe information in logical fuzzy relational databases.International Journal of Intelligent Systems,19(3),257-276.
-
Jensen S.(2001).Mining medical data for predictive and sequential patterns.PKDD 2001 Discovery Challenge on Thrombosis Data.
-
Kaufman L.,Rousseeuw P. J.(1990).Finding Groups in Data: An introduction to cluster analysis.
-
Klemettinen M.,Mannila H.,Ronkainen P.,Toivonen H.,Verkamo A. I.(1994).Finding interesting rules from large sets of discovered association rules.Proceeding CIKM conference.
-
Kohonen T.(1995).The self-organizing map.
-
Lavrac N.(1999).Selected techniques for data mining in medicine.Artificial Intelligence in Medicine,16,3-23.
-
Levin B.,Meidan A.,Cheskis A.,Gefen O.,Vorobyov(1999).PKDD99 Discovery Challenge-Medical Domain.Workshop Notes on Discovery Challenge
-
Markey M. K.,Lo J. Y.,Tourassi G. D.,Floyd Jr. C. E.(2003).Self-organizing map for cluster analysis of a breast cancer database.Artificial Intelligence in Medicine,27,113-127.
-
Mitra S.(2002).Data mining in soft computing framework: A survey.IEEE Transactions on Neural Networks,13(1)
-
Ng R. T.,Han J.(1994).Efficient and effective clustering methods for spatial data mining.Proceeding 20th International Conference on Very Large Databases.
-
Ordonez C.,Santana C. A.,Braal L.(2000).Discovering interesting association rules in medical data.ACM SIGMOD Workshop on Research Issues on Data Mining and Knowledge Discovery (DMKD 2000).
-
Park J. S.,Chen M-S.,Yu P. S.(1995).An effective hash-based algorithm for mining association rules.Proceedings of ACMSIGMOD.
-
Srikant R.,Agrawal R.(1994).Fast algorithms for mining association rules.Proceedings of the 20th VLDB Conference.
-
Srikant R.,Agrawal R.(1995).Mining generalized association rules.Proceedings of the 21th VLDB Conference.
-
Srikant R.,Agrawal R.(1996).Mining quantitative association rules in large relational tables.Proceedings of the ACM SIGMOD International Conference.
-
Taylor C. C.(1999).PKDD`99 Discovery Challenge: Medical Data Set.Workshop Notes on Discovery Challenge
-
Yager R. R.(1984).General multiple-objective decision functions and linguistically quantified statements.International Journal of Man-Machine Studies,21,389-400.
-
Yager R. R.(1988).On ordering weighted averaging aggregation operations in multicriteria decision-making.IEEE Transactions on System, Man, Cybernetics,18,183-190.
-
Zadeh L. A.(1984).A computational approach to fuzzy quantifiers in natural languages.Computers Mathematics with Applications,9,149-184.
-
Zadeh L. A.(1978).Fuzzy sets as a basis for theory of possibility.Fuzzy Sets and Systems,3-28.
-
Zemankova M.,Kandel A.(1985).Implementing Imprecise in Information Systems.Information Sciences,37,107-141.
-
Zytkow J.,Gupta S.(2000).Guide to Medical Data on Collagen Disease and Thrombosis.PKDD 2001 Discovery Challenge on Thrombosis Data.
|