题名

結合動態分群人造蜂群演算法之自動化分群系統

并列篇名

An Automatic Clustering System with Dynamic Clustering Artificial Bee Colony Algorithm

作者

邱哲夫(Che-Fu Chiu);王惠嘉(Hei Chia Wang)

关键词

自動化分群 ; 元啟發式演算法 ; 人工蜂群演算法 ; 模範策略 ; automatic clustering ; meta-heuristic clustering ; artificial bee colony algorithm ; model strategy

期刊名称

資訊管理學報

卷期/出版年月

26卷1期(2019 / 01 / 31)

页次

1 - 24

内容语文

繁體中文

中文摘要

分群是一種資料探勘技術,它是一種非監督式的學習方法,透過相似度計算,將資料分成不同的群。在分群演算法中,啟發式分群在近年來漸漸受到重視,它指的是運用啟發式演算法或啟發式的概念解決分群問題。相較於目前的一些其方分群方法(如:k-means),啟發式分群似有較好的表現。一般的分群演算法在實作時,通常需要使用者給予額外的資訊(例如:群數),這些資訊有時候使用者並不容易下決定,因此,若是能讓分群演算法自動決定群數進行分群,對於使用者來說將會更加的便利。鑑於啟發式分群在分群問題的成功,開始有學者嘗試設計可以自動化分群的啟發式分群演算法,像是衍生自基因演算法(genetic algorithm; GA)的GCUK(genetic clustering for unknown K),以及衍生自粒子群最佳化演算法(particle swarm optimization; PSO)的MEPSO(Multi-Elitist PSO)。上述這兩種方法雖然可以成功的自動化分群,但卻有著效率不佳的問題,其原因有二:(1)編碼格式設計不佳導致演算法需搜尋之解空間過大,(2)選用之啟發式演算法不一定適合分群問題或其能力不足。因此,本研究提出一個可以自動化決定群數的自動化分群系統(automatic clustering system; ACS),此系統先使用群數搜尋演算法(cluster range discovery algorithm; CRD)縮減欲搜尋的群數區間,再使用動態分群人造蜂群演算法(dynamic clustering artificial bee colony algorithm; DCABC)進行自動化分群,DCABC加入了模範策略(model strategy)以克服既有人造蜂群演算法(artificial bee colony algorithm; ABC)的缺點,並透過特別設計的編碼格式,使其可以在分群的時候同時達成決定群數和優化分群品質的功能。

英文摘要

Purpose-This study designs a system that automatically determines the number of groups of clustering. This study refers to the research of past heuristic algorithms and heuristic grouping, and improves the weakness of ABC algorithm, and proposes an exemplary strategy to improve the search performance of the algorithm. Design/methodology/approach - This study designs an Automatic Clustering System (ACS). ACS uses a Cluster Range Discovery (CRD) algorithm to reduce the search range of cluster number. After that, the ACS uses Dynamic Clustering Artificial Bee Colony algorithm (DCABC) to complete the automatic clustering. DCABC adopts the Model Strategy to overcome the drawback of original artificial bee colony algorithm (ABC). DCABC also designs a brand-new encoding format. Combining this encoding format, DCABC can cluster the data and find the number of clusters simultaneously. Findings-With the success of meta-heuristic clustering, some researchers tend to design an automatic clustering algorithm with meta-heuristic method. The experiment results show the proposed DCABC can find the suitable cluster number and can have better performance than ABC. Research limitations/implications-Although the algorithm proposed in this study can automatically determine the appropriate number of groups, at the time of initialization, the user must specify the group number interval. Practical implications-This study proposes an Automatic Clustering System by using a Cluster Range Discovery algorithm to reduce the search range of cluster number. The ACS uses Dynamic Clustering Artificial Bee Colony algorithm to complete the automatic clustering. DCABC adopts the Model Strategy to overcome the drawback of original artificial bee colony algorithm. Originality/value-Unlike the similar algorithms which needs to assign number of cluster manually. This study proposes an Automatic Clustering System (ACS), which can automatically determine the number of clusters. Combining the designed encoding format, DCABC can cluster the data and find the number of clusters simultaneously.

主题分类 基礎與應用科學 > 資訊科學
社會科學 > 管理學
参考文献
  1. Akay, B.,Karaboga, D.(2012).A modified Artificial Bee Colony algorithm for real-parameter optimization.Information Sciences,192(1),120-142.
  2. Arbelaitz, O.,Gurrutxaga, I.,Muguerza, J.,Pérez, J.,Perona, I.(2013).An extensive comparative study of cluster validity indices.Pattern Recognition,46(1),243-256.
  3. Bandyopadhyay, S.,Maulik, U.(2002).Genetic clustering for automatic evolution of clusters and application to image classification.Pattern Recognition,35(6),1197-1208.
  4. Chatterjee, S.,Carrera, C.,Lynch, L.A.(1996).Genetic algorithms and traveling salesman problems.European Journal of Operational Research,93(3),490-510.
  5. Chou, C.-H.,Su, M.-C.,Lai, E.(2004).A new cluster validity measure and its application to image compression.Pattern Analysis and Applications,7(2),205-220.
  6. Cowgill, M.C.,Harvey, R.J.,Watson, L.T.(1999).A genetic algorithm approach to cluster analysis.Computers and Mathematics with Applications,37(7),99-108.
  7. Das, S.,Abraham, A.,Konar, A.(2008).Automatic kernel clustering with a Multi-Elitist Particle Swarm Optimization Algorithm.Pattern Recognition Letters,29(5),688-699.
  8. Das, S.,Abraham, A.,Konar, A.(2009).Metaheuristic Clustering.Springer.
  9. Davies, D.L.,Bouldin, D.W.(1979).A Cluster Separation Measure.IEEE Transactions on Pattern Analysis and Machine Intelligence,PAMI–1(2),224-227.
  10. Glover, F.(1986).Future paths for integer programming and links to artificial intelligence.Computers & Operations Research,13(5),533-549.
  11. Handl, J.,Knowles, J.(2007).An Evolutionary Approach to Multiobjective Clustering.IEEE Transactions on Evolutionary Computation,11(1),56-76.
  12. Hasan, M.A.,Chaoji, V.,Salem, S.,Zaki, M.J.(2009).Robust partitional clustering by outlier and density insensitive seeding.Pattern Recognition Letters,30(11),994-1002.
  13. Holland, J.H.(1975).Adaptation in Natural and Artificial Systems.University of Michigan Press.
  14. Hubert, L.,Arabie, P.(1985).Comparing partitions.Journal of Classification,2(1),193-218.
  15. Ji, J.,Pang, W.,Zhou, C.,Han, X.,Wang, Z.(2012).A fuzzy k-prototype clustering algorithm for mixed numeric and categorical data.Knowledge-Based Systems,30,129-135.
  16. Karaboga, D.,Akay, B.(2009).A comparative study of Artificial Bee Colony algorithm.Applied Mathematics and Computation,214(1),108-132.
  17. Karaboga, D.,Ozturk, C.(2011).A novel clustering approach: Artificial Bee Colony (ABC) algorithm.Applied Soft Computing,11(1),652-657.
  18. Laszlo, M.,Mukherjee, S.(2007).A genetic algorithm that exchanges neighboring centers for k-means clustering.Pattern Recognition Letters,28(16),2359-2366.
  19. Liang, Y.,Wan, Z.,Fang, D.(2017).An improved artificial bee colony algorithm for solving constrained optimization problems.International Journal of Machine Learning and Cybernetics,8(3),739-754.
  20. Liao, S.-H.,Chu, P.-H.,Hsiao, P.-Y.(2012).Data mining techniques and applications – A decade review from 2000 to 2011.Expert Systems with Applications,39(12),11303-11311.
  21. Likas, A.,Vlassis, N.,Verbeek, J.J.(2003).The global k-means clustering algorithm.Pattern Recognition,36(2),451-461.
  22. Mahajan, M.,Nimbhorkar, P.,Varadarajan, K.(2012).The planar k-means problem is NP-hard.Theoretical Computer Science,442(13),13-21.
  23. Mahdavi, M.,Chehreghani, M.H.,Abolhassani, H.,Forsati, R.(2008).Novel meta-heuristic algorithms for clustering web documents.Applied Mathematics and Computation,201(1-2),441-451.
  24. Oftadeh, R.,Mahjoob, M.J.,Shariatpanahi, M.(2010).A novel meta-heuristic optimization algorithm inspired by group hunting of animals: Hunting search.Computers and Mathematics with Applications,60(7),2087-2098.
  25. Rousseeuw, Peter J.(1987).Silhouettes: A graphical aid to the interpretation and validation of cluster analysis.Journal of Computational and Applied Mathematics,20,53-65.
  26. Sabau, A.S.(2012).Survey of clustering based financial fraud detection research.Informatica Economica,16(1),110-122.
  27. Senthilnath, J.,Omar, S.N.,Mani, V.(2011).Clustering using firefly algorithm: Performance study.Swarm and Evolutionary Computation,1(3),164-171.
  28. Song, X.,Yan, Q.,Zhao, M.(2017).An adaptive artificial bee colony algorithm based on objective function value information.Applied Soft Computing,55,384-401.
  29. Szeto, W.Y.,Wu, Y.,Ho, S.C.(2011).An artificial bee colony algorithm for the capacitated vehicle routing problem.European Journal of Operational Research,215(1),126-135.
  30. Tan, P.N.,Steinbach, M.,Kumar, V.(2006).Introduction to Data Mining.Boston, MA, USA.:Addison Wesley.
  31. Wei, J.,Lee, M.,Chen, H.,Wu, H.(2013).Customer relationship management in the hairdressing industry: an application of data mining techniques.Expert Systems with Applications,40(18),7513-7518.
  32. Yan, Y.,Zhang, Y,Gao, F.(2012).Dynamic artificial bee colony algorithm for multi-parameters optimization of support vector machine-based soft-margin classifier.EURASIP Journal on Advances in Signal Processing,2012,160.