华艺学术文献数据库

题名	Evidence from an IC Packaging Foundry by Using a Two-Phase Clustering Methodology
并列篇名	應用二階段分群方法於IC封裝廠
DOI	10.29977/JCIIE.200807.0003
作者	楊旭豪(Hsu-Hao Yang)；劉自強(Tzu-Chiang Liu)；蘇旭東(Hsu-Dong Su)
关键词	分群；自我組織地圖；最小跨越樹； IC封裝； clustering ； self-organizing maps ； minimum spanning tree ； IC packaging
期刊名称	工業工程學刊
卷期/出版年月	25卷4期（2008 / 07 / 01）
页次	287 - 297
内容语文	英文
中文摘要	分群是將物件群集一起使得同群內的物件同質性愈高，而異群間的物件差異性愈明顯。本研究應用二階段分群方法。該方法的第一階段爲自我組織地圖(self-organizing maps, SOM)，第二階段包含k-means演算法與以跨越樹爲基(minimum spanning tree-based)的分群方法。跨越樹爲基的分群方法計算效率高，而且比較不受資料分布的影響。因本研究所使用的實務資料數值差異大，因此考慮二種資料轉換，包含min-max正規化與z-score正規化。我們比較的標準是Davies-Bouldin (DB)值與Wilk's lambda值。根據使用台灣某IC封裝廠焊線機資料的測試結果，我們發現，綜合考慮DB值與Wilk's lambda值，在第二階段應用k-means演算法於經過min-max正規化的資料轉換表現比較好。儘管跨越樹爲基的方法並未比k-means演算法優越，但我們發現，就偵測離群值而言，跨越樹爲基的方法比k-means演算法略勝一籌，尤其是資料經過正規化後。
英文摘要	Clustering is to group objects together so that they are as homogenous as possible within the same cluster while most distinct in different clusters. This paper uses a two-phase clustering methodology that integrates the self-organizing maps (SOM) algorithm in the first phase with the k-means algorithm and the minimum spanning tree-based (MST-based) clustering in the second phase. The MST-based clustering is used because it is efficient to solve tree-type problems and tends to be less sensitive to the geometric shape of data. Two types of data transformations including min-max normalization and z-score normalization are employed to deal with the situation where magnitudes of real-life data differ sharply. We compare clustering results in terms of Davies-Bouldin (DB) value and Wilk's lambda value. According to the results by using the data of Wire Bond machines from a Taiwanese IC packaging foundry, we find that applying the k-means algorithm in the second phase to the data with min-max normalization is better in terms of jointly considering DB value and Wilk’s lambda value. Despite that applying the MST-based clustering in the second phase does not outperform the k-means algorithm; however, we find that the former prevails over the latter in terms of detecting outliers especially when normalized data are used.
主题分类	工程學 > 工程學總論
参考文献	Ahmad, K.,B. L. Vrusias,A. Ledford(2001).Choosing feature sets for training and testing self-organizing maps: a case study.Neural Computing & Applications,10,56-66. Balakrishnan, P. V.,M. C. Cooper,V. S. Jacob,P. A. Lewis(1996).Comparative performance of the FSCL neural net and k-means algorithm for market segmentation.European Journal of Operational Research,93,346-357. Canetta, L.,N. Cheikhrouhou,R. Glardon(2005).Applying two-stage SOM-based clustering approaches to industrial data analysis.Production Planning & Control,16,774-784. Davies, D. L.,D. W. Bouldin(1979).A cluster separation measure.IEEE Transactions on Pattern Analysis and Machine Intelligence,1,224-227. Forina, M.,C. C. Oliveros,C. Casolino,M. Casale(2004).Minimum spanning trees: ordering edges to identify clustering structure.Analytica Chimica Acta,515,43-53. Grabmeier, J.,A. Rudolph(2002).Techniques of cluster algorithms in data mining.Data Mining and Knowledge Discovery,6,303-360. Guha, S.,R. Rastogi,K. Shim(2001).CURE: an efficient clustering algorithm for large databases.Information Systems,26,35-58. Guha, S.,R. Rastogi,K. Shim(2000).ROCK: a robust clustering algorithm for categorical attributes.Information Systems,25,345-366. Jain, A. K.,M. N. Murty,P. J. Flynn(1999).Data clustering: a review.ACM Computer Survey,31,264-323. Jain, A. K.,R. C. Dubes(1988).Algorithms for Clustering Data.Upper Saddle River, NJ:Prentice Hall. Jiang, M. F.,S. S. Tseng,C. M. Su(2001).Two-phase clustering process for outliers detection.Pattern Recognition Letters,22,691-700. Karypis, G., E. H. Han,V. Kumar(1999).CHAMELEON: a hierarchical clustering algorithm using dynamic modeling.IEEE Computer,32,68-75. Kaufman, L.,P. J. Rousseeuw(1990).Finding Groups in Data: an Introduction to Cluster Analysis.New York, NY:John Wiley & Sons. Kohonen, T.(1985).The self-organization map.Proceedings of IEEE,73,1551-1558. Kohonen, T.(1995).Self-Organizing Maps.Berlin, Germany:Springer-Verlag. Kuo, R. J.,L. M. Ho,c. M. Hu(2002).Integration of self-organizing feature map and k-means algorithm for market segmentation.Computers & Operations Research,29,1475-1493. Laszlo, M.,S. Mukherjee(2005).Minimum spanning tree partitioning algorithm for microaggregation.IEEE Transactions on Knowledge and Data Engineering,17,902-911. Luo, F.,L. Khan,F. B. Bastani,I. L. Yen,J. Zhou(2004).A dynamically growing self-organizing tree (DGSOT) for hierarchical clustering gene expression profiles.Bioinformatics,20,2605-2617. MacQueen, J.(1967).Some methods for classification and analysis of multivariate observations.Proceedings of 5th Berkeley Symposium on Mathematical Statistics and Probability,Berkeley, CA: Ng, R.,J. Han(1994).Efficient and effective clustering method for spatial data mining.Proceedings of International Conference on Very Large Data Base,Santiago, Chile: Pölzlbauer G.,M. Dittenbach,A. Rauber(2006).Advanced visualization of self-organizing maps with vector fields.Neural Networks,19,911-922. Vesanto, J.,E. Alhoniemi(2000).Clustering of the self-organizing map.IEEE Transactions on Neural Networks,11,586-600. Xu, Y.,V. Olman,D. Xu(2002).Clustering gene expression data using a graph-theoretic approach: an application of minimum spanning tree.Bioinformatics,18,536-545. Xu, Y.,V. Olman,D. Xu(2001).Minimum spanning tree for gene expression data clustering.Genome Informatics,12,24-33. Zahn, C. T.(1971).Graph-theoretical methods for detecting and describing gestalt clusters.IEEE Transactions on Computers,20,68-86. Zhang, T.,R. Ramakrishnan,M. Livny(1996).BIRCH: an efficient data clustering method for very large databases.Proceedings of International Conference on Management of Data,Montreal, Canada: