题名 |
A Nonparametric Multi-Seed Data Clustering Technique |
并列篇名 |
非參數式資料群集法 |
DOI |
10.29977/JCIIE.200801.0001 |
作者 |
李增坪(Tseng-Pin Lee);耿伯文(Victor B. Kreng) |
关键词 |
群集 ; 最小展開樹 ; 基因演算法 ; clustering ; minimal spanning tree ; genetic algorithms |
期刊名称 |
工業工程學刊 |
卷期/出版年月 |
25卷1期(2008 / 01 / 01) |
页次 |
1 - 10 |
内容语文 |
英文 |
中文摘要 |
單一群集中心點無法處理細長形狀的資料分佈;所以,當資料分佈形成複雜形狀,需要將之分割成數個小群集,並將這些小群集合併為一群,因而需要多個小群集的中心點,作為最終單一群集的起始參考點。本研究提出一非參數式的資料群集法,藉由分割與合併的程序來處理複雜形狀的資料分佈;在分割程序中,應用基因演算法將資料區分為數個小群集,並找出最適宜的群集中心點;而後,應用本研究所發展一種嶄新的判斷演算法-採用最小展開樹與統計方法,判斷任何鄰近的小群集是否合併為單一群集。最終,本文藉由數種資料分佈與實際資料,驗證本群集法的有效性。 |
英文摘要 |
Clustering of data around one seed does not work well if the shape of the cluster is elongated or non-convex. A complex shaped cluster requires several seeds. This study developed a nonparametric multi-seed data clustering approach which splits and merges procedures to handle the complex shapes of clusters. The splitting process utilizes a genetic algorithm to search for the appropriate cluster centers, which split all data into a considered amount of groups. To assign several seeds into one cluster, an innovative clustering process using a minimal spanning tree and statistics concept was proposed to judge whether a pair of clusters should be merged or separated. Experimental results illustrate the difficulties of one-seed-per-cluster, and also the effectiveness of the proposed clustering scheme. |
主题分类 |
工程學 >
工程學總論 |
参考文献 |
|