题名 |
應用類神經網路於植物文件分群之研究 |
并列篇名 |
Chinese Plant Documents Clustering Using Artificial Neural Network |
DOI |
10.29495/CITE.200612.0406 |
作者 |
曾守正;羅永和 |
关键词 |
自我組織映射圖 ; 文件倉儲 ; 文件採礦 ; 向量空間模 ; 類神經網路 ; 資訊檢索 ; Self-Organizing Map ; Document Warehouse ; Text Mining ; Vector Space Model ; Artificial Neural Network ; Information Retrieval |
期刊名称 |
科技教育課程改革與發展學術研討會論文集 |
卷期/出版年月 |
2005期(2006 / 12 / 01) |
页次 |
406 - 413 |
内容语文 |
繁體中文 |
中文摘要 |
廿一世紀是一個資訊爆炸(Information Explosion)的時代,網際網路的風行,助長了資訊的累積與傳播,使得資訊的搜尋益形困難。目前知識工作者所面臨的問題不再是資訊的匱乏,而是資訊的氾濫(Information overflow)。本研究採用的SOM(Self-Organizing Map)類神經網路,針對892篇有關植物的文件,透過SOM分群技術,將分群結果投射在一個二維矩陣上。由各節點的向量,可推算文件之間的相似度,藉此幫助使用者快速地搜尋到符合需求的資訊。並依此模型,作爲建構其他專門學術領域,諸如昆蟲、動物等分類領域的中文文件管理的模型。 |
英文摘要 |
The 21st centenary is an age of information explosion. The continuous growth in the size and use of the Internet is creating difficulties in the search for information. Currently, the problem which the users encountered, are not lack of information but too much information. There is a need for automatic procedures that allow users to retrieve information from bountiful sources. As well known Category map developed based on Kohonen's self-organizing map (SOM) has been proven to be a promising browsing tool for information retrieval. The SOM algorithm automatically compresses and transforms a complex information space into a two-dimensional graphical representation. Such graphical representation provides a user-friendly interface for users to explore information repository. In this study, we applied SOM artificial neural networks algorithm to organize the Chinese plant documents into a two-dimension display map, as a visual tool to assist user to fulfill their information need. In the preprocessing stage, we focused our work on Chinese word segmentation and removed the stopwords for constructing a specific plant domain corpus, and vectored the documents with this corpus as the input value for SOM artificial neural networks. The result of this study show that Chinese plant document map has good recall rates and precision rates in our experiment. |
主题分类 |
社會科學 >
教育學 |