题名

應用文件分群與文字探勘技術於機器學習領域趨勢分析以SSCI資料庫為例

并列篇名

Trend Analysis in Machine Learning Research from SSCI Database by Document Clustering Manipulation and Text Mining Methodology

DOI

10.30115/JCJCU.201012.0001

作者

尹其言(Chi-Yen Yin);楊建民(Jiann-Min Yang)

关键词

文件分群 ; 文字探勘 ; 自組織映射網路 ; Document Clustering ; Text Mining ; Self-Organization Map

期刊名称

長榮大學學報

卷期/出版年月

14卷2期(2010 / 12 / 01)

页次

1 - 16

内容语文

繁體中文

中文摘要

機器學習領域期刊文獻的研究與發表,一直是電腦科學未來應用與新科技誕生的基礎,本研究利用SSCI資料庫中與機器學習應用相關研究文獻,使用文字探勘技術,擷取具文章鑑別力之特徵詞彙,進行詞彙叢聚分析,將每份文章出現各詞彙叢聚的頻率做為自組織映射網路的輸入變數,利用網路神經元自動群集的功能,將機器學習應用的分成10大領域,最後配合文章發表年份進行趨勢分析,找出各研究領域的歷史脈絡,並進一步預測未來可能趨勢。

英文摘要

This paper introduces the new concept for data mining manipulation. The research utilizes a document clustering technology to gain the homogeneous glossaries in each article at SSCI database, and forwarding toward onto the literature cluster assay. To select the term frequency indexes which generated by the glossaries aggregation as the parameters of the Self-Organization Map (SOM) network, proceeding the network neuron automatic clustering function, it is to strengthen the discovering ability through the historical tracking and gathering the results from various research domain, and forecasting the future possible research tendency.

主题分类 人文學 > 人文學綜合
社會科學 > 社會科學綜合
参考文献
  1. Aas, K.,Eikvil, L.(1999).,U.S.A.:Norwegian Computing Center.
  2. Baker, L. D.,McCallum, A. K.(1998).Distributional clustering of words for text classification.Proceedings of the 21st International ACM SIGIR Conference on Research and Development in Information Retrieval
  3. Bassiou, N.,Kotropoulos, C.,Pitas, J.(2001).Hierarchical word clustering for relevance judgment in information retrieval.Third International Conference on Enterprise Information Systems
  4. Bekkerman, R.,El-Yaniv, R.,Winter, Y.,Tishby, N.(2001).On feature distributional clustering for text categorization.Proceedings of the 24th International ACM SIGIR Conference on Research and Development in Information Retrieval
  5. Davis, L. D.,Mitchell, M.(1991).Handbook of Genetic Algorithms.New York:Van Nostrand Reinhold.
  6. Garcia, A. J. J.,Pikatza, J. M.,Florez, S.,Sobrado, F. J.(2005).Intrusion detection using text mining in a web-based telemedicine system.Proceedings of the 18th Australian Joint Conference on Artificial Intelligence
  7. Goldberg, D. E.(1989).Gene Algorithm in Search, Optimization and Machine Learning.New York:Addison-Wesley.
  8. Heller, K.,Ghahramani, Z.(2005).Bayesian hierarchical clustering.ACM International Conference Proceeding Series
  9. Kao, A.,Poteet, S.(2006).Text Mining and Natural Language Processing-Introduction for the Special Issue.New York:Springer-Verlag.
  10. Lam, W.,Ruiz, M.,Srinivasan, P.(1999).Automatic text categorization and its application to text retrieval.IEEE Transactions on Knowledge and Data Engineering,11(6),865-879.
  11. Lu, W.,Chien, L.,Lee, H.(2002).Translation of Web Queries Using Anchor Text Mining.ACM Transaction on Asian Language Information Processing,1(2),159-172.
  12. Maulik, U.,Bandyopadhyay, S.(2000).Genetic algorithm-based clustering technique.Pattern Recognition,33,1455-1465.
  13. Moens, M. F.,Dumortier, J.(2000).Text Categorization: the Assignment of Subject Descriptors to Magazine Articles.Information Processing & Management,36,841-861.
  14. Moretti, S.(2006).Minimum Cost Spanning Trees Situations and Gene Expression Data Analysis.ACM International Conference Proceeding Series
  15. Salton, G.,Buckley, C.(1988).Term weighting approaches in automatic text retrieval.Information Process and Management,24(5),513-523.
  16. Sebastiani, F.(2005).Text Categorization, Text Mining and its Applications.Southampton, U.K.:WIT Press.
  17. Sebastiani, F.(2002).Machine learning in automated text categorization.ACM Computing Survey,34(1),1-47.
  18. Stumme, G.,Hotho, A.,Beremdt, B.(2002).Usage mining for and on the semantic web.The Semantic Web-ISWC 2002, 1st International Semantic Web Conference
  19. Yang, Y.,Pedersen, J.(1997).A comparative study on feature selection in text categorization.Proceedings of the 14th International Conference on Machine Learning, ICML-97
被引用次数
  1. 陳世榮(2015)。社會科學研究中的文字探勘應用:以文意為基礎的文件分類及其問題。人文及社會科學集刊,27(4),683-718。
  2. 賴進貴、吳佳融(2017)。日治時期臺灣民間地圖之發展-以「私設埤圳圖」為例。地圖,27,38-58。
  3. 張益誠,張育傑,余泰毅(2021)。探討環境教育論文的文件自動分類技術-以2013-2018年環境教育研討會摘要為例。環境教育研究,17(1),85-128。