题名

利用文字內容主題特徵與機器學習方法探討MIS相關期刊在ISI資料庫的主題分類

并列篇名

A Study of the Subject Categorization of the MIS-related Journals in the ISI Databases Using Topical Features in the Text Content and Machine Learning Methods

DOI

10.6120/JoEMLS.2015.523/0027.RS.AM

作者

林頌堅(Sung-Chien Lin)

关键词

ISI主題類別 ; 機器學習 ; 主題模型 ; 期刊群集 ; 類別預測 ; ISI subject category ; Machine learning ; Topic modeling ; Journal clustering ; Category prediction

期刊名称

教育資料與圖書館學

卷期/出版年月

52卷3期(2015 / 09 / 01)

页次

269 - 298

内容语文

繁體中文;英文

中文摘要

本研究利用主題模型、期刊群集與類別預測等方法,分析與討論ISI主題類別IS&LS的MIS相關期刊中同時被賦予Management類別的情形。在期刊群集實驗裡,所有被指定到Management類別的期刊及其它同樣具有相似主題特徵的期刊都被聚集在同一個期刊群集內,「管理」是其共同且最突顯的主題。由於此群集包含的期刊和先前研究的MIS群集大多相同,因此視為本研究的MIS群集。類別預測實驗使用分類迴歸樹方法,分別以ISI的Management類別以及本研究的MIS群集做為正案例,進行期刊類別預測。兩次試驗產生的分類樹都以「管理」主題的出現機率為主要的分類規則,但後者不僅分類樹較為單純,同時預測錯誤也較少。也就是若將MIS群集內所有期刊都指定到Management類別,會使檢索的成效更為周全有效。

英文摘要

In this study we analyzed and discussed that the MIS-related journals under the ISI subject category of IS&LS are simultaneously given with subject category Management, using methods of topic modeling, journal clustering and subject category prediction. In the experiment of journal clustering, all journals under subject category Management and other journals also having similar topical features can be gathered into a cluster, and "management" is their common and the most distinct topic. Because the journals belonged to this cluster are almost same to those in the MIS clusters generated by the previous studies, we considered it as the MIS cluster in this study. In the second experiment, we used the classification and regression tree (CART) technique to predict assignment of subject category with that the journals in the original subject category Management and in the MIS cluster produced in this study as positive examples, respectively. The trees generated by the two tests both used the occurring probabilities of the topic "management" as the main classification rule. However, in the latter test, we did not only obtain a simpler classification tree but also had a result with less predicting errors. This means that if all journals in the MIS cluster could be given with subject category Management, the retrieval results can be more effective and complete.

主题分类 人文學 > 圖書資訊學
参考文献
  1. 林頌堅(2014)。以主題模型方法為基礎的資訊計量學領域研究主題分析。教育資料與圖書館學,51(4),499-523。
    連結:
  2. Emerald. (2015). Online Information Review. Retrieved from http://www.emeraldgrouppublishing.com/products/journals/journals.htm?id=oir
  3. SAGE. (2015). Information Development. Retrieved from http://idv.sagepub.com/
  4. Palgrave Macmillan. (2015). About the journal. Retrieved from http://www.palgrave-journals.com/jit/about.html
  5. Elsevier B. V. (2015). Information and Organization. Retrieved from http://www.journals.elsevier.com/information-and-organization/
  6. Abrizah, A.,Noorhidawati, A.,Zainab, A. N.(2015).LIS journals categorization in the Journal Citation Report: A stated preference study.Scientometrics,102(2),1083-1099.
  7. Blei, D. M.,Ng, A. Y.,Jordan, M. I.(2003).Latent Dirichlet allocation.The Journal of Machine Learning Research,3,993-1022.
  8. Blondel, V. D.,Guillaume, J.-L.,Lambiotte, R.,Lefebvre, E.(2008).Fast unfolding of communities in large networks.Journal of Statistical Mechanics: Theory and Experiment,2008(10)
  9. Boyack, K. W.,Klavans, R.,Börner, K.(2005).Mapping the backbone of science.Scientometrics,64(3),351-374.
  10. Breiman, L.,Friedman, J.,Stone, C. J.,Olshen, R. A.(1984).Classification and regression trees.Belmont, CA:CRC Press.
  11. Chen, C.-M.(2008).Classification of scientific networks using aggregated journal-journal citation relations in the Journal Citation Reports.Journal of the American Society for Information Science and Technology,59(14),2296-2304.
  12. de Moya-Anegón, F.,Vargas-Quesada, B.,Chinchilla-Rodríguez, Z.,Corera-Álvarez, E.,Munoz-Fernández, F. J.,Herrero-Solana, V.(2007).Visualizing the marrow of science.Journal of the American Society for Information Science and Technology,58(14),2167-2179.
  13. Frey, B. J.,Dueck, D.(2007).Clustering by passing messages between data points.Science,315(5814),972-976.
  14. Glänzel, W.,Schubert, A.(2003).A new classification scheme of science fields and subfields designed for scientometric evaluation purposes.Scientometrics,56(3),357-367.
  15. Griffiths, T. L.,Steyvers, M.(2004).Finding scientific topics.Proceedings of the National Academy of Sciences of the United States of America,101(Suppl. 1),5228-5235.
  16. Janssens, F.,Zhang, L.,De Moor, B.,Glänzel, W.(2009).Hybrid clustering for validation and improvement of subject-classification schemes.Information Processing and Management,45(6),683-702.
  17. Klavans, R.,Boyack, K. W.(2006).Identifying a better measure of relatedness for mapping science.Journal of the American Society for Information Science and Technology,57(2),251-263.
  18. Leydesdorff, L.(2004).Clusters and maps of science journals based on bi-connected graphs in Journal Citation Reports.Journal of Documentation,60(4),371-427.
  19. Leydesdorff, L.(2006).Can scientific journals be classified in terms of aggregated journal-journal citation relations using the Journal Citation Reports?.Journal of the American Society for Information Science and Technology,57(5),601-613.
  20. Leydesdorff, L.,Rafols, I.(2009).A global map of science based on the ISI subject categories.Journal of the American Society for Information Science and Technology,60(2),348-362.
  21. Ni, C.,Sugimoto, C. R.,Cronin, B.(2013).Visualizing and comparing four facets of scholarly communication: Producers, artifacts, concepts, and gatekeepers.Scientometrics,94(3),1161-1173.
  22. Porter, A. L.,Rafols, I.(2009).Is science becoming more interdisciplinary? Measuring and mapping six research fields over time.Scientometrics,81(3),719-745.
  23. Pudovkin, A. I.,Garfield, E.(2002).Algorithmic procedure for finding semantically related journals.Journal of the American Society for Information Science and Technology,53(13),1113-1119.
  24. Rafols, I.,Leydesdorff, L.(2009).Content-based and algorithmic classifications of journals: Perspectives on the dynamics of scientific communication and indexer effects.Journal of the American Society for Information Science and Technology,60(9),1823-1835.
  25. Rosvall, M.,Bergstrom, C. T.(2008).Maps of random walks on complex networks reveal community structure.Proceedings of the National Academy of Sciences,105(4),1118-1123.
  26. Rzeszutek, R.,Androutsos, D.,Kyan, M.(2010).Self-organizing maps for topic trend discovery.Signal Processing Letters, IEEE,17(6),607-610.
  27. Samoylenko, I.,Chao, T.-C.,Liu, W.-C.,Chen, C.-M.(2006).Visualizing the scientific world and its evolution.Journal of the American Society for Information Science and Technology,57(11),1461-1469.
  28. Tseng, Y.-H.,Tsay, M.-Y.(2013).Journal clustering of library and information science for subfield delineation using the bibliometric analysis toolkit: CATAR.Scientometrics,95(2),503-528.
  29. Wang, F.,Wolfram, D.(2015).Assessment of journal similarity based on citing discipline analysis.Journal of the Association for Information Science and Technology,66(6),1189-1198.
  30. Wolfram, D.,Zhao, Y.(2014).A comparison of journal similarity across six disciplines using citing discipline analysis.Journal of Informetrics,8(4),840-853.
  31. Zhang, L.,Liu, X.,Janssens, F.,Liang, L.,Glänzel, W.(2010).Subject clustering analysis based on ISI category classification.Journal of Informetrics,4(2),185-193.
  32. 林頌堅(2014)。資訊科學期刊的主題分布與多樣性研究。圖書資訊學研究,9(1),171-200。