题名

利用Google BERT提升中文寫作自動評分之準確率

并列篇名

Applying Google BERT to Enhance the Correct Rate of Automatic Scoring in Chinese Writing

作者

郭伯臣(Bor-Chen Kuo);李政軒(Cheng-Hsuan Li);黃淇瀅(Chi-Ying Huang)

关键词

Google BERT模型 ; LSA ; 中文寫作 ; 自動評分 ; automatic scoring model ; Chinese writing ; Google BERT model ; LSA

期刊名称

測驗學刊

卷期/出版年月

68卷1期(2021 / 03 / 31)

页次

53 - 74

内容语文

繁體中文

中文摘要

中文寫作測驗在臺灣大型測驗中已行之有年,但是在作文評分時會因為評分教師的教育背景、經驗及認知上之不同而產生差異性,因此發展中文寫作自動評分模型協助教師進行評分工作將顯得極其重要。目前有許多自動計分模型是採用潛在語意分析(Latent Semantic Analysis, LSA),由於其表示於向量空間中的詞彙並未考慮到上下文順序,故在文本分析上將有所限制。有鑑於此,本研究利用Google在2018年所提出的自然語言處理深度學習BERT(Bidirectional Encoder Representation from Transformers)建立中文寫作自動評分模型。Google BERT模型以預訓練及微調為主,預訓練過程中藉由Masked LM(MLM)、Next Sentence Prediction(NSP),以及Transformer編碼的過程,讓模型在文本的處理上更為精確。本研究隨機挑選大學生語文素養檢測之三等級評分(0分、1分及2分)寫作測驗為分析樣本進行研究,共有1,185名學生之寫作文本。利用微調後的Google BERT模型進行中文寫作自動評分,系統與專家評分的整體準確率(Accuracy)達92.07%,優於傳統以潛在語意分析進行中文寫作自動評分(其與專家評分的整體準確率達64.73%)。

英文摘要

In Taiwan, the Chinese writing test has been used in a large-scale test for many years. When grading Chinese writing, the score may different since the educational background, experience, and cognition of the grading teachers. Therefore, the development of automatic scoring of Chinese writing to assist teachers in grading is extremely important. At present, many automatic scoring models use Latent Semantic Analysis (LSA). Since the vocabulary expressed in the vector space does not consider the context, there will be restrictions on text analysis. In view of this, this research uses the Bidirectional Encoder Representation from Transformers (BERT) proposed by Google in 2018 to establish the model of the automatic scoring model for Chinese writing. Google BERT is mainly based on pretraining and fine-tuning. Google BERT is mainly based on pretraining including the process of Masked LM (MLM), Next Sentence Prediction (NSP), and Transformer encoding. After pretraining, a fine-tuning step was applied by specific essays. These two steps make the model more accurate in text processing. This study randomly selected the three grades (0 points, 1 point, and 2 points) writing test of "Chinese language literacy test for undergraduate" as the analysis sample for research and a total of 1,185 students' writing texts. Compared to the traditional automatic scoring method of LSA (which overall accuracy rate of expert scoring is 64.73%), using the fine-tuned Google BERT model has better results since the overall accuracy rate (Accuracy) of the system and expert scoring reached 92.07%.

主题分类 社會科學 > 心理學
社會科學 > 教育學
参考文献
  1. 陳林志,陳大仁,葉國暉,吳忠澄(2015)。使用語意模型分析線上部落格文件。中華民國資訊管理學報,22(3),273-316。
    連結:
  2. 黃仁鵬,張貞瑩(2014)。運用詞彙權重技術於自動文件摘要之研究。中華民國資訊管理學報,21(4),391-416。
    連結:
  3. 劉至咸,張嘉惠(2019)。基於訊息回應配對相似度估計的聊天記錄解構。中文計算語言學期刊,24(2),63-77。
    連結:
  4. 謝名娟(2016)。自動評分之現況與未來可能性評估。國家教育研究院教育脈動電子期刊,8,173-177。
    連結:
  5. 全國大學生語文素養計畫(2021)。計畫精神與目標。取自 https://chliteracy-assessment.weebly.com
  6. Adhikari, A.,Ram, A.,Tang, R.,Lin, J.(2019).DocBERT: BERT for document classifi-cation.Ithaca, NY:Cornell University.
  7. Devlin, J.,Chang, M.-W.,Lee, K.,Toutanova, K.(2019).BERT: Pre-training of deep bi-directional transformers for language understanding.Ithaca, NY:Cornell University.
  8. Hochreiter, S.,Schmidhuber, J.(1997).Long short-term memory.Neural Computation,9(8),1735-1780.
  9. Huang, W.,Cheng, X.,Chen, K.,Wang, T.,Chu, W.(2019).Toward fast and accurateneural Chinese word segmentation with multi-criteria learning.Ithaca, NY:Cornell University.
  10. Landauer, T. K.,Foltz, P. W.,Laham, D.(1998).Introduction to latent semantic analysis.Discourse Processes,25,259-284.
  11. Lee, W.-T.,Wu, T.-H.,Chi, P.-H.,Hsieh, C.-C.,Lee, H.-Y.(2020).Further boosting bert-based models by duplicating existing layers: Some intriguing phenomena insideBERT.Ithaca, NY:Cornell University.
  12. Liao, C.-H.,Kuo, B.-C.,Pai, K.-C.(2012).Effectiveness of automated Chinese sentencescoring with latent semantic analysis.TOJET: The Turkish Online Journal of Educa-tional Technology,11(2),80-87.
  13. Liu, Y.(2019).Fine-tune BERT for extractive summarization.Ithaca, NY:Cornell Univer-sity.
  14. Salton, G.,McGill, M.-J.(1983).Introduction to modern information retrieval.New York, NY:McGraw-Hill.
  15. Schmidhuber, J.(2015).Deep learning in neural networks: An overview.Neural Networks,61,85-117.
  16. Shen, B.,Zhao, Y.-S.(2014).An experimental study of incremental SVD on latent sem-antic analysis.Journal of Internet Technology,15(1),35-41.
  17. Van Rijsbergen, C. J.(1979).Information retrieval.London, UK:Butterworths.
  18. Vaswani, A.,Shazeer, N.,Parmar, N.,Uszkoreit, J.,Jones, L.,Gomez, A. N.,Polosukhin, I.(2017).Attention is all you need.Ithaca, NY:Cornell University.
  19. Vig, J., & Ramea, K. (2019). Comparison of transfer-learning approaches for response se-lection in multi-turn conversations. Association for the Advancement of Artificial In-telligence. Retrieved from https://reurl.cc/o9RRdv.
  20. Visa, S.,Ramsay, B.,Ralescu, A.,van der Knaap, E.(2011).Confusion matrix-based fea-ture selection.Proceedings of The 22nd Midwest Artificial Intelligence and CognitiveScience Conference 2011
  21. Yang, W.,Xie, Y.,Tan, L.,Xiong, K.,Li, M.,Lin, J.(2019).Data augmentation forBERT fine-tuning in open-domain question answering.Ithaca, NY:Cornell Univer-sity.
  22. Yang, W.,Zhang, H.,Lin, J.(2019).Simple applications of BERT for ad hoc documentretrieval.Ithaca, NY:Cornell University.
  23. Young, T.,Hazarika, D.,Poria, S.,Cambria, E.(2018).Recent trends in deep learningbased natural language processing.Ithaca, NY:Cornell University.
  24. Zong, M.,Zhu, X.,Cheng, D.(2017).Learning k for kNN classification.ACM Transac-tions on Intelligent Systems and Technology,8(3),1-19.
  25. 考選部(2019)。典試法施行細則。臺北市:作者。
  26. 吳孟淞,王新民(2009)。語意關聯主題模型於資訊檢索之研究。人工智慧應用研討會,霧峰區,臺中市:
  27. 沈郁婷(2018)。Python 與機器學習:以 Abalone 資料為例。臺北醫學大學生物統計研究中心,23,1-1。
  28. 柯華葳,戴浩一,曾玉村,曾淑賢,劉子鍵,辜玉旻,周育如(2011)。行政院國家科學委員會研究計畫成果報告行政院國家科學委員會研究計畫成果報告,桃園市:國立中央大學學習與教學研究所。
  29. 唐大任(2002)。新竹市,國立交通大學。
  30. 孫瑛澤,陳建良,劉峻杰,劉昭麟,蘇豐文(2010)。中文短句之情緒分類。第二十二屆自然語言與語音處理研討會,埔里鎮,南投縣:
  31. 馮樹仁(2001)。臺北市,國立臺灣師範大學。
  32. 葉鎮源(2012)。隱含語意索引。取自 http://terms.naer.edu.tw/detail/1678985/。
  33. 熊忠陽,暴自強,李智星,張玉芳(2010)。結合 LSA 的中文譜聚類算法研究。計算機應用研究,27(3),917-918。
  34. 蔡亞韋(2013)。臺中市,國立臺中教育大學。