题名

發展主題語料庫以輔助華語教學-以2019新型冠狀病毒語料庫為例

并列篇名

Developing a Topic-Specific Web Corpus, COVID-19, for Chinese Language Teaching and Learning

作者

白明弘(Ming-Hong BAI);陳浩然(Howard Hao-Jan CHEN);林鶯(Ying LIN)

关键词

N連詞分析 ; 主題華語 ; 字詞頻率 ; 搭配詞 ; 網路作為語料庫 ; 關鍵字詞分析 ; Chinese for specific topics ; collocation ; keyword analysis ; N-gram analysis ; web as corpus ; word frequency

期刊名称

華語文教學研究

卷期/出版年月

17卷3期(2020 / 09 / 01)

页次

1 - 51

内容语文

繁體中文

中文摘要

2019新型冠狀病毒(COVID-19)對人類產生巨大的影響,歐洲及美國皆有團隊建立英文COVID-19語料庫,但華語圈目前尚未有類似語料庫。因此,本文希望能補足此一缺口,建立「中文COVID-19主題語料庫」供研究人員、老師及學生使用。本研究的研究問題有二:(1)中文COVID-19語料庫和No Sketch Engine平臺能否提供有用的資訊?(2)此語料工具有何優缺點?本研究以WebBootCat技術建構COVID-19主題語料庫,也產出各種教學素材:(1)詞頻、(2)關鍵詞、(3)常見N連詞、(4)搭配詞。此研究發現WebBootCat技術可有效生成主題語料庫,此庫有以下優點:(1)即時性、(2)廣泛涵蓋率、(3)真實語言、(4)豐富語境。然而,此語料庫是爬取網路資料所建成,不免納入不相關的雜訊,而平臺上仍有許多重要工具有待開發。

英文摘要

The coronavirus disease 2019 (COVID-19) has had a serious impact on people around the globe. However, teams in Europe and the United States worked hard in developing English COVID-19 corpora. Yet, there is no such corpus available in Chinese. Therefore, this paper aimed to fill this gap by building a Chinese COVID-19 corpus for researchers, teachers, and students. The two research questions are as follows: (1) Can the Chinese COVID-19 corpus and No Sketch Engine platform provide useful information for Chinese teachers and students? (2) What are the advantages and disadvantages of this web platform in terms of the contents and analyses of the corpus? A Chinese COVID-19 corpus was built with WebBootCat. This study also generated raw data for assisting Chinese teaching and learning by analyzing the corpus: (1) top-frequency vocabulary items, (2) keywords, (3) n-grams, (4) collocations. This study found that using WebBootCat could efficiently generate a topic-specific corpus, which has the following advantages: (1) immediacy, (2) wide coverage, (3) authentic language, and (4) rich language contexts. However, as data were crawled from the web, irrelevant noises might be detected. Moreover, more tools need to be developed in No Sketch Engine.

主题分类 人文學 > 語言學
社會科學 > 教育學
参考文献
  1. 謝佳玲, Chia-ling,吳欣儒, Xin-ru(2018)。以華語電視新聞為材料的語篇研究及聽力教學應用。臺灣華語教學研究,16,91-124。
    連結:
  2. 謝佳玲, Chia-ling,李家豪, Jia-hao(2011)。臺灣電視新聞標題研究與教學啟示。華語文教學研究,8(3),79-114。
    連結:
  3. Anthony, Laurence(2004).AntConc: A learner and classroom friendly, multi-platform corpus analysis toolkit.Proceedings of An Interactive Workshop on Language e-Learning,Tokyo, Japan:
  4. Bahns, Jens,Eldaw, Moira(1993).Should we teach EFL students collocations?.System,21(1),101-114.
  5. Baker, Paul(2006).Using Corpora in Discourse Analysis.London:Continuum.
  6. Baroni, Marco,Bernardini, Silvia(2004).BootCaT: Bootstrapping corpora and terms from the Web.Proceedings of the Fourth International Conference on Language Resources and Evaluation,Lisbon, Portugal:
  7. Baroni, Marco,Kilgarriff, Adam,Pomikálek, Jan,Rychlý, Pavel(2006).WebBootCaT: Instant domain-specific corpora to support human translators.Proceedings of EAMT 2006 - 11th Annual Conference of the European Association for Machine Translation,Oslo, Norway:
  8. Biber, Douglas(2006).University Language: A Corpus-based Study of Spoken and Written Registers.Amsterdam:Benjamins.
  9. Biber, Douglas,Barbieri, Federica(2007).Lexical bundles in university spoken and written registers.English for Specific Purposes,26(3),263-286.
  10. Biber, Douglas,Johansson, Stig,Leech, Geoffrey,Conrad, Susan,Finegan, Edward(1999).The Longman Grammar of Spoken and Written English.London:Longman.
  11. Boulton, Alex,Cobb, Tom(2017).Corpus use in language learning: A meta‐analysis.Language Learning,67(2),348-393.
  12. Chen, Yu-hua,Baker, Paul(2010).Lexical bundles in L1 and L2 academic writing.Language Learning & Technology,14(2),30-49.
  13. COVID-19 Open Research Dataset (CORD-19). 2020. Version 2020.05.02. Accessed online, May 2, 2020. https://pages.semanticscholar.org/coronavirus-research. doi:10.5281/zenodo.3715505
  14. Davis, Mark. 2020. The Coronavirus Corpus. Accessed online, May 2, 2020. https://www.english-corpora.org/corona/
  15. Farghal, Mohammed,Obiedat, Hussein(1995).Collocations: A neglected variable in EFL.IRAL-International Review of Applied Linguistics in Language Teaching,33(4),315-332.
  16. Firth, John(1957).Papers in Linguistics, 1934-1951.Oxford:Oxford University Press.
  17. Fujii, Atsushi,Ishikawa, Tetsuya(2000).Utilizing the World Wide Web as an encyclopedia: Extracting term descriptions from semi-structured texts.Proceedings of the 38th Annual Meeting on Association for Computational Linguistics,Hong Kong:
  18. Gitsaki, Christina(1996).Brisbane,The University of Queensland.
  19. Halliday, Michael A. K.,McIntosh, Angus,Strevens, Peter(1964).The Linguistic Sciences and Language Teaching.London:Longman.
  20. Howarth, Peter(1998).Phraseology and second language proficiency.Applied Linguistics,19(1),24-44.
  21. Jones, Rosie,Ghani, Rayid(2000).Automatically building a corpus for a minority language from the web.Annual Meeting-Association for Computational Linguistics
  22. Kilgarriff, Adam,Grefenstette, Gregory(2003).Introduction to the special issue on the web as corpus.Computational Linguistics,29(3),333-347.
  23. Kilgarriff, Adam,Tugwell, David(2001).Word sketch: Extraction and display of significant collocations for lexicography.Proceedings of the Workshop on Collocation: Computational Extraction, Analysis and Exploitation,Toulouse, France:
  24. Leech, Geoffrey(1997).Teaching and language corpora: A convergence.Teaching and Language Corpora,London:
  25. Lewis, Michael(1997).Implementing the Lexical Approach: Putting Theory into Practice.United Kingdom:Heinle.
  26. Lewis, Michael(2000).Teaching Collocation: Further Development in Lexical Approach.England:The Language Teaching Publication, LTP.
  27. Lin, Yu-hsiu(2011).Taipei,National Taiwan Normal University.
  28. Lui, Marco,Cook, Paul(2013).Classifying English documents by national dialect.Proceedings of the Australasian Language Technology Association Workshop 2013 (ALTA 2013)
  29. Nation, I. S Paul(2001).Learning Vocabulary in Another Language. Ernst Klett Sprachen.Cambridge:Cambridge Press.
  30. Nattinger, James R.,DeCarrico, Jeanette S.(1992).Lexical Phrases and Language Teaching.Oxford:Oxford University Press.
  31. Rayson, Paul,Garside, Roger(2000).Comparing corpora using frequency profiling.Proceedings of the Workshop on Comparing Corpora,Hong Kong, China:
  32. Resnik, Philip(1999).Mining the web for bilingual text.Proceedings of the 37th Annual Meeting of the Association for Computational Linguistics on Computational Linguistics,Maryland, U.S.A.:
  33. Rychlý, Pavel(2007).Manatee/Bonito- a modular corpus manager.Proceedings of the 1st Workshop on Recent Advances in Slavonic Natural Language Processing,Brno, Czech Republic:
  34. Salazar, Lorenzo,Joy, Danica(2011).Barcelona,Universitat de Barcelona.
  35. Scott, Mike. 2008. WordSmith Tools version 5. Liverpool: Lexical Analysis Software. http://www.lexically.net/wordsmith/version5/index.html
  36. Scott, Mike(1997).PC analysis of key words- and key key words.System,25(2),233-245.
  37. Scott, Mike,Tribble, Christopher(2006).Textual Patterns: Key Words and Corpus Analysis in Language Education.Amsterdam:John Benjamins.
  38. Simpson-Vlach, Rita,Ellis, Nick C.(2010).An academic formulas list: New methods in phraseology research.Applied Linguistics,31(4),487-512.
  39. Sinclair, John,Renouf, Antoinette(1988).A lexical syllabus for language learning.Vocabulary and Language Teaching
  40. Wicke, Philipp,Bolognesi, Marianna M.(2020).,未出版
  41. William, Raymond(1976).Keywords: A Vocabulary of Culture and Society.New York:Oxford University Press.
  42. Willis, Dave(2003).Rules, Patterns and Words: Grammar and Lexis in English Language Teaching.Cambridge:CUP.
  43. Woolard, George(2000).Collocation- encouraging learner independence.Teaching Collocation: Further Development in the Lexical Approach,Oxford:
  44. Wray, Alison,Perkins, Michael R.(2000).The functions of formulaic language: An integrated model.Language & Communication,20(1),1-28.
  45. 王建勤, Jian-qin(1997).漢語作為第二語言的習得研究.北京=Beijing:北京語言大學出版社=Beijing Language and Culture University Press.
  46. 全香蘭, Xiang-lan(2008)。韓語漢字詞對學生習得漢語詞語的影響。基於中介語語料庫的漢語詞彙專題研究
  47. 吳鑑城, Jian-cheng,陳浩然, Howard Hao-jan,張俊盛, Jason S.(2017)。網路語料庫介紹與應用。語料庫與華語教學,臺北=Taipei:
  48. 胡明揚, Ming-yang(2006)。詞彙教學理論。對外漢語詞彙及詞彙教學研究
  49. 馬玉汴, Yu-bian(2006)。詞彙教學方法。對外漢語詞彙及詞彙教學研究
  50. 高燕, Yan(2008).對外漢語詞彙教學.上海=Shanghai:華東師範大學出版社=East China Normal University Press.
  51. 陳浩然, Howard Hao-jan,潘依婷, I-ting(2017)。語料庫與華語教學。語料庫與華語教學,臺北=Taipei:
  52. 陳燕秋, Yan-qiu(2006)。新聞選讀團體語言教學法實例。台灣華語文教學,1,36-39。
  53. 陸國強, Guo-qiang(1983).現代英語詞彙學.上海=Shanghai:上海外語教育出版社=Shanghai Foreign Language Education Press.
  54. 彭增安, Zeng-an(2007).跨文化的語言傳通-漢語二語習得與教學.上海=Shanghai:學林出版社=Academia Press.
  55. 黃琡華, Chu-hua(2014)。從新聞語體特色談華語新聞教學。華語學刊,16,80-86。
  56. 董政, Zhen,鄭艷群, Yan-qun(2008)。歐美學生漢語量詞的使用情況。基於中介語語料庫的漢語詞彙專題研究
  57. 劉亞菲, Ya-fei,鄭艷群, Yan-qun(2008)。韓國學生漢語量詞的使用情況。基於中介語語料庫的漢語詞彙專題研究
  58. 蕭頻, ping,張妍, Yan(2008)。印尼學生漢語單音節動詞語義偏誤。基於中介語語料庫的漢語詞彙專題研究
  59. 謝舒凱, Shu-kai(2017)。中文語料與詞彙知識地圖。語料庫與華語教學,臺北=Taipei:
被引用次数
  1. 劉德馨,施孟賢(2022)。建置構式成語語料庫輔助華語教學-兼論同型構式和近義構式。華語文教學研究,19(1),95-121。