题名

中文版「語文探索與字詞計算」詞典之建立

并列篇名

The Development of the Chinese Linguistic Inquiry and Word Count Dictionary

DOI

10.6129/CJP.2012.5402.04

作者

黃金蘭(Chin-Lan Huang);Cindy K. Chung;Natalie Hui;林以正(Yi-Cheng Lin);謝亦泰(Yi-Tai Seih);Ben C. P. Lam;程威銓(Wei-Chuan Chen);Michael H. Bond;James W. Pennebaker

关键词

文本分析 ; 情緒書寫 ; 語文探索與字詞計算 ; expressive writing ; LIWC (Linguistic Inquiry and Word Count) ; text analysis

期刊名称

中華心理學刊

卷期/出版年月

54卷2期(2012 / 06 / 01)

页次

185 - 201

内容语文

繁體中文

中文摘要

日常生活語言的使用,包括書寫與口語,反映了個人內在的心理狀態、思考型態甚至人格特質。它為心理學研究提供了一扇探索心靈的窗口。Pennebaker研究團隊(回顧見Pennebaker, 2011)採用字詞計算的方式進行語文特性的分析,並發展出電腦程式「語文探索與字詞計算」(Linguistic Inquiry and Word Count, LIWC)。LIWC的核心在於其詞典,目前LIWC2007已可分析80個字詞類別,具有相當好的信、效度。本研究的主要目的即在建立中文版LIWC詞典,並比較其與英文LIWC2007詞典的對等性及效度檢驗。研究一依據原英文版LIWC詞典建立之基本程序,並經各階段嚴謹的反覆檢驗,並依據中文特性加上一些中文特有詞類,最後編修成包含30個語文類別與42個心理類別的中文版LIWC詞典。研究二則蒐集了100篇中、英文對照文章,進行分析比對。結果發現中文LIWC與LIWC2007所偵測到詞類百分比,大部分有極高的相關,可見兩個版本具有相當的對等性。研究三分析憂鬱文本的特性,發現相對於對照組,憂鬱文本使用較多的第一人稱單數代名詞、較少的第一人稱複數代名詞與較多的負向情緒詞。此一結果與使用LIWC分析的英文相關文獻結果一致,顯示中文LIWC也具有一定的效度。整體而言,本論文透過三個研究建立了與LIWC2007具相當對等性且具有效度的中文LIWC詞典。相信中文LIWC日後可提供華語文使用的心理特性分析一項研究利器。

英文摘要

The analysis of natural language opens a window to the exploration of thoughts, feelings, and personalities. To analyze word use, Pennebaker and his colleagues (Pennebaker, 2011) developed a computer software, Linguistic Inquiry and Word Count (LIWC). LIWC reports on the percentage of words in a text file that are in the grammatical, psychological, and content categories of its dictionary, and was created by judges who rated whether each entry belonged in a category. This paper describes the development of the Chinese LIWC dictionary and establishes its reliability and validity. Study 1 involved the translation of terms from the English LIWC dictionary to Chinese, and judges' ratings of whether each entry belonged in a category. Furthermore, some categories unique to the Chinese language were added to the Chinese LIWC dictionary. These resulted in a total of approximately 6,800 words across 30 linguistic categories and 42 psychological categories. In Study 2, we analyzed one hundred texts and their translations using both the English LIWC dictionary and Chinese LIWC dictionary. Fifty texts of varied genres and authors were written in English and translated to Chinese; others were written in Chinese and translated to English. Reliable correlations were found between the English LIWC and Chinese LIWC categories, indicating acceptable equivalence between the two dictionaries. In Study 3, we analyzed 30 bulletin board messages from a site for people suffering from depression, and 30 messages from a site on a control topic (personal experiences with part-time jobs). Similar to what past researchers have found in English, depressed people used more first person singular pronouns, fewer first person plural pronouns, and more negative emotion words than non-depressed people, confirming the validity of the Chinese LIWC. Just as the English LIWC has led to discoveries in the social sciences, the Chinese LIWC now opens new windows to the psychology of Chinese speakers and authors by providing an efficient means by which to analyze open-ended texts. Future research, applications, and revisions of the Chinese LIWC dictionary are discussed.

主题分类 社會科學 > 心理學
参考文献
  1. 金樹人(2010)。心理位移之結構特性及其辯證現象之分析:自我多重面向的敘寫與敘說。中華輔導與諮商學報,28,187-228。
    連結:
  2. 張惠蓉、何玫樺、黃倩茹(2008)。線上社會支持類型探討:以PTT精神疾病版及整型美容版為例。新聞學研究,94,61-105。
    連結:
  3. 楊中芳(2010)。中庸實踐思維體系探研的初步進展。本土心理學研究,34,3-96。
    連結:
  4. 中央研究院中文詞知識庫小組(1998b):《中央研究院漢語平衡語料資料庫詞集及詞頻統計》。台北:中華民國計算語言學學會。[Chinese Knowledge Information Processing Group. (1998b). Word list with accumulated word frequency in sinica corpus 3.0. Taipei: The Association of Computational Linguistics and Chinese Language Processing.]
  5. 國語推行委員會(2007):《重編國語辭典(修訂本)中華民國九十六年十二月臺灣學術網路四版ver.2》。2010年8月26日,摘自教育部網站,http://dict.revised.moe.edu.tw/ [National Language Committee. (2007). Taiwan Ministry of Education online dictionary version 2, Retrieved August 26, from Ministry of Education Web site: http://dict.revised.moe.edu.tw/]
  6. 中央研究院中文詞知識庫小組(1998a):《線上中文斷詞系統》。2010年8月26日,摘自中文詞知識庫小組網站,http://ckipsvr.iis.sinica.edu.tw/ [Chinese Knowledge Information Processing Group. (1998a). Chinese word segmentation online system. Retrieved August 26, 2010, from Chinese Knowledge Information Processing Group Web site: http://ckipsvr.iis.sinica.edu.tw/]
  7. Beck, A. T.(1967).Depression: Clinical, experimental and theoretical aspects.New York:Hoeber Medical Division, Harper & Row.
  8. Beck, A. T.,Ward, C. H.,Mendelson, M.,Mock, J.,Erbaugh, J.(1961).An inventory for measuring depression.Archives of General Psychiatry,4,561-571.
  9. Choi, I.,Koo, M.,Choi, J. A.(2007).Individual differences in analytic versus holistic thinking.Personality and Social Psychological Bulletin,33,691-705.
  10. Chung, C.,Pennebaker, J.(2007).The psychological functions of function words.Social communication: Frontiers of social psychology,New York:
  11. Cohen, J.(1992).A power primer.Psychological Bulletin,112,155-159.
  12. Cohn, M. A.,Mehl, M. R.,Pennebaker, J. W.(2004).Linguistic markers of psychological change surrounding September 11, 2001.Psychological Science,15,687-693.
  13. Francis, M. E.,Pennebaker, J. W.(1993).,Dallas, TX:Southern Methodist University.
  14. Friedman, H. S.(Ed.).Oxford handbook of health psychology.New York:Oxford University Press.
  15. Gonzales, A. L.,Hancock, J. T.,Pennebaker, J. W.(2010).Language style matching as a predictor of social dynamics in small groups.Communication Research,37,3-19.
  16. Graybeal, A.,Sexton, J. D.,Pennebaker, J. W.(2002).The role of story-making in disclosure writing: The psychometrics of narrative.Psychology & Health,17,571-581.
  17. Gunsch, M. A.,Brownlow, S.,Haynes, S. E.,Mabe, Z.(2000).Differential linguistic content of various forms of political advertising.Journal of Broadcasting & Electronic Media,44,27-42.
  18. Hartley, J.,Pennebaker, J. W.,Fox, C.(2003).Abstracts, introductions and discussions: How far do they differ in style?.Scientometrics,57,389-398.
  19. Ireland, M. E.,Pennebaker, J. W.(2010).Language style matching in writing: Synchrony in essays, correspondence, and Poetry.Journal of Personality and Social Psychology,99,549-571.
  20. Ireland, M. E.,Slatcher, R. B.,Eastwick, P. W.,Scissors, L. E.,Finkel, E. J.,Pennebaker, J. W.(2011).Language style matching predicts relationship initiation and stability.Psychological Science,22,39-44.
  21. Lee, C. H.,Park, J.,Seo, Y. S.(2006).An analysis of linguistic styles by inferred age in TV dramas.Psychological Reports,99,351-356.
  22. Ma, W. Y.,Chen, K. J.(2003).A bottom-up merging algorithm for Chinese unknown word extraction.Proceedings of the Second SIGHAN Workshop on Chinese Language Processing
  23. Ma, W. Y.,Chen, K. J.(2003).Introduction to CKIP Chinese word segmentation system for the first international Chinese Word Segmentation Bakeoff.Proceedings of the Second SIGHAN Workshop on Chinese Language Processing
  24. Mehl, M. R.,Gosling, S. D.,Pennebaker, J. W.(2006).Personality in its natural habitat: Manifestations and implicit folk theories of personality in daily life.Journal of Personality and Social Psychology,90,862-877.
  25. Newman, M. L.,Groom, C. J.,Handelman, L. D.,Pennebaker, J. W.(2008).Gender differences in language use: An analysis of 14,000 text samples.Discourse Processes,45,211-236.
  26. Newman, M. L.,Pennebaker, J. W.,Berry, D. S.,Richards, J. M.(2003).Lying words: Predicting deception from linguistic styles.Personality and Social Psychology Bulletin,29,665-675.
  27. Peng, K.,Nisbett, R. E.(1999).Culture, dialectics, and reasoning about contradiction.American Psychologist,54,741-754.
  28. Pennebaker, J. W.(2011).The secret life of pronouns: What our words say about us.New York:Bloomsbury Press.
  29. Pennebaker, J. W.(1997).Writing about emotional experiences as a therapeutic process.Psychological Science,8,162-166.
  30. Pennebaker, J. W.,Beall, S. K.(1986).Confronting a traumatic event: Toward an understanding of inhibition and disease.Journal of Abnormal Psychology,95,274-281.
  31. Pennebaker, J. W.,Booth, R. J.,Francis, M. E.(2007).Operator's manual - Linguistic inquiry and word count: LIWC2007.Austin, TX:LIWC.net.
  32. Pennebaker, J. W.,Chung, C. K.,Ireland, M.,Gonzales, A.,Booth, R. J.(2007).The development and psychometric properties of LIWC2007.Austin, TX:LIWC.net.
  33. Pennebaker, J. W.,Colder, M.,Sharp, L. K.(1990).Accelerating the coping process.Journal of Personality and Social Psychology,58,528-537.
  34. Pennebaker, J. W.,Francis, M. E.(1996).Cognitive, emotional, and language processes in disclosure.Cognition and Emotion,10,601-626.
  35. Pennebaker, J. W.,Francis, M. E.,Booth, R. J.(2001).Linguistic inquiry and word count: LIWC2001.Mahwah, NJ:Erlbaum.
  36. Pennebaker, J. W.,Kiecolt-Glaser, J.,Glaser, R.(1988).Disclosure of traumas and immune function: Health implications for psychotherapy.Journal of Consulting and Clinical Psychology,56,239-245.
  37. Pennebaker, J. W.,King, L. A.(1999).Linguistic styles: Language use as an individual difference.Journal of Personality and Social Psychology,77,1296-1312.
  38. Pennebaker, J. W.,Lay, T. C.(2002).Language use and personality during crises: Analyses of Mayor Rudolph Giuliani's press conferences.Journal of Research in Personality,36,271-282.
  39. Pennebaker, J. W.,Mayne, T. J.,Francis, M. E.(1997).Linguistic predictors of adaptive bereavement.Journal of Personality and Social Psychology,72,863-871.
  40. Pennebaker, J. W.,Mehl, M. R.,Niederhoffer, K. G.(2003).Psychological aspects of natural language use: Our words, our selves.Annual Review of Psychology,54,547-577.
  41. Pyszczynski, T.,Greenberg, J.(1987).Self-regulatory perseveration and the depressive self-focusing style: A self-awareness theory of reactive depression.Psychological Bulletin,102,122-138.
  42. Rude, S.,Gortner, E.-M.,Pennebaker, J.(2004).Language use of depressed and depression-vulnerable college students.Cognition & Emotion,18,1121-1133.
  43. Spencer-Rodgers, J.,Peng, K.,Wang, L.(2009).Dialecticism and the co-occurrence of positive and negative emotions across cultures.Journal of Cross-Cultural Psychology,41,109-115.
  44. Spencer-Rodgers, J.,Williams, M. J.,Peng, K.(2010).Cultural differences in expectations of change and tolerance for contradiction: A decade of empirical research.Personality and Social Psychology Review,14,296-312.
  45. Stirman, S. W.,Pennebaker, J. W.(2001).Word use in the poetry of suicidal and nonsuicidal poets.Psychosomatic Medicine,63,517-522.
  46. Tausczik, Y. R.,Pennebaker, J. W.(2010).The psychological meaning of words: LIWC and computerized text analysis methods.Journal of Language and Social Psychology,29,24-54.
  47. Tsai, J. L.(2007).Ideal affect: Cultural causes and behavioral consequences.Perspectives on Psychological Science,2,242-259.
  48. Tsai, J. L.,Knutson, B.,Fung, H. H.(2006).Cultural variation in affect valuation.Journal of Personality and Social Psychology,90,288-307.
  49. Tsai, J. L.,Miao, F. F.,Seppala, E.(2007).Good feelings in Christianity and Buddhism: Religious differences in ideal affect.Personality and Social Psychology Bulletin,33,409-421.
  50. Watson, D.,Clark, L. A.,Tellegen, A.(1988).Development and validation of brief measures of positive and negative affect: The PANAS scales.Journal of Personality and Social Psychology,54,1063-1070.
  51. Zimmerman, M.,Coryell, W.(1987).The inventory to diagnose depression, lifetime version.Acta Psychiatrica Scandinavica,75,495-499.
  52. 鄭昭明、陳英孜、卓淑玲、陳學志、梁庚辰(2011)。華人情緒類別的加性樹狀結構。第七屆華人心理學家學術研討會,台北市=Taipei:
被引用次数
  1. Tzu-Ying Wu,Tzu-Ying Lin,Chien Huang(2023)。When Digital Deception is not the Extension of Real-life Deception: Computerized Text Analysis of Digital Deception in Day-to-day News。中華心理學刊,65(3),215-229。
  2. 黃健,黃彥霖,吳姿穎(2022)。當我很好時,你應該提防:隱現於性罪犯治療週誌之再犯軌跡。中華輔導與諮商學報,65,1-25。
  3. 黃金蘭,林瑋芳,李怡青(2021)。婚姻平權議題之支持方與反對方的心理特性差異:以字詞分析為取向。教育心理學報,53(1),109-126。
  4. 黃金蘭,林瑋芳,林以正(2014)。從LIWC到C-LIWC:電腦化中文字詞分析的潛力。臺灣諮商心理學報,2(1),97-111。
  5. 黃金蘭,林瑋芳,林以正,李嘉玲,James W. Pennebaker(2020)。語言探索與字詞計算詞典2015中文版之修訂。調查研究-方法與應用,45,73-118。
  6. 黃金蘭,林以正,仲傳仁(2022)。觀點取替對態度極化的緩解作用:中介及遷移效果分析。教育心理學報,54(2),283-306。
  7. 黃金蘭、程威銓、張仁和、林以正(2014)。我你他的轉變:以字詞分析探討大學生心理位移書寫文本之位格特性。中華輔導與諮商學報,39,35-58。
  8. 黃金蘭、林瑋芳、林以正(2015)。中庸與轉念:以字詞分析體現中庸思維之情緒調節動態歷程。本土心理學研究,44,119-150。
  9. 黃金蘭、張仁和、林以正(2013)。從情緒平和與止觀探討心理位移日記書寫方法的療癒機制。教育心理學報,44(3),589-608。
  10. 黃劭彥,陳玲娜,康照宗,王登仕(2022)。法人說明會影音內容對盈餘管理之影響。管理與系統,29(3),337-361。
  11. 金樹人、李非(2016)。心理位移日記書寫詞語結構與內涵之話語分析。教育心理學報,47(3),305-327。
  12. 林瑋芳(2023)。字詞分析工具(LIWC)的理論基礎及其在體育領域的應用。體育學報,56(1),1-16。
  13. 歐仁彬,黃天受,郝沛毅,林振穎,吳建生(2018)。透過新聞文章預測股價漲跌趨勢-結合情緒分析、主題模型與模糊支持向量機。資訊管理學報,25(4),363-395。
  14. 歐仁彬,楊盛琮,黃天受,郝沛毅(2018)。網路直播聊天室情緒探勘-使用模糊支持向量機。資訊管理學報,25(2),185-218。
  15. 蘇靖雅,譚躍(2023)。臺灣報紙中空污風險的新聞框架:跨時的演變及其內涵。新聞學研究,154,55-112。
  16. 溫錦真,高倜歐,林美珠(2014)。敘事意義性、連貫性與對話性:一次治療書寫的敘事理解與分析。臺灣諮商心理學報,2(1),31-50。
  17. 謝舒凱,曾昱翔(2019)。深度詞庫:邁向知識導向的人工智慧基礎。中華心理學刊,61(3),231-247。
  18. 楊立行,許清芳(2019)。社群媒體上分手文章的性別差異:文本分析取徑。中華心理學刊,61(3),209-230。
  19. 楊新章、黃梓瑞(2017)。基於人格特質之遊戲推薦技術。Electronic Commerce Studies,15(2),209-244。
  20. 葉寶玲(2017)。融合東西方文化思想建構個人諮商模式:以三位專家諮商心理學家為例。本土心理學研究,48,279-330。
  21. 張明偉,徐儷瑜,李采凌(2023)。再評估策略對注意力不足過動症高風險成人生氣誘發情境的情緒調控效果探討。中華心理學刊,65(3),257-276。
  22. 張仁和(2021)。平衡與和諧:自我寧靜系統之特性與機制。本土心理學研究,56,177-243。
  23. 鍾宇軒,黃劭彥,沈子崴(2019)。法人說明會影音資訊之內涵。會計評論,68,39-80。
  24. (2017)。集體記憶中的新媒體事件(2002-2014):情緒分析的視角。傳播與社會學刊,40,105-134。
  25. (2019)。大學生網路社群平臺巨量資料探勘之應用。教育與心理研究,42(3),79-109。
  26. (2023)。臺灣2016和2020年立委選舉中候選人臉書貼文策略的影響。中華傳播學刊,43,199-243。
  27. (2024)。臺灣新聞媒體對於第二波新冠肺炎之報導框架研究。新聞學研究,160,67-135。
  28. (2024)。語言探索與字詞計算詞典2015簡體中文版之建置與應用。本土心理學研究,61,115-163。