题名

數位人文「文本空間化」的實證研究:以詩歌、傳記、日記為例

并列篇名

Spatialization of Textual Data in Digital Humanities: Examples for Poetry, Biographies, and Diaries

DOI

10.6853/DADH.202210_(10).0004

作者

彭逸帆(Yi-Fan Peng);白璧玲(Pi-Ling Pai);薛化元(Hua-Yuan Hsueh);劉昭麟(Chao-Lin Liu)

关键词

文本分析 ; 空間人文 ; 自然語言處理 ; 地理資訊系統 ; Textual analysis ; spatial humanities ; natural language processing ; geographic information system

期刊名称

數位典藏與數位人文

卷期/出版年月

10期(2022 / 10 / 01)

页次

96 - 137

内容语文

繁體中文;英文

中文摘要

早期人文學者對於文本內容的分析與研究往往是建構在知識的博覽,在熟讀相關文獻資料後,將知識內化於研究應用。在進入資訊化社會後,爆量資訊讓研究者必須思考如何調整研究方法,以順應科技帶來的改變,以及改變後所衍生出的相關議題。所幸近年來受益於數位人文多元思維的啟發與影響,帶給傳統人文領域的研究學者諸多創新思路,特別是在研究方法的應用上,不僅突破了既有的理論框架,效率化的研究工具與系統化作業程序,擴展並促成了愈來愈多跨領域學科的合作。本研究嘗試從「文本空間化」的思維角度出發,針對《全臺詩》、《蔣經國先生大事長編初稿》與《雷震日記》等不同類型的文本素材進行分析,並結合資訊技術與空間分析方法,建置相應的文本內容檢索與分析系統,實證「文本空間化」在數位人文領域之應用模式。其中,對於《全臺詩》擷取詩人出生地與詩作中的地名,並藉其時空特性來解析詩人與詩作的相關性;在《蔣經國先生大事長編初稿》資料處理上,則結合時序性文本紀錄之轉換,以空間化與量化資料的圖表視覺化方式,來重構文本內容,分析其不同地點相互關連的發展脈絡;而針對《雷震日記》的文本特色,設計日記文本分析系統,以快速查找不同時期所記述地名及其相關內容,分析人名出現頻率與記主的社會網絡關係。上述實作也揭示了「文本空間化」於拓展數位人文研究視野的可能性。

英文摘要

Textual content analysis explores and internalizes knowledge into a research methodology after reading related references. We live in the information age and should assess how to change the existing research methods. Due to the inspiration and influence of digital humanities, scholars with a traditional research philosophy could adopt many innovative ideas into their existing theoretical frameworks. It extends and enables increasing interdisciplinary collaborations through efficient research tools and systematic procedures. This study focuses on proposing a method of spatializing textual data. As case studies, we use three textual data types: The Complete Collection of Classic Taiwan Poetry, Chronological Memorabilia of Chiang Ching-Kuo, and Lei Chen's Diary. Based on the characteristics of these text types, combined with information technology and spatial methods, we construct relevant analysis systems to demonstrate textual data spatialization in digital humanities. First, for The Complete Collection of Classic Taiwan Poetry, we analyze the poets' birthplace and the place names in poems using information technology and spatial methods to explore the relationship between poets and poetry from the perspective of time and space. The context of interaction between places can be analyzed through graphical visualization methods that spatialize and quantify the data when dealing with the Chronological Memorabilia of Chiang Ching-Kuo. In addition, we establish a diary text analysis system based on diary text features to analyze Lei Chen's Diary. The place names and detailed context in the diary can be quickly retrieved. The above implementation also reveals the possibility of the "spatialization of textual data" in expanding the horizons of digital humanities research.

主题分类 人文學 > 人文學綜合
基礎與應用科學 > 資訊科學
参考文献
  1. 王祿驊,李玉亭,范毅軍,廖泫銘,白璧玲(2011)。《裨海記遊》歷史考證與 3D GIS 整合應用。第十一屆地圖學術研討會(CCA 2011),臺北,臺灣:
    連結:
  2. 李宗信,張育誠,劉庭羽(2021)。劉福才日記》中的社會關係網絡—另一個觀看日記的視角。臺灣師大歷史學報,65,157-213。
    連結:
  3. 張素玢,李宗翰,李毓嵐,李昭容,顧雅文,柯皓仁,謝順宏(2018)。從 CBDB 到 TBDB:以《新修彰化縣志.人物志》為試金石。數位典藏與數位人文,2,91-115。
    連結:
  4. 羅鳳珠,白璧玲,廖泫銘,范毅軍,鄭錦全(2014)。唐代詩人行吟地圖:李白、杜甫、韓愈。圖書館學與資訊科學,40(1),4-25。
    連結:
  5. Bastian, M.,Heymann, S.,Jacomy, M.(2009).Gephi: An open source software for exploring and manipulating networks.Proceedings of the International AAAI Conference on Web and Social Media
  6. Bodenhamer, D. J.,Corrigan, J.,Harris, T. M.(2010).The spatial humanities: GIS and the future of humanities scholarship.Bloomington, IN:Indiana University Press.
  7. Che, W.,Li, Z.,Liu, T.(2010).LTP: A Chinese language technology platform.Proceedings of the 23rd International Conference on Computational Linguistics: Demonstrations,Stroudsburg, PA:
  8. Chen, K.-J.,Huang, C.-R.,Chang, L.-P.,Hsu, H.-L.(1996).SINICA CORPUS: Design methodology for balanced corpora.Proceedings of the 11th Pacific Asia Conference on Language, Information and Computation,Seoul, Korea:
  9. Garbin, E.,Mani, I.(2005).Disambiguating toponyms in news.Human Language Technology Conference and Conference on Empirical Methods in Natural Language Processing,Stroudsburg, PA:
  10. Gazzoni, A.(2017).Mapping Dante: A digital platform for the study of places in the commedia.Humanist Studies & the Digital Age,5,82-95.
  11. Goodchild, M. F.(2013).Prospects for a space–time GIS.Annals of the Association of American Geographers,103,1072-1077.
  12. Google. (2021). Geocoding API. Retrieved from https://developers.google.com/maps/documentation/geocoding/intro
  13. Gregory, I. N.,Geddes, A.(2014).Toward spatial humanities: Historical GIS and spatial history.Bloomington, IN:Indiana University Press.
  14. Harvard University. (2001). China Historical Geographic Information System. Retrieved from https://sites.fas.harvard.edu/~chgis/data/chgis/v6/
  15. Harvard University, Academia Sinica, & Peking University. (2018). China Biographical Database Project. Retrieved from https://projects.iq.harvard.edu/chinesecbdb
  16. Huang, C. R.,Hsieh, S. K.,Chen, K. J.(2017).Mandarin Chinese words and parts of speech: A corpus-based study.Abingdon, UK:Routledge doi.
  17. Jänicke, S.,Franzini, G.,Cheema, M.,Scheuermann, G.(2016).Visual text analysis in digital humanities.Computer Graphics Forum,36(6),226-250.
  18. Manning, C. D.,Schütze, H.(1999).Foundations of statistical natural language processing.Cambridge, MA:MIT Press.
  19. Manning, C. D.,Surdeanu, M.,Bauer, J.,Finkel, J. R.,Bethard, S.,McClosky, D.(2014).The Stanford CoreNLP Natural Language Processing Toolkit.Proceedings of 52nd Annual Meeting of the Association for Computational Linguistics: System Demonstrations,Baltimore, MD:
  20. Murrieta-Flores, P.,Baron, A.,Gregory, I.,Hardie, A.,Rayson, P.(2015).Automatically analyzing large texts in a GIS environment: The registrar general’s reports and Cholera in the 19th Century.Transactions in GIS,19,296-320.
  21. Peng, Y.-F.,Liu, C.-L.(2019).Some GIS-Based analysis of the Complete Taiwan Poems.Digital Humanities Conference 2019 (DH 2019),Utrecht, Netherlands:
  22. Peng, Y.-F.,Pai, P.-L.,Liu, C.-L.(2020).Linking time, space, and statements in one GIS system:.2020 International Conference on Digital Humanities,Ottawa, Canada:
  23. Sun, M., Chen, X., Zhang, K., Guo, Z., & Liu, Z. (2016). THULAC: An efficient lexical analyzer for Chinese. Retrieved from https://github.com/thunlp/THULAC
  24. Wang, X.,Zhang, Y.,Chen, M.,Lin, X.,Yu, H.,Liu, Y.(2010).An evidence-based approach for Toponym Disambiguation.2010 18th International Conference on Geoinformatics,New York, NY:
  25. Weissenbacher, D.,Magge, A.,O’Connor, K.,Scotch, M.,Gonzalez-Hernandez, G.(2019).SemEval-2019 Task 12: Toponym resolution in scientific papers.Proceedings of the 13th International Workshop on Semantic Evaluation,Minneapolis, MN:
  26. Wick, M. (n.d.) GeoNames. Retrieved from https://www.geonames.org
  27. World History Center, University of Pittsburgh. (2020). World Historical Gazetteer. Retrieved from http://whgazetteer.org
  28. 中央研究院(2002)。CCTS 時空對位 API。取自 https://ccts.sinica.edu.tw/api/
  29. 中央研究院臺灣史研究所(2008)。臺灣日記知識庫。取自 http://taco.ith.sinica.edu.tw/tdk/
  30. 中央研究院歷史語言研究所(1984)。漢籍電子文獻資料庫。取自 http://hanchi.ihp.sinica.edu.tw
  31. 范毅軍(2014)。構建虛擬時空框架的設想、具體實踐與應用。第五屆數位典藏與數位人文國際研討會(DADH 2014),臺北,臺灣:
  32. 國立臺灣文學館(2005)。數位《全臺詩》資料庫。取自 https://db.nmtl.gov.tw/site5/index
  33. 許雪姬(2015)。「臺灣日記研究」的回顧與展望。臺灣史研究,22(1),153-184。
  34. 彭逸帆,白璧玲,劉昭麟(2019)。人物傳記之時空架構建置:以「蔣經國先生大事長編」為例。第十屆數位典藏與數位人文國際研討會(DADH 2019),臺北,臺灣:
  35. 彭逸帆,薛化元,劉昭麟(2020)。數位化人物傳記分析之研究方法論——以雷震日記為例。第十一屆數位典藏與數位人文國際研討會(DADH 2020),臺北,臺灣:
  36. 雷震研究中心(2012)。雷震日記。取自 http://leichen.nccu.edu.tw/leichen/05files-diaries.html
  37. 羅鳳珠(2005)。詩詞語言詞彙切分與語意分類標記之系統設計與應用。第四屆數位典藏技術研討會(DADH 2005),臺北,臺灣:
  38. 羅鳳珠,范毅軍,鄭錦全(2009)。文史地理資訊網站建置模式與文學研究之應用。數位典藏地理資訊學術研討會,臺北,臺灣: