英文摘要
|
National Archives Administration, National Development Council (hereinafter referred as NAA) has been acquiring and digitizing national archives since its establishment. And NAA is looking forward to make use of the above results for remote services and researches to utilize various added values. Currently, users are able to search national archives catalog via Archive Access service and use archival images. Still, NAA would like to provide more content services by trying out new information technologies. Thus, this research combed the development and implements of current optical character recognition (OCR) technologies to explore possibilities to uncover more archival content.
|
参考文献
|
-
林巧敏,陳志銘(2017)。古籍風華再現:關於古籍數位人文平台之建置。國家圖書館館刊,106(1),111-132。
連結:
-
林巧敏,蔡瀚緯(2020)。光學字元辨識古籍之全文轉置經驗:以明人文集為例。圖資與檔案學刊,12(2),76-117。
連結:
-
National Archives Catalog (n.d.b). Missing Air Crew Report number 14011. Retrieved from https://catalog.archives.gov/id/91149134 (May 11, 2021)
-
National Archives Catalog (n.d.a). "Did you bail out? No". Retrieved from https://catalog.archives.gov/search?q=%22Did%20you%20bail%20out%3F%20No%22 (May 11, 2021)
-
簡牘字典(n.d.a)。關於。檢自 https://wcd-ihp.ascdc.sinica.edu.tw/woodslip/about.html (Aug. 22, 2022)
-
簡牘字典(n.d.b)。史語所藏居延漢簡資料庫。檢自 https://wcd-ihp.ascdc.sinica.edu.tw/woodslip/item.php?id1=H00167 (Sep.17, 2021)
-
Center for Open Data in the Humanities(n.d.b)。KuroNet サービス . Retrieved from http://codh.rois.ac.jp/kuronet/iiif-curation-viewer/?curation=https://mp.ex.nii.ac.jp/api/kuronet/curation/05e9e254d1a2916116370389c9fae977f9842a91&mode=annotation&lang=en (Jun. 24, 2021)
-
Center for Open Data in the Humanities(2021)。「みを(miwo)」デモ @ 慶應義塾ミュージアム・コモンズ . Retrieved from https://www.youtube.com/watch?v=s3vjEme2gyM (Jun. 24, 2021)
-
Center for Open Data in the Humanities(n.d.a)。KuroNet サービス . Retrieved from http://codh.rois.ac.jp/char-shape/app/icv-kuzushiji/?manifest=http://codh.rois.ac.jp/char-shape/book/100249376/manifest.json&pos=7&lang=en (Sep. 22, 2021)
-
Cordis (2019). Opening up Europe’s Written Cultural Heritage to People All over the World. Retrieved from https://cordis.europa.eu/article/id/411587-opening-up-europe-s-written-cultural-heritage-to-people-all-over-the-world (May 21, 2021)
-
Dunley, R. (2018). Machines Reading the Archive_ Handwritten Text Recognition Software - The National Archives Blog. Retrieved from https://blog.nationalarchives.gov.uk/machines-reading-the-archive-handwritten-text-recognition-software/ (Jan. 15, 2021)
-
Gemeente amsterdam stadsarchief (2021). Amsterdam City Archives. Retrieved from https://transkribus.eu/r/amsterdam-city-archives/#/ (Jun. 16, 2021)
-
Mackenzie, F. (2019). How to teach a computer to read - The National Archives blog. Retrieved from https://blog.nationalarchives.gov.uk/how-to-teach-a-computer-to-read/ (Jan. 15, 2021)
-
Muehlberger, G.,Seaward, L.,Terras, M.,Ares Oliveira, S.,Bosch, V.,Bryan, M.,Zagoris, K.(2019).Transforming scholarship in the archives through handwritten text recognition: Transkribus as a case study.Journal of Documentation,75(5),954-976.
-
National Archives of Finland (2021). Search Finnish Court Records. Retrieved from https://tuomiokirjat.narc.fi/en (Jun. 16, 2021)
-
National Archives of Finland (2019a). HTR Models - Update. Retrieved from https://makingamodernarchive.blogspot.com/2019/12/htr-models-update.html (Jun. 17, 2021)
-
National Archives of Finland (2019b). Keyword spotting - an effective search tool. Retrieved from https://makingamodernarchive.blogspot.com/2019/07/keyword-spotting-effective-search-tool.html (Jun. 17, 2021)
-
NTT DATA(2021)。使用 OpenCV 及 Tesseract 進行 OCR 辨識(3)-使用 Tesseract 進行 OCR 辨識。檢自 https://medium.com/ntt-data-idi-platform/-aacdcb9f47a8 (Jun. 23, 2021)
-
Read-coop. (2020). Amsterdam Notary Archives out of the Dark. Retrieved from https://readcoop.eu/success-stories/amsterdam-notary-archives-out-of-the-dark/ (Jun. 16, 2021)
-
Schaefer, M. (2020). New Search Feature: Optical Character Recognition (OCR). Retrieved from https://narations.blogs.archives.gov/2019/09/09/new-search-feature-optical-character-recognition-ocr/ (Jan. 17, 2021)
-
Tesseract (2021b).TESSERACT(1) Manual Page. Retrieved from https://github.com/tesseract-ocr/tesseract/blob/master/doc/tesseract.1.asc (May 7, 2021)
-
Tesseract (2021a).Tesseract OCR. Retrieved from https://github.com/tesseract-ocr/tesseract (Jun. 24, 2021)
-
Transcribe Bentham (2019). Project Update - gamifying the transcription of Bentham’s writings. Retrieved from https://blogs.ucl.ac.uk/transcribe-bentham/2019/02/28/project-update-game-jam/ (Mar. 31, 2021)
-
Transkribus (n.d.). Transkribus. Retrieved from https://transkribus.eu/ (Jun. 18, 2021)
-
Transkribus Lite (n.d.a). Transkribus Lite. Retrieved from https://transkribus.eu/lite (Jun. 18, 2021)
-
Transkribus Lite (n.d.b). Collections. Retrieved from https://transkribus.eu/lite/collection/105313/doc/696703/detail/1 (Jul. 30, 2021)
-
Wikipedia (2021a). Optical character recognition. Retrieved from https://en.wikipedia.org/wiki/Optical_character_recognition (May 6, 2021)
-
Wikipedia (2021b). Word error rate. Retrieved from https://en.wikipedia.org/wiki/Word_error_rate (Jul. 15, 2021)
-
Wikipedia (2021c). Tesseract (software). Retrieved from https://en.wikipedia.org/wiki/Tesseract (software) (May 7, 2021)
-
三菱總合研究所(2010)。全文テキスト化実証実験に係る調査及び評価支援等作業実証実験報告書。檢自https://www.ndl.go.jp/jp/preservation/digitization/zenbun.pdf (Sep. 22, 2021)
-
中央研究院數位文化中心(n.d)。核心技術。檢自 https://ascdc.sinica.edu.tw/technology (Aug. 30, 2022)
-
吳正己(2000)。教育大辭書:光學字元辨認 Optical Character Recognition, OCR。檢自 https://terms.naer.edu.tw/detail/1304360/?index=1 (May 4, 2021)
-
李佩瑛,程婉如(2009).期刊報紙數位化工作流程指南.臺北市:數位典藏與數位學習國家型科技計畫拓展臺灣數位典藏計畫.
-
林裕淵,曾逸鴻(2007)。中文文件影像中之特殊字體偵測。科學與工程技術期刊,3(4),29-39。
-
青池亨(2019)。新たな検索機能提供のための調査研究活動-次世代デジタルライブラリーを中心とした近年の取組紹介。檢自 https://codh.repo.nii.ac.jp/?action=pages_view_main&active_action=repository_view_main_item_detail&item_id=374&item_no=1&page_id=30&block_id=41 (Sep. 22, 2021)
-
國立國會圖書館(2021)。広島県会社會社要覧昭和 15 年度版。檢自 https://lab.ndl.go.jp/dl/book/1032993?keyword=%E6%98%AD%E5%92%8C (Sep. 22, 2021)
-
國立國會圖書館(2010)。OCR 用いたデジタル画像の全文テキスト実施結果報告書。檢自 https://www.ndl.go.jp/jp/preservation/digitization/fulltextreport.html (Sep. 22, 2021)
-
陳建名,江文瑩(2016)。光學閱讀辨識系統在普查業務的應用概況。主計月刊,729,62-65。
-
蔡瀚緯(2017)。臺北市,國立政治大學。
-
賴敏軒(2011)。臺北市,國立臺灣師範大學。
-
顧力仁(2001)。中文古籍全文資料庫建置比較研究。國家圖書館館刊,90(2),197-216。
|