题名

Implementation and Evaluation of a Retrieval-based Chinese Humor Chatbot

并列篇名

基於檢索方法的中文幽默對話系統之建置應用與評估

DOI

10.6182/jlis.202012_18(2).073

作者

曾元顯(Yuen-Hsien Tseng);許瑋倫(Wei-Lun Hsu);吳玟萱(Wun-Syuan Wu);古怡巧(Yi-Ciao Gu);陳學志(Hsueh-Chih Chen)

关键词

Computational Humor ; Chinese Humorous Dialogue ; Humor Corpus ; Dialogue System ; Icebreaker Chatbot ; 計算幽默 ; 中文幽默對話 ; 幽默語料 ; 對話系統 ; 破冰機器人

期刊名称

圖書資訊學刊

卷期/出版年月

18卷2期(2020 / 12 / 01)

页次

73 - 101

内容语文

英文;繁體中文

中文摘要

This research is to construct a humorous corpus, develop related technologies, implement a retrieval-based "icebreaker chatbot" system which allows users to find relevant jokes for use in relaxing an unduly formal atmosphere when interacting with people, and finally evaluate its effectiveness. Through the iterative steps of the information system development research method, query expansion based on Word2Vec technology, frequent keyword prompts, and random recommendation of good jokes are added after user feedback. The results are that the proportion of user queries that fail to find jokes is reduced from 25.4% to 16.7% and that the icebreaking effect achieved has been increased from 27.9% to 39.9%. The importance of this research not only compiled a corpus of nearly 5,000 Chinese jokes, but also built a Chinese humor dialogue system, which have both been publicized at https://github.com/SamTseng/icebreaker for future use and verification. Empirical experience and implications of this study include: automating the richness and quality of joke corpus and providing recommendation service are important R & D efforts to improve the effectiveness of such services.

英文摘要

本研究發展幽默語料庫,開發相關的技術,實作一個基於檢索方法的「破冰機器人」系統,讓使用者透過對話找出相關的笑話,在與人互動時打破冰冷的氣氛,活絡陌生、尷尬的情境,最後評估其運用成效。透過資訊系統開發研究法的循環步驟,經過回饋後加入Word2Vec的查詢擴展、關鍵詞查詢提示,以及好笑話的隨機推薦等功能,讓使用者找不到笑話的比例從25.4%降低到16.7%,而系統達到的破冰效果從27.9%提升到39.9%。綜合而言,本研究不僅蒐集編製了近5,000則正體中文幽默語料庫,也建置中文幽默對話應用系統,語料與程式公開於https://github.com/SamTseng/icebreaker。本研究結論提供了實證經驗與意涵:自動化豐富笑話語料並確保其幽默程度,以及提供推薦功能,是提升此類服務成效的重要研發工作。

主题分类 人文學 > 圖書資訊學
参考文献
  1. 鄭昭明, Chao-Ming,陳學志, Hseuh-Chih,詹雨臻, Yu-Chen,蘇雅靜, Ya-Ching,曾千芝, Chien-Chih(2013)。臺灣地區華人情緒與相關心理生理資料庫─中文笑話評定常模。中華心理學刊,55(4),555-569。
    連結:
  2. 鄭昭明, Chao-Ming,陳學志, Hseuh-Chih,詹雨臻, Yu-Chen,蘇雅靜, Ya-Ching,曾千芝, Chien-Chih(2013)。臺灣地區華人情緒與相關心理生理資料庫─中文笑話評定常模。中華心理學刊,55(4),555-569。
    連結:
  3. 鄭昭明, Chao-Ming,陳學志, Hseuh-Chih,詹雨臻, Yu-Chen,蘇雅靜, Ya-Ching,曾千芝, Chien-Chih(2013)。臺灣地區華人情緒與相關心理生理資料庫─中文笑話評定常模。中華心理學刊,55(4),555-569。
    連結:
  4. Augello, A.,Saccone, G.,Gaglio, S.,Pilato, G.(2008).Humorist bot: Bringing computational humour in a chat-bot system.The second International Conference on Complex, Intelligent and Software Intensive Systems,Los Alamitos, CA:
  5. Bellegarda, J. R.(2014).Spoken language understanding for natural interaction: The Siri experience.Natural interaction with robots, knowbots and smartphones,New York, NY:
  6. Bergen, B.,Coulson, S.(2006).Frame-shifting humor in simulation-based language understanding.IEEE Intelligent Systems,21(2),59-62.
  7. Binsted, K.(1995).Using humour to make natural language interfaces more friendly.AI, ALife and Entertainment Workshop,Montreal, Canada:
  8. Binsted, K.,Ritchie, G.(1997).Computational rules for generating punning riddles.Humor: International Journal of Humor Research,10(1),25-76.
  9. Blinov, V.,Bolotova-Baranova, V.,Braslavski, P.(2019).Large dataset and language model fun-tuning for humor recognition.Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics,Stroudsburg, PA:
  10. Blinov, V.,Mishchenko, K.,Bolotova, V.,Braslavski, P.(2017).A pinch of humor for short-text conversation: An information retrieval approach.Lecture Notes in Computer Science: Vol. 10456. Experimental IR Meets Multilinguality, Multimodality, and Interaction. CLEF 2017,Cham, Switzerland:
  11. Bryant, J.,Zillmann, D.(1989).Chapter 2: Using humor to promote learning in the classroom.Journal of Children in Contemporary Society,20(1/2),49-78.
  12. Cambridge Advanced Learner’s Dictionary. (2019). Icebreaker. Retrieved from https://dictionary.cambridge.org/zht/dictionary/english-chinese-traditional/icebreaker
  13. Chen, L., & Lee, C. M. (2017). Predicting audience’s laughter using convolutional neural network. Retrieved from https://arxiv.org/abs/1702.02584
  14. Corrall, S.,Kennan, M. A.,Afzal, W.(2013).Bibliometrics and research data management services: Emerging trends in library support for research.Library Trends,61(3),636-674.
  15. Deerwester, S.,Dumais, S. T.,Furnas, G. W.,Landauer, T. K.,Harshman, R.(1990).Indexing by latent semantic analysis.Journal of the American Society for Information Science,41(6),391-407.
  16. Du, S., Wan, X., & Ye, Y. (2017). Towards automatic generation of entertaining dialogues in Chinese crosstalks. Retrieved from https://arxiv.org/abs/1711.00294
  17. Firth, J. R.(Ed.)(1957).Studies in linguistic analysis.Oxford, England:Blackwell.
  18. Gang, W. Y.,Bo, S.,Chen, S. M.,Yi, Z. C.,Zi, M. P.(2014).Chinese intelligent chat robot based on the AIML language.2014 Sixth International Conference on Intelligent Human-machine Systems and Cybernetics,Los Alamitos, CA:
  19. Ivanov N., Khalman M., Smetanin N., Rodichev A., & Fedorenko D. (2019). CakeChat: Emotional generative dialog system [Computer program]. Retrieved from https://github.com/lukalabs/cakechat
  20. Ji, Z., Lu, Z., & Li, H. (2014). An information retrieval approach to short text conversation. Retrieved from https://arxiv.org/abs/1408.6988
  21. Johnson, K. (2017). Facebook Messenger hits 100,000 bots. Retrieved from https://venturebeat.com/2017/04/18/facebook-messenger-hits-100000-bots/
  22. Kane, D. A. (2015). ANTswers: An interactive library FAQ. Retrieved from https://escholarship.org/uc/item/4bs6s3hs
  23. Le, Q. V., & Mikolov, T. (2014). Distributed representations of sentences and documents. Retrieved from https://arxiv.org/abs/1405.4053
  24. Li, J., Monroe, W., Ritter, A., Galley, M., Gao, J., & Jurafsky, D. (2016). Deep reinforcement learning for dialogue generation. Retrieved from https://arxiv.org/abs/1606.01541
  25. Mcghee, P. E.,Frank, M.(2014).Humor and children’s development: A guide to practical applications.Oxford, England:Routledge.
  26. Mihalcea, R.,Strapparava, C.(2006).Technologies that make you smile: Adding humor to text-based applications.IEEE Intelligent Systems,21(5),33-39.
  27. Mihalcea, R.,Strapparava, C.(2006).Learning to laugh (automatically): Computational models for humor recognition.Computational Intelligence,22(2),126-142.
  28. Mikolov, T., Chen, K., Corrado, G., & Dean, J. (2013). Efficient estimation of word representations in vector space. Retrieved from https://arxiv.org/abs/1301.3781
  29. Mikolov, T.,Sutskever, I.,Chen, K.,Corrado, G. S.,Dean, J.(2013).Distributed representations of words and phrases and their compositionality.Proceedings of the 26th International Conference on Neural Information Processing Systems,Red Hook, NY:
  30. Morkes, J.,Kernal, H. K.,Nass, C.(1999).Effects of humor in task-oriented humancomputer interaction and computer-mediated communication: A direct test of SRCT theory.Human-Computer Interaction,14(4),395-435.
  31. Moudgil, A. (2017). Short-jokes-dataset [Python scripts]. Retrieved from https://github.com/amoudgl/short-jokes-dataset
  32. Newyear, D., & McNeal, M. (2014). Extending library services with AI conversational agents. Retrieved from https://www.slideserve.com/garth/ai-conversational-agents
  33. Nijholt, A.(2006).Embodied conversational agents: “A little humor too”.IEEE Intelligent Systems,21(2),62-64.
  34. Nunamaker, J. F.,Chen, M.,Purdin, T. D. M.(1990).Systems development in information systems research.Journal of Management Information Systems,7(3),89-106.
  35. Özbal, G.,Strapparava, C.(2012).Computational humour for creative naming.Proceedings 3rd International Workshop on Computational Humor,Amsterdam, Netherlands:
  36. Potash, P.,Romanov, A.,Rumshisky, A.(2017).Sem Eval-2017 task 6: #HashtagWars: Learning a sense of humor.Proceedings of the 11th International Workshop on Semantic Evaluation,Stroudsburg, PA:
  37. Provine, R. R.(2001).Laughter: A scientific investigation.London, England:Penguin Books.
  38. Qiu, M.,Li, F.-L.,Wang, S.,Gao, X.,Chen, Y.,Zhao, W.,Chu, W.(2017).AliMe Chat: A sequence to sequence and rerank based chatbot engine.Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics,Stroudsburg, PA:
  39. Ritchie, G.,Manurung, R.,Pain, H.,Waller, A.,O’Mara, D.(2006).The STANDUP interactive riddle-builder.IEEE Intelligent Systems,21(2),67-69.
  40. Salton, G.(1989).Automatic text processing: The transformation, analysis, and retrieval of information by computer.Reading, MA:Addison-Wesley.
  41. Sjöbergh, J.,Araki, K.(2009).A very modular humor enabled chat-bot for Japanese.Proceedings of Conference of the Pacific Association for Computational Linguistics (PACLING 2009),Sapporo, Japan:
  42. Stock, O.,Strapparava, C.(2006).Automatic production of humorous expressions for catching the attention and remembering.IEEE Intelligent Systems,21(2),64-67.
  43. Stock, O.,Strapparava, C.(2003).Getting serious about the development of computational humor.Proceedings of the 18th International Joint Conference on Artificial Intelligence,San Franciso, CA:
  44. Tseng, Y.-H.,Teahan, W. J.(2004).Verifying a Chinese collection for text categorization.Proceedings of the 27th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval,New York, NY:
  45. Wallace, R. (2003). The elements of AIML style. Retrieved from https://files.ifi.uzh.ch/cl/hess/classes/seminare/chatbots/style.pdf
  46. Wen, T.-H., Gasic, M., Kim, D., Mrksic, N., Su, P.-H., Vandyke, D., & Young, S. (2015). Stochastic language generation in dialogue using recurrent neural networks with convolutional sentence reranking. Retrieved from https://arxiv.org/abs/1508.01755
  47. Yang, D.,Lavie, A.,Dyer, C.,Hovy, E.(2015).Humor recognition and humor anchor extraction.Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing,Lisbon, Portugal:
  48. Zhang, R.,Liu, N.(2014).Recognizing humor on Twitter.Proceedings of the 23rd ACM International Conference on Conference on Information and Knowledge Management,New York, NY:
  49. 任璐, Lu,楊亮, Liang,徐琳宏, Linhong,樊小超, Xiaochao,刁宇峰, Yufeng,林鴻飛, Honfei(2018)。中文笑話語料庫的構建與應用。中文信息學報,32(7),20-29。
  50. 李廣偉, Guang-Wei,戈玲玲, Ling-Ling,劉朝暉, Zhao-Hui(2016)。言語幽默漢英平行歷時語料庫及其檢索系統的構建與應用。外語電化教學,172,60-65。
  51. 李璠, Fan(2017)。基于自建語料庫對環境幽默語篇的多維度分析。環球市場信息導報,21,102-106。
  52. 劉鋒, Feng,張京魚, Jin-Yu(2015)。基於多媒體語料庫的小學生幽默話語會話分析。山東師範大學外國語學院學報:基礎英語教育,17(2),15-21。
被引用次数
  1. 黃淑齡,王昱鈞(2023)。詞嵌入應用於佛學研究—兼論詞嵌入模型評估。數位典藏與數位人文,12,43-82。