英文摘要
|
This research explores the GPT-2 deep learning model for economic news generation and evaluation. After training GPT-2 by about 300,000 pieces of news with a total of 150 million words, 15 news articles are generated by GPT-2. Together with 15 real news articles written by journalists, 12 subjects were invited to judge the credibility of the 30 news articles with 1 to 5 scales. As a result, 8 subjects who graduated from economic-related major were more capable of discriminating the human-composed news (HCN) from the computer-generated news (CGN); while 4 subjects who graduated from non-economic related major had poor discriminating ability, and one was even unable to tell the HCN from the CGN. Among the 15 HCN articles, 1 was rated as non-genuine news, with an average credibility of 2.92, which is less than 3, due to lack of logic and strong subjectivity. Among the 15 CGN articles, 2 were rated as genuine news, with average credibility of 3.33, which is greater than 3, because the content is reasonable and the details are logical. After comparing these two articles with the corpus, it is found that the computer's ability to substitute and retouch can deceive professionals. However, most of the CGN articles have been spotted, mainly because of obvious flaws in facts and incorrect digits such as dates and stock codes. The research also explores the possibility of automatically detecting computer-generated news using BERT-based neural network model. As a result, BERT had only 2 false predictions out of the above 30 news articles. Compared with the collective prediction by the 12 subjects with 5 errors, BERT performs better. Further large-scale experiments show that the effectiveness of BERT can reach an F-score of 0.96.
|
参考文献
|
-
楊德倫、曾元顯(2020)。建置與評估文字自動生成的情感對話系統。教育資料與圖書館學,57(3),355-378。doi: 10.6120/JoEMLS.202011_57(3).0048.RS.CM【Yang, T.-L., & Tseng, Y.-H. (2020). Development and evaluation of emotional conversation system based on automated text generation. Journal of Educational Media & Library Sciences, 57(3), 355-378. doi: 10.6120/JoEMLS.202011_57(3).0048.RS.CM (in Chinese)】
連結:
-
Allcott, H., & Gentzkow, M. (2017). Social media and fake news in the 2016 election. Journal of Economic Perspectives, 31(2), 211–236. doi: 10.1257/jep.31.2.211
連結:
-
Conroy, N. K., Rubin, V. L., & Chen, Y. (2015). Automatic Deception Detection: Methods for Finding Fake News. In Proceedings of the Association for Information Science and Technology (pp. 1-4). St. Louis, MO: Association for Information Science and Technology. doi: 10.1002/pra2.2015.145052010082
連結:
-
Hochreiter, S., & Schmidhuber, J. (1997). Long short-term memory. Neural Computation, 9(8), 1735-1780. doi: 10.1162/neco.1997.9.8.1735
連結:
-
LeCun, Y., Bengio, Y., & Hinton, G. (2015). Deep learning. Nature, 521(7553), 436-444. doi: 10.1038/nature14539
連結:
-
Ruchansky, N., Seo, S., & Liu, Y. (2017). CSI: A hybrid deep model for fake news detection. In E. -P. Lim & M. Winslett. In Proceedings of the 2017 ACM on Conference on Information and Knowledge Management (pp. 797-806). New York, NY: Association for Computing Machinery. doi: 10.1145/3132847.3132877
連結:
-
Shu, K., Cui, L., Wang, S., Lee, D., & Liu, H. (2019). dEFEND: Explainable fake news detection. In A. Teredesai & V. Kumar (Eds.), Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining (pp. 395-405). New York, NY: Association for Computing Machinery. doi: 10.1145/3292500.3330935
連結:
-
Shu, K., Sliva, A., Wang, S., Tang, J., & Liu, H. (2017). Fake news detection on social media: A data mining perspective. ACM SIGKDD Explorations Newsletter, 19(1), 22-36. doi:10.1145/3137597.3137600
連結:
-
Turing, A. M. (1950). Computing machinery and intelligence. Mind, LIX(236), 433-460. doi:10.1093/mind/LIX.236.433
連結:
-
賴志遠(2018)。國際人工智慧政策推動現況。檢自https://portal.stpi.narl.org.tw/index/article/10418【[Lai, Zhi-Yuan] (2018). [Guo ji rengongzhi huizheng cetui dongxian kuang]. Retrieved from https://portal.stpi.narl.org.tw/index/article/10418 (in Chinese)】
-
Brown, T. B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., … Amodei, D. (2020). Language models are few-shot learners. Retrieved from https://arxiv.org/abs/2005.14165
-
ByteDance. (2019). WSDM - Fake News Classification. Retrieved from https://www.kaggle.com/c/fake-news-pair-classification-challenge/
-
Devlin, J., Chang, M.-W., Lee, K., & Toutanova, K. (2018). BERT: Pre-training of deep bidirectional transformers for language understanding. Retrieved from https://arxiv.org/pdf/1810.04805.pdf
-
Du, Z., Chiu, H., hhou453, & lemon234071 (2019). GPT2-Chinese: Tools for training GPT2 model in Chinese language. Retrieved from https://github.com/Morizeyao/GPT2-Chinese
-
Krizhevsky, A., Sutskever, I., & Hinton, G. E. (2012). ImageNet classification with deep convolutional neural networks. Paper presented at the International Conference on Neural Information Processing Systems, Lake Tahoe, NV.
-
McCarthy, J. (2007). Artificial Intelligence. Retrieved from http://jmc.stanford.edu/artificial-intelligence/index.html
-
Radford, A., Narasimhan, K., Salimans, T., & Sutskever, I. (2018). Improving language understanding by generative pre-training. Retrieved from https://cdn.openai.com/research-covers/language-unsupervised/language_understanding_paper.pdf
-
Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., & Sutskever, I. (2019). Language models are unsupervised multitask learners. Retrieved from https://d4mucfpksywv.cloudfront.net/better-language-models/language_models_are_unsupervised_multitask_learners.pdf
-
Russell, S., & Norvig, P. (2009). Artificial Intelligence: A modern approach (3rd ed.). Upper Saddle River, NJ: Prentice Hall.
-
Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., ... Polosukhin, I. (2017). Attention is all you need. Paper presented at the 31st Conference on Neural Information Processing Systems (NIPS 2017). Long Beach, CA. Retrieved from https://arxiv.org/abs/1706.03762
|