题名

運用深度學習建構高效率的行為辨識模型

并列篇名

Application of Deep Learning Technology to Construct Efficient Behavior Identification Models

作者

高毅桓(Yi-Huan Kao);李維平(Wei-Ping Lee);林煦堯(Hsu-Yao Lin)

关键词

深度學習 ; 三維卷積 ; 殘差網路 ; 行為辨識 ; 影像分類 ; Deep learning ; 3D convolution ; residual network ; behavior recognition ; video classification

期刊名称

先進工程學刊

卷期/出版年月

17卷1期(2022 / 02 / 01)

页次

1 - 8

内容语文

繁體中文

中文摘要

在影像中識別人類行為是一項具有挑戰性與重要性的任務,可廣泛的應用於各種情境;如自動監控系統中的異常事件檢測、體育運動分析與影片分類等。本研究以3D ResNet-18模型為基礎進行優化改造,提出一個更簡單且較少超參數的模組化架構。在KTH和UCF-101資料集上的結果表明本文所提出的演算法準確率(Top-1)分別為的96.3%和60.01%,與3D ResNet-18模型相比,本研究能對人類行為進行更準確的辨識。

英文摘要

Recognizing human behavior in images is a challenging and important task. As the image recognition of human behavior recognition has gradually begun to be used in daily life, such as automatic monitoring system in the detection of abnormal events, sports analysis and film classification. This research will optimize and improve the 3D ResNet-18 based model to propose a simple and less hyperparameter adjustment modular architecture. The experimental results on the KTH and UCF-101 data sets show that the improved algorithm accuracy (Top-1) is 96.3% and 60.01%, and the improved model can take more effective features and improve the recognition effect of human behaviors compared with the original 3D ResNet-18 Model.

主题分类 工程學 > 工程學綜合
工程學 > 工程學總論
工程學 > 土木與建築工程
工程學 > 機械工程
工程學 > 化學工業
参考文献
  1. CRCV | Center for Research in Computer Vision at the University of Central Florida. (n.d.). Retrieved from https://www. crcv.ucf.edu/data/, April 18,2020
  2. Baccouche, M.,Mamalet, F.,Wolf, C.,Garcia, C.,Baskurt, A.(2011).Sequential deep learning for human action recognition.International workshop on human behavior understanding
  3. Carreira, J.,Zisserman, A.(2018).,未出版
  4. Dollár, P.,Rabaud, V.,Cottrell, G.,Belongie, S.(2005).Behavior recognition via sparse spatio-temporal features.2005 IEEE International Workshop on Visual Surveillance and Performance Evaluation of Tracking and Surveillance
  5. Donahue, J.,Anne Hendricks, L.,Guadarrama, S.,Rohrbach, M.,Venugopalan, S.,Saenko, K.(2015).Long-term recurrent convolutional networks for visual recognition and description.Proceedings of the IEEE conference on computer vision and pattern recognition
  6. Fukushima, K.(1988).Neocognitron: A hierarchical neural network capable of visual pattern recognition.Neural Networks,1(2),119-130.
  7. He, K.,Zhang, X.,Ren, S.,Sun, J.(2016).Deep residual learning for image recognition.Proceedings of the IEEE conference on computer vision and pattern recognition
  8. He, K.,Zhang, X.,Ren, S.,Sun, J.(2016).Identity mappings in deep residual networks.European conference on computer vision
  9. Hinton, G. E.,Osindero, S.,Teh, Y. W.(2006).A fast learning algorithm for deep belief nets.Neural Computation,18(7),1527-1554.
  10. Ji, S.,Xu, W.,Yang, M.,Yu, K.(2012).3D convolutional neural networks for human action recognition.IEEE Transactions on Pattern Analysis and Machine Intelligence,35(1),221-231.
  11. Krizhevsky, A.,Sutskever, I.,Hinton, G. E.(2012).Imagenet classify cationn with deep convolutional neural networks.Advances in neural information processing systems
  12. LeCun, Y.,Bengio, Y.(1995).Convolutional networks for images, speech, and time series.The Handbook of Brain Theory and Neural Networks,3361(10)
  13. Lin, M.,Chen, Q.,Yan, S.(2014).,未出版
  14. Schuldt, C.,Laptev, I.,Caputo, B.(2004).Recognizing human actions: a local SVM approach.Proceedings of the 17th International Conference on Pattern Recognition, 2004. ICPR 2004
  15. Shou, Z.,Wang, D.,Chang, S. F.(2016).Temporal action localization in untrimmed videos via multi-stage cnns.Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition
  16. Simonyan, K.,Zisserman, A.(2014).,未出版
  17. Simonyan, K.,Zisserman, A.(2014).Two-stream convolutional networks for action recognition in videos.Advances in neural information processing systems
  18. Soomro, K.,Zamir, A. R.,Shah, M.(2012).,未出版
  19. Sultani, W.,Chen, C.,Shah, M.(2018).Real-world anomaly detection in surveillance videos.Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition
  20. Szegedy, C.,Liu, W.,Jia, Y.,Sermanet, P.,Reed, S.,Anguelov, D.(2015).Going deeper with convolutions.Proceedings of the IEEE conference on computer vision and pattern recognition
  21. Tran, D.,Bourdev, L.,Fergus, R.,Torresani, L.,Paluri, M.(2015).Learning spatiotemporal features with 3d convolutional networks.Proceedings of the IEEE international conference on computer vision
  22. Tran, D.,Wang, H.,Torresani, L.,Ray, J.,LeCun, Y.,Paluri, M.(2018).A closer look at spatiotemporal convolutions for action recognition.Proceedings of the IEEE conference on Computer Vision and Pattern Recognition
  23. Xie, S.,Girshick, R.,Dollár, P.,Tu, Z.,He, K.(2017).,未出版
  24. Zagoruyko, S.,Komodakis, N.(2016).,未出版
  25. Zhou, B.,Andonian, A.,Oliva, A.,Torralba, A.(2018).Temporal relational reasoning in videos.Proceedings of the European Conference on Computer Vision (ECCV)
  26. 李瑞峰,王亮亮,王珂(2014)。人体动作行为识别研究综述。模式识别与人工智能,27(1),35-48。
  27. 李刚,刘新,顾广华(2018)。基于三维卷积稠密网络的视频行为识别算法。中国科技论文,14,12。
  28. 郭明祥,宋全军,徐湛楠,董俊,谢成军(2019)。基于三维残差稠密网络的人体行为识别算法。计算机应用,39(12),3482-3489。