中文摘要
|
A multi-model fusion method based on LayoutLM+BiLSTM+CRF which aims at the problem of document weak semantic information and semantic alignment between different modalities is proposed. LayoutLM not only can obtain feature representations with prior semantic knowledge in large unlabeled document libraries but also fully explore the structure and layout information of text sequences, and then use BiLSTM+CRF to identify entities. Compared to the FUNSD dataset, the results show that this method is better than HMM, CRF, BiLSTM+CRF, and BERT. Based on the CRF, its value of F1 is increased by 27.76%, indicating that this method effectively improves the document recognition effect.
|
参考文献
|
-
Rusinol M, Benkhelfallah T, Poulain dAndecy V. Field extraction from administrative documents by incremental structural templates[C]//2013 12th International Conference on Document Analysis and Recognition. IEEE, 2013: 1100-1104.
連結:
-
Xu Y, Li M, Cui L, et al. Layoutlm: Pre-training of text and layout for document image understanding[C] //Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. 2020: 1192-1200.
連結:
-
Zhou P, Shi W, Tian J, et al. Attention-based bidirectional long short-term memory networks for relation classification[C]//Proceedings of the 54th annual meeting of the association for computational linguistics (volume 2: Short papers). 2016: 207-212.
連結:
-
Hochreiter S, Schmidhuber J. Long short-term memory[J]. Neural computation, 1997, 9(8): 1735-1780.
連結:
-
Dobruschin P L. The description of a random field by means of conditional probabilities and conditions of its regularity[J]. Theory of Probability & Its Applications, 1968, 13(2): 197-224.
連結:
-
Jaume G, Ekenel H K, Thiran J P. Funsd: A dataset for form understanding in noisy scanned documents[C] //2019 International Conference on Document Analysis and Recognition Workshops (ICDARW). IEEE, 2019, 2: 1-6.
連結:
-
Yadav V, Bethard S. A survey on recent advances in named entity recognition from deep learning models[J]. arXiv preprint arXiv:1910.11470, 2019.
-
Devlin J, Chang M W, Lee K, et al. Bert: Pre-training of deep bidirectional transformers for language understanding[J]. arXiv preprint arXiv:1810.04805, 2018.
-
Vaswani A, Shazeer N, Parmar N, et al. Attention is all you need[C]//Advances in neural information processing systems. 2017: 5998-6008.
-
Medsker L R, Jain L C. Recurrent neural networks[J]. Design and Applications, 2001, 5: 64-67.
|