A Simple yet Effective Learnable Positional Encoding Method for Improving Document Transformer Model

Guoxin Wang; Yijuan Lu; Lei Cui; Tengchao Lv; Dinei Florencio; Cha Zhang

2022 AACL AACL 2022

A Simple yet Effective Learnable Positional Encoding Method for Improving Document Transformer Model

Abstract

AbstractPositional encoding plays a key role in Transformer-based architecture, which is to indicate and embed token sequential order information. Understanding documents with unreliable reading order information is a real challenge for document Transformer models. This paper proposes a simple and effective positional encoding method, learnable sinusoidal positional encoding (LSPE), by building a learnable sinusoidal positional encoding feed-forward network. We apply LSPE to document Transformer models and pretrain them on document datasets. Then we finetune and evaluate the model performance on document understanding tasks in form, receipt, and invoice domains. Experimental results show our proposed method not only outperforms other baselines, but also demonstrates its robustness and stability on handling noisy data with incorrect order information.

🌉 Interdisciplinary Bridge — Artificial Intelligence and Machine Learning and Natural Language Processing

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Robotics, Speech & Audio

Authors

Guoxin Wang , Yijuan Lu , Lei Cui , Tengchao Lv , Dinei Florencio , Cha Zhang

Topics

Machine Learning > Core Methods > Representation Learning Natural Language Processing > Resources & Methods > Text Representation Artificial Intelligence > Core AI > Efficient Computing

Keywords

positional encoding noisy data handling document transformer learnable encoding

Download PDF

Related papers

A Japanese Corpus of Many Specialized Domains for Word Segmentation and Part-of-Speech Tagging 2022

Enhancing Tabular Reasoning with Pattern Exploiting Training 2022

Re-contextualizing Fairness in NLP: The Case of India 2022

Adversarially Improving NMT Robustness to ASR Errors with Confusion Sets 2022

Promoting Pre-trained LM with Linguistic Features on Automatic Readability Assessment 2022