Large Sequence Representation Learning via Multi-Stage Latent Transformers

Ionut-Catalin Sandu; Daniel Voinea; Alin-Ionut Popa

2022 COLING COLING 2022

Large Sequence Representation Learning via Multi-Stage Latent Transformers

Abstract

AbstractWe present LANTERN, a multi-stage transformer architecture for named-entity recognition (NER) designed to operate on indefinitely large text sequences (i.e. > 512 elements). For a given image of a form with structured text, our method uses language and spatial features to predict the entity tags of each text element. It breaks the quadratic computational constraints of the attention mechanism by operating over a learned latent space representation which encodes the input sequence via the cross-attention mechanism while having the multi-stage encoding component as a refinement over the NER predictions. As a proxy task, we propose RADAR, an LSTM classifier operating at character level, which predicts the relevance of a word with respect to the entity-recognition task. Additionally, we formulate a challenging novel NER use case, nutritional information extraction from food product labels. We created a dataset with 11,926 images depicting food product labels entitled TREAT dataset, with fully detailed annotations. Our method achieves superior performance against two competitive models designed for long sequences on the proposed TREAT dataset.

🌉 Interdisciplinary Bridge — Artificial Intelligence and Computer Vision and Deep Learning and Natural Language Processing

🧭 Keyword Pioneer — latent transformer

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Ionut-Catalin Sandu , Daniel Voinea , Alin-Ionut Popa

Topics

Deep Learning > Architectures > Transformers Computer Vision > Domain-Specific > Medical Imaging Artificial Intelligence > Core AI > Large Language Models Natural Language Processing > Applications > Named Entity Recognition

Keywords

named entity recognition latent space entity extraction long sequence latent transformer multi-stage encoding nutritional information extraction

Download PDF

Related papers

MulZDG: Multilingual Code-Switching Framework for Zero-shot Dialogue Generation 2022

The Role of Context and Uncertainty in Shallow Discourse Parsing 2022

SelfMix: Robust Learning against Textual Label Noise with Self-Mixup Training 2022

Complicate Then Simplify: A Novel Way to Explore Pre-trained Models for Text Classification 2022

Repo4QA: Answering Coding Questions via Dense Retrieval on GitHub Repositories 2022