← Resources & Methods

Natural Language Processing › Resources & Methods ›

Text Representation

2246 directly classified papers

Papers per year

Papers

Agnus LLM: Robust and Flexible Entity Disambiguation with decoder-only Language Models IJCNLP 2025

Revisiting Word Embeddings in the LLM Era AACL 2025

On the Correspondence between the Squared Norm and Information Content in Text Embeddings EMNLP 2025

Agnus LLM: Robust and Flexible Entity Disambiguation with decoder-only Language Models AACL 2025

Rethinking Tokenization for Rich Morphology: The Dominance of Unigram over BPE and Morphological Alignment IJCNLP 2025

Egalitarian Language Representation in Language Models: It All Begins with Tokenizers COLING 2025

Comparable Corpora: Opportunities for New Research Directions COLING 2025

IndoMorph: a Morphology Engine for Indonesian COLING 2025

Cyber Protectors@DravidianLangTech 2025: Abusive Tamil and Malayalam Text Targeting Women on Social Media using FastText NAACL 2025

The iRead4Skills Intelligent Complexity Analyzer EMNLP 2025

Same Question, Different Words: A Latent Adversarial Framework for Prompt Robustness EMNLP 2025

Whose Palestine Is It? A Topic Modelling Approach to National Framing in Academic Research EMNLP 2025

Broken Words, Broken Performance: Effect of Tokenization on Performance of LLMs AACL 2025

UWBa at SemEval-2025 Task 7: Multilingual and Crosslingual Fact-Checked Claim Retrieval SEMEVAL 2025

ChuenSumi at SemEval-2025 Task 1: Sentence Transformer Models and Processing Idiomacity SEMEVAL 2025

Generating Text from Uniform Meaning Representation IJCNLP 2025

LangSAMP: Language-Script Aware Multilingual Pretraining ACL 2025

Sinhala Encoder-only Language Models and Evaluation ACL 2025

Exploring morphology-aware tokenization: A case study on Spanish language modeling EMNLP 2025

PosterSum: A Multimodal Benchmark for Scientific Poster Summarization IJCNLP 2025

Mapping semantic networks to Dutch word embeddings as a diagnostic tool for cognitive decline EMNLP 2025

ParsiPy: NLP Toolkit for Historical Persian Texts in Python NAACL 2025

Exploring the Integration of Eye Movement Data on Word Embeddings NAACL 2025

Py-Elotl: A Python NLP package for the languages of Mexico NAACL 2025

Learning Word Embeddings from Glosses: A Multi-Loss Framework for Arabic Reverse Dictionary Tasks EMNLP 2025