2025
ACL
ACL 2025
RUC Team at SemEval-2025 Task 5: Fast Automated Subject Indexing: A Method Based on Similar Records Matching and Related Subject Ranking
Abstract
AbstractThis paper presents MaRSI, an automatic subject indexing method designed to address the limitations of traditional manual indexing and emerging GenAI technologies. Focusing on improving indexing accuracy in cross-lingual contexts and balancing efficiency and accuracy in large-scale datasets, MaRSI mimics human reference learning behavior by constructing semantic indexes from pre-indexed document. It calculates similarity to retrieve relevant references, merges, and reorders their topics to generate index results. Experiments demonstrate that MaRSI outperforms supervised fine-tuning of LLMs on the same dataset, offering advantages in speed, effectiveness, and interpretability.
🌉
Interdisciplinary Bridge
— Artificial Intelligence and Machine Learning
🧭
Keyword Pioneer
— subject indexing
🐝
Cross-Pollinator
— Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Speech & Audio
Authors
Topics
Artificial Intelligence > Core AI > Interpretability
Machine Learning > Core Methods > Classification
Machine Learning > Core Methods > Representation Learning
Machine Learning > Core Methods > Metric Learning
Machine Learning > Learning Types > Semi-Supervised Learning
Machine Learning > Application Areas > Domain Adaptation
Natural Language Processing > Applications > Information Retrieval
Natural Language Processing > Applications > Text Classification
Keywords
text classification
information retrieval
document retrieval
semantic indexing
semantic similarity
semantic retrieval
cross-lingual retrieval
similarity matching
subject indexing
reference retrieval
automated subject indexing
reference learning
record matching
automated indexing
subject ranking
similar records matching
record similarity
related subject ranking