RecBase: Generative Foundation Model Pretraining for Zero-Shot Recommendation

Sashuai Zhou; Weinan Gan; Qijiong Liu; Ke Lei; jieming zhu; Hai Huang; Yan Xia; Ruiming Tang; Zhenhua Dong; Zhou Zhao

2025 EMNLP EMNLP 2025

RecBase: Generative Foundation Model Pretraining for Zero-Shot Recommendation

Abstract

AbstractRecent advances in LLM-based recommendation have shown promise, yet their cross-domain generalization is hindered by a fundamental mismatch between language-centric pretraining and the recommendation task. Existing methods, relying on language-level knowledge, fail to capture dynamic, item-level user interests across domains. To bridge this gap, we propose RecBase, a domain-agnostic foundational model pretrained with a recommendation-oriented objective. RecBase leverages a large-scale, heterogeneous, cross-domain corpus with unified textual representations and feature mappings to enhance cross-domain generalization. To further align item semantics across domains, we introduce a unified item tokenizer that encodes items into hierarchical concept identifiers, enabling structured representation and efficient vocabulary sharing. The model is trained using an autoregressive objective to capture complex item-level sequential patterns. On eight real-world datasets, our 1.5B-parameter model matches or surpasses the performance of LLM baselines up to 7B parameters in zero-shot and cross-domain recommendation tasks.

🧭 Keyword Pioneer — item tokenizer

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Sashuai Zhou , Weinan Gan , Qijiong Liu , Ke Lei , jieming zhu , Hai Huang , Yan Xia , Ruiming Tang , Zhenhua Dong , Zhou Zhao

Topics

Artificial Intelligence > Core AI > Foundation Models Artificial Intelligence > Learning Paradigms > Transfer Learning Artificial Intelligence > Learning Paradigms > Zero-Shot Learning

Keywords

zero-shot learning autoregressive model foundation model recommendation system cross-domain transfer item tokenizer

Download PDF

Related papers

Bit-Flip Error Resilience in LLMs: A Comprehensive Analysis and Defense Framework 2025

VoiceCraft-X: Unifying Multilingual, Voice-Cloning Speech Synthesis and Speech Editing 2025

Model-based Large Language Model Customization as Service 2025

ZoomEye: Enhancing Multimodal LLMs with Human-Like Zooming Capabilities through Tree-Based Image Exploration 2025

SlideCoder: Layout-aware RAG-enhanced Hierarchical Slide Generation from Design 2025