K-ON: Stacking Knowledge on the Head Layer of Large Language Model

Lingbing Guo; Yichi Zhang; Zhongpu Bo; Zhuo Chen; Mengshu Sun; Zhiqiang Zhang; Wen Zhang; Huajun Chen

2025 AAAI AAAI 2025

K-ON: Stacking Knowledge on the Head Layer of Large Language Model

Abstract

Abstract Recent advancements in large language models (LLMs) have significantly improved various natural language processing (NLP) tasks. Typically, LLMs are trained to predict the next token, aligning well with many NLP tasks. However, in knowledge graph (KG) scenarios, entities are the fundamental units and identifying an entity requires at least several tokens. This leads to a granularity mismatch between KGs and natural languages. To address this issue, we propose K-ON, which integrates KG knowledge into the LLM by employing multiple head layers for next k-step prediction. K-ON can not only generate entity-level results in one step, but also enables contrastive loss against entities, which is the most powerful tool in KG representation learning. Experimental results show that K-ON outperforms state-of-the-art methods that incorporate text and even the other modalities.

🌉 Interdisciplinary Bridge — Artificial Intelligence and Machine Learning and Natural Language Processing

🧭 Keyword Pioneer — entity-level prediction

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Lingbing Guo , Yichi Zhang , Zhongpu Bo , Zhuo Chen , Mengshu Sun , Zhiqiang Zhang , Wen Zhang , Huajun Chen

Topics

Artificial Intelligence > Learning Paradigms > Transfer Learning Machine Learning > Learning Types > Contrastive Learning Natural Language Processing > Resources & Methods > Large Language Models

Keywords

contrastive learning knowledge transfer knowledge graph large language model entity-level prediction

Download PDF

Related papers

BEV-TSR: Text-Scene Retrieval in BEV Space for Autonomous Driving 2025

APIRL: Deep Reinforcement Learning for REST API Fuzzing 2025

Anywhere: A Multi-Agent Framework for User-Guided, Reliable, and Diverse Foreground-Conditioned Image Generation 2025

3CAD: A Large-Scale Real-World 3C Product Dataset for Unsupervised Anomaly Detection 2025

Collaborative Learning for 3D Hand-Object Reconstruction and Compositional Action Recognition from Egocentric RGB Videos Using Superquadrics 2025