2025 AAAI AAAI 2025

Harnessing Language Model for Cross-Heterogeneity Graph Knowledge Transfer

Abstract

Abstract Heterogeneous graphs (HGs) that contain various node and edge types are ubiquitous in real-world scenarios. Considering the common label sparsity problem in HGs, some researchers propose to pretrain on source HGs to extract general knowledge and then fine-tune on a target HG for knowledge transfer. However, existing methods often assume that source and target HGs share a single heterogeneity, meaning that they have the same types of nodes and edges, which contradicts the real-world scenarios requiring cross-heterogeneity transfer. Although a recent study has made some preliminary attempts in cross-heterogeneity learning, its definition of general knowledge heavily rely on human knowledge, which lacks flexibility and further leads to a suboptimal transfer. To address the problem, we propose a novel Language Model-enhanced Cross-Heterogeneity learning model, namely LMCH. Specifically, we first design a metapath-based corpus construction method to unify HG representations as languages. The corpora of source HGs are then used to fine-tune a pretrained Language Model (LM), enabling the LM to autonomously extract general knowledge across different HGs. Furthermore, to fully utilize the extensive unlabeled nodes in a few-labeled target HG, we propose an iterative training pipeline with the help of an extra Graph Neural Network (GNN) predictor, enhanced by LM-GNN contrastive alignment at the end of each iteration. Extensive experiments on four real-world datasets have demonstrated the superior performance of LMCH over state-of-the-art methods.

🌉 Interdisciplinary Bridge — Artificial Intelligence and Deep Learning and Knowledge & Reasoning and Machine Learning and Natural Language Processing
🧭 Keyword Pioneer — metapath-based corpus
🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio