LLM-Enabled Scientific Knowledge Diffusion Analysis

Uttam Rao; Madhav Marathe

2026 AAAI AAAI 2026

LLM-Enabled Scientific Knowledge Diffusion Analysis

Abstract

Abstract Bibliometric and science-of-science studies have yielded valuable insights into co-authorship and citation networks, yet most analyses rely on static datasets and limited relation types. We introduce a multi-agent AI architecture that orchestrates specialized large language model (LLM) agents (ingestion, extraction, disambiguation, integration, and analysis) to build and query a comprehensive knowledge graph. Ingestion agents unify data from diverse sources such as OpenAlex, ORCID, ROR, USPTO, and custom web scrapers. Extraction agents harness LLMs to parse unstructured text. Disambiguation agents combine rule-based heuristics with LLM reasoning to resolve ambiguous authors and institutions. Integration agents assemble and cache a provenance-rich graph. An analysis agent translates natural language questions into graph queries and interprets results. This end-to-end pipeline produces a rich graph schema spanning authors, institutions, publications, patents, grants, topics, and temporal relations. Researcher mobility and knowledge diffusion are then modeled as timed automata, where each researcher node’s institutional transitions and accumulated attributes (such as publications, collaborators, and topic expertise) enable dynamic temporal reasoning. Results show that our multi-agent, graph-based system consistently outperforms standalone LLMs and research agents on complex temporal queries, entity disambiguation accuracy, and cross-entity reasoning while maintaining competitive efficiency. These capabilities position the system as a foundation for real-time, LLM-assisted knowledge analysis platforms that can support science policy, research evaluation, and meta-scientific inquiry.

🌉 Interdisciplinary Bridge — Artificial Intelligence and Machine Learning and Natural Language Processing

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Uttam Rao , Madhav Marathe

Topics

Artificial Intelligence > Core AI > Agent Systems Artificial Intelligence > Core AI > Foundation Models Artificial Intelligence > Core AI > Multi-Agent Systems Machine Learning > Core Methods > Representation Learning Natural Language Processing > Resources & Methods > Large Language Models

Keywords

entity disambiguation knowledge graph scientific knowledge large language model multi-agent system bibliometric analysis knowledge diffusion

Download PDF

Related papers

Hi-EF: Benchmarking Emotion Forecasting in Human-interaction 2026

MosaicDoc: A Large-Scale Bilingual Benchmark for Visually Rich Document Understanding 2026

Sparse3DPR: Training-Free 3D Hierarchical Scene Parsing and Task-Adaptive Subgraph Reasoning from Sparse RGB Views 2026

LayerEdit: Disentangled Multi-Object Editing via Conflict-Aware Multi-Layer Learning 2026

HDGS: Hierarchical Dynamic Gaussian Splatting for Urban Driving Scenes 2026