SKE-Layout: Spatial Knowledge Enhanced Layout Generation with LLMs

Junsheng Wang; Nieqing Cao; Yan Ding; Mengying Xie; Fuqiang Gu; Chao Chen

2025 CVPR CVPR 2025

SKE-Layout: Spatial Knowledge Enhanced Layout Generation with LLMs

Abstract

Generating layouts from textual descriptions by large language models (LLMs) plays a crucial role in precise spatial reasoning-induced domains such as robotic object rearrangement and text-to-image generation. However, current methods face challenges in limited real-world examples, handling diverse layout descriptions and varying levels of granularity. To address these issues, a novel framework named Spatial Knowledge Enhanced Layout (SKE-Layout), is introduced. SKE-Layout integrates mixed spatial knowledge sources, leveraging both real and synthetic data to enhance spatial contexts. It utilizes diverse representations tailored to specific tasks and employs contrastive learning and multitask learning techniques for accurate spatial knowledge retrieval. This framework generates more accurate and fine-grained visual layouts for object rearrangement and text-to-image generation tasks, achieving improvements of 5%-30% compared to existing methods.

🌉 Interdisciplinary Bridge — Artificial Intelligence and Machine Learning and Natural Language Processing

🧭 Keyword Pioneer — spatial knowledge enhancement

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Junsheng Wang , Nieqing Cao , Yan Ding , Mengying Xie , Fuqiang Gu , Chao Chen

Topics

Artificial Intelligence > Core AI > Foundation Models Machine Learning > Learning Types > Contrastive Learning Natural Language Processing > Generation > Text Generation Artificial Intelligence > Core AI > Large Language Models Artificial Intelligence > Core AI > Reasoning Machine Learning > Learning Paradigms > Multi-Task Learning

Keywords

contrastive learning multitask learning spatial reasoning layout generation large language model object rearrangement spatial knowledge spatial knowledge enhancement

Download PDF

Related papers

AnyCam: Learning to Recover Camera Poses and Intrinsics from Casual Videos 2025

SeriesBench: A Benchmark for Narrative-Driven Drama Series Understanding 2025

FADE: Frequency-Aware Diffusion Model Factorization for Video Editing 2025

Fast and Accurate Gigapixel Pathological Image Classification with Hierarchical Distillation Multi-Instance Learning 2025

Reversible Decoupling Network for Single Image Reflection Removal 2025