2025 AAAI AAAI 2025

CollageNoter: Real-Time and Adaptive Collage Layout Design for Screenshot-Based E-Note-Taking

Abstract

Abstract To enhance the processing of complex multi-modal documents (e.g. e-books, long web pages, etc.), it is an efficient way for users to take digital screenshots of key parts and reorganize them into a new collage E-Note. Existing methods for assisting collage layout design primarily employ a semantic relevance-first strategy, with arranging related contents together. Though capable, it can not ensure the visual readability of screenshots and may conflict with human natural reading patterns. In this paper, we introduce CollageNoter for real-time collage layout design that adapts to various devices (e.g. laptop, tablet, phone, etc.), offering users with visually and cognitively well-organized screenshot-based E-Notes. Specifically, we construct a novel two-stage pipeline for collage design, including 1) readability-first layout generation and 2) cognitive-driven layout adjustment. In addition, to achieve real-time response and adaptive model training, we propose a cascade transformer-based layout generator named CollageFormer and a size-aware collage layout builder for automatic dataset construction. Extensive experimental results have confirmed the effectiveness of our CollageNoter.

🌉 Interdisciplinary Bridge — Artificial Intelligence and Computer Science and Computer Vision and Deep Learning
🧭 Keyword Pioneer — collage layout
🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Security & Privacy, Speech & Audio