Research Explorer
Papers
Conferences
Authors
Topics
Keywords
Trends
Achievements
Explore
← Architectures
Deep Learning
›
Architectures
›
Transformers
9294 directly classified papers
Papers per year
2011: 1
2014: 2
2015: 6
2016: 17
2017: 67
2018: 156
2019: 404
2020: 769
2021: 1217
2022: 1446
2023: 1628
2024: 1574
2025: 1647
2026: 360
Papers
Object-Centric Framework for Video Moment Retrieval
AAAI 2026
Improved Masked Image Generation with Knowledge-Augmented Token Representations
AAAI 2026
FloorPlanFormer: Multi-Task Transformer Network for Floor Plan Recognition with Outer-to-Inner Feature Refinement
AAAI 2026
MoFu: Scale-Aware Modulation and Fourier Fusion for Multi-Subject Video Generation
AAAI 2026
Vision-MoR: Scaling Vision Transformer via Patch-Level Mixture-of-Recursions
AAAI 2026
Boosting Resolution Generalization of Diffusion Transformers with Randomized Positional Encodings
AAAI 2026
Gait Transformer: End-to-End Transformer Backbone for Gait Recognition
AAAI 2026
DAPE: Harmonizing Content-Position Encoding for Versatile Dense Visual Prediction
AAAI 2026
UltraGen: High-Resolution Video Generation with Hierarchical Attention
AAAI 2026
Fair Facial Attribute Recognition via Group-Decoupled Vision Transformer with Mask-Guided Correlation Suppression
AAAI 2026
CLIPPan: Adapting CLIP as a Supervisor for Unsupervised Pansharpening
AAAI 2026
Circuit-Think: A Multimodal Reasoning Framework for Automated Circuit-to-Netlist Translation with Trajectory-Guided Reinforcement Learning
AAAI 2026
Reliable-View 2D-3D Key-Part Aligned Transformer with Reinforced Masking for 3D Point Cloud Understanding
AAAI 2026
State-Space Hierarchical Compression with Gated Attention and Learnable Sampling for Hour-Long Video Understanding in Large Multimodal Models
AAAI 2026
AbductiveMLLM: Boosting Visual Abductive Reasoning Within MLLMs
AAAI 2026
Sortblock: Similarity-Aware Feature Reuse for Diffusion Model
AAAI 2026
LoGoSeg: Integrating Local and Global Features for Open-Vocabulary Semantic Segmentation
AAAI 2026
LAMIC: Layout-Aware Multi-Image Composition via Scalability of Multimodal Diffusion Transformer
AAAI 2026
Empowering DINO Representations for Underwater Instance Segmentation via Aligner and Prompter
AAAI 2026
DCMM-Transformer: Degree-Corrected Mixed-Membership Attention for Medical Imaging
AAAI 2026
TraveLLaMA: A Multimodal Travel Assistant with Large-Scale Dataset and Structured Reasoning
AAAI 2026
On Model and Data Scaling for Skeleton-based Self-Supervised Gait Recognition
AAAI 2026
Zero-Reference Joint Low-Light Enhancement and Deblurring via Visual Autoregressive Modeling with VLM-Derived Modulation
AAAI 2026
Empowering Semantic-Sensitive Underwater Image Enhancement with VLM
AAAI 2026
Segment and Matte Anything in a Unified Model
AAAI 2026
<
1
…
8
9
10
…
372
>