TritonBench: Benchmarking Large Language Model Capabilities for Generating Triton Operators

Jianling Li; Shangzhan Li; Zhenye Gao; Qi SHI; Yuxuan Li; Zefan Wang; Jiacheng Huang; Haojie Wang; Jianrong Wang; Xu Han; Zhiyuan Liu; Maosong Sun

2025 ACL ACL 2025

TritonBench: Benchmarking Large Language Model Capabilities for Generating Triton Operators

Abstract

AbstractTriton, a high-level Python-like language designed for building efficient GPU kernels, is widely adopted in deep learning frameworks due to its portability, flexibility, and accessibility. However, programming and parallel optimization still require considerable trial and error from Triton developers. Despite advances in large language models (LLMs) for conventional code generation, these models struggle to generate accurate, performance-optimized Triton code, as they lack awareness of its specifications and the complexities of GPU programming. More critically, there is an urgent need for systematic evaluations tailored to Triton. In this work, we introduce TritonBench, the first comprehensive benchmark for Triton operator generation. TritonBench features two evaluation channels: a curated set of 184 real-world operators from GitHub and a collection of operators aligned with PyTorch interfaces. Unlike conventional code benchmarks prioritizing functional correctness, TritonBench also profiles efficiency performance on widely deployed GPUs aligned with industry applications. Our study reveals that current state-of-the-art code LLMs struggle to generate efficient Triton operators, highlighting a significant gap in high-performance code generation.

🌉 Interdisciplinary Bridge — Artificial Intelligence and Natural Language Processing

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Jianling Li , Shangzhan Li , Zhenye Gao , Qi SHI , Yuxuan Li , Zefan Wang , Jiacheng Huang , Haojie Wang , Jianrong Wang , Xu Han , Zhiyuan Liu , Maosong Sun

Topics

Artificial Intelligence > Core AI > Foundation Models Natural Language Processing > Resources & Methods > Large Language Models

Keywords

code generation large language model gpu kernel

Download PDF

Graphically Speaking: Unmasking Abuse in Social Media with Conversation Insights 2025

CodeTool: Enhancing Programmatic Tool Invocation of LLMs via Process Supervision 2025

Structural Deep Encoding for Table Question Answering 2025

Vision-aided Unsupervised Constituency Parsing with Multi-MLLM Debating 2025

TritonBench: Benchmarking Large Language Model Capabilities for Generating Triton Operators

Abstract

Authors

Topics

Keywords

Related papers