AlphaBorno at BLP-2025 Task 2: Code Generation with Structured Prompts and Execution Feedback

Mohammad Ashfaq Ur Rahman; Muhtasim Ibteda Shochcho; Md Fahim

2025 AACL AACL 2025

AlphaBorno at BLP-2025 Task 2: Code Generation with Structured Prompts and Execution Feedback

Abstract

AbstractThis paper explores various prompting strategies in the BLP-2025 Shared Task 2, utilizing a pipeline that first translates Bangla problem descriptions into English with GPT-4o,then applies techniques like zero-shot, few-shot,chain of thought, synthetic test case integration, and a self-repair loop. We evaluated fourLLMs (GPT-4o, Grok-3, Claude 3.7 Sonnet,and Qwen2.5-Coder 14B). Our findings revealthat while traditional methods like few-shotand chain-of-thought prompting provided inconsistent gains, the integration of explicit unittests delivered a substantial performance boostacross all models. The most effective strategycombined zero-shot prompting with these synthetic tests and a self-repair loop, leading GPT4o to achieve a top Pass@1 score of 72.2%.These results represent the value of using explicit constraints and iterative feedback in codegeneration, offering a solid framework that improves the model’s code generation capabilities.

🌉 Interdisciplinary Bridge — Artificial Intelligence and Machine Learning and Natural Language Processing

🧭 Keyword Pioneer — self-repair loop

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Mohammad Ashfaq Ur Rahman , Muhtasim Ibteda Shochcho , Md Fahim

Topics

Artificial Intelligence > Core AI > Agent Systems Artificial Intelligence > Core AI > Multi-Agent Systems Artificial Intelligence > Learning Paradigms > Few-Shot Learning Machine Learning > Learning Types > Self-Supervised Learning Machine Learning > Learning Types > Zero-Shot Learning Natural Language Processing > Generation > Text Generation Machine Learning > Learning Types > Few-Shot Learning Natural Language Processing > Applications > Text Generation

Keywords

zero-shot learning few-shot learning prompt engineering code generation prompting strategy zero-shot prompting chain of thought large language model execution feedback self-repair loop synthetic test case

Download PDF

Related papers

Judging the Judges: A Systematic Study of Position Bias in LLM-as-a-Judge 2025

Counterfactual Evaluation for Blind Attack Detection in LLM-based Evaluation Systems 2025

Enhancing Training Data Quality through Influence Scores for Generalizable Classification: A Case Study on Sexism Detection 2025

CtrlShift: Steering Language Models for Dense Quotation Retrieval with Dynamic Prompts 2025

A Diagnostic Framework for Auditing Reference-Free Vision-Language Metrics 2025