Learning from Few Samples: A Novel Approach for High-Quality Malcode Generation

Haijian Ma; Daizong Liu; Xiaowen Cai; Pan Zhou; Yulai Xie

2025 EMNLP EMNLP 2025

Learning from Few Samples: A Novel Approach for High-Quality Malcode Generation

Abstract

AbstractIntrusion Detection Systems (IDS) play a crucial role in network security defense. However, a significant challenge for IDS in training detection models is the shortage of adequately labeled malicious samples. To address these issues, this paper introduces a novel semi-supervised framework GANGRL-LLM, which integrates Generative Adversarial Networks (GANs) with Large Language Models (LLMs) to enhance malicious code generation and SQL Injection (SQLi) detection capabilities in few-sample learning scenarios. Specifically, our framework adopts a collaborative training paradigm where: (1) the GAN-based discriminator improves malicious pattern recognition through adversarial learning with generated samples and limited real samples; and (2) the LLM-based generator refines the quality of malicious code synthesis using reward signals from the discriminator. The experimental results demonstrate that even with a limited number of labeled samples, our training framework is highly effective in enhancing both malicious code generation and detection capabilities. This dual enhancement capability offers a promising solution for developing adaptive defense systems capable of countering evolving cyber threats.

🌉 Interdisciplinary Bridge — Artificial Intelligence and Computer Science and Deep Learning and Machine Learning

🧭 Keyword Pioneer — intrusion detection system

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Haijian Ma , Daizong Liu , Xiaowen Cai , Pan Zhou , Yulai Xie

Topics

Artificial Intelligence > Core AI > Multimodal Learning Machine Learning > Learning Types > Semi-Supervised Learning Computer Science > Applications > Cybersecurity Machine Learning > Learning Types > Few-Shot Learning Deep Learning > Learning Types > Adversarial Learning Machine Learning > Learning Paradigms > Semi-Supervised Learning

Keywords

semi-supervised learning few-shot learning generative adversarial network intrusion detection malicious code generation few-sample learning sql injection intrusion detection system

Download PDF

Related papers

Bit-Flip Error Resilience in LLMs: A Comprehensive Analysis and Defense Framework 2025

VoiceCraft-X: Unifying Multilingual, Voice-Cloning Speech Synthesis and Speech Editing 2025

Model-based Large Language Model Customization as Service 2025

ZoomEye: Enhancing Multimodal LLMs with Human-Like Zooming Capabilities through Tree-Based Image Exploration 2025

SlideCoder: Layout-aware RAG-enhanced Hierarchical Slide Generation from Design 2025