Large Language Models for Code Review and Bug Detection
Depani Krish Sunilbhai Krish Sunilbhai
Paper Contents
Abstract
Large Language Models (LLMs) have emerged as transformative tools for automatedcode review and bug detection across multiple programming languages. Thiscomprehensive analysis examines specialized model training approaches for identifyingsecurity vulnerabilities and logic errors in source code spanning 13 programminglanguages including Go, C, C++, Java, JavaScript, TypeScript, Haskell, PHP, HTML,CSS, C#, Rust, Kotlin, and SQL. Recent transformer-based models demonstrate up to67% accuracy in vulnerability detection with context-rich evaluation frameworks,achieving precision rates of 0.8 in specialized applications. This research evaluatesvarious training methodologies, from fine-tuning to pre-training approaches, and analyzesmodel architectures ranging from encoder-only systems to hybrid graph-neural networks.Performance assessment across comprehensive benchmarks including CodeXGLUE andreal-world datasets reveals that hybrid approaches combining LLMs with traditionalstatic analysis tools outperform individual methods. However, significant challengespersist, including context window limitations, false positive rates of 32-35%, andcomputational scalability issues for large codebases. The study concludes with futureresearch directions encompassing formal verification integration, automated code repaircapabilities, and domain-specific model specialization strategies.
Copyright
Copyright © 2025 Depani Krish Sunilbhai. This is an open access article distributed under the Creative Commons Attribution License.