Enhancing Patch Validation in Automated Program Repair through Large Language Models

Investigate and develop methods to confirm the correctness of patches generated by APR systems, leveraging the capabilities of large language models (LLMs).

Being a master’s student

List of projects

Faults (aka bugs) in software systems can affect large groups of people and lead to massive financial damages. Correcting such bugs accounts for a significant portion of overall software development costs. Automated program repair (APR) techniques aim to reduce these costs by automatically generating program patches - edits in code - to remove bugs from software systems. While APR systems can produce multiple candidate patches for a single bug, validating the correctness of these patches remains a critical challenge. Relying solely on test suites has been proven to be insufficient since patches that pass all tests may still be incorrect. Therefore, APR researchers still rely on manual reviews that are time-consuming and subject to human errors and inconsistencies. The project aims to research potential solutions to improve the validation of patches generated by APR approaches. Given their advanced coding capabilities, LLMs could offer a nuanced and systematic approach to assess patch correctness. This research explores whether LLMs (or other approaches) can accurately evaluate correctness, and therefore reduce the dependence on manual validation.

Goal

Analyse the potential uses of LLMs in patch validation, in particular, for identifying patches that pass available automated tests but are not correct.
Develop methods to evaluate whether two patches resolve the same issue.
Propose strategies to reduce the need for manual assessment of patches.
Investigate the effectiveness of the proposed approach.

Learning outcome

Proficiency with large language models through hands-on experience.
Expertise in machine learning and its application to code development tasks.
Understand automatic program repair, its potential uses and limitations.
Design, implement, and evaluate novel approaches in software engineering.

Qualifications

Interested in deep learning, in particular, natural language processing, and machine learning for source code analysis (an area also known as ML4code).
Interested in experimenting with APR approaches, reusing and building models for source code, as well as evaluating them.
Preferably knowledge of python and LaTeX.

Supervisors

Leon Moonen
Fernando Vallecillos Ruiz

Enhancing Patch Validation in Automated Program Repair through Large Language Models

Goal

Learning outcome

Qualifications

Supervisors

Associated contacts