MENTOR: Automated Feedback for Introductory Programming Exercises, PhD Thesis

Abstract

The increasing demand for programming education has given rise to all kinds of online evaluations such as Massive Open Online Courses (MOOCs) focused on introductory programming assignments (IPAs). As a consequence of a large number of enrolled students, one of the main challenges in these courses is to provide valuable and personalized feedback to students. This thesis presents MENTOR, a semantic automated program repair (APR) framework designed to provide Automated Feedback for Introductory Programming Exercises. MENTOR addresses this challenge by generating possible repairs for faulty student programs, enabling semantic repairs validated through execution on a test suite and by highlighting these faulty statements to the students. Hence, in the context of this work, we provide scientific contributions in several areas, such as program clustering and analysis, automated fault localization and program repair. MENTOR advances the state of the art in the referred areas and provides an innovative practical framework to be deployed in educational environments. Unlike symbolic repair tools like Clara and Verifix, which require correct implementations with identical control flow graphs (CFGs), MENTOR’s Large Language Model (LLM)-based approach enables flexible repairs without strict structural alignment. MENTOR clusters successful submissions regardless of CFGs and employs a Graph Neural Network (GNN)-based variable alignment module for enhanced accuracy. MENTOR’s fault localization module, CFaults, leverages MaxSAT techniques to pinpoint buggy code segments precisely. MENTOR’s program fixer integrates Formal Methods (FM) and LLMs through a Counterexample Guided Inductive Synthesis (CEGIS) loop, iteratively refining repairs. Furthermore, this work also proposes a language-agnostic automated assessment tool, GitSEED, that enhances student learning by providing personalized feedback on code submissions and successfully integrates CFaults for effective fault detection on student code. Experimental results on C-Pack-IPAs demonstrate that MENTOR significantly improves repair success rates, achieving 64.4%, compared to just 6.3% for Verifix and 34.6% for Clara.

Publication
MENTOR: Automated Feedback for Introductory Programming Exercises
Avatar
Pedro Orvalho
Research Associate

My research interests include Automated Reasoning, Program Repair and Program Synthesis.

Related