Source Themes | Pedro Orvalho

Counterexample Guided Program Repair Using Zero-Shot Learning and MaxSAT-based Fault Localization, AAAI 2025

In this paper, we propose a novel approach that combines the strengths of both FM-based fault localization and LLMs, via zero-shot learning, to enhance APR for IPAs. Our method uses MaxSAT-based fault localization to identify buggy parts of a program, then presents the LLM with a program sketch devoid of these buggy statements. This hybrid approach follows a Counterexample Guided Inductive Synthesis (CEGIS) loop to iteratively refine the program. We ask the LLM to synthesize the missing parts, which are then checked against a test suite. If the suggested program is incorrect, a counterexample from the test suite is fed back to the LLM for revised synthesis.

InvAASTCluster: On Applying Invariant-Based Program Clustering to Introductory Programming Assignments, JSS 2025

This paper proposes InvAASTCluster, a novel approach for program clustering that uses dynamically generated program invariants to cluster semantically equivalent IPAs.

MENTOR: Automated Feedback for Introductory Programming Exercises, PhD Thesis

This PhD thesis presents MENTOR, a semantic automated program repair (APR) framework designed to provide Automated Feedback for Introductory Programming Exercises.

GitSEED: A Git-backed Automated Assessment Tool for Software Engineering and Programming Education, SIGCSE Virtual 2024

This paper introduces GitSEED, a language-agnostic automated assessment tool designed for Programming Education and Software Engineering (SE) and backed by GitLab.

CFaults: Model-Based Diagnosis for Fault Localization in C with Multiple Test Cases, FM 2024

This paper introduces a novel fault localization approach for C programs with multiple faults. CFaults leverages Model-Based Diagnosis (MBD) with multiple observations and aggregates all failing test cases into a unified MaxSAT formula. Consequently, our method guarantees consistency across observations and simplifies the fault localization procedure.

C-Pack of IPAs: A C90 Program Benchmark of Introductory Programming Assignments, APR 2024

A C90 Program Benchmark of Introductory Programming Assignments (IPAs), that contains semantically correct, semantically incorrect, and syntactically incorrect programs and a test suite for each IPA.

Graph Neural Networks For Mapping Variables Between Programs, ECAI 2023

In this work, we propose using graph neural networks (GNNs) to map the set of variables between two programs based on both programs' abstract syntax trees (ASTs). To demonstrate the strength of variable mappings, we present three use-cases of these mappings on the task of program repair to fix well-studied and recurrent bugs among novice programmers in introductory programming assignments (IPAs).

UpMax: User partitioning for MaxSAT, SAT 2023

This paper proposes a new framework called UpMax that decouples the partitioning procedure from the MaxSAT solving algorithms. As a result, new partitioning procedures can be defined independently of the MaxSAT algorithm to be used. Moreover, this decoupling also allows users that build new MaxSAT formulas to propose partition schemes based on knowledge of the problem to be solved.

MultIPAs: Applying Program Transformations to Introductory Programming Assignments for Data Augmentation, ESEC/FSE 2022

This paper presents MultIPAs, a program transformation tool that can augment IPAs benchmarks by (1) applying six syntactic mutations that conserve the program's semantics and (2) applying three semantic mutilations that introduce faults in the IPAs.

Project Proposal: Learning Variable Mappings to Repair Programs, AITP 2022

In this position paper, we propose to learn how to map the set of variables between different small imperative programs based on both programs' abstract syntax trees (ASTs) using graph neural networks (GNNs).