Model-Based Diagnosis for Software

Localising system faults has long been recognised as one of the most time-consuming and costly tasks in software engineering.

Given a buggy system, fault localisation (FL) refers to the process of identifying program locations that may cause faulty behaviour (bugs).

More specifically, given a faulty system and a set of observations (e.g., failing test cases), formula-based fault localisation (FBFL) methods encode the localisation problem as a series of constraint optimisation problems, identifying a minimal set of faulty statements (diagnoses) within the system, as illustrated below:

FBFL approaches are typically grounded in the theory of Model-Based Diagnosis (MBD), which has been successfully applied to restore consistency across various domains, including Boolean circuits, C software, knowledge bases, and spreadsheets.
Traditionally, these methods compute a minimal diagnosis by considering each failing test case individually, rather than simultaneously across all failures.

⚙️ The CFaults Project

CFaults introduces a novel approach to fault localisation that leverages Maximum Satisfiability (MaxSAT), the optimisation variant of SAT, together with Model-Based Diagnosis (MBD) to perform multi-test-case FBFL for C programs.

Our approach is implemented in a publicly available tool, CFaults, designed to see faults in C.

Unlike existing methods, CFaults consolidates all failing test cases into a unified MaxSAT formula, ensuring consistency and precision across the fault localisation process.
In our paper, we demonstrate that CFaults produces only minimal diagnoses of faulty statements, while other FBFL methods for C software (e.g., BugAssist and SNIPER) tend to output redundant diagnoses.

🔍 System Overview

CFaults comprises six main steps:

Program unrolling
Program instrumentalisation
Bounded model checking (via CBMC)
Encoding to MaxSAT
Oracle resolution (MaxSAT solver)
Diagnosis refinement

The process begins by unrolling and instrumentalising C programs at the code level, ensuring independence from the model checker. CFaults then employs CBMC to generate a trace formula of the program, which is subsequently encoded into MaxSAT to identify the minimal set of faulty statements.

Explore CFaults!

🚀 Recent Use Cases of CFaults

CFaults has been successfully applied in multiple research directions, demonstrating its flexibility across educational and AI-driven settings:

🎓 Providing Feedback in Programming Education

In Spring 2024, CFaults was integrated with GitSEED, a GitLab-backed automated assessment tool for software engineering and programming education, to provide automated feedback in a first-year undergraduate C programming course.

Among the findings, 69% of surveyed students reported that the hints generated by CFaults were helpful in identifying and correcting their code errors. These localised hints encouraged self-directed debugging and a deeper understanding of programming concepts.

Overall, this study demonstrated CFaults’ value as a pedagogical tool, reinforcing its potential as a bridge between formal reasoning and effective programming education.

🤖 Guiding LLMs for Automated Program Repair

CFaults has also been integrated with Large Language Models (LLMs) via zero-shot learning to enhance automated program repair for C assignments.

In this hybrid framework, CFaults first identifies faulty statements and produces a program sketch with those statements removed.
This approach follows a Counterexample-Guided Inductive Synthesis (CEGIS) loop:

The LLM synthesises code to fill in the missing parts.
The result is validated against a test suite.
If incorrect, a counterexample is provided to guide refinement.

This iterative process enables LLMs to repair more programs and produce smaller, higher-quality fixes, outperforming both purely neural methods and state-of-the-art symbolic repair tools.

🐍 Formula-Based Fault Localisation for Python

CFaults’ underlying principles have also been extended beyond C.

PyVeritas is a framework that uses LLMs for high-level transpilation from Python to C, followed by bounded model checking and MaxSAT-based fault localisation via CFaults.

This approach enables verification and bug localisation for Python code using existing C verification tools.
Empirical evaluations on Python benchmarks demonstrate that LLM-based transpilation achieves 80–90% accuracy for some models, opening a new pathway for combining language models with formal verification.

CFaults exemplifies how symbolic reasoning and constraint optimisation can serve as the foundation for trustworthy AI-assisted software development, spanning applications from education to automated repair and cross-language verification.

References

Pedro Orvalho, Mikoláš Janota, Vasco Manquinho (2024). CFaults: Model-Based Diagnosis for Fault Localization in C with Multiple Test Cases. In FM 2024.
PDF Code Zenodo project Poster Preprint DOI
Pedro Orvalho, Marta Kwiatkowska, Mikoláš Janota, and Vasco Manquinho (2025). Model-Based Diagnosis with Multiple Observations: A Unified Approach for C Software and Boolean Circuits. In arXiv 2025. PDF.
Pedro Orvalho, Mikoláš Janota, Vasco Manquinho (2025). Counterexample Guided Program Repair Using Zero-Shot Learning and MaxSAT-based Fault Localization. In AAAI 2025.
PDF Code Zenodo project Poster Preprint DOI
Pedro Orvalho (2025). MENTOR: Automated Feedback for Introductory Programming Exercises. PhD Thesis.
PDF Code IST Project
Pedro Orvalho, Mikoláš Janota, Vasco Manquinho (2024). GitSEED: A Git-backed Automated Assessment Tool for Software Engineering and Programming Education. In SIGCSE Virtual 2024.
PDF Code Zenodo project Preprint DOI
Pedro Orvalho, Marta Kwiatkowska (2026). PyVeritas: On Verifying Python via LLM-Based Transpilation and Bounded Model Checking for C. In P-AI-FM @ AAAI 2026.
project Preprint

I am always excited to explore new ideas together! Feel free to reach out 📧 if you are interested in collaborating on this research topic!