Code Mutations

Large Language Models Are Not (Yet) Robust in Understanding Code Against Semantics-Preserving Mutations

In this paper we assess whether SOTA LLMs can reason about Python programs or are simply guessing. We apply five semantics-preserving code mutations, which maintain program …

Pedro Orvalho

• Jul 13, 2026 • 1 min read

LLMs for Code Understanding

From Brittle LLM Code Reasoning to MaxSAT-Based Verified Repairs @ UCL

In this talk, we examine the limitations of Large Language Models (LLMs) in semantic code reasoning, showing that their predictions may change under semantics-preserving code …

Pedro Orvalho

• May 20, 2026 • 1 min read

LLMs for Code Understanding

Are Large Language Models Robust in Understanding Code Against Semantics-Preserving Mutations? @ Oxford

In this talk, I will present our evaluation on whether state-of-the-art LLMs with up to 8B parameters can reason about Python programs or are simply guessing.

Pedro Orvalho

• May 15, 2025 • 1 min read