Code Mutations

Are Large Language Models Robust in Understanding Code Against Semantics-Preserving Mutations?

In this talk, I will present our evaluation on whether state-of-the-art LLMs with up to 8B parameters can reason about Python programs or are simply guessing.

Pedro Orvalho

• May 15, 2025 • 1 min read