From Brittle LLM Code Reasoning to MaxSAT-Based Verified Repairs @ UCL
In this talk, we examine the limitations of Large Language Models (LLMs) in semantic code reasoning, showing that their predictions may change under semantics-preserving code …
In this talk, we examine the limitations of Large Language Models (LLMs) in semantic code reasoning, showing that their predictions may change under semantics-preserving code …
Understanding the reasoning and robustness of AI systems, such as Large Language Models (LLMs), is critical for ensuring their reliable use in programming tasks. While recent …
In this talk, I will present our evaluation on whether state-of-the-art LLMs with up to 8B parameters can reason about Python programs or are simply guessing.