Artificial Intelligence for Reliable Code
Understanding the reasoning and robustness of AI systems, such as Large Language Models (LLMs), is critical for ensuring their reliable use in programming tasks. While recent …

Understanding the reasoning and robustness of AI systems, such as Large Language Models (LLMs), is critical for ensuring their reliable use in programming tasks. While recent …
In this talk, I will present our evaluation on whether state-of-the-art LLMs with up to 8B parameters can reason about Python programs or are simply guessing.