AI can improve on code it writes, but you have to know how to ask
LLMs do more for developers who already know what they're doing
Large language models (LLMs) will write better code if you ask them, though it takes some software development experience to do so effectively – which limits the utility of AI code help for novices.
Max Woolf, senior data scientist at Buzzfeed, on Thursday published an experiment in LLM prompting, to see whether LLMs can optimize the code they suggest on demand.
Woolf explained, "If code can indeed be improved simply through iterative prompting such as asking the LLM to 'make the code better' — even though it’s very silly — it would be a massive productivity increase. And if that’s the case, what happens if you iterate on the code too much?"
This is what happened: Anthropic's Claude was tasked with writing Python code to find the difference between the smallest and the largest numbers whose digits sum up to 30, given a list of one million random integers between 1 and 100,000. And the LLM returned a functional solution that worked.
The initial solution, which Woolf characterized as something a novice programmer might write, took an average 657 milliseconds to run on an Apple M3 Pro Macbook Pro.
And when asked to "write better code," Claude responded with optimized code that performed 2.7x faster.
Asked again, Claude made further improvements, returning code that incorporates multithreading for a 5.1x performance improvement over the initial implementation, though it was at the cost of creating errors that require fixing.
Iterations three and four produced further gains, resulting in speedups of 4.1x and 99.7x.
Woolf then repeated the experiment using "prompt engineering," which simply means providing the LLM with more detail about what's expected and how to proceed. This was done in part by modifying the Claude system prompt, available via API as a way to set the rules for LLMs, to do things like use certain code efficiency strategies.
"Although it's both counterintuitive and unfun, a small amount of guidance asking the LLM specifically what you want, and even giving a few examples of what you want, will objectively improve the output of LLMs more than the effort needed to construct said prompts," observes Woolf in his write-up.
- Just how deep is Nvidia's CUDA moat really?
- Cheat codes for LLM performance: An introduction to speculative decoding
- Google Gemini 2.0 Flash comes out with real-time conversation, image analysis
- Open source maintainers are drowning in junk bug reports written by AI
The results with prompt engineering produced more sophisticated, faster code, though with more bugs.
"In all, asking an LLM to 'write code better' does indeed make the code better, depending on your definition of better," Woolf concludes. "Through the use of the generic iterative prompts, the code did objectively improve from the base examples, both in terms of additional features and speed.
"Prompt engineering improved the performance of the code much more rapidly and consistently, but was more likely to introduce subtle bugs as LLMs are not optimized to generate high-performance code. As with any use of LLMs, your mileage may vary, and in the end it requires a human touch to fix the inevitable issues no matter how often AI hypesters cite LLMs as magic."
Woolf concludes that LLMs won't replace software engineers anytime soon because a software engineering background is necessary to distinguish good code and to understand domain-specific constraints.
A recent research paper from computer scientists at Northeastern University, Wellesley College, and Oberlin College offers support for this view. The paper, titled "Substance Beats Style: Why Beginning Students Fail to Code with LLMs," examines whether prompt style – the arrangement of words – or prompt substance – the terms used to frame the problem – matter more.
"Overall, our findings support the view that the information content of prompts is more important than wording," conclude authors Francesca Lucchetti, Zixuan Wu, Arjun Guha, Molly Q Feldman, and Carolyn Jane Anderson.
In other words, to get a good answer from an LLM, it helps to have a strong background in the topic of inquiry. So experienced developers will get better results asking LLMs for help than neophytes. ®