This profiler chatbot promises to help speed up your Python – we can believe it
Scalene, Scalene, Scalene, Scalene, I'm beggin' of you please improve my code
Interview To make Python code run faster, you can now get performance optimization advice from the Scalene Python profiler and its associated chatbot – and mostly its recommendations help.
A Python profiler provides statistical data about how Python code runs, such as the number of calls, the time spent executing functions, and so on. Armed with this information, a developer has a chance to find and mitigate potential performance bottlenecks. For example, profiling data might suggest that a given CPU-bound program could run faster with parallelization, to better use multiple CPU cores.
Scalene, developed by University of Massachusetts Amherst computer science professor Emery Berger, and graduate students Sam Stern and Juan Altmayer Pizzorno, debuted in 2020 and has been improved repeatedly since then. It's been downloaded more than 675,000 times since its initial release.
The profiling tool is described by its creators as "a high-performance CPU, GPU and memory profiler for Python that does a number of things that other Python profilers do not and cannot do. It runs orders of magnitude faster than many other profilers while delivering far more detailed information."
It didn't always incorporate generative AI advice, and began its dalliance with chatty large language models in January. Initially setup to work with OpenAI's GPT-3, it was subsequently tweaked to connect to GPT-3.5, and in July, was further revised to converse with GPT-4.
That happened shortly after Berger and his co-authors presented a paper [PDF] on the project at the 17th USENIX Symposium on Operating Systems Design and Implementation, one that would end up being awarded best paper at the conference.
"The first version of Scalene was released in 2020, but that version of Scalene was completely different from what it is today," Berger told The Register in an email. "The AI features were incorporated in January. The most recent version was last week. We did issue a major release shortly after the conference that improves the AI-based optimizations."
- Modular finds its Mojo, a Python superset with C-level speed
- Python is getting faster: Major performance tweaks on horizon
- Faster Python: Mark Shannon, author of newly endorsed plan, speaks to The Register
- Sneaky Python package security fixes help no one – except miscreants
Scalene's AI ties enhance usability though what really makes this Python profiler stand out is the breadth of its capabilities. Where many profilers are either memory-only or CPU-only, Scalene handles both as well as GPU reporting.
It reports on lines and functions, it works without modifying code, and it supports Python threads. It also has various capabilities that, it's claimed, are not present in other Python profilers, such as support for the multiprocessing library, breaking out time spent on Python code from time spent on library native code, reporting memory use over time per line or function, reporting megabytes copied per second, and detecting lines responsible for memory leaks.
Many programming languages execute code faster than Python, but Berger believes Python's ubiquity and vast ecosystem justify acceleration efforts.
"Python is one of the most popular languages today," he said. "It gives you access to tons of libraries, especially for AI and data science, and is hence (and for other reasons) more approachable and much more convenient to use than high-performance alternatives. It's practically the lingua franca for AI and data science and much else, and vastly easier to use than, say, Java, C++, or Rust, for example."
It's practically the lingua franca for AI and data science and much else, and vastly easier to use than, say, Java
Berger said that neither Mojo nor Codon are workable replacements for Python because they require people to rewrite their code.
"Mojo isn't actually even publicly available yet ... I'm looking forward to using it – I am on the wait list – but getting high performance in Mojo will, as I understand it, still require rewriting code to provide static type information in order to get it to perform well," said Berger. "Mojo is meant to be a superset of Python, but Codon is not compatible with much existing Python code, which I think will present a serious barrier."
That said, Berger suggested that Python's reputation for slowness isn't entirely deserved because one of its major draws is the ability to use compiled libraries implemented using high-performance languages and frameworks, such as C++ and CUDA.
"If you are writing code in Python but making heavy use of these native libraries, your code will scream," he said.
Scalene can also make your code scream, or so users report and the paper describes. For instance, in a post to the Scalene repo last year, Chris Wilhelm, an engineer at Semantic Scholar, reported that one of the company's machine learning models had become too expensive to run and jeopardized a related product.
"We generated a set of test data and ran our models with Scalene mounted – the html output was able to pinpoint our squeakiest wheels and help us validate our changes were having an impact," he wrote. "The process was iterative, precise and repeatable. In the end, we were able to reduce costs by a staggering 92 percent."
That was before Scalene added support for connecting to an OpenAI chatbot model. By adding an OpenAI API key to Scalene, users can have profiling data turned into specific advice about code changes from a not entirely unreliable chatbot.
The text output from the AI model might look like, for example, "Vectorize the code to reduce the number of loops and improve performance." And that would be followed by suggested code to do so.
"It doesn’t always work in that it sometimes produces optimizations that would change the behavior of the program (this is something we are addressing and should be out soonish), but the optimizations are generally 'on point,'" said Berger.
The Scalene web interface demo includes the appropriate warning: "Note that optimizations are AI-generated and may not be correct."
Using Scalene is just a matter of installing it – which isn't necessarily trivial given the Python ecosystem's varied package management options – and then running
scalene your_program.py from the command line rather than
"Scalene identifies where the inefficiencies are in your program," said Berger. "You can then click on a button and get AI-based suggested optimizations. So in short, you already basically can push a button and get better performance." ®