The quest for faster Python: Pyston returns to open source, Facebook releases Cinder, or should devs just use PyPy?

Official CPython is slow, but there are many ways to get better performance


Facebook has released Cinder, used internally in Instagram to improve Python performance, while another faster Python, called Pyston, has released version 2.2 and made the project open source (again).

Python is the world's second most popular programming language (after JavaScript) according to some surveys; but it is by no means the fastest. A glance at benchmarks tells us that Python 3 computation is often many times slower than compiled languages like C and Go, or JIT (Just-in-Time) compiled languages like Java and JavaScript.

One reason is that the official implementation of Python, called CPython, is an interpreted, dynamic language, and its creator Guido Van Rossum has resisted optimising it for performance, saying in 2014 that "Python is about having the simplest, dumbest compiler imaginable, and the official runtime semantics actively discourage cleverness in the compiler like parallelizing loops or turning recursion into loops."

He argued that Python developers should write performance-critical code in C or use a JIT-compiled implementation like PyPy, which claims to be on average 4.2 times faster than CPython – though there are some differences between PyPy and CPython.

The demand for faster Python though has inspired some other projects: Facebook has released Cinder as open source, a project described as "Instagram's internal performance-oriented production version of CPython 3.8." Optimizations in Cinder include "bytecode inline caching, eager evaluation of coroutines, a method-at-a-time JIT, and an experimental bytecode compiler."

The Cinder JIT "supports almost all Python opcodes, and can achieve 1.5-4x speed improvements," according to the documents.

Facebook emphasises that although it runs Cinder in production, the project "is not polished or documented for anyone else's use," and specifically refuses to commit to fixing reported bugs or providing any support.

Another limitation is that Cinder is only used on x64 Linux, and "anything else (including OS X) probably won't work." At the same time, the team said that "our goal in making this code available is a unified faster CPython."

There does seem to be an element of pushing the code out and hoping that others will pick it up and make it something more useful to the Python community.

One important aspect of Cinder is the use of "Static Python," which sounds like a contradiction since Python is a dynamic language. The idea is to add type annotation to Python code so that normal Python syntax can be compiled to type-checked bytecode by the Cinder compiler, enabling better optimization. Performance, says the team, is similar to Cython modules, where Cython is a static compiler for Python and C.

Dropbox is another high profile company which once used Python heavily but wanted better performance, and in 2014 came up with Pyston, saying at the time that "hitting our performance targets can sometimes become prohibitively difficult when staying on Python."

Pyston is a method-at-a-time JIT, whereas PyPy is a tracing JIT, meaning that it traces through the code to optimize specific code paths and loops, rather than simply compiling each method.

In 2017 Dropbox ended its involvement with Pyston, writing its performance-critical code in other languages such as Go instead. Kevin Modzelewski, formerly a principal engineer at Dropbox, founded an independent Pyston project. Pyston 2 was rewritten and released as a binary, but Modzelewski said that "since compiler projects are expensive and we no longer have benevolent corporate sponsorship, it is currently closed-source while we iron out our business model."

Performance issues?

Those business challenges have now been overcome, since Pyston 2.2 is now available and is open source. Pyston 2.2 is "30 per cent faster than stock Python on our web server benchmarks," Modzelewski said, adding that "Pyston can thrive on an open-source business model, primarily by starting with support services."

The project aims to be highly compatible, so that it is a drop-in replacement for CPython – provided it is on an x86-64 system, as other architectures are not supported. Compatibility includes CPython C extensions. Benchmarks here show Pyston improving on CPython 3.8 in most cases, often substantially, but not to the same extent as PyPy. The trade off appears to be compatibility versus performance.

In May 2020 AI specialists DLabs tested JavaScript versus Python performance for machine learning. For JavaScript Node 12.16.1 was used, and for Python 3.7.6. The results seem surprising: although JavaScript benefits from an excellent JIT in Node (which uses the V8 engine as used by Google Chrome), Python easily outperformed it. "The learnings from the tests I ran are stark. JavaScript couldn’t get close to Python’s tasks — across the board. JavaScript’s computational performance is still much better than Python’s. However, the maturity of the libraries — which often have underlying modules written in C — means that operations on large datasets can offer so much more than sheer computational power," said developer Krzysztof Miśtal.

Perhaps Pyston would have been even quicker; but Miśtal's experience demonstrates that Python performance is not always a problem, since library developers have followed Van Rossum's advice and written performance-critical code in C. Those using Python for general purposes are likely to get more benefit. ®

PS: We've also been alerted to Pyjion, a JIT extension for CPython that compiles your Python code into native CIL and executes it using the .NET 5 CLR

Similar topics


Other stories you might like

Biting the hand that feeds IT © 1998–2021