PostgreSQL 18 eyes analytics boost and distributed future

Async I/O and UUID v7 highlights of the September release, though some SQL features are delayed

Users and developers can expect the release of PostgreSQL 18 in September, the new iteration of the popular open source database, promising new features to enhance analytics and distributed architectures.

In July, the Stack Overflow survey showed that PostgreSQL had extended its lead as the most popular database for professional developers, 58 percent of whom use it, compared to 40 percent using MySQL and 30 percent using Microsoft SQL Server. The figure opting to use PostgreSQL is up from 52 percent last year.

But despite its popularity, PostgreSQL has not been well known as an analytics system, and is more likely to be used for online transaction processing (OLTP).

That might be about to change with new features in PostgreSQL 18, according to Umair Shahid, founder of PostgreSQL advisory company Stormatics and former head of PostgreSQL at Percona, an open source software consultancy.

Set for general availability later this month, PostgreSQL 18 will introduce a new asynchronous I/O subsystem that will allow read statements to be called from disk more quickly.

"With asynchronous I/O, what the CPU can now do is issue a read for data from disk, and instead of waiting for that data to come back, it can asynchronously issue multiple such read requests and take them in simultaneously," Shahid said. "The feature allows PostgreSQL to leverage the CPU more efficiently, instead of bottlenecking on I/O and disk, and especially with the read commands in analytical workloads that require lots of joins, and lots of parallel workload. It can significantly improve performance."

The asynchronous I/O subsystem would go a long way in helping out with analytical workloads, with tests showing performance improvements of 200 to 300 percent in some analytical workloads, Shahid said.

However, he reckons the feature was only designed for modern hardware, such as SSDs or premium high-speed disks. Developers should also be aware that the amount of concurrency and parallelism can be controlled to fit the hardware environment.

Shahid added that the universally unique identifier (UUID) employed for distributed computing had been upgraded from version 4 to 7 in the transition to PostgreSQL 18 to improve database index performance.

"With UUID v7, you get the ability to sort your UUIDs so, in effect, you can now actually implement global sequences in a distributed system, which goes one step further in helping with the ability to set up a distributed database with multiple active nodes with PostgreSQL, open source PostgreSQL built right into the core engine."

A number of startups have based products around the idea of distributed PostgreSQL. Both CockroachDB and YugabyteDB have PostgreSQL front ends – with varying degrees of compatibility – and distributed back ends. Meanwhile, PlanetScale has built a service on PostgreSQL with its proprietary operator, while pgEdge promises a multi-master distributed database that lets users run master nodes across multiple locations.

However, developers will have to wait a little longer for some features in the pipeline. Support for bitemporal queries in the system and their application in SQL syntax, as well as syntax support for Property Graph Queries, did not make the cut this time.

Alastair Turner, Technology Evangelist at Percona, said a year ago it looked like these features were progressing and would make it to the release. "I thought a year ago it looked like it was moving very well, the addition to the SQL standard that does graph syntax. That really was just a question of committer." ®

More about

TIP US OFF

Send us news


Other stories you might like