Inside the Hekaton: SQL Server 2014's database engine deconstructed

Nadella's database squares the circle of cheap memory vs speed

It's 1996 and Mission:Impossible has just arrived on the cinema screens. RAM is $10 per megabyte and falling. Against this backdrop, Microsoft has quietly commenced its own seemingly impossible mission, a complete re-write of a little known database which will be released two years into the future, and will be known as SQL Server 7.

Redmond's new database will compete with the entrenched kings of the database world, Oracle and DB2.

Fast-forward to 2014.

MI:4 has come and gone. RAM costs $10 a gigabyte. And now we have the latest incarnation of one of the industry's best-selling databases: SQL Server 2014. Microsoft's freshly minted chief executive Satya Nadella presides over the launch at an overly orchestrated corporate event in San Francisco, California, intended to demonstrate his company is all over "big data."

SQL Server is consistently one of the strongest growers in Microsoft's product portfolio, meaning Microsoft's revenue share of the RDBMS market has grown in nearly two decades. Exact numbers vary according to the analyst you buy into, but what's not in dispute is the fact that Microsoft has clawed out a third-place position behind IBM and Oracle (number one) who have been going at this for longer.

Arguably, the only thing that's prevented SQL Server from doing better is the fact it's only available for Windows.

Market share aside, perhaps the two most significant changes in those 18 years have been on the technology front - the increase in server processing power and the price of RAM, the latter decreasing in cost one thousand fold. This is significant, as it is these two factors that are the major drivers behind SQL Server 2014's key theme of performance.

Memory is the new disk

When designing this release, Microsoft re-evaluated the reasons data is stored on disk rather than in memory. This is reflected in the three key features of this release. Assumptions valid when SQL Server 7 was written no longer hold true, ie: main memory was expensive and limited, and CPUs and operating systems had only 32-bit address spaces.

Main memory is now huge, dense, and most importantly, very cheap. Durable flash memory has and will almost certainly continue to follow the same curve. Whilst 32-bit Windows coped with at most 3GB of usable memory, CPUs are now 64-bit and Windows Server 2012 can address 4TB of RAM. Memory sizes are growing at such a rapid rate, it's estimated that more than 95 per cent of disk-based online transaction processing (OLTP) databases are small enough to be stored in-memory.

Conversely, there are problems with disk. Data retrieval latency is orders of magnitude slower than memory. We're talking milliseconds compared to nanoseconds, a million-fold difference.

Also, as processors have reached commercially viable speed limits, to increase their computational throughput CPU manufacturers have pipelined and scaled out their processors. In 1996, twin core CPU systems were a rare novelty and hugely expensive. In 2014, encountering 16, 24 or 32 cores inside a database server is a given.

These parallel giants consume data at a staggering rate, spending much of their time waiting on the disk subsystem to supply it, whilst disk-oriented mechanisms have evolved within the database engine to ensure data updates behave in the consistent way we expect them to in order to maintain ACID compliance.

It is time to re-think why data is stored on slow, unreliable disk, for anything other than recovery and archive purposes.


In SQL Server 2014, Microsoft has started again. The Hekaton database engine – an addition to rather than a replacement for the core SQL Server database engine – now has the capability to store tables in memory. In contrast to Oracle's TimesTen and IBM's SolidDB, SQL Server's in-memory database engine is fully integrated into the core product, and at no additional cost. This makes adoption attractive both technically and financially.

One table at a time migration is practical, reducing risk without requiring expensive consultancy or new skill sets. All common data types are supported - only the less often used XML, CLR and max() types are excluded. Table schemas only, or schemas and data can be persisted in memory, the former useful in cases where temporary data does not have to be persisted, such as staging tables. Furthermore, in contrast to its competitors, all tables or one table at a time can be migrated.

Full ACID compliance is ensured through optimistic MVCC (multi version concurrency control). Row-versioning and time-stamping is used rather than page locks to ensure ACID compliance, now a feasible technique as each row is stored in memory rather than disk. This results in the elimination of locking and blocking, and the waits and timeouts associated with them, a major bane of any production database administrator's life.

As a result, disk read waits are completely eliminated, as all reads are now performed from memory. Schema changes have to be logged to disk, not a big deal as they are comparatively rare. If the schema plus data table type is used, data changes are also logged to disk to provide durability for ACID compliance and recovery.

In this case, checkpointing and write logging still has to be performed to maintain durability, but this is now implemented much more efficiently, and in a well configured system rarely presents a bottleneck.

Hekaton's in-memory features are supported by traditional backup and restore and SQL Server's high availability options, AlwaysOn and log shipping. Puzzlingly, replication is not supported, though neither is database mirroring, which is understandable as it is soon to be deprecated.

A new migration aid, the Analysis, Migration and Reporting tool, is included as part of SQL Server Management Studio to assist the DBA in converting a schema to in-memory tables. With a few clicks, this generates a report containing recommendations and any necessary changes for successful migration.

Although this is of great assistance, a little new knowledge needs to be acquired by the DBA to transition data from disk to memory resident tables, particularly in regard to the restrictions imposed upon table schemas. Developers are unlikely to persevere with them in the current release, but Microsoft’s recent track record with SQL Server is to ship functionality and reliability first, then add GUI based front end management tools later – the way it should be.

In the final pre-manufacturing release, the most notable issues are lack of support for truncate and merge statements, and a clutch of disallowed query hints that might cause existing code to break. There are other minor restrictions, but Microsoft needs to address the keyword constraints, as Hekaton will probably be first adopted by data warehouse customers, where the use of both statements is likely to be more prevalent. The migration wizard also needs some work in terms of its recommendations, but this will no doubt improve in time.

Next page: Stored procedures

Other stories you might like

  • Xi Jinping himself weighs in on how Big Tech should deploy FinTech
    Beijing also outlines its GovTech vision and gets very excited about data

    China's government has outlined its vision for digital services, expected behavior standards at China's big tech companies, and how China will put data to work everywhere – with president Xi Jinping putting his imprimatur to some of the policies.

    Xi's remarks were made in his role as director of China’s Central Comprehensively Deepening Reforms Commission, which met earlier this week. The subsequent communiqué states that at the meeting Xi called for "financial technology platform enterprises to return to their core business" and "support platform enterprises in playing a bigger role in serving the real economy and smoothing positive interplay between domestic and international economic flows."

    The remarks outline an attempt to balance Big Tech's desire to create disruptive financial products that challenge monopolies, against efforts to ensure that only licensed and regulated entities offer financial services.

    Continue reading
  • America edges closer to a federal data privacy law, not that anyone can agree on it
    What do we want? Safeguards on information! How do we want it? Er, someone help!

    American lawmakers held a hearing on Tuesday to discuss a proposed federal information privacy bill that many want yet few believe will be approved in its current form.

    The hearing, dubbed "Protecting America's Consumers: Bipartisan Legislation to Strengthen Data Privacy and Security," was overseen by the House Subcommittee on Consumer Protection and Commerce of the Committee on Energy and Commerce.

    Therein, legislators and various concerned parties opined on the American Data Privacy and Protection Act (ADPPA) [PDF], proposed by Senator Roger Wicker (R-MS) and Representatives Frank Pallone (D-NJ) and Cathy McMorris Rodgers (R-WA).

    Continue reading
  • Microsoft fixes under-attack Windows zero-day Follina
    Plus: Intel, AMD react to Hertzbleed data-leaking holes in CPUs

    Patch Tuesday Microsoft claims to have finally fixed the Follina zero-day flaw in Windows as part of its June Patch Tuesday batch, which included security updates to address 55 vulnerabilities.

    Follina, eventually acknowledged by Redmond in a security advisory last month, is the most significant of the bunch as it has already been exploited in the wild.

    Criminals and snoops can abuse the remote code execution (RCE) bug, tracked as CVE-2022-30190, by crafting a file, such as a Word document, so that when opened it calls out to the Microsoft Windows Support Diagnostic Tool, which is then exploited to run malicious code, such spyware and ransomware. Disabling macros in, say, Word won't stop this from happening.

    Continue reading
  • Microsoft pledges neutrality on unions for Activision staff
    Now can we just buy them, please?

    Microsoft isn't wasting time trying to put Activision Blizzard's problems in the rearview mirror, announcing a labor neutrality agreement with the game maker's recently-formed union.

    Microsoft will be grappling with plenty of issues at Activision, including unfair labor lawsuits, sexual harassment allegations and toxic workplace claims. Activision subsidiary Raven Software, developers on the popular Call of Duty game series, recently voted to organize a union, which Activision entered into negotiations with only a few days ago.

    Microsoft and the Communication Workers of America (CWA), which represents Raven Software employees, issued a joint statement saying that the agreement is a ground-breaking one that "will benefit Microsoft and its employees, and create opportunities for innovation in the gaming sector." 

    Continue reading

Biting the hand that feeds IT © 1998–2022