Microsoft's next major version of its Entity Framework (EF) database library for .NET will have long-term support and attempt to match rival Dapper for performance – an attempt, said senior program manager Jeremy Likness, that "will likely not be fully achieved."
Entity Framework is Microsoft's Object-Relational Mapping (ORM) library, and sits on top of ADO.Net, a lower-level database library.
The theory behind using an ORM is that it relieves developers of much of the tedious and error-prone work of writing code for CRUD (Create, Retrieve, Update, Delete) operations against databases. Using an ORM, developers can work with classes representing their business objects and ask the ORM to save and retrieve them as needed.
A complication is that the transition from EF (also runs on .NET Framework) to EF Core (only runs on .NET Core) was bumpy. The last version of EF was 6.2 in 2017. EF Core was a complete rewrite and not a drop-in replacement. Some features of EF, like the ability to update model classes from the database, have been on the EF Core backlog for over six years. On the other hand, EF Core has some features EF lacks.
EF Core 6.0 is scheduled for November 2021, according to Likness, and will be an LTS (Long-term support) release to tie in with .NET 6 (note that all future versions of .NET are based on .NET Core, not .NET Framework.)
New features include support for SQL Server temporal tables (tables that keep a history of how data changes); JSON column support; better migration support (the business of updating a database from the object model in code); and support for all queries that work in EF 6.x. There are also plans to improve the Cosmos DB provider, this being for Microsoft's multi-model Azure database service, and full free-text search for SQLite and SQL Server.
Work is also being carried out on ADO.NET, Likness said, including a new batching API for sending multiple queries in one operation, and improvements are being made to the SQLite provider to support connection pooling and prepared statements.
Experimental features, with "no concrete deliverables planned", include a rewritten SQL Server driver using "modern .NET features" called SQLServer.Core; and better support for GraphQL in .NET.
Trying to catch Dapper
Performance is another big issue. Plans include compiled models, and making EF Core "work better with linkers and AOT" where AOT is ahead-of-time compilation. "We plan to match Dapper performance on the TechEmpower Fortunes benchmark," proclaimed Likness, adding: "This is a significant challenge which will likely not be fully achieved."
ORMs vary in their level of abstraction. EF is highly abstracted. An alternative in the .NET world is Dapper, which is lightweight, describing itself as a "simple object mapper for .NET". Dapper still requires the developer to write SQL, but wraps the business of mapping results to and from .NET objects. Performance can be close to raw ADO.NET.
A look at the TechEmpower benchmarks shows that the top performing .NET stack for single queries is ASP.NET Core with ADO, scoring 318,164. The Dapper ORM scores from 247,280. The best performing EF result is 116,496. Note that this is with PostgreSQL; Microsoft's SQL Server is not included in the benchmarks for licensing reasons.
In order to catch up with Dapper, EF Core would have to eliminate its overhead versus simply sending SQL to the database server. It is unlikely that the team can do this without bypassing features that are key to the value EF provides. This is already possible using raw SQL queries in EF but that does not address performance issues with EF as normally used.
ORMs are an example of what developer Joel Spolsky dubbed a "leaky abstraction."
An abstraction, he said, is "a simplification of something much more complicated that is going on under the covers" and leaks represent "the things that the abstraction can't quite protect you from."
Spolsky described SQL, the common language of database queries, as a leaky abstraction, which makes an ORM a leaky abstraction of a leaky abstraction.
Performance is often an issue. An ORM has to allow for all sorts of possible cases, and is prone to retrieving more data than the application actually needs, slowing it down. In many cases this does not much matter as it is not a bottleneck, but it is critical in a context like a busy web application.
A second issue is that ORMs have their own complexity that can be as bad as the complexity they are trying to protect you from.
If you read up on many-to-many relationships in EF, you might conclude that SQL's JOIN syntax is no harder to learn and more precise to use. A counter-argument is that an ideal ORM could generate better optimised SQL than most developers can achieve.
How do developers optimise EF in their applications? In part, by learning how to get the best from it; in part by inspecting the SQL that EF generates. This means that expert EF users still need to understand SQL.
The performance and complexity of EF, once developers get beyond simple applications, is not a small matter. Microsoft pushes developers towards EF as well as SQL Server in how it presents and documents the .NET platform.
Therefore EF performance also impacts how .NET performance is perceived. EF will always have a role for high productivity in some use cases, but given the excellent results from Dapper, Microsoft might consider giving this alternative more prominence, rather than attempting to match it but planning to fail. ®
Dapper originated at StackOverflow, the developer favourite
copy-and-paste question-and-answer site, which was co-founded by Spolsky. But Dapper was first developed by Sam Saffron, who went on to found Discourse, a Ruby on Rails discussion application.