This article is more than 1 year old
MS poised to switch Windows file systems with Blackcomb
At long last, Cairo...?
The pieces are coming together for a clear shift in the way Microsoft does data storage. It is coming full circle, right back to the unified storage model of Cairo, promised but not delivered back in the early 1990s as the super-OS built upon Windows NT.
As I have been postulating for a while in my columns in PC Pro, Microsoft needs to complete its storage story. It has too many ways of storing relatively similar items, and a simplification is long overdue.
This is coming, and it lays the roadmap for the major changes for the next 4-5 years of Windows operating system changes. From the users' perspective, the changes will be small or even invisible - Microsoft is a past master of changing things only as quickly as the majority of the user base can manage.
From a technical perspective, the changes are quite profound. They focus on four main areas: Active Directory, SQL Server, Exchange Server and then the file system itself. As you will see, there is a clear interweaving of the changes to both BackOffice products, and to the base OS.
Microsoft recently admitted at the Barcelona TechEd conference, that one of the design goals for Yukon, the codename for SQL Server 2003 due in Summer 2003 after a year long beta programme, was to make it better at handling semi-structured and unstructured data. There can be no question as to its performance today when running as an RDBMS store - the TPCC and SAP benchmarks clearly show that Microsoft is a leading candidate in server speed. But SQL Server has never been optimised to work well on more randomly structured data. The pending announcement is vitally important for Microsoft, because the rumours suggest that Kodiak, the next major release of Exchange Server, will use Yukon as its store rather than the current JET/EDB/ESE storage engine. If that is the case, then Yukon will need to handle the extraordinary storage flexibility that Exchange Server offers today.
The first real .NET server
Yukon will be the first full-scale real .NET server from Microsoft. In other words, the first BackOffice engine that has been built using Visual Studio .NET and the Common Runtime. We know this, because Microsoft has admitted that Transact SQL (TSQL) will become yet another NET language, in the same vein as Visual Basic and C#. In other words, Yukon runs with the Common Runtime as an integral part of its engine design, and you will be able to use any .NET language to write stored procedures that sit close to the raw engine.
It is worth noting that both the Exchange Server and SQL Server teams are now reporting to the same key people inside Microsoft, clearly indicating that a major rationalisation is taking place on the storage front.
But the move of Exchange Server off the JET/ESE store has some other profound impacts. Active Directory in Windows 2000 and XP uses JET/ESE as its store. Since AD is the grown-up child of the directory service found in Exchange Server 4/5/5.5, this is to be expected - it came from a JET background and has stayed in that storage area under Windows 2000.
The move of Active Directory from JET/ESE to SQL Server is something that Microsoft simply refuses to discuss, even privately. It is probably the most sensitive topic around in the Windows Server arena. The reason is clear: Windows XP Server will ship at the end of this year and will continue to use the JET/ESE engine for one final fling. Only when we get to the next release of Windows Server, codenamed Longhorn and due in mid 2003, will we see steps to move Active Directory onto SQL Server Yukon. Obviously Microsoft doesn't want anything to confuse the XP Server launch, and thus storage and directory services is going to be story for 2002, and most definitely not something to be discussed before the end of this year.
You might think that this is the end of the story, but I think Microsoft has wider and more profound changes planned. Drive M: in Exchange Server 2000 allows you to mount the storage engine as a drive letter on the server, and to drill into it just as if it was file system storage. You can go into the tree, and share your mailbox at the filesystem level, complete with all normal access and security facilities. This works by having a driver which plugs into the top of the JET/ESE storage engine which then plugs into the bottom of the Windows 2000 driver stack.
Drive M: is a profoundly important technology to Microsoft. Today, it offers an extremely powerful replicating file system, which runs between your Exchange 2000 servers. Put some data into the London office server, and it will appear in the New York server automatically, with your system administrators controlling the replication schedules and rules. It allows you to publish straight out via IIS 5.0 - just one mouse click is required to turn a Public Folder into a Web site for Intranet, Extranet or Internet use. And it has extra document management facilities over and above those found in NTFS.
In the mid betas of Windows 2000, we saw a service called NSS - Native Storage Services, which allowed Office documents to be decomposed into NTFS Streams and thus stored in the NTFS file system more efficiently. NSS disappeared in late betas, and was promised to return in XP - it hasn't done so. At the time, the reason for the removal of NSS was because the HFS (Hierarchical Filing System) capabilities of Windows 2000 clashed with NSS - HFS worked at the whole file level, whereas the raison d'etre of NSS was to work at the sub-file level.
The fact that NSS is still 'missing in action', coupled with the work done on Drive M:, its associated indexing capabilities and the first-cut work done on Tahoe, now makes it clear that Microsoft is doing no more real work on the NTFS file system.
Flipping the switch with Blackcomb
Because of this, I think Microsoft is going to pull a major switch in Blackcomb, the release of Windows XP due in 2004/5. Yukon will be out, and stable and Kodiak will be sitting on top of Yukon's store. Active Directory will be in Yukon storage too. Drive M: will still be of great importance to corporate users.
At this point, Microsoft turns the whole NTFS/SQL Server model on its head. Instead of SQL Server using storage file space provided by NTFS, SQL Server itself becomes the base storage engine, and NTFS becomes an API-compatible driver into the store - just like Drive M: today. In other words, the machine boots SQL Server and NTFS is an old compatibility API for those applications that still need to manipulate files through a filing system API.
This is perfectly possible to do. With the final killing-off of real-mode and 16-bit code with the death of Windows 98/ME and the arrival of Windows XP, Microsoft can have any size of boot-strap loader it likes. There is no need for the native store to be NTFS - it could just as easily be SQL Server that starts as the native store, with the NTFS driver coming in as a service. For desktop and laptop users, they might call it MSDE as in the present light-weight SQL Server for those machines.
The changes this will bring will be profound. By having everything within one store, it will be possible to treat all data as is it was one data type - all queries can be "in process", and thus much faster than today's inefficient mish-mash of queries that spread across a wide number of stores. The performance of servers and desktops in 2005 will be dramatically higher than today, especially in the 64-bit world. So issuing huge queries against a uniform store will be entirely possible. From the users perspective, nothing could be simpler - "find me stuff to do with project X, where the management team includes Fred and Emily" is not fanciful at all - after all, English Query is now a robust technology in the SQL Server product group.
From a historical perspective, the vision of Cairo will have truly come full circle. Back then, Microsoft promised a proper directory service, messaging, cross-machine working, a fluid and customisable user interface experience, and unified storage. Active Directory gives the first, Exchange Server gives the second, Common Runtime/SOAP/XML gives the third, HTML/XML/XSL/IE gives the fourth, and Yukon's children give the latter.
It really is a case of Cairo Comes Home. ®