Writing history with Microsoft's Office lock-in

No XML please, we're arbitrary


Sometimes, very small decisions can have a very big impact on how people work in the future. So join us, on a journey into the future: a story that begins with a little fudge.

In a little noticed move, Microsoft has slid on its commitment to produce open standard file formats for its Office products.

By maintaining a proprietary binary format that frequently changes, Microsoft has kept the exit costs high for potential defectors. However, Microsoft has for a long time touted its investment in XML as a sign of its commitment to openness.

You must remember that XML has always had a "feature" which distinguishes it from SGML, its much more complicated publishing predecessor. SGML insisted on leaving nothing to chance, but an XML parser can, by using a DTD (Document Type Definition) file, happily munch its way through a "well formed" XML document schema, leaving many entities which have not been defined alone.

"Well formed" means that the document will parse without errors - it doesn't mean that the document will make any sense.

Some of our schemas are missing

Microsoft has made a curious choice. It has backed away from implementing an OASIS-defined industry standard by flying a populist flag. Microsoft will offer "freedom" to its users by letting them roll their own schemas.

Microsoft has done so by playing a six-cup shell game. There will be six versions of Microsoft Office 2003, but only two will support user-defined schemas. Can you guess under which two cups the schemas are hiding?

We'll tell you. Office Enterprise and Office Professional. As Joe Wilcox notes in this article, it's the first time such important functionality has been isolated in one variant of the suite.

For the rest of the time, you will be using Microsoft's own schema, WordML. But this is only open in the sense that XML is open.

So when you read a statement from Redmond (via Joe) that, "...when you are using Word in Office XP or the Standard version of Office 2003, the WordML--Microsoft's XML schema, which is 100 percent compliant with industry standards for XML--is saving the formatting of the Word doc," you can hear the sound of a wooden nose growing [*].

A splendid summary of the state of affairs can be found at XML Deviant , a column penned by Kendall Grant Clark.

Clark cites Mike Champion, who asks, "what is the point of storing data in XML if the schema [WordML] is so hideous and proprietary than no one can use it without proprietary API support? "

So in the future, you may be faced with two flavors of nonsense. XML Word documents that have been mangled by Microsoft's XML-creation tools, and XML Word documents that have been mangled by users who add their own non-standard entities (such as our Top Secret "VULTURE" tag).

Put your hands where we can see them

Now then. Microsoft argues, with some justification, that its binary Office format is superior technology to "open" and interoperable Unix file systems. The Unix people have barely got round to even starting discussing a Peace Process for Metadata. Microsoft offers a richer format: it supports multiple data streams, and allows all kinds of interesting compound documents to be created.

But if Microsoft had taken note of the responsibilities that go with the power it wields, it would have documented the format and submitted it to a recognized standards body. It could then compete on its own skills as the best implementer of its home grown format.

No XML please, we're arbitrary

(Kendall's must-read column goes onto other areas, such as the quality of WordML, and the market power that Microsoft as a producer of XML content will have on the language, which is an interesting discussion in itself)

The user defined schemas come with a very curious choice of name.

Forgive us for taking part in what looks like a semantic Jihad in recent weeks - yes, there other useful ways of looking at the world - but sometimes the choice of language tells us a lot.

Microsoft calls these user defined schemas "arbitrary schemas".

Remember me not

A very telling quote in Joe's piece comes from Jean Paoli, XML tribal elder and Microsoft's man in XML-land.

Paoli appears to have given up the pretense of Microsoft using XML as a document format at all.

"I'm out of the business of creating formats. Our focus on Office is on data exchange."

Data exchange. There's a good subject.

Let's add the factor "time" into the context. It's already quite hard for you to read EBCDIC documents, unless you have terminal access to an IBM mainframe - or the right IBM mainframe - as there were several EBCDICs and not all were compatible with each other. (Sound familiar?)

Simon Phipps, who works for Sun but here is speaking for himself, making an important point:

" We continue to live in a world where all our know-how is locked into binary files in an unknown format. If our documents are our corporate memory, Microsoft still has us all condemned to Alzheimer's."

He has identified that if we want our data to live on, we need Microsoft to live on too, to help us read it.

So regarding data exchange, who is exchanging what with whom here?

We need our history and our historians. And by ensuring data formats are vendor specific, we're already defining the constraints under which future historians will operate. ®

[*] Creative readers are encouraged to submit entries for what this may sound like, please - no files larger than 35kb.


Other stories you might like

  • Chip shortage forces temporary Raspberry Pi 4 price rise for the first time

    Ten-buck increase for 2GB model 'not here to stay' says Upton

    The price of a 2GB Raspberry Pi 4 single-board computer is going up $10, and its supply is expected to be capped at seven million devices this year due to the ongoing global chip shortage.

    Demand for components is outstripping manufacturing capacity at the moment; pre-pandemic, assembly lines were being red-lined as cloud giants and others snapped up parts fresh out of the fabs, and the COVID-19 coronavirus outbreak really threw a spanner in the works, so to speak, exacerbating the situation.

    Everything from cars to smartphones have been affected by semiconductor supply constraints, including Raspberry Pis, it appears. Stock is especially tight for the Raspberry Pi Zero and the 2GB Raspberry Pi 4 models, we're told. As the semiconductor crunch shows no signs of letting up, the Raspberry Pi project is going to bump up the price for one particular model.

    Continue reading
  • Uncle Sam to clip wings of Pegasus-like spyware – sorry, 'intrusion software' – with proposed export controls

    Surveillance tech faces trade limits as America syncs policy with treaty obligations

    More than six years after proposing export restrictions on "intrusion software," the US Commerce Department's Bureau of Industry and Security (BIS) has formulated a rule that it believes balances the latitude required to investigate cyber threats with the need to limit dangerous code.

    The BIS on Wednesday announced an interim final rule that defines when an export license will be required to distribute what is basically commercial spyware, in order to align US policy with the 1996 Wassenaar Arrangement, an international arms control regime.

    The rule [PDF] – which spans 65 pages – aims to prevent the distribution of surveillance tools, like NSO Group's Pegasus, to countries subject to arms controls, like China and Russia, while allowing legitimate security research and transactions to continue. Made available for public comment over the next 45 days, the rule is scheduled to be finalized in 90 days.

    Continue reading
  • Global IT spending to hit $4.5 trillion in 2022, says Gartner

    The future's bright, and expensive

    Corporate technology soothsayer Gartner is forecasting worldwide IT spending will hit $4.5tr in 2022, up 5.5 per cent from 2021.

    The strongest growth is set to come from enterprise software, which the analyst firm expects to increase by 11.5 per cent in 2022 to reach a global spending level of £670bn. Growth has fallen slightly, though. In 2021 it was 13.6 per cent for this market segment. The increase was driven by infrastructure software spending, which outpaced application software spending.

    The largest chunk of IT spending is set to remain communication services, which will reach £1.48tr next year, after modest growth of 2.1 per cent. The next largest category is IT services, which is set to grow by 8.9 per cent to reach $1.29tr over the next year, according to the analysts.

    Continue reading
  • Memory maker Micron moots $150bn mega manufacturing moneybag

    AI and 5G to fuel demand for new plants and R&D

    Chip giant Micron has announced a $150bn global investment plan designed to support manufacturing and research over the next decade.

    The memory maker said it would include expansion of its fabrication facilities to help meet demand.

    As well as chip shortages due to COVID-19 disruption, the $21bn-revenue company said it wanted to take advantage of the fact memory and storage accounts for around 30 per cent of the global semiconductor industry today.

    Continue reading
  • China to allow overseas investment in VPNs but Beijing keeps control of the generally discouraged tech

    Foreign ownership capped at 50%

    After years of restricting the use and ownership of VPNs, Beijing has agreed to let foreign entities hold up to a 50 per cent stake in domestic VPN companies.

    China has simultaneously a huge market and strict rules for VPNs as the country's Great Firewall attempts to keep its residents out of what it deems undesirable content and influence, such as Facebook or international news outlets.

    And while VPN technology is not illegal per se (it's just not practical for multinationals and other entities), users need a licence to operate one.

    Continue reading
  • Microsoft unveils Android apps for Windows 11 (for US users only)

    Windows Insiders get their hands on the Windows Subsystem for Android

    Microsoft has further teased the arrival of the Windows Subsystem for Android by detailing how the platform will work via a newly published document for Windows Insiders.

    The document, spotted by inveterate Microsoft prodder "WalkingCat" makes for interesting reading for developers keen to make their applications work in the Windows Subsystem for Android (WSA).

    WSA itself comprises the Android OS based on the Android Open Source Project 1.1 and, like the Windows Subsystem for Linux, runs in a virtual machine.

    Continue reading
  • Software Freedom Conservancy sues TV maker Vizio for GPL infringement

    Companies using GPL software should meet their obligations, lawsuit says

    The Software Freedom Conservancy (SFC), a non-profit which supports and defends free software, has taken legal action against Californian TV manufacturer Vizio Inc, claiming "repeated failures to fulfill even the basic requirements of the General Public License (GPL)."

    Member projects of the SFC include the Debian Copyright Aggregation Project, BusyBox, Git, GPL Compliance Project for Linux Developers, Homebrew, Mercurial, OpenWrt, phpMyAdmin, QEMU, Samba, Selenium, Wine, and many more.

    The GPL Compliance Project is described as "comprised of copyright holders in the kernel, Linux, who have contributed to Linux under its license, the GPLv2. These copyright holders have formally asked Conservancy to engage in compliance efforts for their copyrights in the Linux kernel."

    Continue reading
  • DRAM, it stacks up: SK hynix rolls out 819GB/s HBM3 tech

    Kit using the chips to appear next year at the earliest

    Korean DRAM fabber SK hynix has developed an HBM3 DRAM chip operating at 819GB/sec.

    HBM3 (High Bandwidth Memory 3) is a third generation of the HBM architecture which stacks DRAM chips one above another, connects them by vertical current-carrying holes called Through Silicon Vias (TSVs) to a base interposer board, via connecting micro-bumps, upon which is fastened a processor that accesses the data in the DRAM chip faster than it would through the traditional CPU socket interface.

    Seon-yong Cha, SK hynix's senior vice president for DRAM development, said: "Since its launch of the world's first HBM DRAM, SK hynix has succeeded in developing the industry's first HBM3 after leading the HBM2E market. We will continue our efforts to solidify our leadership in the premium memory market."

    Continue reading

Biting the hand that feeds IT © 1998–2021