Swiss startup balesio, staffed by all of nine people, has devised a penalty-free way of reducing unstructured data file sizes without altering the original file format, meaning no rehydration or decompression is needed to read the reduced size files.
Its Native File Optimisation (NFO) software technology analyses unstructured data files and restores their contents in a visually lossless manner with a smaller file size. The optimised and reduced files can still be read by their originating applications, such as Windows PowerPoint, SharePoint and Excel.
This approach is very much like that of the Dell-acquired Ocarina, which optimised and reduced the size of various image formats in a visually lossless manner but requiring an Ocarina reader to read the optimised files.
Christoph Schmid, balesio's chief operating officer and sales VP, says Ocarina started with image optimisation and then moved into unstructured files, whereas balesio started with Microsoft format unstructured data files – now including PowerPoint, Excel and SharePoint – and has progressed into PDFs and various image formats.
Schmid says balesio's NFO software can recover 50 to 95 per cent of an unstructured file's disk capacity by storing its contents more efficiently. For example, repeated elements on a PowerPoint deck, such as logos, need only be stored once. Imported image files often have colour and resolution attributes that can be scaled back, reducing the image's file size without compromising its visual integrity for the human eye.
The company produces various FILEminimizer software applications, which can be run on PCs and servers to reduce file sizes. Developers have access to a SDK. The company started up in 2006 providing eLearning products, and then looked for ways to reduce the size of the files involved. Balesio focused on this application, became incorporated in 2008 and has has shipped – wait for it – more than 4.5 million copies of its FILEminimizer software. It claims up to 2,000 large accounts are using multiple copies of its applications. The majority are in Europe with some in Japan and America, and include General Electric, Lafarge, and Hyundai. Word of mouth has been responsible, Schmid says, for multiple purchases within balesio's customers.
As a consequence of this sales volume, balesio is entirely self-funded and profitable – it is a venture-capital-free zone.
Balesio's claims are that its technology is highly efficient, completely open and has a one-shot approach, optimising and reducing a file once and forever, with no lock-in to a balesio reader. It says it optimises primary file data.
We asked how efficient it was compared to NetApp's A-SIS deduplication. Schmid said that, according to reports, A-SIS can return 60 per cent dedupe ratios for VMware virtual machine files but only 5 to 10 per cent for general unstructured data files.
We can intuitively conceive that A-SIS, as a block-level deduplication technology, is not file-content-aware as balesio is. Schmid sums up the NetApp balesio optimisation efficiency comparison like this:
Taking a "classic" primary storage share with 75-80 per cent unstructured files, we can achieve a data reduction of 50-85 per cent of that, compared to the 5-10 per cent that NetApp A-SIS is doing. Even if the remaining 20 per cent could be massively deduped by Netapp, I believe it would not achieve our level of realised storage space savings.
We note balesio optimises within a file, without looking for or finding redundancies across several files or across multiple balesio instances. This is an intrinsic feature of the product as it optimises the way an application stores data and removes redundant information within a file, rather than looking for repeating patterns of data within a data stream as simple compression does, or repeating patterns of information across multiple files or block groups, as deduplication technology does.
The company says it can flatten the fate of unstructured data growth in storage capacity terms and does so, it appears, better than any other supplier. It says that it actually helps performance, instead of hindering it, because smaller files are quicker to load, faster to back up, and consume fewer network resources when sent between computers, either in or between data centres or from a data centre to a hosting centre or the the cloud.
There are free trial offers of up to 12 optimisations on balesio's website. A single user FILEminimizer Office licence costs £34.95 and multi-user licence costs have volume discount curves. It seems worth a trial at least to see if you can turn your giant unstructured data files into reduced gnomic Swiss instances of their former selves. ®