It's all got complicated: The costs of data recovery

Do the sums, folks. You won't be disappointed

Data protection companies have multiplied in the past 10 years and they are locked in a bitter battle for market share.

Before Amazon arrived on the scene, the main options were owning all links in the data protection chain or renting or leasing appliances, with or without attached offsite services.

Thanks to Amazon-style public clouds we have a third option: storage owned by someone else. This comes in three varieties: cloud provider offers a software/service combo; you provide the software and rent the cloud storage; or a third-party provider offers software and storage.

While the basic "how" of data protection has evolved little beyond these three options, there has been a proliferation of different ways to license it.

Mine, all mine

The easiest and simplest type of data protection is the one you own from start to finish. You supply two sites, you buy the hardware that fills them, you purchase the software that makes the backups do their thing and you supply the bandwidth to connect them. When your storage is full, you buy more, whether that is working storage or your data protection array.

Simple enough – except for the part where it is not. All the little details of everything intersect with all the details of everything else. And trying to figure out what you need to buy can be a mind-altering omnishambles.

How often does everything need to be backed up? Different documents from different sources have different relevance half lives, and not all databases are created equal.

The local file share with all the safety PDFs that nobody reads gets updated once a year before the auditor shows up, so you are probably fine if you don't back that up every night. But the financials database needs to be as close to real time as possible lest you irritate the people who sign your pay cheques.

What does your second site look like? If you are a small company it might look a lot like a drawer in the sysadmin's garage. If you are big enough, that second site could be a shiny data centre with all the fixings and its own guard detail. Or it could be anything in between.

What your second site is determines how you get the data there. Do you have to send a tape or hard drive by courier or have someone take it home? Can you schlep it across the internet? What's the bandwidth like between the sites?

Are you rotating tapes or using a fixed storage device? Are you using deduplication or compression at either end? What is providing this service – the storage device or the data protection software?

And how do these considerations affect the licensing of the data protection software? Is software charged per server? Per TB? Per feature, for example extra for deduplication?

Could your supplier charge you several times per server – say, one licence for an SQL instance on a server and another to back up Windows files?

Change one thing and it cascades. The costs have to be redone, the calculations rerun from the beginning. Owning the whole stack makes it easy to know what the costs are and that they won't change – but it can also lead to nervous breakdowns.

Appliances to let

And so the appliance was born. Sure, you could roll your own software, but why bother? With an appliance you can just buy a widget from a data-protection provider and get unlimited protection, features, servers and what have you.

That is until the appliance is full, and then you need another appliance, the cost of which bears little relationship to the hardware it runs on.

The appliance will almost certainly be a re-badged Supermicro affair and the actual cost per raw TB is materially insignificant. Unfortunately, data protection companies need to pay their staff too so they must add margin on top of the hardware.

In addition, you still have to worry about that pesky second site

The margin can be as high as five times the cost of the hardware, and that makes the incremental upgrade cost hard to swallow for smaller businesses.

In addition, you usually still have to worry about that pesky second site. You might choose to put the appliance on the second site and stream the data over the internet as the backups occur.

You might back up to a local appliance and sync with an offsite one, or you might have a data protection gateway that you back up to locally, which then deduplicates and compresses the data before firing it off to a second site.

And that second site could again be anything. It could be a second location your company owns, some space in a colocation facility, or perhaps a copy of the data is kicked up to a public cloud provider, either by you as a target or as part of a service offered by the appliance vendor.

Decisions, decisions. Appliances take a lot of the complexity out of calculating what you will ultimately pay but they do not eliminate all of it, and they are not necessarily cheaper than rolling your own.

Cloud provider appeal

In this context, the cloud provider solution starts to make a lot of sense, even if it is likely to cost you more than you bargained for.

In a scenario in which the cloud provider provides the data protection software rarely are you nickel and dimed for features. You get software (or a cloud gateway appliance) that does X and it will cost you Y per TB per month. There is not a lot of wiggle room on the interpretation side, nor the implementation side.

At first blush, cloud provider operated data protection services are surely a sanity-saving salve for swamped sysadmins? If only.

The problem is that you still have to get the data there. That means you are still provisioning the bandwidth to schlep everything to your second site, which in this case happens to be the cloud. And when a restore event does occur will you get back the relevant data in time to be useful?

There is also that awkward part where all your backups are now just a cracked password away from going bye-bye, or having someone downloading everything and read through all your preciouses.

Double up

Two-factor authentication is a good start, but for the critical stuff you might consider more than one cloud provider.

Also factor in that data protection storage never gets smaller. If you pay $5 this month, next month it will be $5 plus whatever was added. The month after that it will be $5 plus whatever was added plus whatever was added after that. So on and so forth forever.

Oh, and then there is the cost of restores. Cloud service providers can be somewhat asymmetrical about the cost of restoring data. Right when the chips are down and you are at your most desperate, they hit you with a restore fee that is substantially more than the cost of storage or bandwidth to get the stuff up there. Ouch.

Of course, the data protection companies saw a way here of not being driven out of business by the public cloud providers, and thus we have our last category of data protection to consider.

Public cloud storage gets cheaper when you buy in bulk and are prepared to make longish term commitments/ A file storage company like my preferred can buy its storage from cloud providers in bulk and manage it with its own software.

Sync uses this purchasing power to provide a secure Dropbox-like and do so at ridiculous prices such as $50 a year for 500GB of storage.

Why can it do this? Because nobody uses that full 500GB, because it can compress data as it goes into the storage and because one person not using up all the bandwidth factored in to that $50 a year means someone else can go a little over.

It is sort of like the way we used to buy insurance before some git legalised tracking us 24/7 so that insurers don't actually have any risk or share the costs among subscribers.

It should be clear where I am going with this: the world's data protection companies have glommed onto public cloud storage as a means of eliminating a big whack of complexity for the customer, but are finding they can provide value by engaging in a price war. No, the data protection industry never did learn anything from history.

At the moment, most data protection companies are not quite so far along. They charge you based on the cost of storage to them, with a surcharge on top for software and support services. They typically bury the recovery costs charged by the service provider by assuming that everyone will have to recover 100 per cent of their data X number of times a year.

It's a battle

In the end, data protection companies eke out only a fixed margin above the costs of the raw cloud storage. Since they all long ago gave up on ease of use as a differentiator, and everyone more or less has the same features, you end up with a huge swathe of data protection providers that all cost about the same and deliver roughly the same product.

Since the public cloud providers don't offer a whole lot of flexibility in how they price storage, bandwidth or restore (especially if you are using "slow" storage, like Amazon's Glacier), the world's data protection companies have decided that the new battleground is to be around the methods by which they tweak that risk-sharing algorithm.

Among these providers, another Canadian company, Asigra, catches my eye. Asigra has decided to take a middle road between the sort of raw pricing you get from a cloud provider directly and the fully shared cost model used by its competitors.

Asigra examines your backup and recovery events and assigns your organisation a recovery performance score (RPS), which it measures on a scale of 0-10. The RPS is basically a ranking of what percentage of your total data you needed to recover that year.

The less you recover, the lower your cost per TB of backup storage. Only successful recoveries count towards the RPS and the single largest recovery event in each term is excluded from the calculation.

The comparison between data protection and insurance makes a lot more sense now. Asigra's approach sounds rather a lot like the grid system my province in Canada adopted to prevent some pretty ruthless discrimination by our insurance companies. It is an approach that worked out rather well in the end.

Now, some companies might freak out a bit about this model because they do recovery drills. These may require restoration of 100 per cent of the data, and wouldn't that push a company into the highest RPS bracket?

Apparently, Asigra has that covered too, with recovery drills being priced separately from production storage and recovery events and not contributing to the RPS position at all.

So the pricing model war of the cloud storage era has officially begun. I suspect that cloud backups are about to evolve from a one-size-fits-all model into an industry where you can find a financial model that is actually tailored to your company needs and to the business value. ®

Similar topics

Other stories you might like

  • Cheers ransomware hits VMware ESXi systems
    Now we can say extortionware has jumped the shark

    Another ransomware strain is targeting VMware ESXi servers, which have been the focus of extortionists and other miscreants in recent months.

    ESXi, a bare-metal hypervisor used by a broad range of organizations throughout the world, has become the target of such ransomware families as LockBit, Hive, and RansomEXX. The ubiquitous use of the technology, and the size of some companies that use it has made it an efficient way for crooks to infect large numbers of virtualized systems and connected devices and equipment, according to researchers with Trend Micro.

    "ESXi is widely used in enterprise settings for server virtualization," Trend Micro noted in a write-up this week. "It is therefore a popular target for ransomware attacks … Compromising ESXi servers has been a scheme used by some notorious cybercriminal groups because it is a means to swiftly spread the ransomware to many devices."

    Continue reading
  • Twitter founder Dorsey beats hasty retweet from the board
    As shareholders sue the social network amid Elon Musk's takeover scramble

    Twitter has officially entered the post-Dorsey age: its founder and two-time CEO's board term expired Wednesday, marking the first time the social media company hasn't had him around in some capacity.

    Jack Dorsey announced his resignation as Twitter chief exec in November 2021, and passed the baton to Parag Agrawal while remaining on the board. Now that board term has ended, and Dorsey has stepped down as expected. Agrawal has taken Dorsey's board seat; Salesforce co-CEO Bret Taylor has assumed the role of Twitter's board chair. 

    In his resignation announcement, Dorsey – who co-founded and is CEO of Block (formerly Square) – said having founders leading the companies they created can be severely limiting for an organization and can serve as a single point of failure. "I believe it's critical a company can stand on its own, free of its founder's influence or direction," Dorsey said. He didn't respond to a request for further comment today. 

    Continue reading
  • Snowflake stock drops as some top customers cut usage
    You might say its valuation is melting away

    IPO darling Snowflake's share price took a beating in an already bearish market for tech stocks after filing weaker than expected financial guidance amid a slowdown in orders from some of its largest customers.

    For its first quarter of fiscal 2023, ended April 30, Snowflake's revenue grew 85 percent year-on-year to $422.4 million. The company made an operating loss of $188.8 million, albeit down from $205.6 million a year ago.

    Although surpassing revenue expectations, the cloud-based data warehousing business saw its valuation tumble 16 percent in extended trading on Wednesday. Its stock price dived from $133 apiece to $117 in after-hours trading, and today is cruising back at $127. That stumble arrived amid a general tech stock sell-off some observers said was overdue.

    Continue reading
  • Amazon investors nuke proposed ethics overhaul and say yes to $212m CEO pay
    Workplace safety, labor organizing, sustainability and, um, wage 'fairness' all struck down in vote

    Amazon CEO Andy Jassy's first shareholder meeting was a rousing success for Amazon leadership and Jassy's bank account. But for activist investors intent on making Amazon more open and transparent, it was nothing short of a disaster.

    While actual voting results haven't been released yet, Amazon general counsel David Zapolsky told Reuters that stock owners voted down fifteen shareholder resolutions addressing topics including workplace safety, labor organizing, sustainability, and pay fairness. Amazon's board recommended voting no on all of the proposals.

    Jassy and the board scored additional victories in the form of shareholder approval for board appointments, executive compensation and a 20-for-1 stock split. Jassy's executive compensation package, which is tied to Amazon stock price and mostly delivered as stock awards over a multi-year period, was $212 million in 2021. 

    Continue reading
  • Confirmed: Broadcom, VMware agree to $61b merger
    Unless anyone out there can make a better offer. Oh, Elon?

    Broadcom has confirmed it intends to acquire VMware in a deal that looks set to be worth $61 billion, if it goes ahead: the agreement provides for a “go-shop” provision under which the virtualization giant may solicit alternative offers.

    Rumors of the proposed merger emerged earlier this week, amid much speculation, but neither of the companies was prepared to comment on the deal before today, when it was disclosed that the boards of directors of both organizations have unanimously approved the agreement.

    Michael Dell and Silver Lake investors, which own just over half of the outstanding shares in VMware between both, have apparently signed support agreements to vote in favor of the transaction, so long as the VMware board continues to recommend the proposed transaction with chip designer Broadcom.

    Continue reading

Biting the hand that feeds IT © 1998–2022