Mmm, yes. 11-nines data durability? Mmmm, that sounds good. Except it's virtually meaningless

No one can agree on how it's calculated


Analysis What do data durability numbers mean? Azure brags 12 and even 16 nines durability, while Amazon S3, Google Cloud Platform and Backblaze tout 11 nines. What does this mean?

Data durability is a fancy way of promising you'll keep someone's data intact, and not allow the bits and bytes to degrade through media decay, drive loss, array loss, data center loss, power loss, or some other corrupting influence. Offering 99.999999999 per cent annual durability means you expect to lose 0.000000001 per cent of stored stuff a year.

There are two general ways to lengthen the data durability time. The first is to use algorithms, along with extra information about the data, to detect corruption and restore files and objects if some portions are lost to bit rot. Erasure coding is one such method. Reed-Solomon coding is another.

The second way is to store multiple copies of the data across multiple locations, allowing you to overcome individual drive and array failures all the way to data centers being flooded, torched by rioters, shattered by earthquakes, or eating a nuke. This is redundancy.

Given these two approaches are standard for hyperscale cloud giants, how do these providers calculate their data durability? Good question. We at least know the result represents the period of time you would need to wait before some data is lost. For example, Amazon states for its S3 cloud storage service:

Amazon S3 Standard, S3 Standard–IA, S3 One Zone-IA, and Amazon Glacier are all designed to provide 99.999999999 per cent durability of objects over a given year. This durability level corresponds to an average annual expected loss of 0.000000001 per cent of objects. For example, if you store 10,000,000 objects with Amazon S3, you can on average expect to incur a loss of a single object once every 10,000 years.

As you can see, it's defined here as the length of time we would have to wait before an object we store in S3 is lost, and represented statistically. Another way of saying it, according to Backblaze this week, is if you store one million objects for ten million years, you would expect to lose one file in that time. That's what 11 nines durability means, according to Backblaze.

Meanwhile, David Friend of Wasabi blogged late last year: "If you gave Amazon or Wasabi 1 million files to store, statistically they would lose one file every 659,000 years." Friend added that if you store 1PB of data, with 1.2 billion objects, with 11 nines of durability, you would lose 0.12 files per year, meaning one file lost every eight years. "The problem is that you won't know you've lost files until you try to use them," he noted.

Great, now we have different interpretations of 11 nines data durability. Which is right? And how is it really calculated? Well, here's the rub: there is no industry standard way to calculate it. We can't reliably compare AWS, Azure, Backblaze, GCP, or Wasabi data durability numbers to see which is best or worst or the costliest.

The probabilities of data loss can vary with the number of file and object fragments, drives used, failure rates, and rebuild time. Drive failure rates are tricky to factor in, as disks exhibit a bathtub curve effect – having a higher likelihood of failure when they are first turned on and at the forecasted end of their usable life.

disk

Moshe's monster seven-nines disk box blooms

READ MORE

In explaining one way to calculate data durability, Backblaze CTO Brian Wilson presented a Poisson distribution method using drives that have, for simplicity's sake, a constant failure probability over their life. He assumed the average rebuild time to achieve complete parity for any given Backblaze B2 object with a failed drive is 6.5 days. Also the annualized failure rate of a drive is 0.81 per cent, which is cut to 0.41 per cent by having an outside agency, DriveSavers, recover some data from failed drives.

The annualized drive failure rate is therefore 0.0041 per cent. Backblaze can recover from three drive failures before the first drive is rebuilt. The result is 11 nines. Backblaze also published on GitHub its method for using a binomial distribution to calculate durability.

Wilson said it is likelier that other things will happen before a cloud storage system loses its data to bit rot; for example, a data center could be blown up during armed conflict. Earthquakes, floods, asteroids, pests, and other "acts of God" could destroy one or multiple facilities. Or there could be a prolonged credit card billing problem, and your account data is deleted as a result of non-payment. Whatever you can imagine happening, it's probably more likely than losing information to bit rot.

It's unlikely Amazon, Azure, and Google will reveal the basis of their data durability calculations just because minnow Backblaze shook a stick at them this week. The moral is that we're not necessarily comparing apples and oranges when looking at costs for 11 nines data durability from cloud storage providers. Sup their data with a long spoon. ®


Other stories you might like

  • Google has more reasons why it doesn't like antitrust law that affects Google
    It'll ruin Gmail, claims web ads giant

    Google has a fresh list of reasons why it opposes tech antitrust legislation making its way through Congress but, like others who've expressed discontent, the ad giant's complaints leave out mention of portions of the proposed law that address said gripes.

    The law bill in question is S.2992, the Senate version of the American Innovation and Choice Online Act (AICOA), which is closer than ever to getting votes in the House and Senate, which could see it advanced to President Biden's desk.

    AICOA prohibits tech companies above a certain size from favoring their own products and services over their competitors. It applies to businesses considered "critical trading partners," meaning the company controls access to a platform through which business users reach their customers. Google, Apple, Amazon, and Meta in one way or another seemingly fall under the scope of this US legislation. 

    Continue reading
  • Amazon fears it could run out of US warehouse workers by 2024
    Internal research says the hiring pool has already dried up in a number of locations stateside

    Jeff Bezos once believed that Amazon's low-skill worker churn was a good thing as a long-term workforce would mean a "march to mediocrity." He may have to eat his words if an internal memo is accurate.

    First reported by Recode, the company's 2021 research rather bluntly says: "If we continue business as usual, Amazon will deplete the available labor supply in the US network by 2024."

    Some locations will be hit much earlier, with the Phoenix metro area in Arizona expected to exhaust its available labor pool by the end of 2021. The Inland Empire region of California could reach breaking point by the close of this year, according to the research.

    Continue reading
  • Makers of ad blockers and browser privacy extensions fear the end is near
    Overhaul of Chrome add-ons set for January, Google says it's for all our own good

    Special report Seven months from now, assuming all goes as planned, Google Chrome will drop support for its legacy extension platform, known as Manifest v2 (Mv2). This is significant if you use a browser extension to, for instance, filter out certain kinds of content and safeguard your privacy.

    Google's Chrome Web Store is supposed to stop accepting Mv2 extension submissions sometime this month. As of January 2023, Chrome will stop running extensions created using Mv2, with limited exceptions for enterprise versions of Chrome operating under corporate policy. And by June 2023, even enterprise versions of Chrome will prevent Mv2 extensions from running.

    The anticipated result will be fewer extensions and less innovation, according to several extension developers.

    Continue reading
  • I was fired for blowing the whistle on cult's status in Google unit, says contractor
    The internet giant, a doomsday religious sect, and a lawsuit in Silicon Valley

    A former Google video producer has sued the internet giant alleging he was unfairly fired for blowing the whistle on a religious sect that had all but taken over his business unit. 

    The lawsuit demands a jury trial and financial restitution for "religious discrimination, wrongful termination, retaliation and related causes of action." It alleges Peter Lubbers, director of the Google Developer Studio (GDS) film group in which 34-year-old plaintiff Kevin Lloyd worked, is not only a member of The Fellowship of Friends, the exec was influential in growing the studio into a team that, in essence, funneled money back to the fellowship.

    In his complaint [PDF], filed in a California Superior Court in Silicon Valley, Lloyd lays down a case that he was fired for expressing concerns over the fellowship's influence at Google, specifically in the GDS. When these concerns were reported to a manager, Lloyd was told to drop the issue or risk losing his job, it is claimed. 

    Continue reading
  • Big Tech loves talking up privacy – while trying to kill privacy legislation
    Study claims Amazon, Apple, Google, Meta, Microsoft work to derail data rules

    Amazon, Apple, Google, Meta, and Microsoft often support privacy in public statements, but behind the scenes they've been working through some common organizations to weaken or kill privacy legislation in US states.

    That's according to a report this week from news non-profit The Markup, which said the corporations hire lobbyists from the same few groups and law firms to defang or drown state privacy bills.

    The report examined 31 states when state legislatures were considering privacy legislation and identified 445 lobbyists and lobbying firms working on behalf of Amazon, Apple, Google, Meta, and Microsoft, along with industry groups like TechNet and the State Privacy and Security Coalition.

    Continue reading
  • End of the road for biz living off free G Suite legacy edition
    Firms accustomed to freebies miffed that web giant's largess doesn't last

    After offering free G Suite apps for more than a decade, Google next week plans to discontinue its legacy service – which hasn't been offered to new customers since 2012 – and force business users to transition to a paid subscription for the service's successor, Google Workspace.

    "For businesses, the G Suite legacy free edition will no longer be available after June 27, 2022," Google explains in its support document. "Your account will be automatically transitioned to a paid Google Workspace subscription where we continue to deliver new capabilities to help businesses transform the way they work."

    Small business owners who have relied on the G Suite legacy free edition aren't thrilled that they will have to pay for Workspace or migrate to a rival like Microsoft, which happens to be actively encouraging defectors. As noted by The New York Times on Monday, the approaching deadline has elicited complaints from small firms that bet on Google's cloud productivity apps in the 2006-2012 period and have enjoyed the lack of billing since then.

    Continue reading
  • UK competition watchdog seeks to make mobile browsers, cloud gaming and payments more competitive
    Investigation could help end WebKit monoculture on iOS devices

    The United Kingdom's Competition and Markets Authority (CMA) on Friday said it intends to launch an investigation of Apple's and Google's market power with respect to mobile browsers and cloud gaming, and to take enforcement action against Google for its app store payment practices.

    "When it comes to how people use mobile phones, Apple and Google hold all the cards," said Andrea Coscelli, Chief Executive of the CMA, in a statement. "As good as many of their services and products are, their strong grip on mobile ecosystems allows them to shut out competitors, holding back the British tech sector and limiting choice."

    The decision to open a formal investigation follows the CMA's year-long study of the mobile ecosystem. The competition watchdog's findings have been published in a report that concludes Apple and Google have a duopoly that limits competition.

    Continue reading
  • Amazon shows off robot warehouse workers that won't complain, quit, unionize...
    Mega-corp insists it's all about 'people and technology working safely and harmoniously together'

    Amazon unveiled its first "fully autonomous mobile robot" and other machines designed to operate alongside human workers at its warehouses.

    In 2012 the e-commerce giant acquired Kiva Systems, a robotics startup, for $775 million. Now, following on from that, Amazon has revealed multiple prototypes powered by AI and computer-vision algorithms, ranging from robotic grippers to moving storage systems, that it has developed over the past decade. The mega-corporation hopes to put them to use in warehouses one day, ostensibly to help staff lift, carry, and scan items more efficiently. 

    Its "autonomous mobile robot" is a disk-shaped device on wheels, and resembles a Roomba. Instead of hoovering crumbs, the machine, named Proteus, carefully slots itself underneath a cart full of packages and pushes it along the factory floor. Amazon said Proteus was designed to work directly with and alongside humans and doesn't have to be constrained to specific locations caged off for safety reasons. 

    Continue reading
  • Google recasts Anthos with hitch to AWS Outposts
    If at first you don't succeed, change names and try again

    Google Cloud's Anthos on-prem platform is getting a new home under the search giant’s recently announced Google Distributed Cloud (GDC) portfolio, where it will live on as a software-based competitor to AWS Outposts and Microsoft Azure Stack.

    Introduced last fall, GDC enables customers to deploy managed servers and software in private datacenters and at communication service provider or on the edge.

    Its latest update sees Google reposition Anthos on-prem, introduced back in 2020, as the bring-your-own-server edition of GDC. Using the service, customers can extend Google Cloud-style management and services to applications running on-prem.

    Continue reading
  • Google offers $118m to settle gender discrimination lawsuit
    Don't even think about putting LaMDA on the compensation committee

    Google has promised to cough up $118 million to settle a years-long gender-discrimination class-action lawsuit that alleged the internet giant unfairly pays men more than women.

    The case, launched in 2017, was led by three women, Kelly Ellis, Holly Pease, and Kelli Wisuri, who filed a complaint alleging the search giant hires women in lower-paying positions compared to men despite them having the same qualifications. Female staff are also less likely to get promoted, it was claimed.

    Gender discrimination also exists within the same job tier, too, the complaint stated. Google was accused of paying women less than their male counterparts despite them doing the same work. The lawsuit was later upgraded to a class-action status when a fourth woman, Heidi Lamar, joined as a plaintiff. The class is said to cover more than 15,000 people.

    Continue reading

Biting the hand that feeds IT © 1998–2022