Back up for a minute – Backblaze HD reliability stats show oldies can be goodies
Failure can be factored in when you're doing your sums, says vendor
Cloud storage and backup provider Backblaze has released a comprehensive report detailing reliability statistics for the hard drives it operated during the whole of 2021, with an interesting finding on its older kit.
The report leaves out boot drives (which are SSDs) and focuses on Backblaze's data drives, of which the firm had 203,168 under management as of December 31, 2021. These included various models sourced from all four major vendors: HGST, Seagate, Toshiba, and WDC. Of that figure, 409 drives were also excluded because they were used for evaluation or Backblaze did not have at least 60 examples of that model, leaving a total of 202,759 hard drives.
I would expect the larger capacity drives to ... fail more often as they age
Backblaze found that the drive model exhibiting the lowest failure rate was a 6TB Seagate drive (model ST6000DX000), with an annualised failure rate (AFR) of just 0.11 per cent. These also turned out to be the oldest in the Backblaze inventory, with an average age of over 80 months.
The report also called out two drive models that are currently performing well but which were new to Backblaze in 2021, and so have not clocked up so many in-service hours. The 16TB WD drive cohort (model WUH721816ALE6L0) has an average age of 5.06 months and an AFR of 0.14 per cent, while the 16TB Toshiba drive cohort (model MG08ACA16TE) has an average age of 3.57 months and an AFR of 0.91 per cent.
Across all drive models, the AFR during 2021 was 1.01 per cent, slightly higher than the 0.93 per cent that Backblaze reported for 2020. However, both years were down from the rate of 1.83 per cent in 2019, which the firm said indicates that the downward trend is not an anomaly.
Looking back at all the different drive models aggregated together by manufacturer over the last three years appears to show that Seagate's failure rates have fluctuated the most on a quarter by quarter basis, while HGST drives consistently delivered a failure rate of under 1 per cent.
|Q1 2019||Q2 2019||Q3 2019||Q4 2019||Q1 2020||Q2 2020||Q3 2020||Q4 2020||Q1 2021||Q2 2021||Q3 2021||Q4 2021|
Meanwhile, the 2021 figures shows that annual failure rates for the larger drive models (12TB, 14TB, and 16TB) all come in below the annual average of 1.01 per cent across all drives, compared with 1.27 per cent for those of 10TB and below.
But the larger drives also tend to be the newer drives in service with Backblaze, and so are less likely to fail versus older drives. The oldest model of large drive has an average age of 33 months, while the newest model of 10TB and below has an average age of 44.9 months.
"Right now the larger capacity drives are all performing better than the average. But, as I noted all of the large capacity drive models are much younger than the smaller capacity drives we have," said cloud storage evangelist Andy Klein, adding: "I would expect the larger capacity drives to do the same, that is, fail more often as they age."
Backblaze also managed to solve an issue it had previously reported with 14TB Seagate drives (model ST14000NM0138) failing at a higher than expected rate.
It had the failed drives examined by fault analysis specialists and a simple decision to upgrade the firmware of all the units still operating resulted in the quarterly failure rate dropping from 6.29 per cent in Q3 to 4.66 per cent in Q4, stabilising the rapid rise in failures, Backblaze said.
If any readers are wondering why Backblaze doesn't just focus on the drives that are most reliable, that isn't the way it works, because price is always a factor when procuring storage, Klein explained.
"Our architecture is built assuming drives will fail. We have multiple layers of protection to keep data safe when that happens. This allows us the flexibility to purchase drives based on price as long as they perform reasonably well in our environment," he said.
In addition, when there is a need to replace failed drives, the firm may not be able to purchase the same model again.
Backblaze said that its Vault storage architecture revolves around mixing and matching different drive models. A Vault is apparently composed of 60 tomes, with each tome made up of 20 drives. Each tome is made up of a single drive model, but the tomes making up an entire Vault may represent different drive models, and even different drive sizes. This allows the firm to be less reliant on any particular drive model.
The Backblaze report also includes a table of annualised failure rate figures for every drive model it had in production use as of December 31, 2021, covering a reporting period that started way back in March 2013.
According to Backblaze, the lifetime AFR for all the drives in the table is 1.4 per cent, and continues to go down year after year. At the end of 2020, the comparable figure was 1.54 per cent and a year earlier the figure stood at 1.62 per cent.
- 2021 in storage: We waited for a flash price revolution that never came. But about creativity? We can't complain
- Is hard drive reliability improving? Annual failure rate from Backblaze comes in at its lowest yet
- Backblaze, long a champion of home-grown hardware, succumbs to the lure of commodity servers
- Mmm, yes. 11-nines data durability? Mmmm, that sounds good. Except it's virtually meaningless
However, Backblaze cautions that some of the drives in the table have a fairly wide confidence interval (greater than 0.5) which implies that there is not enough data about that model of drive's performance to have a reasonable level of confidence in the AFR listed, either because the drives are too new or there are too few of that model deployed.
Backblaze will be disclosing the annual failure rates for its SSD drives in a separate post sometime in the next few weeks. The hard drive data is also available to download from Backblaze for anyone who wishes to analyse the figures for themselves. ®