Yes, it's true: Hard drive failures creep up as disks age

Some stats for those wondering when their models will be next

Cloud storage and backup provider Backblaze has published its latest quarterly report detailing the reliability of disk storage deployed in its datacenters.

Backblaze had a total of 219,444 hard drives and SSDs in its bit barns at the end of calendar Q2, of which 4,020 are boot drives, so the company has focused on the 215,424 data drives it has under management for the report. A separate report on reliability of the SSDs will be published later.

For the three-month period, Backblaze says that the annualized failure rate (AFR) across all drives in its datacenters increased to 1.46 percent, compared with 1.22 percent in the previous quarter and 1.01 percent for a year ago, during Q2 of 2021.

However, the company said this appears to be related simply to the aging of its entire drive fleet and it expected this AFR number to go down as older drives are retired over the coming year.

For example, Backblaze previously reported that the drive model exhibiting the lowest failure rate was a 6TB Seagate drive (ST6000DX000), also the oldest in the company’s inventory at the time. During the quarter, these drives did experience a couple of failures, but this was the first since Q3 of last year, and the drives now boast an average age of 86.7 months of service.

HD quarterly failure rates for q2

Click for larger image ... Source: Backblaze

“At some point in the future we can expect these drives will be cycled out, but with their lifetime AFR at just 0.87 percent, they are not first in line,” said Backblaze’s principal cloud storage evangelist Andy Klein.

However, the next to oldest drives in Backblaze’s portfolio are 4TB Toshiba drives (model MD04ABA400V) which have been in service for an average 85.3 months and these recorded zero failures during Q2. The AFR for these drives is just 0.79 percent, but this comes with the caveat that the lifetime confidence interval gap for them is 1.3 percent, which means Backblaze is lacking enough data to be confident of the true AFR.

Other drives that Backblaze reveals as having zero failures during Q2 are an 8TB HGST model (HUH728080ALE604), as well as some 14TB and 16TB Toshiba drives (MG07ACA14TEY and MG08ACA16TA respectively). But, as with the 4TB Toshiba model, these drives have very wide confidence interval gaps because of a relatively limited number of data points.

Some older drives on Backblaze’s books that are definitely showing their age are 4TB Seagate models (ST4000DM000), which have increased their failure rate over the past four quarters and have now reached 3.42 percent. The company said it has initiated its drive cloning program for these drives as part of its data durability program, and the drives will be cycled out over the next few months.

For the lifetime failure rates of drives, Backblaze looked at a total of 215,011 hard drives, excluding those being used for testing purposes or models for which the company did not have at least 60 drives in operation.

HD quarterly failure for q2

Click to enlarge

Backblaze found that the lifetime AFR across all these drives is 1.39 percent, which is the same as during the previous quarter and down from the 1.45 percent found during the same quarter in 2021.

Three drives can be picked out from the table with the highest failure rates, and these are an 8TB HGST model (HUH728080ALE604) at 6.26 percent; a Seagate 14TB model (ST14000NM0138) at 4.86 percent; and Toshiba’s 16TB (MG08ACA16TA) with 3.57 percent.

However, Backblaze cautions that the number of “drive days” (the number of days all the drives of a specific model were operational during the defined period) for these is on the low side, leading to a wide gap between the low and high confidence interval values and therefore lower confidence in those AFR figures.

For this reason, the company has drawn up another table comprising drives which have a minimum drive days value of one million and are larger than 8TB in capacity. This cuts the number of models down to 13, and the model with the highest failure rate here is a 12TB Seagate model (ST12000NM0007) with an AFR of 2.03 percent.

HD quarterly failure for q2

Click to enlarge

Backblaze said it now classifies drive failures in its datacenters into two categories - reactive and proactive. The first is the traditional failure mode, where a drive is no longer functioning failed or not responding to commands. A proactive failure is where failure is judged to be imminent, and the drive is removed before it stops functioning. This is based on errors reported by the drive and confirmed by examining its onboard SMART statistics.

For those interested in analyzing the data for themselves, Backblaze makes the complete data set used to create this latest report available from its Hard Drive Test Data page. ®

Similar topics

Similar topics

Similar topics


Other stories you might like

Biting the hand that feeds IT © 1998–2022