Backblaze starts tracking hot drives as world preps for rising global temperatures
Quarterly stats to show when units exceed manufacturer max temp
Cloud storage and backup provider Backblaze has released its latest drive statistics report, introducing tracking of temperature data for drives and failure rates for each datacenter.
Backblaze, which focuses on cloud-based storage services, issues regular reports on the health and status of the fleet of storage devices under its management, providing useful insights into drive behavior.
For the calendar Q3 Drive Stats report, the company looked at temperature data for its drives and found that a small number of devices had exceeded the manufacturer’s maximum level for at least one day.
According to Backblaze, the maximum recommended temperature for most drives is 60°C (140°F), except for the 12TB, 14TB, and 16TB Toshiba drives it runs, for which the maximum is 55°C (131°F). Only a tiny percentage of its fleet of drives exceeded this; 354 out of the 259,533 data drives in operation during Q3, or about 0.0013 percent.
Of those 354, two of the drives failed, both 4TB Seagate models (ST4000DM000). This is just a small number, and temperature fluctuation is an everyday part of running datacenters, yet Backblaze said that its engineers are looking into the root causes to ensure it is prepared for rising global temperatures.
For Q4 onwards, Backblaze will also remove the other 352 drives that exceeded temperature from its regular Drive Stats annualized failure rate (AFR) calculations and use these to create a separate cohort of drives it will track called Hot Drives. This will help Backblaze compare the failure rates against the other drives which operate within the manufacturer’s specifications.
Although there are currently just a limited number of drives in the Hot Drives cohort, it could deliver some insight into whether being exposed to high temperatures can cause a drive to fail more often, Backblaze said.
For this quarter (Q3), there are three new data fields that Backblaze has begun populating in the Drive Stats data it publishes. One of these is the Backblaze datacenter where the drive is installed, another is the cluster ID for which group of storage servers it is in, and the third is the physical location (slot) within a storage server where the drive is fitted.
Those datacenters are located in Sacramento (California), Phoenix (Arizona), Reston (Virginia) and Amsterdam (Netherlands), and identified by the values ams5, iad1, phx1, sac0, and sac2.
According to Backblaze, sac0 has the highest AFR out of all of the datacenters, but it also has the oldest drives, which are - on average - nearly twice as old as the next closest in sac2. Another factor could be that sac0 has some of the oldest Storage Pods, including a handful of 45-drive units, which Backblaze said it is in the process of replacing.
Backblaze cautions that this data covers Q3 only, and includes all the data drives, including those which are represented by less than 60 drives per model. The company said that as it tracks this data in future, it hopes to glean some insights into whether different datacenters really do have different drive failure rates, and, if so, why.
As for the rest of the report, this covers the data drives out of Backblaze's storage infrastructure that were being tracked at the end of Q3 2023. It monitors a total of 263,992 drives, of which 4,459 are used as boot drives and another 449 are excluded for various reasons, leaving 259,084 hard drives comprising 32 different models.
Backblaze said that the AFR for all drives in this quarter was 1.47 percent, down from 2.2 percent in the previous quarter and 1.65 percent a year ago.
- Lessons to be learned from Google and Oracle's datacenter heatstroke
- Google: We had to shut down a datacenter to save it during London's heatwave
- Europe wants more cities to use datacenter waste heating. How's that going?
- Cloud upstart offers free heat if you host its edge servers
- Do SSD failures follow the bathtub curve? Ask Backblaze
That 2.2 percent in Q2 was attributed by Backblaze to the overall aging of the drive fleet, and in particular some 8TB, 10TB, and 12TB drive models pushing the figure up. The company now believes that Q2 was an anomaly, as the bulk of the drive models experienced a decrease in AFR, including the suspect 8TB, 10TB, and 12TB drive models.
During Q3, six different drive models managed to have zero drive failures, but of those, only the 6TB Seagate drives (ST6000DX000), had clocked up over 50,000 drive days of operation, with an average of 101 months in operation.
The data also includes results for some Western Digital 22TB drives (WUH722222ALE6L4), but these are in a Backblaze Vault of 1,200 drives installed in September, and so only have one day of service each in this report – with zero failures.
As ever, Backblaze has made the data set available on its website, and it's free to download and analyze with the proviso that anyone doing so cites Backblaze as the source and does not sell the data on. ®