Cisco warns of premature DIMM failures
Check 16, 32, and 64GB DIMMs made in the middle to end of 2020, blames manufacturing errors
Cisco says some of its DIMMs are failing prematurely due to a manufacturing error, and has advised users to replace the memory to avoid server failures.
The affected components are a limited number of 16, 32, and 64GB DIMMS manufactured in the middle to end of 2020. The company provided a Serial Number Validation Tool (requires login) for users to check if their DIMMs are from the faulty batch.
The flawed DIMMs exhibit persistent correctable memory errors that if left untreated could cause an unexpected server reset.
"If encountered during Power-On Self-Test (POST), the DIMM will be mapped out and the total available memory reduced. In some cases, a boot error might be seen," cautioned Cisco in the notice on Friday.
The company also warns that the extent of the correctable errors can be masked by various operating system or DIMM Reliability, Availability and Serviceability (RAS) features, so it's best not to judge the component reliability on its error count.
- F5, Cisco admins: Stop what you're doing and check if you need to install these patches
- When companies invest, they invest in software – report
- Broken password check algorithm lets anyone log into Cisco's Wi-Fi admin software
- Cisco's Webex app phoned home audio telemetry even when muted
As for the replacement parts, they can be ordered through Cisco.
"A replacement DIMM placed in the same slot as a previously failed DIMM might not immediately show as healthy. If a DIMM does not come up healthy on the first boot after the replacement, verify the physical DIMM seating," warned the manufacturer in its workaround directions.
Cisco also recommends running memory diagnostics before placing servers into production to mitigate early runtime errors.
The company said it has taken action to fix the manufacturing process to ensure new DIMMs work correctly.
Last week, Cisco said after two years of work it had an analytics engine that could predict network issues before they happen, and potentially in the future even fix them.
Cisco told The Register this predictive analytics engine "will power a broad range of products and services over the next few years."
Unfortunately, these faulty DIMMs are not yet among those that can appear in the company's AI crystal ball. ®