Going strictly hands-off: Managing your data centre from afar

Techniques for saving your sanity, and your job


If your core servers – and hence your core applications – live in a data centre, then by definition they're not on your premises.

In many cases they may be hundreds of miles away – in fact, in a previous life, my employer's most distant data centre was six time zones away in the US Midwest.

This means that you don't have the option of wandering into the server room and power-cycling something; instead you need to work hard to make your systems manageable from afar.

Documentation

Absolutely core to the remote management of data centres is accurate, complete and rigorously updated documentation. You can't just nip and have a peek at stuff, and so you need to be able to rely completely on the documentation for information about the system.

Every connection – power, serial, LAN, the lot – needs to be rigorously documented and a regime of capital punishment initiated to deter people from not updating the docs when something changes.

It only takes one undocumented power or LAN change to make your world fall apart when you confidently but inadvertently disable something crucial because the docs differed from reality.

Similarly, document the front and rear panels of all the devices in the cabinets, along with the possible statuses of all the flashing lights and what each means: we'll come to why in a moment.

On the flipside of documentation is the labelling of everything in the data centre cabinets. Unless you're very close in a geographical sense to your data centre, you're likely from time to time to call on the data centre provider's staff to do something for you – install a new LAN connection, or maybe fit a replacement hot-swap power supply when an old one dies.

So give your devices names, and label them on the front and back. Label every cable a few inches from each end (not right at the end – you won't be able to get at the labels to read them).

And this is why I've had you document the front and rear panels: if your server has two hot-swap power supplies and one dies, you need to be absolutely certain to tell the provider's “intelligent hands” person which one to pull out.

And of course because you documented all the LED status options, you can get him or her to double check before pulling: “It's the one on the left, but before you pull please confirm that the light's flashing yellow, as that signifies it's the failed unit.”

Monitoring

Next on the list we have another core aspect of stuff being a long way away: you can't just whack another disk into the box if you run out of space.

There are so many monitoring tools on the market – and so many free ones – that there's no excuse for not monitoring your data centre to death both to check that everything's healthy but also to do capacity planning and usage trending for key resources.

Run up proper monitoring, preferably in a form that doesn't rely on the data centre being fully functional. How can you send an alert that everything went down if the monitoring server's on one of the boxes that went down? Maybe you could even look to one of the many cloud services that offers system monitoring?

Next page: Power

Other stories you might like

  • Immersion cooling no longer reserved for the hyperscalers, HPC
    With increasing density in a smaller footprint, small shops finally have datacenter dunking dibs

    Immersion cooling has long been the domain of larger datacenter operators but with increasing density and therefore smaller datacenter facilities, there is a need for shops of all sizes to get around heavy-duty AC and air cooling.

    This is the target for German server maker RNT Rausch, which has teamed up with cooling specialist Submer to provide immersion cooling for RNT's server and storage systems

    The partnership means businesses of any size can deploy liquid cooling in their datacenter. A relatively small space is required for this as it eliminates the need for air-conditioning units to cool servers, or for expensive and sophisticated fire extinguisher systems, the companies said.

    Continue reading
  • DRAM prices to drop 3-8% due to Ukraine war, inflation
    Wait, we’ll explain

    As the world continues to grapple with unrelenting inflation for many products and services, the trend of rising prices is expected to have the opposite impact on memory chips for PCs, servers, smartphones, graphics processors, and other devices.

    Taiwanese research firm TrendForce said Monday that DRAM pricing for commercial buyers is forecast to drop around three to eight percent across those markets in the third quarter compared to the previous three months. Even prices for DDR5 modules in the PC market could drop as much as five percent from July to September.

    This could result in DRAM buyers, such as system vendors and distributors, reducing prices for end users if they hope to stimulate demand in markets like PC and smartphones where sales have waned. We suppose they could try to profit on the decreased memory prices, but with many people tightening their budgets, we hope this won't be the case.

    Continue reading
  • Splunk dabbles in edgy hardware, lowers data ingestion
    'Puck' hardware demoed with customers including Royal Dutch Shell to address big concern: cost

    Splunk has released a major update to its core data-crunching platform, emphasizing reductions in the quantity of data ingested and therefore the cost of operations.

    It also addresses a few security flaws that may not be fixable in earlier editions. The release is called Splunk 9.0.

    As explained to The Register by Splunk senior vice president Garth Fort, the changes reflect users' concerns that Splunk sucked up so much data that using the application had become very expensive. Fort even cited a joke that did the rounds when Cisco was said to have $20 billion earmarked to spend on Splunk and observers couldn't be sure if that was the sum needed to buy the company or just pay for licences.

    Continue reading

Biting the hand that feeds IT © 1998–2022