10 things you need to avoid SNAFUs in your data centre

You won't believe No.8 (OK, you will)


Despite my apparently youthful good looks, I've been in the IT industry since 1989. Which means I've been around the block a bit, and have learned rather a lot of lessons – some of them the hard way. To avoid you having to find them out yourself, here are ten to be going on with.

1. Always carry a torch in your laptop bag

Even in the brightest of data centres, the racks your kit lives in have more than their fair share of dark recesses. This makes reading the serial numbers or port marking off the back of devices more than a little challenging, and so a mandatory inclusion in your laptop bag is a little Maglite torch. And while you're at it a little dentist-style mirror on a stick wouldn't go amiss either. Oh, and if you're thinking of using the torch on your £650 iPhone: you'll change your mind when you've dropped it in the rack and scratched it to death.

2. What does the plug look like?

networking plugs

If you have equipment delivered directly to your data centre, double-check how it gets its power and what cables come with it. I've had LAN switches delivered to US data centres with European-style plug on the cables, which was a pain in the butt. I've discovered the hard way that the Cisco 3750-X has a different power inlet from its predecessor (it's an IEC C16, not a C13) so when you upgrade you need new cables too. And the worst of all worlds is where you have a device that has a proprietary power brick – because they just don't fit standard rack power strips. All of these things are surmountable, but don't wait to find out until you get to the data centre and you can't install anything.

3. Keep an offline copy of your docs

Bookshelf in the British Library basement

If your network's died, or the server's gone pear-shaped, you'll need your diagrams and reference sheets. And it's no good if they're sitting on the fileserver that just turned up its toes. Keep an up-to-date copy of the documents somewhere that you know you'll be able to get at it: on your laptop, or maybe on a Cloud repository like Google Drive. If your docs repository is SharePoint based then SharePoint Workspace lets you keep a synced copy on your PC.

4. Tidy cabling isn't dull

Example of tidy cabling

Neat cabling looks fab. It also makes it a breeze to trace cables, to install new stuff, and most importantly to get old stuff out. If your racks are a rats' nest of cables then the chances are you won't be able to pull out unwanted cabling – so it'll just sit there with the ends dangling in the breeze and generally getting in the way. Take your time to make cabling neat, and use cable management attachments in the racks to help you.

5. See the problem for yourself

Sign with text: Don't just hear an eagle, Watch an eagle! Rent binoculars here

My dad was an engineer, and he taught me a lesson that I use at least once a week: if someone tells you they have a problem, see for yourself before believing them. “The printer's faulty” could mean that the printer's faulty – but it could equally mean that the spooler has crashed, or they've done something daft to their driver, or even that they've knocked the LAN cable out of their PC so it can't see anything on the network (including the printer). You're the IT person, so you probably know best.

6. Spend the money and double-connect everything

Cash on scales. Pic: Images M oney, Flickr

If you care about your servers staying accessible, double-connect them to the network. If they support it then team the network adaptors with LACP/EtherChannel, but if not then use the teaming software that comes with the adaptors to run the ports as a team in active/passive mode. Oh, and have two NICs (there's no point double-connecting stuff in a dual-port NIC because if the NIC dies, the server's off the network) and connect them into two switches.

7. Record every change

Baby in t-shirt - with logo: I'm recording everything

Change control is an absolute must. Have a log of changes, and be rigorous about updating it. If people don't update it, be firm with them (make it a disciplinary offence – it's truly that serious). What's the first thing you ask when a problem is reported? “Has anything changed?”. You need to be confident of the answer your documentation gives you.

8. Stuff goes wrong, even if it's always gone right

Cartoon - Private SNAFU

Be as cautious about the hundredth time you do something as you were about the first. Familiarity leads to complacency, and it's easy to assume that just because something has always worked as expected it'll continue to do so. So your Web site code drop has never failed … until you renamed the old version and ran out of disk space whilst copying the new one over. Or maybe you've never done the task on 1 March in a leap year before. Don't be paranoid about it, of course – but don't be blasé either.

9. Understand the dependencies

finger pushing first in set of dominos

It's surprisingly common to carry out two apparently unrelated changes only to discover that there's a relationship between them after all. And you usually find this out when something goes wrong. Most common is where you have two changes and both of them experience unrelated problems that need the on-call engineer to have a shufti, and all of a sudden you're Googling for the phone number of the Shit Creek paddle shop as two teams depend on one person. Consider the dependencies of activities whether they go well or badly.

10. Keep the vendors sweet

Signpost saying Happy Sad

You can have a contractual relationship with your vendors, and you need to be professional when dealing with them, but there's no harm keeping on the right side of the techies you deal with day-to-day. So when I had a two-hour-response SLA on my phone system and something non-fatal threw an alert at going-home time, I'd often say to them: “No worries, come at 9am tomorrow”. And when they were in doing upgrades I'd provide the coffee and pizza. The amount of free advice and ad-hoc consultancy I got in return was amazing. ®

Similar topics


Other stories you might like

  • NASA installs a new and improved algorithm to better track near-Earth asteroids

    Nearly 20 year-old software used to protect humanity gets an upgrade

    NASA has upgraded its near-Earth asteroid monitoring algorithm to model hazardous space rocks more accurately after nearly two decades, it announced on Tuesday.

    The new system, dubbed Sentry-II, is more powerful than its predecessor, Sentry. Astronomers working at the space agency's Center for Near Earth Object Studies can now automatically calculate thermal influences that nudge an asteroid’s orbit, potentially sending it hurtling towards our home planet.

    The so-called Yarkovsky effect describes the subtle and gradual change of motion when asteroids are heated by the Sun’s light. When asteroids spin, one side of its surface exposed to the star gets heated. As it continues to rotate, the hot region enters shade and cools down. Infrared energy is radiated outwards; the photons carry momentum and impart a tiny thrust on the asteroid. Over long periods of time, these small kicks can change their paths and knock them out of their original orbit.

    Continue reading
  • Facebook slapped with an eyepopping $150B lawsuit for spreading hate speech against Rohingya refugees

    Lawsuit claims social media giant's algos helped Myanmar military crackdown on the Rohingya

    Meta was sued on Tuesday for a whopping $150 billion in a class-action lawsuit for allegedly amplifying hate speech and aiding the Myanmar military in the genocide of the Rohingya people.

    The case, led by an anonymous Rohingya refugee living in the US, accuses the entity formerly known as Facebook of inciting hatred and inflicting real harm on the predominantly Muslim group for years. Not only did the social media platform ignore hate speech posts, it's alleged that the service's algorithms actively promoted anti-Rohingya propaganda as hundreds of thousands of people fled from Myanmar to escape persecution.

    Facebook has already acknowledged its role in the campaign, which saw an estimated 25,000 people perish and 700,000 forced from the country. The lawsuit also comes after ex-employee and whistleblower Frances Haugen leaked internal documents demonstrating how its algorithms prioritized engagement over safety.

    Continue reading
  • Power management IC shortage holding cars, laptops, hostage

    Couple of cents-worth of kit causing big problems for the year to come

    The shortage of power management chips is worsening and holding back companies from building cars, PCs and items with batteries or an on-off switch, Trendforce said in a study this week.

    Power management ICs cost just a few cents, and are among cheap chips that include display driver and USB-C components that are in short supply. These chips are as important to PCs and other electronics as CPUs or memory.

    The demand for PMICs has gone through the roof with the emergence of electric cars and growing demand for PCs and consumer electronics during the past 20 plus months. Trendforce expects the prices will go up by 10 per cent to a six-year high of $0.23.

    Continue reading

Biting the hand that feeds IT © 1998–2021