This article is more than 1 year old
10 things you need to avoid SNAFUs in your data centre
You won't believe No.8 (OK, you will)
Despite my apparently youthful good looks, I've been in the IT industry since 1989. Which means I've been around the block a bit, and have learned rather a lot of lessons – some of them the hard way. To avoid you having to find them out yourself, here are ten to be going on with.
1. Always carry a torch in your laptop bag
Even in the brightest of data centres, the racks your kit lives in have more than their fair share of dark recesses. This makes reading the serial numbers or port marking off the back of devices more than a little challenging, and so a mandatory inclusion in your laptop bag is a little Maglite torch. And while you're at it a little dentist-style mirror on a stick wouldn't go amiss either. Oh, and if you're thinking of using the torch on your £650 iPhone: you'll change your mind when you've dropped it in the rack and scratched it to death.
2. What does the plug look like?
If you have equipment delivered directly to your data centre, double-check how it gets its power and what cables come with it. I've had LAN switches delivered to US data centres with European-style plug on the cables, which was a pain in the butt. I've discovered the hard way that the Cisco 3750-X has a different power inlet from its predecessor (it's an IEC C16, not a C13) so when you upgrade you need new cables too. And the worst of all worlds is where you have a device that has a proprietary power brick – because they just don't fit standard rack power strips. All of these things are surmountable, but don't wait to find out until you get to the data centre and you can't install anything.
3. Keep an offline copy of your docs
If your network's died, or the server's gone pear-shaped, you'll need your diagrams and reference sheets. And it's no good if they're sitting on the fileserver that just turned up its toes. Keep an up-to-date copy of the documents somewhere that you know you'll be able to get at it: on your laptop, or maybe on a Cloud repository like Google Drive. If your docs repository is SharePoint based then SharePoint Workspace lets you keep a synced copy on your PC.
4. Tidy cabling isn't dull
Neat cabling looks fab. It also makes it a breeze to trace cables, to install new stuff, and most importantly to get old stuff out. If your racks are a rats' nest of cables then the chances are you won't be able to pull out unwanted cabling – so it'll just sit there with the ends dangling in the breeze and generally getting in the way. Take your time to make cabling neat, and use cable management attachments in the racks to help you.
5. See the problem for yourself
My dad was an engineer, and he taught me a lesson that I use at least once a week: if someone tells you they have a problem, see for yourself before believing them. “The printer's faulty” could mean that the printer's faulty – but it could equally mean that the spooler has crashed, or they've done something daft to their driver, or even that they've knocked the LAN cable out of their PC so it can't see anything on the network (including the printer). You're the IT person, so you probably know best.
6. Spend the money and double-connect everything
If you care about your servers staying accessible, double-connect them to the network. If they support it then team the network adaptors with LACP/EtherChannel, but if not then use the teaming software that comes with the adaptors to run the ports as a team in active/passive mode. Oh, and have two NICs (there's no point double-connecting stuff in a dual-port NIC because if the NIC dies, the server's off the network) and connect them into two switches.
7. Record every change
Change control is an absolute must. Have a log of changes, and be rigorous about updating it. If people don't update it, be firm with them (make it a disciplinary offence – it's truly that serious). What's the first thing you ask when a problem is reported? “Has anything changed?”. You need to be confident of the answer your documentation gives you.
8. Stuff goes wrong, even if it's always gone right
Be as cautious about the hundredth time you do something as you were about the first. Familiarity leads to complacency, and it's easy to assume that just because something has always worked as expected it'll continue to do so. So your Web site code drop has never failed … until you renamed the old version and ran out of disk space whilst copying the new one over. Or maybe you've never done the task on 1 March in a leap year before. Don't be paranoid about it, of course – but don't be blasé either.
9. Understand the dependencies
It's surprisingly common to carry out two apparently unrelated changes only to discover that there's a relationship between them after all. And you usually find this out when something goes wrong. Most common is where you have two changes and both of them experience unrelated problems that need the on-call engineer to have a shufti, and all of a sudden you're Googling for the phone number of the Shit Creek paddle shop as two teams depend on one person. Consider the dependencies of activities whether they go well or badly.
10. Keep the vendors sweet
You can have a contractual relationship with your vendors, and you need to be professional when dealing with them, but there's no harm keeping on the right side of the techies you deal with day-to-day. So when I had a two-hour-response SLA on my phone system and something non-fatal threw an alert at going-home time, I'd often say to them: “No worries, come at 9am tomorrow”. And when they were in doing upgrades I'd provide the coffee and pizza. The amount of free advice and ad-hoc consultancy I got in return was amazing. ®