This article is more than 1 year old
Consolidating databases has significant storage benefits – and therefore everyone should be doing it
Save yourself money, save yourself space
Register Debate Welcome to The Register Debate in which we pitch our writers against each other on contentious topics in IT and enterprise tech, and you – the reader – decide the winning side. The format is simple: a motion is proposed, for and against arguments are published today, then another round of arguments on Wednesday, and a concluding piece on Friday summarizing the brouhaha and the best reader comments.
During the week you can cast your vote using the embedded poll, choosing whether you're in favor or against the motion. The final score will be announced on Friday, revealing whether the for or against argument was most popular. It's up to our writers to convince you to vote for their side.
For this debate, the motion is: Consolidating databases has significant storage benefits, therefore everyone should be doing it.
Earlier today, Chris Mellor argued against the motion. And now, arguing FOR the motion, is DAVE CARTWRIGHT...
Storage presents us with a major problem: it constantly gets smaller, faster and cheaper.
What’s that? You see smaller, faster and cheaper as a good thing? You’ll be telling me next that you think it’s great that CPUs are getting faster, with the price per CPU cycle heading constantly downhill. (And just to be clear: that’s not great either).
Some of us learned about technology in the days when you had to be mindful of how you used it. When you had to consider the complexity of your algorithm – because if you didn’t write an efficient algorithm, you could be claiming your pension. And you made darned sure that you stored as few copies of your data as you could, because you didn’t have much storage; part of this limitation was the technical limits of the hardware, but most of it was the sheer cost of the stuff.
If you have limitations, you become mindful of those limitations. And if you don’t start with that mindfulness you soon acquire it, because your program run never completes, or runs out of memory, or fills the disk up. Sadly, these days the technology is so fast, cheap and forgiving that you can use it inefficiently and it’ll save your bacon through raw speed and size … most of the time, anyway.
This leads to the problem of not being efficient in the way you use storage. One of the classic reasons is that, if you need to do some development work on a database you’ve not used before, It’s easier to take a copy of a database to work on than to go through access management procedures to get access to the existing copy. Does that copy ever get removed? No, generally not. It does, however, get increasingly out of date and irrelevant – and when you decide that the live version has crossed that watershed of being too different from your copy, it’s time for a new copy … but of course you daren’t remove the old one for fear of losing a table, or a view, or a stored procedure that you “might need one day”.
And if you’re cluttering up your storage with multiple copies of the same thing – each of which could be tens or hundreds of gigabytes – that’s not a patch on the sin of failing to consolidate databases that are inherently different but whose contents overlap. Although some of your databases will have no data in common (the HR database tends to be pretty exclusive, for instance) you’ll often have pockets of the same data in different systems – customer data, product data, pricing data, the list goes on.
Leaving aside the data protection nightmares of understanding what data you have and maintaining its accuracy, you’re also wasting storage. And by doing the “yet another copy” across all those databases (and don’t kid yourself you won’t), you’re using space that you mostly won’t ever claim back.
Consolidating databases makes the data easier to maintain, easier to keep in line with regulatory and data protection requirements, and all that fun stuff. And the exercise can improve performance – not only will you be making sure you index stuff properly and write queries to access fewer data stores. But it also saves you a boatload of storage – which means it also saves you the money you’d otherwise have spent on that extra space.
Oh, and if you’re thinking to yourself: “Hey, my SAN kit de-deduplicates the data before it hits the disk, so I don’t have to worry about consolidating and de-duping it myself” … yes, if you have on-premises SAN. But these days the data’s probably in the cloud, and it’s the service provider that’s making the storage savings, not you.
Consolidate your data. Save yourself space.
Oh, and save yourself money. And make your systems better and more efficient. Why wouldn’t you? ®
Cast your vote below. You can track the progress of the debate right here.