Register Debate You’d think debating the benefits of database consolidation for storage would be a relatively straightforward affair. Not when it’s a Register Debate.
This week our writers turned their attention to the following motion: Consolidating databases has significant storage benefits, therefore everyone should be doing it. In the process, they conjured up images of hideous chimeras, slated inefficient programming, and drew a straight line between DevOps practices and the perfect barbershop experience.
Maybe it’s not such a surprise. This is an area that encompasses your company’s precious data and whole thickets of thorny hardware and software engineering problems. Make the wrong choice and your company could be subsidising your vendor account manager’s holiday villa for years to come. There’s a lot at stake here.
Database consolidation is a server issue, not a storage game
First to take the floor on Monday was El Reg’s storage supremo Chris Mellor, arguing against the motion because, “The idea that consolidating databases has significant storage benefits and therefore everyone should be doing it is missing the point.” Switching to an all flash array, for example, is not an issue of consolidation, Chris argued, “It’s database acceleration.”
“Database consolidation onto fewer servers saves server cost because you need fewer servers, and also saves database instance licensing expense as you need fewer per-server instance licenses,” he concluded. “There is no storage benefit here but the potentially significant server-based benefits make database consolidation an attractive idea that can serve you right.”
Arguing for the motion was Dave Cartwright, who is a chartered engineer, a chartered IT pro, and a member of the British Computer Society. Dave took a long view of the issue, noting sagely that: “Some of us learned about technology in the days when you had to be mindful of how you used it… you made darned sure that you stored as few copies of your data as you could, because you didn’t have much storage; part of this limitation was the technical limits of the hardware, but most of it was the sheer cost of the stuff.”
These days, he argued, “the technology is so fast, cheap and forgiving that you can use it inefficiently and it’ll save your bacon through raw speed and size … most of the time, anyway.” Because, if we’re honest, we all know that devs have been quietly copying parts of databases, or departments have been spinning up their own stores, often overlapping info with other departments, and no one ever deletes any of this, because... well, just in case.
The result? Wasted storage obviously, but also information audit challenges, data protection issues, and all the other problems that spring from an incontinent approach to data and storage.
Ultimately, Dave said, consolidating databases can address all of those issues, save a “boatload of storage,” and probably improve performance as you will “be making sure you index stuff properly and write queries to access fewer data stores.”
And who wouldn’t want all of that. Well, not all Reg readers. You can see some of the most upvoted comments in the box below, but suffice to say the phrase “eggs in one basket” popped up a couple of times, along with Oracle RDB and UNIVACs. And commenter PeterCorless raised a series of points, including the observation that “there's no way your standard ERP system is keeping up with the raw rate of ingestion and analytics of IIoT. And no way the CFO is going to let quarter close be impacted because someone's trying to run an ad hoc data query on the ERP system.”
So it was no surprise that Chris weighed back into the fray on Wednesday with a nightmarish vision of just what could happen if you really think through database consolidation.
Trying to consolidate RDBMS’s and NoSQL stores – for example – into a single database, on a single storage vault is “an impractical curiosity” akin to “trying to combine a horse and a fish, and building a noisy crowded zoo” to keep them in. Just think of the mess. Apart from ACID and CAP issues, the poor storage admins face the problems of disparate metadata and log data, as well as sizing and IO processing challenges.
Or, as Chris summed it up, horses can’t live in the sea with fish, or fish on the land with horses. (We now fully expect a database consolidation startup to appear called Seahorse. Or Landfish.)
After this nightmarish image of database chimeras prowling around expanding menageries, it was down to El Reg’s APAC editor Simon Sharwood to tie things up by turning the argument on its head, then giving it a good haircut into the bargain.
In this world where software rules and businesses bend over backwards for developers, simplicity is valuable
That’s because Simon used the example of his barber’s app, which shows a real-time queue, allows him to choose a cut in advance, and book and pay for it. The only part that doesn’t rely on a database – for now at least – is the part where scissors actually meet hair and the client says no, they don’t need something for the weekend.
We don’t normally consider the implications of DevOps for barbering, but, Simon argued: “In this world where software rules and businesses bend over backwards for developers, simplicity is valuable. Which is why database consolidation is a fine thing.”
Yes, this might send the ops team reaching for a hot towel, but “smart organisations don’t let it get to the stage where they are caught in a web of legacy tech…hostage to a shrinking pool of tech and services vendors who can ratchet up prices.”
In the end, and with upwards of 300 readers taking part, it seems the readers came in favour of the motion. But each vote reflects the state of play in each voter’s own organisation, at least to an extent. So, are customers and/or devs setting the agenda at the majority of organisations? And does this mean a focus on application delivery and customer experience trumps the views – and bitter experience – of storage/ops folks? Sounds like another debate topic. Expect fireworks. ®
Top comments upvoted by you, selected by us
"Err, no. I deal with a couple of legacy databases, Oracle RDB, and a hierarchical database that originated on UNIVACs. Neither of those is going to be on the table for consolidation. …. There are a few reasons for consolidation, but there are many reasons to refrain. Packing your favorite bowl or cup in your attic chest of porcelain means that you will constantly dis/reassemble the contents, and things will likely get broken. Certain architectural aspects become brittle and very difficult to change" – Chasil
"In a hypothetical situation and hear me out on this, having everything in one database makes queries a bitch especially when every person and their dog are doing it. It also leaves you wide open to user errors if you don't set the permissions right which I know can be an issue with multiple databases and yes I have heard of backups but it's just too risky. Divide and conquer I say. There is a reason we don't put our eggs in one basket. Best case is also local duplication for anything that doesn't require real time access. This is just my opinion on the matter" – Anonymous Coward
"Consolidation is putting all your eggs in one basket Any breakage and nothing works. It also means that the one database/cluster/... does all the work ie a higher workload than when it is distributed. One humongous machine might be more costly than several smaller ones -- maybe" – Alain williams
"Have to agree with Chris... Here's a guy who actually knows what he's talking about. But here's the thing... I don't know if the question is being framed properly. When you say 'database consolidation, what do you mean exactly. Yes, it's a strange question, but think about it. You have databases that are OLTP transaction processing systems of truth. Then you have Data Warehouses (OLAP) that are used to drive analytics. Then you have Data Lakes which in itself is a Data Warehouse consolidation by removing the silos. (Here the number of DWs goes down, but the storage requirements go up. ) And it's not just the CPUs getting better, or storage, but also networking. 40GbE is becoming Cisco's norm. 100GbE is also there... But at 40GbE you can start to consider data fabric as your storage layer. The issue is cost versus density and performance has to be evaluated on a case by case basis. The networking also allows for a segregation of COTS and specialty hardware to get the most bang for your buck. You can weave a GPU appliance into your data fabric and then consolidate compute servers using K8s to allow distributed OLTP RDBMs to take better advantage of the hardware. (This is where the network can be a bottleneck. )
"What's interesting and a side note... when you look at this... its in *YOUR* Data Center. Not on the cloud. (Although it could be in the Cloud too.) These advances will spell a down turn in the cloud over the next 5 years. Thats not to say that there won't be a reason for cloud but more of a hybrid approach. Just some random thought from someone who's been around this for far too long but too broke to retire. ;-)" – Mike the FlyingRat
"Improving tech is bad? On the argument that improving tech results in poor, lazy coding: Not quite. In current development cycles, delivering something on time (aka AGILE) is important. What is delivered is less so, but to hit these targets, developers take short cuts, write lazy code and rely on the tech to cover for them - who has time to optimise the code when you have to deliver in three days?
"So what's written is 'good enough' in that it works thanks to the faster CPU, and it doesn't matter that it takes more space 'cause disk is cheap, right? And when does the optimisation happen? When do we get to go back to make sure it's efficient? When something breaks and we have no choice." – Helcat