This article is more than 1 year old

Storage consolidation: Why different flavors of database need different types of storage

You can bring a horse to water but you can't turn it into a fish

Register Debate Welcome to The Register Debate in which we pitch our writers against each other on contentious topics in IT and enterprise tech, and you – the reader – decide the winning side. The format is simple: a motion is proposed, for and against arguments are published today, then another round of arguments on Wednesday, and a concluding piece on Friday summarizing the brouhaha and the best reader comments.

During the week you can cast your vote using the embedded poll below, choosing whether you're in favor or against the motion. The final score will be announced on Friday, revealing whether the for or against argument was most popular. It's up to our writers to convince you to vote for their side.

For this debate, the motion is: Consolidating databases has significant storage benefits, therefore everyone should be doing it.

Previously, we've heard from Dave Cartwright arguing in favor of the motion. And now, arguing for a second time AGAINST the motion, our Storage Editor CHRIS MELLOR...

Consolidating different types of databases – such as RDBMSes and NoSQL stores with distributed databases and data warehouses – into a single database entity is virtually impossible, and consolidating it all into a single storage vault is an impractical curiosity that’s not suited to business. It's like trying to combine a horse and a fish, and building a noisy, crowded zoo, respectively.

Let’s look at the fishy equine affair first.

An SQL relational database typically uses ACID transactions while a NoSQL database does not. ACID – that's Atomicity, Consistency, Isolation, and Durability – means that when adding, changing or deleting records, the many users of the database see the same record value. Consider a bank moving money from one customer account to another in a database. The transaction needs to be completed so that that all authorized users of the database see the same accurate account details all the time.

To meet ACID requirements, relational databases go to great lengths to ensure transaction validity and completeness, compensating for mishaps such as power and connection failures. NoSQL and other distributed databases do not do this. They provide no more than two out of the following three CAP properties:

  • Consistency: all read requests return the most recent write or an error
  • Availability: Every read request gets a response but no guarantee it’s the most recent write.
  • Partition tolerance: The system operates in spite of messages sent between nodes being dropped or delayed.

NoSQL databases read requests can return inconsistent data while RDBMS read requests return consistent data. It is simply impossible to combine the two database types; a horse cannot be a fish, and vice versa.

If you host the different database types in a single storage pool, you get over the ACID-CAP incompatibilities but a whole new set of problems comes along.

Building a database zoo

Cramming different database types into a single storage pool will result in database and storage admins untangling a complex sizing and SAN performance problem. It will be like designing and building a zoo that can hold all manner of creatures for years to come.

Different databases types have different needs for storage of metadata, record table data, log data, and so forth, plus their likely growth. Trying to arrive at a total size will be like trying to size a zoo’s capacity for all future insects, snakes, other reptiles, grazing animals, rodents and wild cats, kangaroos, and gorillas. No zoo has unlimited space, and compromises abound.

Also the storage controllers must be sized correctly to handle the IO processing burden. Calculating that burden given the database types present and factoring in the level of use of each type will be a complex and lengthy job as the different DBAs each run their numbers and the storage admins try to satisfy all of their requirements. Diagnosing specific problems such as a slow response from one particular database will be tremendously difficult down the road due to all the variables of the system involved.

The storage network links also have to be sized correctly; this is particularly important in the distributed database consolidation case, and the databases' scalability needs must be considered, too. One database could hog SAN resources and be a noisy neighbor to the others, causing the other systems to perform badly. The different databases could require unique security regimes and mixing them in one pool could break policy.

Altogether this could be a mind-bogglingly complex exercise.

Different database types cannot be combined into a single, all-singing, all-dancing uber-database. It’s theoretical madness. And different database types should not be consolidated into a single storage pool. That’s practical madness. They each need their own storage.

Different database types need to operate individually and have their own storage resources. Horses cannot be combined with fish into a single creature, and horses can’t live in the sea with fish, or fish on the land with horses. ®

Cast your vote below: are you for or against the motion? You can track the progress of the debate right here.

JavaScript Disabled

Please Enable JavaScript to use this feature.

More about

More about

More about

TIP US OFF

Send us news


Other stories you might like