As technology evolves, bottlenecks in the infrastructure move around. The switch speed leapfrogs the server speed, then the servers are upgraded with faster LAN cards and the spinning disks in the SAN become the weak link, so you upgrade and find that the SAN fabric is holding you back.
How does everything interact? And as the various bits of the hardware keep overtaking each other, can the software keep up or are we simply wasting our time and money?
Since we can't afford to have infinite speed in every part of the infrastructure, where do we spend and where do we hold back?
The usual setup
In the average infrastructure I tend to find:
- Servers connected by Gigabit Ethernet, with a growing amount of 10GbE
- Storage connected by 8Gbps Fibre Channel, with a growing amount of 10GbE iSCSI and some 16Gbps Fibre Channel
- Storage that is primarily spinning disk with a growing amount of flash
What is important is that all the technologies are extremely well established. Perhaps with the exception of Fibre Channel – which we have all found daunting at first but changed our minds once we have played with it a bit – there is nothing I mentioned that most infrastructure engineers find particularly difficult.
The 802.3an (10GbE over twisted pair copper) standard has been around since 2006, and 8Gbps Fibre Channel has been with us a few months longer. So as well as being well understood, they are cheap. And basic 1Gbps Ethernet is regarded as a staple – the laptop I am typing this on has a Gigabit port in it, based on a standard that came along 15 years ago.
What is also important in the Ethernet world is the existence of channel bonding – notably LACP and EtherChannel. So if your Gigabit Ethernet network is creaking a little, simply run up a bonded pair and you have doubled your bandwidth with a piece of network string and a couple of commands on the LAN switches and server NICs.
The reason why companies have not yet adopted 10GbE for their LANs is that although 1Gbps is not enough, a 4Gbps EtherChannel more than suffices. I have seen a lot of data centres where 10GbE implementations have crept first into the storage network, not the data LAN side.
We have talked about connectivity, now let's talk about the servers you are hanging on the end of the string. Servers have three main elements: processors, memory and storage.
I am going to get shot by the CPU police here, but the only fact that I think matters about processors (at least those for servers) is that with each new generation they get faster.
Yes, the vendors also strive to decrease the power consumption per CPU cycle, and of course the speed increase is due only to ridiculously difficult innovations in miniaturisation, pipelining, on-board cache and the like. But the bottom line is that processors get faster without really costing more.
The story is similar with RAM. As time goes by, you can put more and more RAM into a single slot in a server at a lower and lower cost per gigabyte, but generally speaking the speed of the memory is not your first consideration when you are buying the kit. Quantity is what you look for in RAM.
Although there is an overhead of a few per cent of your processing power and RAM if you layer virtualisation on top of the hardware, it is well worth the technological cost because of the ability to allocate resource dynamically and automatically to the virtual servers that sit on top.
It means that in extreme circumstances you can give a single virtual machine almost all the resource of a potentially socking big physical host if it needs it for a short while. No need for bottlenecks there, then.
Which brings us to storage. Now, my preference is for my servers to boot from a pair of mirrored on-board hot-swap disks, for two simple reasons: it doesn't cost much more than a diskless machine, and it means you can boot the beast even when it is not connected to the core storage.
Ah, the storage. I mentioned that as part of the server infrastructure, but in fact some kind of shared storage is the order of the day in most organisations to give best bang for the buck.
Storage is an interesting paradox. If you go for on-board storage you will waste money whatever you do. Buy a server with modest capacity for disks and you will kick yourself when you run out and have to move to a bigger box (or do a bare-metal rebuild having swapped out all the small disks for bigger ones).
Buy a server with spare disk slots that you don't fill and you will take up rack space that costs you money. Buy a server with plenty of disk slots and fill them with disks and you end up with puddles of unused storage all over your data centre.
The alternative, though, is to use shared storage to minimise wastage. The problem is that this resource has to be fast enough to keep up with the servers that are sharing it. That means three locations for the potential bottleneck: the SAN, the storage chassis setup and the disks themselves.
Disk technology is progress is relentless, with new and faster stuff appearing every few months
There is no reason why the SAN should be slow. 10GbE (or even bonded Gigabit links) should easily keep you going with the average storage array, and as we have said this type of networking is cheap. And if you like Fibre Channel, a 16Gbps implementation is not the cheapest but it works well and is fast.
The disks are another thing. Disk technology is the most prominent area where progress is relentless, with new and faster stuff appearing every few months or at worst years. The price of the new stuff remains high for a long time before the technology becomes sufficiently commoditised.
Think of how the cloud works: all the big providers let you choose solid state disk for a premium price or spinning disk for much less – you get to limit your systems' throughput in the interests of cost management.
Everything else is pretty scalable without breaking the bank, partly because of modest unit costs and partly because in a pay-as-you-go model you can crank the processing and CPU power up and down and pay only for what you use.
The same logic applies to your on-premise infrastructure. I have already noted that by using a virtualisation layer on your server hardware you can make the most of the CPU and RAM resource available: you can over-provision your virtual machines so that when machine A pauses to think about what to do next, machines B, C and D can nick the resource and use it.
But those machines will seldom be able to shift their disk usage up and down in the same way: regardless of how much I/O happens, most of the content of the output from the servers to the storage will sit on the disks for days, weeks or months.
Pinpoint the pinch point
Realistically, then, storage is always going to be the pinch point. Why do you think the storage vendors make such a big deal about the quality-of-service features of their storage subsystems?
“You can guarantee your database server the IOPS it needs,” they proclaim, meaning: “Your storage will be so contended that you need to be able to make some guarantees that your core systems will have fast enough access.”
And upgrading storage is not a trivial task: while you can whack in a new CPU card and add it to your ESXi host in minutes, or stuff in some 128GB DIMMs and have them immediately available, upgrading storage is a far more involved and much longer process (and it needs extra rack space and power into the bargain).
There is absolutely no excuse for having the pinch point of your infrastructure in the LAN. Insight (other kit vendors exist) will sell you a two-port 10GbE LAN card for £350, and a 24-port 10GbE switch for less than £4,000. So if the network is slowing you down, you are doing something wrong.
Server power is also very cheap to expand when things start to get congested, starting with memory. Assuming you have bought kit with a bit of expansion in mind (particularly with regard to spare RAM slots) then expanding is a ridiculously cheap thing to do. Just £1,500 for a 128GB DDR-4 module? That is less than £12 per gigabyte. No excuse for a bottleneck there either, then.
And even if you have to add to the processing power of your server estate, you are not breaking the bank there either. Adding a twin-processor server to the estate won't need you to grope down the back of the company sofa for stray 50p pieces, particularly if you maximise the use of the hardware through virtualisation and dynamic resource allocation.
The SAN? Well, if it is iSCSI then the 10GbE costs mentioned above apply; and if it is Fibre Channel then there is a cost but not a vast one (£1,000 for a two-port 16Gbps HBA, for example, is not that hideous).
Which leaves us with the storage. The vendor has given you the opportunity to share out the IOPS, which means it knows it is never going keep up with demand.
If you want to add this year's technology then you have either to pull out the old storage, put in the new, and restore from a backup in a socking big downtime period; or find the space and power to put in the new stuff, copy the data while online, then either decom the old stuff or more likely demote it to less critical tasks.
Worse still, the problem is compounded by storage being the fastest-evolving technology. You are not going to sit with your Gigabit Ethernet LAN or SAN, waiting for the successor to 10GbE, as it is not imminent and anyway stuffing 10GbE in its place is easy.
Similarly few people wait for the processor after next from Intel or AMD, at least not at a server level. You just go with the most sensible of the current options.
Maybe with Fibre Channel you will wait for the new 32GB or 128GB variants, but on balance you probably won't because although they are projected for 2016 nothing is certain and they will be expensive. So if Fibre Channel is your choice you will whack in the 16Gbps flavour right now.
But with storage: do you go for SAS or SATA, or 7.2k, 10k or 15k spinning disk, or one of today's flavours of SSD, or one of the new flavours that are promised for a few months' time?
The temptation to stick with something slow and likely to be a bottleneck instead of leaping too soon into the next generation is strong. It is something I used to see 20 years ago: the team I supported were always agonising between keeping their crappy PowerBook and waiting for the next whizz-bang colour/hi-res Duo, or upgrading and gaining speed but living with low-res greyscale.
Storage, then, is where your infrastructure bottleneck is inevitably destined to appear. Spend your money ensuring that the other, much cheaper areas of your world are kept up to speed because if you don't, you are doing something wrong.
And learn to make the most of your analysis tools and the features the vendors give you to tune and customise the I/O of the kit. Although the data river narrows when it hits the storage world, it is probably still wide enough for your purposes if you use it wisely.
Before we finish, let me sow a little seed of an idea about where all this disk I/O comes from: applications.
There are some amazingly well-written applications. Take the popular database management systems: SQL Server, Oracle and even the likes of MySQL and PostgreSQL are tuned wonderfully, with super-efficient code layers for accessing memory and disk storage.
Yes, they take a lot of memory and processor cycles, but the reward in terms of number and data crunching power is immense.
And then we let developers write code on them. SQL Server uses its indexes brilliantly – until some numpty of a developer writes code that misses the indexes and does half-million-row table scans.
Oracle PL/SQL is an elegant, usable language that lets you do unbelievably funky things, even writing order-n-squared nested loops-within-loops because nobody taught you how to design algorithms.
There are some brilliant developers in this world. I have had the privilege of working with several. But my goodness, there are some awful ones whose code takes hopelessness to new levels.
It is said that there are more stars in the universe than there are in all the grains of sand on Earth. This may be true, but I'd warrant that across the planet there are even more misused IOPS and CPU cycles than that.
So yes, your storage is rightly the bottleneck in your infrastructure. But I bet that is being exacerbated by crap code. ®