This article is more than 1 year old
Does your company really need all that storage?
How to get the best performance
I was chatting not long ago to a sales guy from one of the big storage vendors. The market he serves is one of those where some of his customers buy a petabyte at a time – which I am sure he is happy about when it comes to hitting his sales quotas.
The thing is, though, most companies have much more modest requirements. These “normal” companies, as I think of them, have a storage requirement of a few terabytes, or perhaps a few dozen terabytes, and budgets that don't have incomprehensible numbers of zeroes on the end.
How, then, do they implement their storage requirements in a way that performs well and doesn't cost a vast amount of money?
It sounds obvious, but shared storage is an absolute must – despite the fact that it may well be slower on average than direct-attached storage (DAS).
What you lose in performance you more than make up for in flexibility: with DAS you invariably have the problem of one server being close to capacity while half a dozen others have hundreds of gigabytes free.
Shared storage lets you run more servers using less total disk space, with the ability to expand any given server's allocation from the central pool on demand (and often without even needing a server reboot).
And because you need less storage overall you have the option to spend some of what you have saved on faster disks for the shared resource.
Fibre Channel or Ethernet?
It is not so many years since fast storage access meant Fibre Channel connectivity and expensive SAN technologies. These days Fibre Channel switches are a commodity and have plummeted in price – but conversely 10Gbps Ethernet has come along and is dead cheap to buy.
Furthermore there is nothing to prevent you from booting servers over an Ethernet LAN using iSCSI, the only caveat being that (unsurprisingly) each server will need a network adaptor that supports iSCSI booting.
Such beasties are not expensive: a quick search on my supplier's website threw up a two-port 10GbE iSCSI-capable NIC for less than £400.
Of course, you only need iSCSI-bootable NICs if you are booting physical servers from SAN storage. If you have a virtual environment, then the chances are the host servers each have a pair of small internal boot disks (mirrored for resilience) from which the hypervisor boots locally. All the supplementary storage on which the virtual machines rely is mounted normally through iSCSI.
There are two types of access to shared storage. First, and most important for our purposes, is block-level storage: presenting the shared disks via a SAN so they can be mounted in the fashion of a local disk at operating system driver level.
This is what we have been talking about thus far: internal disks and iSCSI/Fibre Channel connected storage use block-level access. If a disk is mounted at block level on a server, the apps running on that server see it just as if it were a local disk.
Of course you also need to provide networked storage to users in the second access method, file-level storage. This is the level at which file-sharing protocols such as CIFS/SMB and NFS work, and so to have file shares at user level (for example shared data drives for various teams to use for collaboration) you will need to present some kind of file-level volumes.
The most performant option is a storage system that can natively present file systems using file-level protocols. Client machines connect directly over the network to the storage device. This is hooked into the corporate directory service so it can enforce permissions based on user and group information.
The alternative is to present the file stores using a file server, which mounts the storage at block level and presents it with file-level protocols.
The advantage is that the storage subsystem is generally cheaper if it only has to support block-level presentation. The disadvantage is the slight inefficiency of having two network hops between client and disk and the necessity to use a clustered pair of servers to avoid a single point of failure.
IT writers bang on about multi-tiering of storage: building your storage arrays as an appropriately balanced collection of different types of disk.
So you start with super-fast flash storage for your high-performance applications, then high-rpm spinning disk for the next tier, perhaps another layer of slower disk for the next tier, and usually finish up with some cheap SATA-connected 5,400rpm in another tier for disk-to-disk backups and general storage.
Meanwhile, back in the real world we find that nobody really does this. The more tiers you have the more work there is to manage it all, and the more you end up juggling volumes between storage types – particularly as the faster storage starts filling up because you have overdone the provisioning.
Most people will go for just a couple of layers – one nice and fast and the other cheap and cheerful – because for general file, print and email operation the latter is all you really need.
Of course if you have vast numbers of servers and hence the IOPS of the storage is shared so thinly that it can't keep up, then slow disk is less attractive.
But so long as you are not overdoing it then there is every chance that you can have sensibly priced storage for general use with a bit of high-speed stuff in there for the heavy applications.
Need for speed
Defining a storage volume should be about more than just deciding which speed disks you go for and how big it needs to be.
When you are choosing a brand of storage, you need to disregard the ones that allow you only this basic functionality and look for one that lets you shape the speed each volume supports.
All the decent storage subsystems on the market these days let you configure constraints on the throughput of each volume you define.
Any storage device has a finite capacity for data transfer, and defining the characteristics of each volume makes sure that the key applications are guaranteed the bandwidth they need and the low-priority apps can't simply grab all of the I/O cycles.
It is absolutely essential to dedicate a LAN (or at the very least a VLAN) to your block-level storage network. Letting storage and user connections share a LAN is a recipe for unpredictable behaviour as demand ebbs and flows.
LAN switches – even 10GbE ones – are highly affordable, and dedicating a pair of speedy 10GbE switches (you need to make it resilient, remember?) to your storage LAN will pay dividends in performance.
In the same vein, you must also have a separate backup LAN or VLAN. This is probably even more important than the separate storage LAN, because as we all know, when the backup kicks in the network takes a socking big hit and things slow down hideously.
So keep your storage, your data and your backups separate and you will keep performance up.
Spot the bottleneck
No IT infrastructure has an infinite amount of bandwidth and resource. Hence there will always be a bottleneck somewhere; the overall performance of the installation will be limited by one system or another.
I would argue that the bottleneck should be in the most expensive component: you wouldn't want a cheap switch, say, preventing you from accessing an expensive storage array at its maximum speed.
Given that storage is likely to be one of the expensive parts of your infrastructure, it is not a bad thing if it is the component that defines the speed limit of the overall setup.
It means that you have a fairly straightforward value judgment to make if you decide that more speed is needed. If you can't live without it then you have to find the money, but at least you are safe in the knowledge that the kit between server and storage will let you access the pricey bit effectively.
The final aspect of making your storage perform is to monitor it. Make sure you have monitors on all the ports and links between critical devices, examine the data regularly, and use what you find to change the configuration.
I mentioned earlier that you should be able to shape the access speed for each volume, but you should also regularly check the actual behaviour. If a volume is hitting its limit and another is getting nowhere near, that is your chance to tune the behaviour to maximise effectiveness – and you can't do this without sensible monitoring tools.
For a “normal” company, storage performance is really not hard. Requirements tend not to be extreme, so two tiers of storage is generally absolutely fine. Don't worry about splashing out on Fibre Channel SAN kit – 10GbE IP-based storage is generally plenty.
Concentrate your spend on getting storage with the features and control you need, particularly the ability to configure different volumes with different levels of throughput.
And use monitoring tools continually to observe behaviour. Update the configuration to turn down under-used volumes so you can increase the throughput of those whose performance boundaries are being pushed.