When it comes to storage, what benchmarks to use, how to configure them and how to interpret the results has been the subject of many a heated debate.
Benchmarks are supposed to provide empirical data that can be used as evidence for drawing rational conclusions. Of course, if you torture data long enough it will confess to anything you want, and not everyone wants the same thing.
In today's storage world, flash is the new hotness. SSDs are different from traditional "spinning rust" hard drives. Despite this, they are largely used for the same tasks.
This alone gives rise to epic discussions about whether or not benchmarking tools are valid, even if (or especially if) they haven't been tuned to be SSD-aware.
What is SSD awareness? In SSDs, reads and writes have different effects on flash cells. In turn, these cells provide asynchronous performance between reads and writes.
What is more, writes of different sizes and patters are handled differently. As you can see, SSD awareness in benchmarking can make a big difference to the numbers you get.
On one level, SSD awareness is useful. If you are testing SSDs that are directly attached to the computer via any of the exploding numbers of different interconnects, then it is reasonable to make sure that TRIM is respected, assuming that the ultimate use of the SSD will be on a TRIM-aware operating system.
Of course, what the operating system does or does not support matters little if the operating system doesn't have direct control over the SSD in question. If the SSD happens to be attached to a RAID controller, a host bus adapter (HBA) that doesn't expose the full capabilities of the SSD, or is attached via remote-storage protocols such as iSCSI, then operating system support doesn't matter.
Add in the presence of different types of flash-caching software, the effects hybrid storage arrays and server SANs have upon mixed storage configurations and measuring what your actual workload needs are becomes far more important than raw I/O numbers.
Raw benchmarks don't care much about what is under the hood, nor do they try to simulate a given workload. They just spam traffic at the system's I/O driver and report back how much traffic it took to make the thing cry.
Raw benchmarks offer only the most basic nerd knobs to twiddle: sequential versus random, block size and read-to-write ratio.
Vdbench was a popular open-source tool but it has rapidly fallen from favour. Oracle bought it, and immediately ruined it with unpalatable licensing and generally being Oracle about the whole thing.
Vdbench is different from other benchmarks in that it is designed to run against the raw storage. Think \\.\PhysicalDriveX in Windows or /dev/sdX in Linux. No buffering from the operating system, just raw performance numbers. It is primarily used to test cache misses.
Iometer is the benchmarking tool that just won't die. Intel started the project and then abandoned it because it had got to the point where it did everything Intel wanted it to do.
Intel donated it to the Open Source Development Lab, for which it has the community's lasting gratitude. The community has continued its development.
Iometer is the tool you use when you want to test storage as it will be delivered by the operating system. It is not the raw storage approach of Vdbench. It tests by creating a huge great big file on a target volume, then doing all of its reads and writes from that file.
The open source tool fio is an up and coming raw benchmark that is worthy of a look by anyone serious about benchmarking. There is a great deep dive written up by Spencer Hayes here that provides all the info needed, assuming you are familiar with at least one of the other raw benchmarking tools.
All these tools have their pros and their cons. On the pro side, if you know what you are doing these tools will give you the most accurate information on the fundamental storage performance of a given storage device.
On the con side, they tell you absolutely nothing about the workloads that will run on top of that storage. Let's take a run-of-the-mill file server as an example.
I have a Synology RS3614SX+. With the right configuration and the right SSDs I can gleefully get 500,000 IOPS out of the thing, or just under 4GBps using iSCSI. Not bad for a NAS that runs for only $6000 (before drives).
Now, fill that system with files and try the same thing. You are emphatically not going to pull 4GBps of file transfer off that unit, even with four 10GbE SFP+ cables directly connecting the server to the NAS. Not going to happen: raw storage speed is one thing and metadata parsing is another*.
Move from "all reads" or "all writes" to "mixed read and write" and you will cut the IOPS in half. Run mixed workloads – a few iSCSI LUNs and some file hosting on the side – and you get even lower results.
After months of testing with real-world workloads I came to the conclusion that I can reliably get about 1200MBps at around 100,000 IOPS off the Synology RS3614SX+ using one vendor's SSDs before the latency falls off a cliff.
Using SSDs from another vendor I drive that up to about 1500MBps, and a third vendor gave me around 1400MBps.
That is still a heck of a deal for a $6000 NAS, but it is a far cry from the headline speeds advertised by Synology (439,000 IOPS or 3600MBps) or my own peak tested speeds.
Turn the knobs a bit at a time and figure out where the cliff is
In addition to raw speed, you need to worry about latency. As you push a storage device to the redline, latency will increase. In practice you want to model your storage system's response.
Turn the knobs a bit at a time and figure out where the cliff is. Set all your monitoring to alert you just before you hit it; base those monitors on the total IOPS, or total throughput, and you might be able to avoid real latency crashes that affect running workloads.
Beyond both speed and latency is disk utilisation. When working on a demo for Synology, I was very unhappy that I couldn't push the RS3614SX+ over 80 per cent utilisation.
The Synology folks actually thought that was pretty good as they typically see only 70 to 75 per cent on most of their lab tests, and far lower utilisation by most people in the field.
I kept fussing over the thing for months until I finally pushed it to between 90 and 95 per cent consistently. It took a little more tuning to ensure that I realised the increased performance without experiencing a latency cascade, but I did manage to get it to stably deliver at more than 90 per cent disk utilisation with acceptable latency.
This is the power that raw benchmarks can provide. I now know the exact storage profile of that unit. I know exactly what it can deliver and under what circumstances.
When virtual machines are running off the storage it provides I know what to expect and where to start looking – at the application, the operating system, the hypervisor or the storage unit – if performance is not meeting expectations.