Automated storage tiering (ATS) is one of most discussed storage array topics in the past year.
It refers to the ability of the storage array to move chunks of data between different disk types and RAID levels to meet the right balance between performance and space usage, and avoiding so-called hot-spots.
The pioneer in this arena was Compellent (now acquired by Dell); it delivered this kind of functionality since 2004, but now all the top tier vendors have a similar functionality in their products. EMC, HDS, HP-3Par, IBM, and NetApp call it different names and implement it in their own ways due to very different architectures and points of view regarding the virtualisation layers at the storage level. So, this article will not point out which one is the best; my objective is to do a 360° review of all the most important metrics you need to evaluate in different ATS implementations when you look at them.
LUN or sub-LUN
In the early days, some vendors tried to sell a "semi automatic" feature to move LUNs from fast disks (FC) to cheap but capacitive disks (SATA) when the I/O requests went under a certain threshold. This approach is very simple but it is almost unusable in real life due to the big data movements involved and lack of granularity: the risk is to move down a LUN and then to move it back again for a new I/O peak ... figure out what this means when you have a multi-terabyte LUN in an IOPS-stressed array!
Modern sub-LUN implementations of ATS are quite a bit smarter then previous ones: a LUN is divided in chunks and each one can be positioned in a different tier of disk, sometimes even protected with a different RAID level.
Engines and algorithms
Each vendor developed its own movement engine to migrate data chunks between tiers based on time/events and/or access frequency basis. Each implementation depends on the overall architecture but, in a general way, we can consider two approaches: "all in the array" or "something out of the array".
When the implementation is "all in the array" all the monitoring capabilities are inside the controllers and its software and all the functionalities are software/operating system independent. But, when some parts of the movement process are implemented outside the array (i.e.: agents on the O/S to monitor the IOPS or an external analytics tool) the data movements are at risk because the environment is more complex to manage and depends on external factors.
Granularity means efficiency and efficiency usually means better savings: you can find very different chunk sizes ranging between 512KB to 1GB! (Some vendors are already talking about going down to 32KB blocks). So, granularity is important because:
- The more granularity you have, the less data you move in the backend;
- Small chunks can be moved up and down relatively often for a better data placement and fast/finest tuning;
The risks with big chunks are:
- Before moving a big amount of data, the algorithm needs to wait too much with the risk of moving the data when it's too late!
- If you have a small amount of active data, like a bunch of megabytes in a large LUN, the risk is to have GBs and GBs of data moved to the upper, costlier, tier.
How many tiers do you need?
Normally you'll find that more granularity means also more tiers supported. The most comprehensive implementations support many tiers for each LUN ranging between SSDs, FC/SAS and SATA with different RAID levels and data placement optimisations on the hard disk drives. Other vendors preferred to implement a simpler two way tiering; it's less efficient, for sure, but it could be just "good enough" in some environments.
Demote or promote?
Another important point to evaluate is the first data placement in the array upon a write operation. Actually, when the ATS implementation is more granular, you will find that data is written in the speediest tier and will be demoted at a later time to the slow tiers, giving you the maximum performance immediately. Some implementations though, write the data on the slowest tier and then, if the data is heavily accessed, it will be promoted to an upper tier (most of the time SSDs).
I think the first approach is the best one for performance but the second one can result in better savings because you can use different technologies: if you use the first tier for writes (and it is made with SSDs) you need to use SLC technology to achieve the best results in terms of IOPS and resiliency, but if you write on lower tiers and then hot daa blocks are promoted on SSDs you can safely use cheaper MLC drives.
The right use of SSDs
SSD is a very helpful technology to solve performance problems but it isn't cheap, especially if you need a lot of space! With automated tiering you buy only the amount of SSDs needed to serve the most accessed data blocks. In a well configured, multi-tier system with automated tiering enabled, the design is similar to a pyramid (with SSD on top, FC/SAS in the middle, and SATA on the bigger base).
Ease of use and the risks
The usage of ATS is very simple, normally you define and apply a profile to a LUN via a GUI and the system adapts itself to write the data in the right place. That's all!
The risk is that ATS is too simple, if you create the wrong profiles or the whole array isn't correctly configured (in terms of quality and quantity of disks) you will obtain unexpected and hard-to-diagnose bad results.
Enrico Signoretti is the CEO of Cinetica, a small consultancy firm in Italy, which offers services to medium/large companies in finance, manufacturing, and outsourcing). The company has partnerships with Oracle, Dell, VMware, Compellent and NetApp.