In prior entries we have discussed some of the basic capabilities of storage virtualization; thin provisioning and the ability to use dissimilar storage hardware in some cases. A newer feature is the ability to leverage a virtualized storage system's granular understanding of block of data to automatically move data segments to faster or slower storage tiers, based on access patterns. This capability is known as automated storage tiering.
Automated storage tiering is particularly attractive as customers consider the use of solid state disk (SSD). SSD is known for its performance, but it is also known for its premium price. Customers need to balance the performance of SSD with its cost. The good news is that for most environments, at any given moment, less than 10% of the data is currently active or is going to be active.
This means that if there was a way to make sure that your most active data is always on solid state storage, you could get almost its full benefits and only buy 10% of your primary storage capacity. For example, a 1TB SSD tier could support a 10TB mechanical drive based primary storage tier. The problem is, of course, how do you get that data to the high speed tier of storage?
Automated tiering is one solution and of course it is one that the storage virtualization vendors would love for you to adopt.
[What's Google's master plan? See 9 Markets Google Wants To Rule.]
While each automated tiering solution is a little different, in general they work by monitoring data segments on disk; when a data segment reaches a certain read intensity level that data is promoted and moved to the SSD tier. When the data cools or there is another data segment that is getting even more read activity, that data segment is demoted to the mechanical tier. In some systems, this demotion can be to several types of mechanical tiers, from 15k SAS to high capacity SATA, for example.
There are a few downsides to this approach. First, most automated tiering techniques are read only, meaning that all inbound writes are written to disk first. They must then go through this read analysis to be promoted to SSD. This is a safe way to leverage SSD. There is limited risk of data loss since the first copy of data is always going to disk first.
The downside is that writes are an expensive process for the disk system to handle compared to reads. This means that before a block is promoted to the SSD tier, some number of mechanical disk reads has to occur.
Second, this results in a lot of work for the storage controller, more work than it probably has ever done before. It has to constantly analyze all the data that it stores, identify certain segments for promotion, and then actually promote or demote the data. Combine this with all the other work that a storage controller has to do, snapshots, thin provisioning, replication etc... and it can be overwhelmed.
In addition to the work of analyzing and moving data, there is also the issue that automated tiering causes a lot of write traffic to occur to a storage medium that is very sensitive to high write traffic; flash SSD. The life expectancy of flash is always a concern in these high turnover situations.
Despite these downsides, automated tiering is a quick and easy way for users to avoid the work of analyzing applications and determining which data set should be manually placed on SSD. As you can imagine, automation of tiering is now a very popular subject and there are more options available than just those found in storage virtualization solutions with automated tiering. In our next entry, we will compare the different types of SSD automation solutions.
George Crump is lead analyst of Storage Switzerland, an IT analyst firm focused on the storage and virtualization segments. Find Storage Switzerland's disclosure statement here.