In our last entry, we covered some of the differences between automated tiering and caching. An area that deserves specific examination is how each of these technologies deals with inbound writes. Writes are one of the most resource-consuming things that a storage system has to do, and how these technologies help you--or not--with write I/O is important in product selection.
As I said in my last post, every vendor's implementation of these technologies is slightly different and there will be exceptions. However, many caching and automated tiering technologies do not, as a rule, accelerate write I/O. Many automated tiering systems have all inbound writes go to the mechanical hard drive first, and many cache systems are read only, meaning again that all writes go to disk first.
Either of these schemes--going to disk first or disk only--is a very safe, conservative way to handle your data and that is not necessarily a bad thing. You probably still will see a write performance boost because now that all or most reads are coming from solid state, the mechanical disk tier is left only to have to handle inbound write I/O. Basically you have decreased its workload substantially.
Sending all write I/O to disk first, though, means that before data can be promoted to the SSD tier, it has to be accessed. In some systems that use a heat map type of approach this warming process can take hours or even days to occur. But there are ways around this while still keeping your data safe. For example, some cache systems have the ability to store all inbound writes in solid state disk (SSD) as well as write them to the hard disk tier. The idea here is that what you just wrote to storage is also the most likely to be requested next. This method gives you the faster read response without having to wait for the data to warm up to be cached, while still giving you the protection of having the data on disk.
As we discuss in our recent article How Data Centers Can Benefit From SSD Today, the next level of write performance is to actually cache or store all inbound writes to the SSD tier. Though flash memory is slower at write performance than it is at read performance, its write speed is still substantially faster than that of hard disk drives. This would mean storing writes to the cache or SSD tier first, and then if that data block does not get requested or modified, gradually demote it to the mechanical drive tier. If you can safely get the performance boost, a write-heavy application might see some significant performance improvement.
Designing a system to send all writes to SSD first is not difficult. The important part is making sure that the SSD tier is reliable because if that SSD tier or cache system fails then data could be lost. This is particularly important for caching vendors because their solutions often are standalone systems that improve performance across a variety of mechanical storage hardware. They have to have their cache mirrored and clustered in such a way that a failure on one device does not mean the loss of data.
For auto-tiering solutions provided by the hardware manufacturer, we would expect them to leverage the same high availability offered in their storage system that the mechanical tier uses, which typically means dual controllers, dual power supplies, and so on. Interestingly, I know of only a few storage system vendors that start with inbound writes in flash, citing reducing write traffic (and increasing life) of the flash as the primary motivation for not storing writes to SSD first.
Whether you cache writes or not is going to be largely dependent on your environment. If you have a read-heavy workload, then the extra effort to get read caching might not be worth it to you. If you do, then it might significantly increase performance.
There are several other points of discussion around write acceleration that we will get into in the future including how write acceleration impacts the virtual server environment and how operating systems or file systems could make write I/O less resource consuming. Next though we will discuss eliminating auto-tiering and caching issues all together with SSD only systems or direct placement of data on SSD appliances.
Track us on Twitter
George Crump is lead analyst of Storage Switzerland, an IT analyst firm focused on the storage and virtualization segments. Find Storage Switzerland's disclosure statement here.