EMC's announcements of FASTcache and sub-LUN FAST, the feature formerly known as FAST 2.0, has got me thinking once again about how to get the best bang for the big bucks flash memory will cost you. The whole idea of automated tiering is supposed to move the hot data to flash while leaving the less frequently accessed cold data on spinning disks. My question, is how do you determine what data is hot?
Clearly automated tiering requires that the storage system collect some stats on access frequency. The simplest thing is to keep several days of IOP/day counts or a moving average IOP/hr for each data block. An admin could then create a policy that, for selected volumes, moved blocks with higher access counts up to the flash tier, and colder ones are moved down to the trash tier of high-capacity SAS drives.
Average temperatures are all well and good, but anyone who has been to camp in the mountains knows that the average temperature doesn't say enough about the weather to know how to dress. Even if the average temperature for a given day is 70 degrees it can be 70 degrees all day or 50 degrees at 7AM and 90 degrees at 4PM. Similarly, some workloads may be so bursty that they generate many IOPs/min occasionally, but not enough across an entire day to be in the hot 5-10 percent we can afford to put in flash. Moving the 90 degree blocks to flash may have a bigger impact on application performance and cost than moving the data that's the metaphorical equivalent of a day in Honolulu where it's 74-85 degrees all day every day. Then there are the periodic loads. Things like weekly data warehouse cube-builds, end-of-month processing, class registration and the like can be predicted. When the tiering process runs on Friday night, or on the last day of the month, I might want to direct it to use the access metadata from last Saturday when the data warehouse was being loaded or last month's end.
Finally, we have to consider block sizes. Keeping access metadata on each addressable block in a system would quickly exhaust the array CPU and require enough space for users to notice they're getting 5-15 percent less than they used to. Bigger blocks simplify tiering, but they also drag cooler data along with the hot data, reducing its effectiveness. Vendors haven't talked about block size in tiering much yet, but I imagine most are using 64K-4MB blocks that align with RAID stripes.
Caching is sounding simpler all the time, isn't it?