Network Computing is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them. Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.

SSD Failure Rates

Ever since SSD drives began their slow march to mainstream storage,
there has been a constant chorus over concerns about SSD failure rates
and questions on if the technology was ready for the enterprise. Most of
the concern lies around how many writes a SSD drive can sustain.
Vendors of enterprise SSD drives have gone to great lengths to make sure
that today's SSD drives used in the data center will not have premature
write issues. With the improvements in the quality of the NAND and the
capabilities of the Flash controller, if the right vendor is selected,
Flash SSDs should outlive most mechanical drives.

Although some are deserved, many of the concerns about SSD failure rates
seems to come from personal experiences with consumer grade compact Flash, like those found in cameras or inexpensive thumb drives, which
have had a sketchy past. Interestingly, if your personal experience was a
mid-1990 RAM based SSD you would have the opposite reaction. RAM based
SSDs have always proven to be extremely reliable. Flash is
different. Its persistent storage state means that memory cells need to
be written to. If the memory cell is full,, it needs to be erased and
written over. The primary concern around Flash-based SSD is write
endurance. In other words, how many write cycles can they handle? This
typically is addressed in two ways. First by the type of SSD used and
second by the intelligence on the controller. There are two types of
Flash based technology: single level cell (SLC) and multi-level cell
(MLC). Most MLC drives hold four states per cell, where SLC holds one
state per cell. As a result MLC is higher capacity but more prone to
failure, SLC is more expensive but also more reliable and have a much
higher write endurance. Most enterprise drives are SLC based for this
reason.

All Flash drives have a controller. The controller handles how data is
written to the flash drive and it is one of the bigger differentiators
of SSD quality. One of the controller's primary functions is to make
sure that data is written evenly to each cell on the drive. This keeps a
group of cells from wearing out before the other cells on the flash
drive. The controller also manages excess cells on the drive, making
those available as other cells wear out. The controller may even group
the writes so that it can write larger blocks of data to the drive, which
further helps with both performance and write longevity. The Flash
controller technology has become so advanced that we are seeing a few
vendors offer MLC based SSD drives to the enterprise. They can do this
by leveraging all of the above controller technologies. While I'm not
sure that MLC will replace SLC in the enterprise, we may see a tiering of
SSD, just like we have today in hard drives.

While there are other factors in delivering reliable SSD technology the
controller intelligence that the SSD supplier uses may be the key
differentiator in reducing SSD failure rates. As we discuss in our
article "Pay Attention to Flash Controllers when Comparing SSD Systems," it is reasonable
to expect seven to ten years of life out of today's SSDs with the capabilities of today's controller technology, which is
really beyond the life of any enterprise primary storage device.