Flash Memory Summit 2015

I have posted my presentations from the 2015 Flash Memory Summit in the Library.

In my Touch Rate talk I showed how Touch Rate can be applied to hybrid storage systems. As mentioned in the talk, Tom Coughlin and I will be working on a 2nd Touch Rate whitepaper covering this new material. I will post the new paper when it’s complete. This will include an updated version of the spreadsheet supporting hybrid (cached or tiered) systems.

Flash Temperature Testing and Modeling

Temperature Testing

This an expanded version of the temperature modeling section from my Flash Memory Summit 2014 Tutorial T1.

An accurate temperature model is vital for flash devices as most vendors rely on accelerate temperature testing to verify retention capabilities. I tested flash at the SSD level as this is how the devices are integrated in to storage systems. All tests were performed using devices supporting a host-managed interface. Continue reading

Flash Memory Summit 2014 Spreadsheet

I am posting the spreadsheet that accompanies the Flash Memory Summit 2014 Seminar A chart deck. This is a LibreOffice 4 spreadsheet. Note that it includes a macro that extends the range of the cumulative binomial distribution. Interestingly, LibreOffice suffers from the same lack of precision that Excel does. The macro is provided as-is, and is not guaranteed to work in all cases. [Translation: I haven’t had time to debug it and the error handling in LibreOffice basic is abysmal.]

The spreadsheet shows how to compute reliability for flash-based storage systems, and compares various RAID  architectures when using a DNR ECC approach (described in the chart deck).

Note: I have seen a lot of talk in the industry about how new systems use erasure codes instead of RAID. Technically, all RAID designs use erasure codes – even mirroring (that’s called a replication code). I use the term RAID loosely to describe a system utilizing an erasure code  to protect against data loss.

The PMDS codes described here are erasure codes designed to protect against simultaneous device loss and sector loss with high data efficiency. You can read about them here, here and here.

You can download the RAID Reliability Spreadsheet.

The spreadsheet is provide under GPLv2.

Flash Memory Summit 2014

FMS 2014

I will be at the 2014 Flash Memory Summit presenting in two tutorials.

The first is:  Pre-Conference Seminar A Making Error Correcting Codes Work for Flash Memory

The second is Tutorial T-1 Measuring Reliability in SSD Storage Systems.

I think you’ll find them both enlightening. In the first, I will be presenting on the benefits of optimizing error correction at the flash device level in concert with a RAID system. In the second, I will cover material not yet presented here on SSD reliability measurements and presenting an empirical temperature acceleration model for flash, derived from device-level measurements. Don’t miss it!

Series SSD: 7. Bit error rate – cycling data (endurance)

Program-Erase Cycling data (endurance)

Authors note: I am working diligently to get all the data together into a consumable format. Since this is turning out to be a time consuming process given the volume of data, I will be updating this post as I get the data ready. Once I get the data posted, then I’ll get back to the analysis.

Update March 2013

I finally got all the cycling data for the 3xnm class devices collated so you can preview it while I write up the analysis. As you can see, there is quite a bit of data. I have posted the data in high-res images galleries so you can see the details.

The Cycling Test

As described in chapter 6, the cycling tests are designed to rapidly reach a given Program-erase cycle count, at which point data aging tests are performed. Unless noted, the erase-to-write dwell time is 250ms if the data isn’t going to be read, and 360ms if it will be read. The write-to-read dwell time is 500ms.

The data is presented as raw bit error rate vs. program-erase cycle count.

70C raw cycling data

The following gallery shows the raw cycling data at 70C for a set of 3xnm devices. Cycle measurements here were taken out to 14k cycles. These device have a specified 3k program erase cycle limit, a 70C max operating temperature and 1 year retention.

The astute observer may have noticed some non-uniformity in the bit error noise characteristics. Namely, there are spikes in the bit error rate. I have decided to call these error fountains. I plan to devote a chapter to this phenomenon.

60C raw cycling data

The following gallery shows the raw cycling data at 60C for a set of 3xnm devices. Cycle measurements here were taken out to 20k cycles here. These device have a specified 3k program erase cycle limit, a 70C max operating temperature and 1 year retention.

I have run some of these tests to higher cycling counts than the 70C data.Again, the bit error noise characteristics aren’t always uniform. Some of the test parameters were adjusted in some of these tests. I’ll point these out where they show changes in behavior.

40C raw cycling data

The following gallery shows the raw cycling data at 40C for a set of 3xnm devices. Cycle measurements here were taken out to 20k cycles here. These device have a specified 3k program erase cycle limit, a 70C max operating temperature and 1 year retention.

I have run some of these tests to higher cycling counts than the 70C data. Again, the bit error noise characteristics aren’t always uniform. Some of the test parameters were adjusted in some of these tests. I’ll point these out where they show changes in behavior.

30C raw cycling data

The following gallery shows the raw cycling data at 30C (well, 28C if you want to get technical) for a set of 3xnm devices. Cycle measurements here were taken out to 14k cycles here. These device have a specified 3k program erase cycle limit, a 70C max operating temperature and 1 year retention.

Again, the bit error noise characteristics aren’t always uniform. Some of the test parameters were adjusted in some of these tests. I’ll point these out where they show changes in behavior.

Series SSD: 6. The test methodology

A philosophical point

When testing devices, it is usually best to test beyond the specifications.
This is especially important when architecting systems.
This allows for a better understanding of behavior in the region of the failure limits.
I also find it better to interpolate behavior than to extrapolate it.
Finally, we may learn something interesting!

The test methodology

I have measured the error rate behavior for a set of MLC NAND SSDs with real-time aging. I have looked at 5xnm, 4xnm and 3xnm class consumer-grade devices. I haven’t had the chance to measure 2xnm devices yet. I chose consumer-grade flash as this is the dominant flash on the market, and one of the goals is to learn how to use such devices in enterprise systems.

I have created a test suite to measure the error rate surface in multiple dimensions. It can measure the bit error rate as a function of the data age (retention), the P-E cycle count (endurance), the number of reads since the last write (read disturb) and temperature. I’m not looking at other effects such as write disturb here. Off the shelf SSDs with a special micro code load are used. The special features include: turning off wear leveling, providing direct read, write and erase, and raw (no ECC) read. The latter allows a bit-for-bit compare to be used to determine the bit error rate. The SSD controller still determines how the write, read and erase operations are performed. All the SSDs were purchased through retail channels, thus should be representative of what is available.

In the test procedure, the cycling and aging are all performed at the same temperature, as is likely to be the case in an enterprise application. This is distinct from many other test methods, where the device is cycled at room temperature, and then the retention is measured at high temperature. For example, the JEDEC JESD218 specifies this type of test. I don’t feel that this approach adequately reflects the operational environment. Enterprise SSDs are are typically installed in rack mount systems in a data center, where the temperature is controlled. I think the test method should replicate the operating conditions as closely as practicable. A further concern wit the JESD218 is the assumption that accelerated testing is valid, and that the Arrhenius model is valid with a 1.1eV activation energy. You can tell from the surface equation of chapter 5 that the observed temperature dependence is not Arrhenius. I will cover this in detail in a later chapter.

In enterprise applications, SSDs aren’t usually “rode hard and put up wet”. That is to say, they are aren’t used heavily and then left idle. This might be more true of a light IT workload, such as a laptop, where a device might be used for a short period,with long idle periods. However, if the laptop was used hard, then shut down, it might be that the SSD is “put up wet” in that it has no time to perform clean up post activity once the power is removed. It might also heat up for a while as the cooling fans are turned off.

When I presented some of this data at the Non-volatile Memories Workshop in March 2012, I had a couple of people from flash vendors offer the opinion that my test methodology was flawed. I won’t identify them or their companies, but they said measuring non-recoverable read errors (sector failures) was not a valid method for determining SSD reliability. It is clear to me that measuring sector failures is a valid measure of system reliability. NRRE tests of storage media reliability in systems has been used for decades (hard disk, tape, etc.). (If a retention failure doesn’t result in a sector loss, then what exactly is the error?) I bring this anecdote up because I think it will help my readers understand why there may be large differences between how the SSDs behave and how they are specified. If the manufacturers don’t believe measuring non-recoverable read errors is valid, then perhaps they themselves don’t do it. This by itself could explain how the measured behavior deviates from the specification.

JEDEC Solid State Technology Association “Solid-State Drive (SSD) Requirements and Endurance Test Method”, JESD218 Sep. 2010.

Series SDD 5: The error rate surface for MLC NAND flash

Update July 2014: modified bit-error equation

I have modified the MLC equation based on examination of further data.

Expected behavior of the  MLC error rate surface

NAND flash has rather complex bit error rate behavior compared with magnetic recording. In the case of hard disks, the bit error rate behavior tends to be a constant, without strong dependence on other factors. For a given bit, the error rate doesn’t depend on the number of write cycles or the age of the data. Unfortunately, the same can’t be said for flash. Flash has a complex multi-dimensional error rate surface. Continue reading

musings on data storage