Series SSD: 6. The test methodology

Operation

The test suite was designed primarily to measure the bit error rate surface in SSDs. Thus, I try to measure as many parameters as possible. As mentioned above, I want to measure this as closely to the manner that a device will be used or tested in a manufacturing environment. Thus, we perform IO and age the device at a constant temperature, which is representative of how an enterprise drive will be used.

The primary parameters for the surface are the P-E cycle count (endurance), the data age (retention) and temperature. Secondary parameters include read disturb and dwell time. I haven’t yet designed a test that I like to measure write disturb at the the SSD level.

Initialization files are used to define the test setup. Important parameters include:

  • read_interval – the number of write IOs between data reads. The test doesn’t have to perform a read after every write operation, which allows for faster tests. Typical value is 10.
  • age_limit_h – the target test duration in hours. For cache applications, 360 hours is typical.
  • cdwell_ms – the erase to write dwell time in ms when not reading data (just cycling). Typical value is 250.
  • wdwell_ms – the erase to write dwell time in ms when reading the data. This value allows for the erase to “settle” prior to data being read. It is distinct from cdwell_ms, because no measurement is made when the prior is used. Typical value is 360, unless performing a dwell test.
  • rdwell_ms – the write-to-read dwell time in ms. Typical value is 500ms.
  • eccreadlim – the number of error bits in a sector beyond which we also perform a normal ECC read to check for data corruption.
  • cooldown – has two parameters, age in hours and duration in hours. When a cool down age is reached, the stripe is quiesced and no IO is performed. Helps isolate read disturb effects. Can have multiple cool downs. For cache application tests, typically have 2 cool downs: (72,24) and (168,24).
  • stripe – a stripe test indention has 4 parameters: the lba, the cycle target, the aging read interval in seconds and the pattern type flag. The typical aging read interval is 360, or 10 reads/hour. Multiple stripes are defined in the test.

The test for each stripe is broken into two phases: cycling and aging. When cycling, the stripe of erase blocks is erased, written in page-sequential order and optionally read in page sequential order. This process repeats at the rate specified by the various dwell settings or as limited by the device. Once a stripe is cycled, it transitions to aging, and is never written again. In aging, the data is read in page sequential order for the entire set of erase blocks as specified by the test parameters. Once a stripe has reached a cooldown interval, it is not accessed until the specified duration has expired.

Multiple stripes are processed in the order specified in the initialization file.

Data Logged

Well, there’s just too much of it! A typical surface test generates between 20 and 100GB of data, depending on the conditions. Running multiple devices, multiple temperatures runs the size up quickly. I currently have about 3 TB of raw data.

For surface analysis, the primary data is captured in the cycle log. The cycle log captures error counts on all read operations during a stripe test. There is one file for each stripe./ The primary data captured includes the read time in real time, the cycle count of the read, the age of the data, the number of reads since the last write of the stripe, the temperature, the number of sectors successfully read, the total error count for the stripe. Additionally, some summary data is included, including a sector error histogram providing the number of sectors with a given number of bit errors, and a spatial histogram, giving the number of bit errors by sector location. Many of the devices has a stripe size of 16 erase blocks, 256 pages per erase block and 8 sectors per page, for a total of 32k pages in a stripe. The spatial histogram bins on 128 sector (16 page) boundaries.

A second log type is the sector log. This keeps the bit error counts directly for each sector for each read. The third log type is the pattern log. An early implementation of the pattern log kept both the written data pattern and the read data pattern, so that the bit errors could be analyzed without knowledge of the test program. While nice in principal, it turned out to be too capacity intensive. In fact, the logs got so large that they became the performance limiter. The current version stores  a list of errors instead, as a singed integer. The absolute value is the bit location, and the sign represents the type of error (0 to 1 or 1 to 0). While this gives the errors, it doesn’t record the entire sector’s data pattern. To get that, we need to go back to the parameters to recreate the contents. The sector and pattern log files are useful for creating the heat maps.

The Cool Down

The cool down was devised as a means to separate read disturb from aging. Unfortunately, we must read the data to measure the error rate (unless someone has an entangled quantum SSD…). If we read at a constant rate, there would be no way to deconvolve these two effects. The cool down provides us with a means to measure aging only, without read disturb. Thus, we are able to separate the effects in a single test. Cool downs will be visible in the aging charts as gaps in the data.

Leave a Reply

Your email address will not be published.