Wear leveling and access patterns
Raw access patterns
Let’s use the term raw access patterns to refer to data requests as they are received by the device. A block storage device is generally viewed by the attached system as a linear address space where all the blocks are essentially equivalent. Such is the case with HDDs (although there are modest performance differences with physical block location, which some systems exploit). Applications will use the address space any way the deem fit, which generally results in non-uniform access patterns.
For solid-state devices, we will concentrate on the write behavior as it has the greatest impact on reliability. Some data locations will be written much more frequently than others. They might contain meta-data such as access time stamps, for example. Other locations will be written very rarely. We can plot such access patterns as a histogram.
Figure 3 shows an example raw-access histogram. At a given time, most of the blocks might have moderate P-E cycle counts. Here, most blocks are have fewer than 100 P-E cycles. If we assume the device has a specified P-E cycle limit of 5,000, we can see there might be some blocks near this limit.
As the usage increases, some blocks rapidly approach the P-E cycle limit. This can be seen in Figure 4.
In Figure 4, the red curve shows how the histogram might look at a later time after the device has undergone further use. The distribution will likely shift to the right and broaden. We can easily see that some blocks have exceeded the 5,000 P-E cycle limit, even though most blocks are still under 1,000 P-E cycles. Thus, if direct raw access patterns were allowed, a flash device would wear out very rapidly.
There is also a second effect which acts as multiplier, due to the erase block size being substantially larger than the typical system IO size. Most erase blocks in MLC flash are about 1 MB is size, while 4kB is a common size for a write operation in an enterprise storage system. Thus, if the system were to write 4kB directly to the flash, a full 1 MB would need to be erased each time. Since there is likely other data in the erase block, this data would need to be copied and re-written after the erase. Thus, we’d have a P-E cycle for each 4kB operation, affecting an entire 1MB, which would rapidly wear out the device. Using the 5,000 P-E cycle example, the most any sector on the device should be written would be 5,000 times. Not very useful.
The rapid wear out effect of some data locations can be addressed through a technique called wear leveling. Wear leveling uses a translation layer (FTL – flash translation layer – in SSD parlance) which converts the logical addresses received from a host system to physical addresses at the device level. The logical to physical mapping is managed at the device level, allowing the write load to be spread more evenly across the device blocks.
The goal of wear leveling is to maximize the available program-erase cycles for a device and mitigate the effects of the large erase block. There are many approaches, and we won’t go into the details. We are not looking to examine particular algorithms; rather we hope to understand how wear leveling impacts reliability.
Figure 5 shows an example P-E histogram for a wear-leveled device. Ideally, wear leveling will make the width of the distribution substantially narrower than the raw distribution. More importantly, it is designed to significantly limit the tail at high cycle count. This prevents blocks from reaching the P-E cycle limits at low total device usage.
Ideally, the distribution moves to the right as the cycle count increases, but the width doesn’t increase but stay a fixed percent of the total P-E cycles. Thus the useful life for the device can be significantly increased, as the total P-E cycles is larger before any blocks reach the P-E cycle limit. This is illustrated in Figure 6.
Unlike raw access patterns, wear leveling is designed to keep the distribution shape more constant as the device is used. Thus, the distribution mostly shifts to higher P-E cycle count without an increase in the width, and more particularly without an increase in the high cycle count tail.