Sunday, March 23, 2014

SSD Flash Controller




Every SSD includes a controller i.e. an embedded processor that executes firmware-level code and is one of the most important factors of SSD performance.

Functions:
  1. Error correction (ECC)
  2. Wear leveling
  3. Bad block mapping
  4. Read scrubbing and read disturb management
  5. Read and write caching
  6. Garbage collection
  7. Encryption



Sandfore Controller

SandForce initially released the SF-1000 family of SSD Processors and split them into enterprise and client computing applications. The SF-1500 was the enterprise product and the SF-1200 the client focused product.

 In October 2010, SandForce introduced their second generation SSD controllers called the SF-2000 family focused on enterprise applications. Enhancements included: SATA 3.0 (6 Gbit/s), faster speeds, security, and data protection features. The client version of this second generation line was introduced in February 2011 with most of the same enhancements seen in the SF-2500.

Launched in November 2013, the SF 3700 family of controllers supports triple-level cell flash for high-capacity drives and NVM Express for improved performance at the high end. Sample engineering boards with the PCIe x4 (gen 2) model of this controller found 1,800 MB/sec read/write sequential speeds and 150K/80K random IOPS. A Kingston HyperX "prosumer" product using this controller was showcased at the Consumer Electronics Show 2014 and promised similar performance.Mushkin also showcased products using the SF 3700 series at CES, highligting their M.2 Helix series up to 480GB (512GiB) and up to 2TB in for the 2.5 inch format.
The SF 3700 family consists of the following announced models:
SF3719 — SATA 6Gbit/s + x2 PCIe; "entry level" product with identical connectivity but announced to have fewer firmware features than the "mainstream" SF3729; precise differences in features not yet disclosed
SF3729 — SATA 6Gbit/s + x2 PCIe
SF3739 — x4 PCIe (gen 2); support for optional battery or supercapacitor “full power fail” protection
SF3759 — “full enterprise feature set” (no further details released yet)


All these models are actually made of the same die (produced in a 40 nm process), an area of which goes unused in the lower-end products. The RAISE technology in the SF 3700 series was upgraded from protecting against a single page or block failure (in the previous series) to "multiple pages and blocks or up to a full die" with the so-called RAISE level 2. Additionally, the new chips reserve less than a full die for redundancy (so-called "fractional RAISE").

http://www.anandtech.com/show/7520/lsi-announces-sandforce-sf3700-sata-and-pcie-in-one-silicon



Bad Block Managment


With use, memory cells that forms blocks of the NAND Flash memory array can wear out. Most of the NAND Flash devices contain some initial bad blocks within the memory array. These blocks are typically marked as bad by the manufacturer, indicating that they should not be used in any system.

Bad Blocks are blocks that contain one or more invalid bits whose reliability is not guaranteed. BadBlocks may be present when the device is shipped, or may develop during the lifetime of the device.

Devices with Bad Blocks have the same quality level and the same AC and DC characteristics as devices here all the blocks are valid. A Bad Block does not affect the performance of valid blocks because it is isolated from the bit line and common source line by a select transistor.

Bad Block Management, Block Replacement and the Error Correction Code software are necessary to manage the error bits in NAND Flash devices.


Recognizing Bad Blocks
After the original bad-block table is created, if in the time span any other blocks go bad those should also be included in the “invalid block list”. In general, for SLC large page (2112- byte) devices, any block, where the 1st and 6th bytes/1st word in the spare area of the 1st page, does not contain FFh is a bad block. So new block which come under permanent failure has to place in bad block table, if the error is temporary then can be corrected by Flash controller i.e., if Flash Translation layer addresses one of the Bad Blocks, then Bad Block Management program directs it to a good block.



Block Replacement
NAND devices have READ STATUS command after an READ/ERASE operation. This reports a failure in PROGRAM (ERASE) if at least on bit in the programmed (erased) page did not change from “1” to a “0”state (“0” to a “1” state). The additional bad blocks are identified when attempts to program or erase give errors in the status register. As the failure of a page program operation does not affect the data in other pages in the same block, the block can be replaced by reprogramming the current data and copying the rest of the replaced block to an available valid block.

The Bad Block Table is created by reading all the spare areas in the NAND Flash memory. The table is then saved to a good block so that on rebooting the NAND Flash memory, the Bad Block Table is loaded into RAM. The blocks contained in the Bad Block Table are not addressable. So, if the Flash Translation Layer addresses one of the Bad Blocks, the Bad Block Management software redirects it to a good block.

Blocks can be marked as bad and new blocks allocated using two general methods.

Skip Block Method
In the skip block method the algorithm creates the bad block table and when the target address corresponds to a bad block address, the data is stored in the next good block, skipping the bad block. When a bad block is generated during the lifetime of the NAND Flash device, its data is also stored in the next good block. In this case, the information that indicates which good block corresponds to each developed bad block also has to be stored in the NAND Flash device.

Reserve Block Method:
In the reserve block method, bad blocks are not skipped but replaced by good blocks by redirecting the FTL to a known free good block. For that purpose, the bad block management software creates two areas in the NAND Flash: the user addressable block area and the reserved block area as shown in Fig .6. The FTL can use the user addressable block area to store data whereas the reserved block area is only used for bad block replacement and to save the bad block table that also keeps track of the remapped developed bad blocks.

SSD NAND Flash Memory Layout

Source: anandtech.com

1.       Pages:  multiple memory cells
1.       one page is the smallest structure which can be read or written

2.       Blocks: multiple pages
1.       one block is the smallest structure which can be erased
2.       e.g.
one block = 128 pages á 4 KiB
(with MLC 16.384 memory cells per page)
→ 512 KiB Block
3.       newer SSDs (25nm/20nm Intel/Micron
                       or 24nm/19nm Sandisk/Toshiba)
one block = 256 pages á 8 KiB
→ 2 MiB Block


3.       Planes :
Multiple blocks make up a plane
e.g  1024 Blocks  =  1 Plane
25nm Intel/Micron :-  1Plane = 2GiByte

Intel/Micron: dies with 64 GiBit (8 GiByte)
4.       Dies:
       Multiple planes make up a die
          e.g   4 Planes = 1 Die


Complete Layout of SSD:


A typical 2Gb Single Level Cell (SLC) NAND Flash device is organized as 2,048 blocks, with 64 pages per block.
Each page is 2,112 bytes, consisting of a 2,048-byte data area and a 64-byte spare area.
The spare area is typically used for Error Control and Coding (ECC), wear-leveling, and other software overhead functions, although it is physically the same as the rest of the page.



Inside a standard SDD