New Mac Mini Has Modular Storage, 256GB Model Will Have Faster SSD
Read more of this story at Slashdot.
Read more of this story at Slashdot.
Read more of this story at Slashdot.
Read more of this story at Slashdot.
Read more of this story at Slashdot.
Sabrent's lineup of internal and external SSDs is popular among enthusiasts. The primary reason is the company's tendency to be among the first to market with products based on the latest controllers, while also delivering an excellent value proposition. The company has a long-standing relationship with Phison and adopts its controllers for many of their products. The company's 2 GBps-class portable SSD - the Rocket nano V2 - is based on Phison's U18 native controller. Read on for a detailed look at the Rocket nano V2 External SSD, including an analysis of its performance consistency, power consumption, and thermal profile.
The CXL consortium has had a regular presence at FMS (which rechristened itself from 'Flash Memory Summit' to the 'Future of Memory and Storage' this year). Back at FMS 2022, the company had announced v3.0 of the CXL specifications. This was followed by CXL 3.1's introduction at Supercomputing 2023. Having started off as a host to device interconnect standard, it had slowly subsumed other competing standards such as OpenCAPI and Gen-Z. As a result, the specifications started to encompass a wide variety of use-cases by building a protocol on top of the the ubiquitous PCIe expansion bus. The CXL consortium comprises of heavyweights such as AMD and Intel, as well as a large number of startup companies attempting to play in different segments on the device side. At FMS 2024, CXL had a prime position in the booth demos of many vendors.
The migration of server platforms from DDR4 to DDR5, along with the rise of workloads demanding large RAM capacity (but not particularly sensitive to either memory bandwidth or latency), has opened up memory expansion modules as one of the first set of widely available CXL devices. Over the last couple of years, we have had product announcements from Samsung and Micron in this area.
At FMS 2024, SK hynix was showing off their DDR5-based CMM-DDR5 CXL memory module with a 128 GB capacity. The company was also detailing their associated Heterogeneous Memory Software Development Kit (HMSDK) - a set of libraries and tools at both the kernel and user levels aimed at increasing the ease of use of CXL memory. This is achieved in part by considering the memory pyramid / hierarchy and relocating the data between the server's main memory (DRAM) and the CXL device based on usage frequency.
The CMM-DDR5 CXL memory module comes in the SDFF form-factor (E3.S 2T) with a PCIe 3.0 x8 host interface. The internal memory is based on 1α technology DRAM, and the device promises DDR5-class bandwidth and latency within a single NUMA hop. As these memory modules are meant to be used in datacenters and enterprises, the firmware includes features for RAS (reliability, availability, and serviceability) along with secure boot and other management features.
SK hynix was also demonstrating Niagara 2.0 - a hardware solution (currently based on FPGAs) to enable memory pooling and sharing - i.e, connecting multiple CXL memories to allow different hosts (CPUs and GPUs) to optimally share their capacity. The previous version only allowed capacity sharing, but the latest version enables sharing of data also. SK hynix had presented these solutions at the CXL DevCon 2024 earlier this year, but some progress seems to have been made in finalizing the specifications of the CMM-DDR5 at FMS 2024.
Micron had unveiled the CZ120 CXL Memory Expansion Module last year based on the Microchip SMC 2000 series CXL memory controller. At FMS 2024, Micron and Microchip had a demonstration of the module on a Granite Rapids server.
Additional insights into the SMC 2000 controller were also provided.
The CXL memory controller also incorporates DRAM die failure handling, and Microchip also provides diagnostics and debug tools to analyze failed modules. The memory controller also supports ECC, which forms part of the enterprise class RAS feature set of the SMC 2000 series. Its flexibility ensures that SMC 2000-based CXL memory modules using DDR4 can complement the main DDR5 DRAM in servers that support only the latter.
A few days prior to the start of FMS 2024, Marvell had announced a new CXL product line under the Structera tag. At FMS 2024, we had a chance to discuss this new line with Marvell and gather some additional insights.
Unlike other CXL device solutions focusing on memory pooling and expansion, the Structera product line also incorporates a compute accelerator part in addition to a memory-expansion controller. All of these are built on TSMC's 5nm technology.
The compute accelerator part, the Structera A 2504 (A for Accelerator) is a PCIe 5.0 x16 CXL 2.0 device with 16 integrated Arm Neoverse V2 (Demeter) cores at 3.2 GHz. It incorporates four DDR5-6400 channels with support for up to two DIMMs per channel along with in-line compression and decompression. The integration of powerful server-class ARM CPU cores means that the CXL memory expansion part scales the memory bandwidth available per core, while also scaling the compute capabilities.
Applications such as Deep-Learning Recommendation Models (DLRM) can benefit from the compute capability available in the CXL device. The scaling in the bandwidth availability is also accompanied by reduced energy consumption for the workload. The approach also contributed towards disaggregation within the server for a better thermal design as a whole.
The Structera X 2404 (X for eXpander) will be available either as a PCIe 5.0 (single x16 or two x8) device with four DDR4-3200 channels (up to 3 DIMMs per channel). Features such as in-line (de)compression, encryption / decryption, and secure boot with hardware support are present in the Structera X 2404 as well. Compared to the 100 W TDP of the Structera X 2404, Marvell expects this part to consume around 30 W. The primary purpose of this part is to enable hyperscalers to recycle DDR4 DIMMs (up to 6 TB per expander) while increasing server memory capacity.
Marvell also has a Structera X 2504 part that supports four DDR5-6400 channels (with two DIMMs per channel for up to 4 TB per expander). Other aspects remain the same as that of the DDR4-recycling part.
The company stressed upon some unique aspects of the Structera product line - the inline compression optimizes available DRAM capacity, and the 3 DIMMs per channel support for the DDR4 expander maximizes the amount of DRAM per expander (compared to competing solutions). The 5nm process lowers the power consumption, and the parts support accesses from multiple hosts. The integration of Arm Neoverse V2 cores appears to be a first for a CXL accelerator, and enables delegation of compute tasks to improve overall performance of the system.
While Marvell announced specifications for the Structera parts, it does appear that sampling is at least a few quarters away. One of the interesting aspects about Marvell's roadmaps / announcements in recent years has been their focus on creating products tuned to the demands of high-volume customers. The Structera product line is no different - hyperscalers are hungry to recycle their DDR4 memory modules and apparently can't wait to get their hands on the expander parts.
CXL is just starting its slow ramp-up, and the hockey stick segment of the growth curve is definitely definitely not in the near term. However, as more host systems with CXL support start to get deployed, products like the Structera accelerator line start to make sense from a server efficiency viewpoint.
When Western Digital introduced its Ultrastar DC SN861 SSDs earlier this year, the company did not disclose which controller it used for these drives, which made many observers presume that WD was using an in-house controller. But a recent teardown of the drive shows that is not the case; instead, the company is using a controller from Fadu, a South Korean company founded in 2015 that specializes on enterprise-grade turnkey SSD solutions.
The Western Digital Ultrastar DC SN861 SSD is aimed at performance-hungry hyperscale datacenters and enterprise customers which are adopting PCIe Gen5 storage devices these days. And, as uncovered in photos from a recent Storage Review article, the drive is based on Fadu's FC5161 NVMe 2.0-compliant controller. The FC5161 utilizes 16 NAND channels supporting an ONFi 5.0 2400 MT/s interface, and features a combination of enterprise-grade capabilities (OCP Cloud Spec 2.0, SR-IOV, up to 512 name spaces for ZNS support, flexible data placement, NVMe-MI 1.2, advanced security, telemetry, power loss protection) not available on other off-the-shelf controllers – or on any previous Western Digital controllers.
The Ultrastar DC SN861 SSD offers sequential read speeds up to 13.7 GB/s as well as sequential write speeds up to 7.5 GB/s. As for random performance, it boasts with an up to 3.3 million random 4K read IOPS and up to 0.8 million random 4K write IOPS. The drives are available in capacities between 1.6 TB and 7.68 TB with one or three drive writes per day (DWPD) over five years rating as well as in U.2 and E1.S form-factors.
While the two form factors of the SN861 share a similar technical design, Western Digital has tailored each version for distinct workloads: the E1.S supports FDP and performance enhancements specifically for cloud environments. By contrast, the U.2 model is geared towards high-performance enterprise tasks and emerging applications like AI.
Without any doubts, Western Digital's Ultrastar DC SN861 is a feature-rich high-performance enterprise-grade SSD. It has another distinctive feature: a 5W idle power consumption, which is rather low by the standards of enterprise-grade drives (e.g., it is 1W lower compared to the SN840). While the difference with predecessors may be just 1W, hyperscalers deploy thousands of drives and for their TCO every watt counts.
Western Digital's Ultrastar DC SN861 SSDs are now available for purchase to select customers (such as Meta) and to interested parties. Prices are unknown, but they will depend on such factors as volumes.
Sources: Fadu, Storage Review
As the deployment of PCIe 5.0 picks up steam in both datacenter and consumer markets, PCI-SIG is not sitting idle, and is already working on getting the ecosystem ready for the updats to the PCIe specifications. At FMS 2024, some vendors were even talking about PCIe 7.0 with its 128 GT/s capabilities despite PCIe 6.0 not even starting to ship yet. We caught up with PCI-SIG to get some updates on its activities and have a discussion on the current state of the PCIe ecosystem.
PCI-SIG has already made the PCIe 7.0 specifications (v 0.5) available to its members, and expects full specifications to be officially released sometime in 2025. The goal is to deliver a 128 GT/s data rate with up to 512 GBps of bidirectional traffic using x16 links. Similar to PCIe 6.0, this specification will also utilize PAM4 signaling and maintain backwards compatibility. Power efficiency as well as silicon die area are also being kept in mind as part of the drafting process.
The move to PAM4 signaling brings higher bit-error rates compared to the previous NRZ scheme. This made it necessary to adopt a different error correction scheme in PCIe 6.0 - instead of operating on variable length packets, PCIe 6.0's Flow Control Unit (FLIT) encoding operates on fixed size packets to aid in forward error correction. PCIe 7.0 retains these aspects.
The integrators list for the PCIe 6.0 compliance program is also expected to come out in 2025, though initial testing is already in progress. This was evident by the FMS 2024 demo involving Cadence's 3nm test chip for its PCIe 6.0 IP offering along with Teledyne Lecroy's PCIe 6.0 analyzer. These timelines track well with the specification completion dates and compliance program availability for previous PCIe generations.
We also received an update on the optical workgroup - while being optical-technology agnostic, the WG also intends to develop technology-specific form-factors including pluggable optical transceivers, on-board optics, co-packaged optics, and optical I/O. The logical and electrical layers of the PCIe 6.0 specifications are being enhanced to accommodate the new optical PCIe standardization and this process will also be done with PCIe 7.0 to coincide with that standard's release next year.
The PCI-SIG also has ongoing cabling initiatives. On the consumer side, we have seen significant traction for Thunderbolt and external GPU enclosures. However, even datacenters and enterprise systems are moving towards cabling solutions as it becomes evident that disaggregation of components such as storage from the CPU and GPU are better for thermal design. Additionally maintaining signal integrity over longer distances becomes difficult for on-board signal traces. Cabling internal to the computing systems can help here.
OCuLink emerged as a good candidate and was adopted fairly widely as an internal link in server systems. It has even made an appearance in mini-PCs from some Chinese manufacturers in its external avatar for the consumer market, albeit with limited traction. As speeds increase, a widely-adopted standard for external PCIe peripherals (or even connecting components within a system) will become imperative.
The growth in the enterprise SSD (eSSD) market has outpaced that of the client SSD market over the last few years. The requirements of AI servers for both training and inference has been the major impetus in this front. In addition to the usual vendors like Samsung, Solidigm, Micron, Kioxia, and Western Digital serving the cloud service providers (CSPs) and the likes of Facebook, a number of companies have been at work inside China to service the burgeoning eSSD market within.
In our coverage of the Microchip Flashtec 5016, we had noted Longsys's use of Microchip's SSD controllers to prepare and market enterprise SSDs under the FORESEE brand. Long before that, two companies - DapuStor and Memblaze - started releasing eSSDs specifically focusing on the Chinese market.
There are two drivers for the current growth spurt in the eSSD market. On the performance side, usage of eTLC behind a Gen 5 controller is allowing vendors to advertise significant benefits over the Gen 4 drives in the previous generation. At the same time, a capacity play is happening where there is a race to cram as much NAND as possible into a single U.2 / EDSFF enclosure. QLC is being used for this purpose, and we saw a number of such 128 TB-class eSSDs on display at FMS 2024.
DapuStor and Memblaze have both been relying on SSD controllers from Marvell for their flagship drives. Their latest product iterations for the Gen 5 era use the Marvell Bravera SC5 controller. Similar to the Flashtec controllers, these are not meant to be turnkey solutions. Rather, the SSD vendor has considerable flexibility in implementing specific features for their desired target market.
At FMS 2024, both DapuStor and Memblaze were displaying their latest solutions for the Gen 5 market. Memblaze was celebrating the sale of 150K+ units of their flagship Gen 5 solution - the PBlaze7 7940 incorporating Micron's 232L 3D eTLC with Marvell's Bravera SC5 controller. This SSD (available in capacities up to 30.72 TB) boasts of 14 GBps reads / 10 GBps writes along with random read / write performance of 2.8 M / 720K - all with a typical power consumption south of 16 W. Additionally, the support for some of NVMe features such as software-enabled flash (SEF) and zoned name space (ZNS) had helped Memblaze and Marvell to receive a 'Best of Show' award under the 'Most Innovative Customer Implementation' category.
DapuStor had their current lineup on display (including the Haishen H5000 series with the same Bravera SC5 controller). Additionally, the company had an unannounced proof-of-concept 61.44 TB QLC SSD on display. Despite the label carrying the Haishen5 series tag (its current members all use eTLC NAND), this sample comes with QLC flash.
DapuStor has already invested resources into implementing the flexible data placement (FDP) NVMe feature into the firmware of this QLC SSD. The company also had an interesting presentation session dealing with usage of CXL memory expansion to store the FTL for high-capacity enterprise SSDs - though this is something for the future and not related to any current product in the market.
Having established themselves within the Chinese market, both DapuStor and Memblaze are looking to expand in other markets. Having products with leading performance numbers and features in the eSSD growth segment will stand them in good stead in this endeavor.
At FMS 2024, Phison devoted significant booth space to their enterprise / datacenter SSD and PCIe retimer solutions, in addition to their consumer products. As a controller / silicon vendor, Phison had historically been working with drive partners to bring their solutions to the market. On the enterprise side, their tie-up with Seagate for the X1 series (and the subsequent Nytro-branded enterprise SSDs) is quite well-known. Seagate supplied the requirements list and had a say in the final firmware before qualifying the drives themselves for their datacenter customers. Such qualification involves a significant resource investment that is possible only by large companies (ruling out most of the tier-two consumer SSD vendors).
Phison had demonstrated the Gen 5 X2 platform at last year's FMS as a continuation of the X1. However, with Seagate focusing on its HAMR ramp, and also fighting other battles, Phison decided to go ahead with the qualification process for the X2 process themselves. In the bigger scheme of things, Phison also realized that the white-labeling approach to enterprise SSDs was not going to work out in the long run. As a result, the Pascari brand was born (ostensibly to make Phison's enterprise SSDs more accessible to end consumers).
Under the Pascari brand, Phison has different lineups targeting different use-cases: from high-performance enterprise drives in the X series to boot drives in the B series. The AI series comes in variants supporting up to 100 DWPD (more on that in the aiDAPTIVE+ subsection below).
The D200V Gen 5 took pole position in the displayed drives, thanks to its leading 61.44 TB capacity point (a 122.88 TB drive is also being planned under the same line). The use of QLC in this capacity-focused line brings down the sustained sequential write speeds to 2.1 GBps, but these are meant for read-heavy workloads.
The X200, on the other hand, is a Gen 5 eTLC drive boasting up to 8.7 GBps sequential writes. It comes in read-centric (1 DWPD) and mixed workload variants (3 DWPD) in capacities up to 30.72 TB. The X100 eTLC drive is an evolution of the X1 / Seagate Nytro 5050 platform, albeit with newer NAND and larger capacities.
These drives come with all the usual enterprise features including power-loss protection, and FIPS certifiability. Though Phison didn't advertise this specifically, newer NVMe features like flexible data placement should become part of the firmware features in the future.
Though not strictly an enterprise demo, Phison did have a station showing 100 GBps+ sequential reads and writes using a normal desktop workstation. The trick was installing two HighPoint Rocket 1608A add-in cards (each with eight M.2 slots) and placing the 16 M.2 drives in a RAID 0 configuration.
HighPoint Technology and Phison have been working together to qualify E26-based drives for this use-case, and we will be seeing more on this in a later review.
One of the more interesting demonstrations in Phison's booth was the aiDAPTIV+ Pro suite. At last year's FMS, Phison had demonstrated a 40 DWPD SSD for use with Chia (thankfully, that fad has faded). The company has been working on the extreme endurance aspect and moved it up to 60 DWPD (which is standard for the SLC-based cache drives from Micron and Solidigm).
At FMS 2024, the company took this SSD and added a middleware layer on top to ensure that workloads remain more sequential in nature. This drives up the endurance rating to 100 DWPD. Now, this middleware layer is actually part of their AI training suite targeting small business and medium enterprises who do not have the budget for a full-fledged DGX workstation, or for on-premises fine-tuning.
Re-training models by using these AI SSDs as an extension of the GPU VRAM can deliver significant TCO benefits for these companies, as the costly AI training-specific GPUs can be replaced with a set of relatively low-cost off-the-shelf RTX GPUs. This middleware comes with licensing aspects that are essentially tied to the purchase of the AI-series SSDs (that come with Gen 4 x4 interfaces currently in either U.2 or M.2 form-factors). The use of SSDs as a caching layer can enable fine-tuning of models with a very large number of parameters using a minimal number of GPUs (not having to use them primarily for their HBM capacity).
Samsung had quietly launched its BM1743 enterprise QLC SSD last month with a hefty 61.44 TB SKU. At FMS 2024, the company had the even larger 122.88 TB version of that SSD on display, alongside a few recorded benchmarking sessions. Compared to the previous generation, the BM1743 comes with a 4.1x improvement in I/O performance, improvement in data retention, and a 45% improvement in power efficiency for sequential writes.
The 128 TB-class QLC SSD boasts of sequential read speeds of 7.5 GBps and write speeds of 3 GBps. Random reads come in at 1.6 M IOPS, while 16 KB random writes clock in at 45K IOPS. Based on the quoted random write access granularity, it appears that Samsung is using a 16 KB indirection unit (IU) to optimize flash management. This is similar to the strategy adopted by Solidigm with IUs larger than 4K in their high-capacity SSDs.
A recorded benchmark session on the company's PM9D3a 8-channel Gen 5 SSD was also on display.
The SSD family is being promoted as a mainstream option for datacenters, and boasts of sequential reads up to 12 GBps and writes up to 6.8 GBps. Random reads clock in at 2 M IOPS, and random writes at 400 K IOPS.
Available in multiple form-factors up to 32 TB (M.2 tops out at 2 TB), the drive's firmware includes optional support for flexible data placement (FDP) to help address the write amplification aspect.
The PM1753 is the current enterprise SSD flagship in Samsung's lineup. With support for 16 NAND channels and capacities up to 32 TB, this U.2 / E3.S SSD has advertised sequential read and write speeds of 14.8 GBps and 11 GBps respectively. Random reads and writes for 4 KB accesses are listed at 3.4 M and 600 K IOPS.
Samsung claims a 1.7x performance improvement and a 1.7x power efficiency improvement over the previous generation (PM1743), making this TLC SSD suitable for AI servers.
The 9th Gen. V-NAND wafer was also available for viewing, though photography was prohibited. Mass production of this flash memory began in April 2024.
A few years back, the Japanese government's New Energy and Industrial Technology Development Organization (NEDO ) allocated funding for the development of green datacenter technologies. With the aim to obtain up to 40% savings in overall power consumption, several Japanese companies have been developing an optical interface for their enterprise SSDs. And at this year's FMS, Kioxia had their optical interface on display.
For this demonstration, Kioxia took its existing CM7 enterprise SSD and created an optical interface for it. A PCIe card with on-board optics developed by Kyocera is installed in the server slot. An optical interface allows data transfer over long distances (it was 40m in the demo, but Kioxia promises lengths of up to 100m for the cable in the future). This allows the storage to be kept in a separate room with minimal cooling requirements compared to the rack with the CPUs and GPUs. Disaggregation of different server components will become an option as very high throughput interfaces such as PCIe 7.0 (with 128 GT/s rates) become available.
The demonstration of the optical SSD showed a slight loss in IOPS performance, but a significant advantage in the latency metric over the shipping enterprise SSD behind a copper network link. Obviously, there are advantages in wiring requirements and signal integrity maintenance with optical links.
Being a proof-of-concept demonstration, we do see the requirement for an industry-standard approach if this were to gain adoption among different datacenter vendors. The PCI-SIG optical workgroup will need to get its act together soon to create a standards-based approach to this problem.
At FMS 2024, the technological requirements from the storage and memory subsystem took center stage. Both SSD and controller vendors had various demonstrations touting their suitability for different stages of the AI data pipeline - ingestion, preparation, training, checkpointing, and inference. Vendors like Solidigm have different types of SSDs optimized for different stages of the pipeline. At the same time, controller vendors have taken advantage of one of the features introduced recently in the NVM Express standard - Flexible Data Placement (FDP).
FDP involves the host providing information / hints about the areas where the controller could place the incoming write data in order to reduce the write amplification. These hints are generated based on specific block sizes advertised by the device. The feature is completely backwards-compatible, with non-FDP hosts working just as before with FDP-enabled SSDs, and vice-versa.
Silicon Motion's MonTitan Gen 5 Enterprise SSD Platform was announced back in 2022. Since then, Silicon Motion has been touting the flexibility of the platform, allowing its customers to incorporate their own features as part of the customization process. This approach is common in the enterprise space, as we have seen with Marvell's Bravera SC5 SSD controller in the DapuStor SSDs and Microchip's Flashtec controllers in the Longsys FORESEE enterprise SSDs.
At FMS 2024, the company was demonstrating the advantages of flexible data placement by allowing a single QLC SSD based on their MonTitan platform to take part in different stages of the AI data pipeline while maintaining the required quality of service (minimum bandwidth) for each process. The company even has a trademarked name (PerformaShape) for the firmware feature in the controller that allows the isolation of different concurrent SSD accesses (from different stages in the AI data pipeline) to guarantee this QoS. Silicon Motion claims that this scheme will enable its customers to get the maximum write performance possible from QLC SSDs without negatively impacting the performance of other types of accesses.
Silicon Motion and Phison have market leadership in the client SSD controller market with similar approaches. However, their enterprise SSD controller marketing couldn't be more different. While Phison has gone in for a turnkey solution with their Gen 5 SSD platform (to the extent of not adopting the white label route for this generation, and instead opting to get the SSDs qualified with different cloud service providers themselves), Silicon Motion is opting for a different approach. The flexibility and customization possibilities can make platforms like the MonTitan appeal to flash array vendors.
At FMS 2024, Kioxia had a proof-of-concept demonstration of their proposed a new RAID offload methodology for enterprise SSDs. The impetus for this is quite clear: as SSDs get faster in each generation, RAID arrays have a major problem of maintaining (and scaling up) performance. Even in cases where the RAID operations are handled by a dedicated RAID card, a simple write request in, say, a RAID 5 array would involve two reads and two writes to different drives. In cases where there is no hardware acceleration, the data from the reads needs to travel all the way back to the CPU and main memory for further processing before the writes can be done.
Kioxia has proposed the use of the PCIe direct memory access feature along with the SSD controller's controller memory buffer (CMB) to avoid the movement of data up to the CPU and back. The required parity computation is done by an accelerator block resident within the SSD controller.
In Kioxia's PoC implementation, the DMA engine can access the entire host address space (including the peer SSD's BAR-mapped CMB), allowing it to receive and transfer data as required from neighboring SSDs on the bus. Kioxia noted that their offload PoC saw close to 50% reduction in CPU utilization and upwards of 90% reduction in system DRAM utilization compared to software RAID done on the CPU. The proposed offload scheme can also handle scrubbing operations without taking up the host CPU cycles for the parity computation task.
Kioxia has already taken steps to contribute these features to the NVM Express working group. If accepted, the proposed offload scheme will be part of a standard that could become widely available across multiple SSD vendors.
Western Digital's BiCS8 218-layer 3D NAND is being put to good use in a wide range of client and enterprise platforms, including WD's upcoming Gen 5 client SSDs and 128 TB-class datacenter SSD. On the external storage front, the company demonstrated four different products: for card-based media, 4 TB microSDUC and 8 TB SDUC cards with UHS-I speeds, and on the portable SSD front we had two 16 TB drives. One will be a SanDisk Desk Drive with external power, and the other in the SanDisk Extreme Pro housing with a lanyard opening in the case.
All of these are using BiCS8 QLC NAND, though I did hear booth talk (as I was taking leave) that they were not supposed to divulge the use of QLC in these products. The 4 TB microSDUC and 8 TB SDUC cards are rated for UHS-I speeds. They are being marketed under the SanDisk Ultra branding.
The SanDisk Desk Drive is an external SSD with a 18W power adapter, and it has been in the market for a few months now. Initially launched in capacities up to 8 TB, Western Digital had promised a 16 TB version before the end of the year. It appears that the product is coming to retail quite soon. One aspect to note is that this drive has been using TLC for the SKUs that are currently in the market, so it appears unlikely that the 16 TB version would be QLC. The units (at least up to the 8 TB capacity point) come with two SN850XE drives. Given the recent introduction of the 8 TB SN850X, an 'E' version with tweaked firmware is likely to be present in the 16 TB Desk Drive.
The 16 TB portable SSD in the SanDisk Extreme housing was a technology demonstration. It is definitely the highest capacity bus-powered portable SSD demonstrated by any vendor at any trade show thus far. Given the 16 TB Desk Drive's imminent market introduction, it is just a matter of time before the technology demonstration of the bus-powered version becomes a retail reality.
Kioxia's booth at FMS 2024 was a busy one with multiple technology demonstrations keeping visitors occupied. A walk-through of the BiCS 8 manufacturing process was the first to grab my attention. Kioxia and Western Digital announced the sampling of BiCS 8 in March 2023. We had touched briefly upon its CMOS Bonded Array (CBA) scheme in our coverage of Kioxial's 2Tb QLC NAND device and coverage of Western Digital's 128 TB QLC enterprise SSD proof-of-concept demonstration. At Kioxia's booth, we got more insights.
Traditionally, fabrication of flash chips involved placement of the associate logic circuitry (CMOS process) around the periphery of the flash array. The process then moved on to putting the CMOS under the cell array, but the wafer development process was serialized with the CMOS logic getting fabricated first followed by the cell array on top. However, this has some challenges because the cell array requires a high-temperature processing step to ensure higher reliability that can be detrimental to the health of the CMOS logic. Thanks to recent advancements in wafer bonding techniques, the new CBA process allows the CMOS wafer and cell array wafer to be processed independently in parallel and then pieced together, as shown in the models above.
The BiCS 8 3D NAND incorporates 218 layers, compared to 112 layers in BiCS 5 and 162 layers in BiCS 6. The company decided to skip over BiCS 7 (or, rather, it was probably a short-lived generation meant as an internal test vehicle). The generation retains the four-plane charge trap structure of BiCS 6. In its TLC avatar, it is available as a 1 Tbit device. The QLC version is available in two capacities - 1 Tbit and 2 Tbit.
Kioxia also noted that while the number of layers (218) doesn't compare favorably with the latest layer counts from the competition, its lateral scaling / cell shrinkage has enabled it to be competitive in terms of bit density as well as operating speeds (3200 MT/s). For reference, the latest shipping NAND from Micron - the G9 - has 276 layers with a bit density in TLC mode of 21 Gbit/mm2, and operates at up to 3600 MT/s. However, its 232L NAND operates only up to 2400 MT/s and has a bit density of 14.6 Gbit/mm2.
It must be noted that the CBA hybrid bonding process has advantages over the current processes used by other vendors - including Micron's CMOS under array (CuA) and SK hynix's 4D PUC (periphery-under-chip) developed in the late 2010s. It is expected that other NAND vendors will also move eventually to some variant of the hybrid bonding scheme used by Kioxia.
At FMS 2024, Phison gave us the usual updates on their client flash solutions. The E31T Gen 5 mainstream controller has already been seen at a few tradeshows starting with Computex 2023, while the USB4 native flash controller for high-end PSSDs was unveiled at CES 2024. The new solution being demonstrated was the E29T Gen 4 mainstream DRAM-less controller. Phison believes that there is still performance to be eked out on the Gen 4 platform with a low-cost DRAM-less solution.
Phison NVMe SSD Controller Comparison | |||||||||
E31T | E29T | E27T | E26 | E18 | |||||
Market Segment | Mainstream Consumer | High-End Consumer | |||||||
Manufacturing Process |
7nm | 12nm | 12nm | 12nm | 12nm | ||||
CPU Cores | 2x Cortex R5 | 1x Cortex R5 | 1x Cortex R5 | 2x Cortex R5 | 3x Cortex R5 | ||||
Error Correction | 7th Gen LDPC | 7th Gen LDPC | 5th Gen LDPC | 5th Gen LDPC | 4th Gen LDPC | ||||
DRAM | No | No | No | DDR4, LPDDR4 | DDR4 | ||||
Host Interface | PCIe 5.0 x4 | PCIe 4.0 x4 | PCIe 4.0 x4 | PCIe 5.0 x4 | PCIe 4.0 x4 | ||||
NVMe Version | NVMe 2.0 | NVMe 2.0 | NVMe 2.0 | NVMe 2.0 | NVMe 1.4 | ||||
NAND Channels, Interface Speed | 4 ch, 3600 MT/s |
4 ch, 3600 MT/s |
4 ch, 3600 MT/s |
8 ch, 2400 MT/s |
8 ch, 1600 MT/s |
||||
Max Capacity | 8 TB | 8 TB | 8 TB | 8 TB | 8 TB | ||||
Sequential Read | 10.8 GB/s | 7.4 GB/s | 7.4 GB/s | 14 GB/s | 7.4 GB/s | ||||
Sequential Write | 10.8 GB/s | 6.5 GB/s | 6.7 GB/s | 11.8 GB/s | 7.0 GB/s | ||||
4KB Random Read IOPS | 1500k | 1200k | 1200k | 1500k | 1000k | ||||
4KB Random Write IOPS | 1500k | 1200k | 1200k | 2000k | 1000k |
Compared to the E27T, the key update is the use of a newer LDPC engine that enables better SSD lifespan as well as compatibility with the latest QLC flash, along with additional power optimizations.
The company also had a U21 USB4 PSSD reference design (complete with a MagSafe-compatible casing) on display, along with the usual CrystalDiskMark benchmark results. We were given to understand that PSSDs based on the U21 controller are very close to shipping into retail.
Phison has been known for taking the lead in introducing SSD controllers based on the latest and greatest interface options - be it PCIe 4.0, PCIe 5.0, or USB4. The competition is usually in the form of tier-one vendors opting for their in-house solution, or Silicon Motion stepping in a few quarters down the line after the market takes off with a more power-efficient solution. With the E29T, Phison is aiming to ensure that they still have a viable play in the mainstream Gen 4 market with their latest LDPC engine and supporting the highest available NAND flash speeds.
Microchip recently announced the availability of their second PCIe Gen 5 enterprise SSD controller - the Flashtec 5016. Like the 4016, this is also a 16-channel controller, but there are some key updates:
Microchip's enterprise SSD controllers provide a high level of flexibility to SSD vendors by providing them with significant horsepower and accelerators. The 5016 includes Cortex-A53 cores for SSD vendors to run custom applications relevant to SSD management. However, compared to the Gen4 controllers, there are two additional cores in the CPU cluster. The DRAM subsystem includes ECC support (both out-of-band and inline, as desired by the SSD vendor).
At FMS 2024, the company demonstrated an application of the neural network engines embedded in the Gen5 controllers. Controllers usually employ a 'read-retry' operation with altered read-out voltages for flash reads that do not complete successfully. Microchip implemented a machine learning approach to determine the read-out voltage based on the health history of the NAND block using the NN engines in the controller. This approach delivers tangible benefits for read latency and power consumption (thanks to a smaller number of errors on the first read).
The 4016 and 5016 come with a single-chip root of trust implementation for hardware security. A secure boot process with dual-signature authentication ensures that the controller firmware is not maliciously altered in the field. The company also brought out the advantages of their controller's implementation of SR-IOV, flexible data placement, and zoned namespaces along with their 'credit engine' scheme for multi-tenant cloud workloads. These aspects were also brought out in other demonstrations.
Microchip's press release included quotes from the usual NAND vendors - Solidigm, Kioxia, and Micron. On the customer front, Longsys has been using Flashtec controllers in their enterprise offerings along with YMTC NAND. It is likely that this collaboration will continue further using the new 5016 controller.
Western Digital's FMS 2024 demonstrations included a preview of their upcoming PCIe 5.0 x4 M.2 2280 NVMe SSDs for mobile workstations and consumer desktops. The Gen 5 client SSD market has been dominated by solutions based on Phison's E26 controller. The first generation products launched with slower NAND flash, while the more recent ones have exceeded the 14 GBps barrier by utilizing Micron's 2400 MT/s 232L 3D TLC. Western Digital has been conservative over the last year or so by focusing more on the mainstream / mid-range market in terms of new product introductions (such as the WD Blue SN5000, WD_BLACK SN770M, and the WD Blue SN580). Their SSD lineup is due for an update with Gen 5 drives being sorely missed. The SSDs being demonstrated at FMS 2024 will end up doing just that.
Western Digital's technology demonstrations in this segment involved two different M.2 2280 SSDs - one for the performance segment, and another for the mainstream market. They both utilize in-house controllers - while the performance segment drive uses a 8-channel controller with DRAM for the flash translation layer, the mainstream one utilizes a 4-channel DRAM-less controller. Both drives being benchmarked live were equipped with BiCS8 218-layer 3D TLC.
Western Digital is touting the power efficiency of their platform as a key differentiator, promising south of 7W (performance drive) and 5W (mainstream DRAM-less drive) for the complete SSD under stressful traffic. This makes it suitable for use in mobile workstations, but a good fit for desktops as well.
Demonstrated performance numbers indicate almost 15 GBps sequential reads and 2M+ random read IOPS for the performance drive, and 10.7 GBps sequential reads for the mainstream version. Western Digital might have missed the Gen 5 bus as it started out slowly. However, the technology demonstrations with the in-house controller and NAND indicate that WD has caught up just as the Gen 5 market is about to take off.|
Silicon Motion's SM2320 native USB 3.2 Gen 2x2 controller for USB flash drives and portable SSDs has enjoyed great market success with a large number of design wins over the last few years. Silicon Motion proudly displayed a selection of products based on the SM2320 on the show floor at FMS 2024.
The SM2320 went into mass production in Q3 2021. Since then, the NAND flash market has seen considerable change. QLC is becoming more and more reliable and common, leading to the launch of high-capacity cost-effective 4 TB and 8 TB SSDs. Newer NAND generations with flash operating at higher speeds have also made an appearance.
The SM2320, fabricated in TSMC's 28nm node, supported four channels of NAND flash running at up to 800 MT/s. The new SM2322 uses the same process node and retains support for the same number of flash channels and chip enables (8 CEs per channel). However, the NAND can now operate at up to 1200 MT/s.
The SM2322 also improves the QLC support, thanks to the implementation of a better ECC scheme. While the SM2320 opted for a 2KB LDPC implementation, the SM2322 goes in for a 4KB LDPC solution. The use of a larger region enables extension of the NAND's useful life.
The SM2322 and SM2320 packages are similar in size, and Silicon Motion expects PSSD designs using the SM2320 to adopt the SM2322 with different NAND (higher capacity / speeds) using the same enclosure. Products based on the SM2322 are expected to appear in the market before the end of the year.