Vue normale

Il y a de nouveaux articles disponibles, cliquez pour rafraîchir la page.
À partir d’avant-hierFlux principal

JEDEC Extends DDR5 Memory Specification to 8800 MT/s, Adds Anti-Rowhammer Features

22 avril 2024 à 12:00

When JEDEC released its DDR5 specification (JESD79) back in 2020, the standard setting organization defined precise specs for modules with speed bins of up to 6400 MT/s, while leaving the spec open to further expansions with faster memory as technology progressed. Now, a bit more than three-and-a-half years later, and the standards body and its members are gearing up to release a faster generation of DDR5 memory, which is being laid out in the newly updated JESD79-JC5 specification. The latest iteration of the DDR5 spec defines official DDR timing specifications up to 8800 MT/s, as well as adding some new features when it comes to security.

Diving in, the new specification outlines settings for memory chips (on all types of memory modules) with data transfer rates up to 8800 MT/s (AKA DDR5-8800). This suggests that all members of the JESD79 committee that sets the specs for DDR5 — including memory chip makers and memory controller designers — agree that DDR5-8800 is a viable extension of the DDR5 specification both from performance and cost point of view. Meanwhile, the addition of higher speed bins is perhaps enabled by another JEDEC feature introduced in this latest specification, which is the Self-Refresh Exit Clock Sync for I/O training optimization.

JEDEC DDR5-A Specifications
AnandTech Data Rate
MT/s
CAS Latency (cycles) Absolute Latency (ns) Peak BW
GB/s
DDR5-3200 A 3200 22 22 22 13.75 25.6
DDR5-3600 A 3600 26 26 26 14.44 28.8
DDR5-4000 A 4000 28 28 28 14 32
DDR5-4400 A 4400 32 32 32 14.55 35.2
DDR5-4800 A 4800 34 34 34 14.17 38.4
DDR5-5200 A 5200 38 38 38 14.62 41.6
DDR5-5600 A 5600 40 40 40 14.29 44.8
DDR5-6000 A 6000 42 42 42 14 48
DDR5-6400 A 6400 46 46 46 14.38 51.2
DDR5-6800 A 6800 48 48 48 14.12 54.4
DDR5-7200 A 7200 52 52 52 14.44 57.6
DDR5-7600 A 7600 54 54 54 14.21 60.8
DDR5-8000 A 8000 56 56 56 14 64.0
DDR5-8400 A 8400 60 60 60 14.29 67.2
DDR5-8800 A 8800 62 62 62 14.09 70.4

When it comes to the JEDEC standard for DDR5-8800, it sets relatively loose timings of CL62 62-62 for A-grade devices and CL78 77-77 for lower-end C-grade ICs. Unfortunately, the laws of physics driving DRAM cells have not improved much over the last couple of years (or decades, for that matter), so memory chips still must operate with similar absolute latencies, driving up the relative CAS latency. In this case 14ns remains the gold standard, with CAS latencies at the new speeds being set to hold absolute latencies around that mark. But in exchange for systems willing to wait a bit longer (in terms of cycles) for a result, the new spec improves the standard's peak memory bandwidth by 37.5%.

This of course is just the timings set in the JEDEC specification, which is primarily of concern for server vendors. So we'll have to see just how much harder consumer memory manufacturers can push things for their XMP/EXPO-profiled memory. Extreme overclockers are already hitting speeds as high as 11,240 MT/s with current-generation DRAM chips and CPUs, so there may be some more headroom to play with in the next generation.

Meanwhile, on the security front, the updated spec makes a couple of changes that have been put in place seemingly to address rowhammer-style exploits. The big item here is Per-Row Activation Counting (PRAC), which true to its name, enables DDR5 to keep a count of how often a row has been activated. Using this information, memory controllers can then determine if a memory row has been excessively activated and is at risk of causing a neighboring row's bits to flip, at which point they can back off to let the neighboring row properly refresh and the data re-stabilize.

Notably here, the JEDEC press release doesn't use the rowhammer name at any point (unfortunately, we haven't been able to see the specification itself). But based on the description alone, this is clearly intended to thwart rowhammer attacks, since these normally operate by forcing a bit flip between refreshes through a large number of activations.

Digging a bit deeper, PRAC seems to be based on a recent Intel patent, Perfect Row Hammer Tracking with Multiple Count Increments (US20220121398A1), which describes a very similar mechanism under the name "Perfect row hammer tracking" (PRHT). Notably, the Intel paper calls out that this technique has a performance cost associated with it because it increases the overall row cycle time. Ultimately, as the vulnerability underpinning rowhammer is a matter of physics (cell density) rather than logic, it's not too surprising to see that any mitigation of it comes with a cost.

The updated DDR5 specification also deprecates support for Partial Array Self Refresh (PASR) within the standard, citing security concerns. PASR is primarily aimed at power efficiency for mobile memory to begin with, and as a refresh-related technology, presumably overlaps some with rowhammer – be it a means to attack memory, or an obstruction to defending against rowhammer. Either way, with mobile devices increasingly moving to low-power optimized LPDDR technologies anyhow, the depreciation of PASR does not immediately look like a major concern for consumer devices.

SK Hynix and TSMC Team Up for HBM4 Development

19 avril 2024 à 15:00

SK hynix and TSMC announced early on Friday that they had signed a memorandum of understanding to collaborate on developing the next-generation HBM4 memory and advanced packaging technology. The initiative is designed to speed up the adoption of HBM4 memory and solidify SK hynix's and TSMC's leading positions in high-bandwidth memory and advanced processor applications.

The primary focus of SK hynix's and TSMC's initial efforts will be to enhance the performance of the HBM4 stack's base die, which (if we put it very simply) acts like an ultra-wide interface between memory devices and host processors. With HBM4, SK hynix plans to use one of TSMC's advanced logic process technologies to build base dies to pack additional features and I/O pins within the confines of existing spatial constraints. 

This collaborative approach also enables SK hynix to customize HBM solutions to satisfy diverse customer performance and energy efficiency requirements. SK hynix has been touting custom HBM solutions for a while, and teaming up with TSMC will undoubtedly help with this.

"TSMC and SK hynix have already established a strong partnership over the years. We've worked together in integrating the most advanced logic and state-of-the art HBM in providing the world's leading AI solutions," said Dr. Kevin Zhang, Senior Vice President of TSMC's Business Development and Overseas Operations Office, and Deputy Co-Chief Operating Officer. "Looking ahead to the next-generation HBM4, we're confident that we will continue to work closely in delivering the best-integrated solutions to unlock new AI innovations for our common customers."

Furthermore, the collaboration extends to optimizing the integration of SK hynix's HBM with TSMC's CoWoS advanced packaging technology. CoWoS is among the most popular specialized 2.5D packaging process technologies for integrating logic chips and stacked HBM into a unified module.

For now, it is expected that HBM4 memory will be integrated with logic processors using direct bonding. However, some of TSMC's customers might prefer to use an ultra-advanced version of CoWoS to integrate HBM4 with their processors.

"We expect a strong partnership with TSMC to help accelerate our efforts for open collaboration with our customers and develop the industry's best-performing HBM4," said Justin Kim, President and the Head of AI Infra at SK hynix. "With this cooperation in place, we will strengthen our market leadership as the total AI memory provider further by beefing up competitiveness in the space of the custom memory platform."

Corsair Enters Workstation Memory Market with WS Series XMP/EXPO DDR5 RDIMMs

12 avril 2024 à 12:30

Corsair has introduced a family of registered memory modules with ECC that are designed for AMD's Ryzen Threadripper 7000 and Intel's Xeon W-2400/3400-series processors. The new Corsair WS DDR5 RDIMMs with AMD EXPO and Intel XMP 3.0 profiles will be available in kits of up to 256 GB capacity and at speeds of up to 6400 MT/s.

Corsair's family of WS DDR5 RDIMMs includes 16 GB modules operating at up to 6400 MT/s with CL32 latency as well as 32 GB modules functioning at 5600 MT/s with CL40 latency. At present, Corsair offers a quad-channel 64 GB kit (4×16GB, up to 6400 MT/s), a quad-channel 128GB kit (4×32GB, 5600 MT/s), an eight-channel 128 GB kit (8×16GB, 5600 MT/s), and an eight-channel 256 GB kit (8×32GB, 5600 MT/s) and it remains to be seen whether the company will expand the lineup.

Corsair's WS DDR5 RDIMMs are designed for AMD's TRX50 and WRX90 platforms as well as Intel's W790 platform and are therefore compatible with AMD's Ryzen Threadripper Pro 7000 and 7000WX-series as well as Intel's Xeon W-2400/3400-series CPUs. The modules feature both AMD EXPO and Intel XMP 3.0 profiles to easily set their beyond-JEDEC-spec settings and come with thin heat spreaders made of pyrolytic graphite sheet (PGS), which thermal conductivity than that of copper and aluminum of the same thickness. For now, Corsair does not disclose which RCD and memory chips its registered memory modules use.

Unlike many of its rivals among leading DIMM manufacturers, Corsair did not introduce its enthusiast-grade RDIMMs when AMD and Intel released their Ryzen Threadripper and Xeon W-series platforms for extreme workstations last year. It is hard to tell what the reason for that is, but perhaps the company wanted to gain experience working with modules featuring registered clock drivers (RCDs) as well as AMD's and Intel's platforms for extreme workstations.

The result of the delay looks to be quite rewarding: unlike modules from its competitors that either feature AMD EXPO or Intel XMP 3.0 profiles, Corsair's WS DDR5 RDIMMs come with both. While this may not be important on the DIY market where people know exactly what they are buying for their platform, this is a great feature for system integrators, which can use Corsair WS DDR5 RDIMMs both for their AMD Ryzen Threadripper and Intel Xeon W-series builds, something that greatly simplifies their inventory management.

Since Corsair's WS DDR5 RDIMMs are aimed at workstations and are tested to offer reliable performance beyond JEDEC specifications, they are quite expensive. The cheapest 64 GB DDR5-5600 CL40 kit costs $450, the fastest 64 GB DDR5-6400 CL32 kit is priced at $460, whereas the highest end 256 GB DDR5-5600 CL40 kit is priced at $1,290.

Report: Impact of Taiwanese Earthquake on DRAM Output to be Negligible in Q2

10 avril 2024 à 22:00

Following the magnitude 7.2 earthquake that struck Taiwan on April 3, 2024, there was immediate concern over what impact this could have on chip production within the country. Even for a well-prepared country like Taiwan, the tremor was the strongest quake to hit the region in 25 years, making it no small matter. But, according to research compiled by TrendForce, the impact on the production of DRAM will not be significant. The market tracking company believes that Taiwanese DRAM industry has remained largely unaffected, primarily due to their robust earthquake preparedness measures.

There are four memory makers in Taiwan: Micron, the sole member of the "big three" memory manufacturers on the island, runs two fabs. Meanwhile among the smaller players is Nanya (which has one fab), Winbond (which makes specialty memory at one fab), and PSMC (which produces specialty memory at one plant). The study found that these DRAM producers quickly resumed full operations, but had to throw away some wafers. The earthquake is estimated to have a minor effect on Q2 DRAM production, with a negligible 1% impact, TrendForce claims

In fact, as Micron is ramping up production of DRAM on its 1alpha and 1beta nm process technologies, it increases bit production of memory, which will positively affect supply of commodity DRAM in Q2 2025.

Following the earthquake, there was a temporary halt in quotations for both the contract and spot DRAM markets. However, the spot market quotations have already largely resumed, while contract prices have not fully restarted. Notably, Micron and Samsung ceased issuing quotes for mobile DRAM immediately after the earthquake, with no updates provided as of April 8th. In contrast, SK hynix resumed quotations for smartphone customers on the day of the earthquake and proposed more moderate price adjustments for Q2 mobile DRAM.

TrendForce anticipates a seasonal contract price increase for Q2 mobile DRAM of between 3% and 8%. This moderate increase is partly due to SK hynix's more restrained pricing strategy, which is likely to influence overall pricing strategies across the industry. The earthquake's impact on server DRAM primarily affected Micron's advanced fabrication nodes, potentially leading to a rise in final sale prices for Micron's server DRAM, according to TrendForce. However, the exact direction of future prices remains to be seen.

Meanwhile, DRAM fabs outside of Taiwan have none been directly affected by the quake. This includes Micron's HBM production line in Hiroshima, Japan, and Samsung's and SK hynix's HBM lines in South Korea, all of which are apparently operating with business as usual.

In general, the DRAM industry has shown resilience in the face of the earthquake, with minimal disruptions and a quick recovery. The abundant inventory levels for DDR4 and DDR5, coupled with weak demand, suggest that any slight price elevations caused by the earthquake are expected to normalize quickly. The only potential outlier here is DDR3, which is nearing the end of its commercial lifetime and production is already decreasing.

Samsung Unveils CXL Memory Module Box: Up to 16 TB at 60 GB/s

3 avril 2024 à 12:00

Composable disaggregated data center infrastructure promises to change the way data centers for modern workloads are built. However, to fully realize the potential of new technologies, such as CXL, the industry needs brand-new hardware. Recently, Samsung introduced its CXL Memory Module Box (CMM-B), a device that can house up to eight CXL Memory Module – DRAM (CMM-D) devices and add plenty of memory connected using a PCIe/CXL interface.

Samsung's CXL Memory Module Box (CMM-B) is the first device of this type to accommodate up to eight 2 TB E3.S CMM-D memory modules and add up to 16 TB of memory to up to three modern servers with appropriate connectors. As far as performance is concerned, the box can offer up to 60 GB/s of bandwidth (which aligns with what a PCIe 5.0 x16 interface offers) and 596 ns latency. 

From a pure performance point of view, one CXL Memory Module—Box is slower than a dual-channel DDR5-4800 memory subsystem. Yet, the unit is still considerably faster than even advanced SSDs. At the same time, it provides very decent capacity, which is often just what the doctor ordered for many applications.

The Samsung CMM-B is compatible with the CXL 1.1 and CXL 2.0 protocols. It consists of a rack-scale memory bank (CMM-B), several application hosts, Samsung Cognos management console software, and a top-of-rack (ToR) switch. The device was developed in close collaboration with Supermicro, so expect this server maker to offer the product first.

Samsung's CXL Memory Module – Box is designed for applications that need a lot of memory, such as AI, data analytics, and in-memory databases, albeit not at all times. CMM-B allows the dynamic allocation of necessary memory to a system when it needs this memory and then uses DRAM with other machines. As a result, operators of datacenters can spend money on procuring expensive memory (16 TB of memory costs a lot), reduce power consumption, and add flexibility to their setups.

HBM Revenue Poised To Cross $10B as SK hynix Predicts First Double-Digit Revenue Share

28 mars 2024 à 12:00

Offering some rare insight into the scale of HBM memory sales – and on its growth in the face of unprecedented demand from AI accelerator vendors – the company recently disclosed that it expects HBM sales to make up "a double-digit percentage of its DRAM chip sales" this year. Which if it comes to pass, would represent a significant jump in sales for the high-bandwidth, high-priced memory.

As first reported by Reuters, SK hynix CEO Kwak Noh-Jung has commented that he expects HBM sales will constitute a double-digit percentage of its DRAM chip sales in 2024. This prediction corroborate with estimates from TrendForce, who believe that, industry-wide, HBM will account for 20.1% of DRAM revenue in 2024, more than doubling HBM's 8.4% revenue share in 2023.

And while SK hynix does not break down its DRAM revenue by memory type on a regular basis, a bit of extrapolation indicates that they're on track to take in billions in HBM revenue for 2024 – having likely already crossed the billion dollar mark itself in 2023. Last year, SK hynix's DRAM revenue $15.941 billion, according to Statista and TrendForce. So SK hynix only needs 12.5% of its 2024 revenues to come from HBM (assuming flat or positive revenue overall) in order to pass 2 billion in HBM sales. And even this is a low-ball estimate.

Overall, SK hynix currently commands about 50% of HBM market, having largely split the market with Samsung over the last couple of years. Given that share, and that DRAM industry revenue is expected to increase to $84.150 billion in 2024, SK hynix could earn as much as $8.45 billion on HBM in 2024 if TrendForce's estimates prove accurate.

It should be noted that with demand for AI servers at record levels, all three leading makers of DRAM are poised to increase their HBM production capacity this year. Most notable here is a nearly-absent Micron, who was the first vendor to start shipping HBM3E memory to NVIDIA earlier this year. So SK hynix's near-majority of the HBM market may falter some this year, though with a growing pie they'll have little reason to complain. Ultimately, if sales of HBM reach $16.9 billion as projected, then all memory makers will be enjoying significant HBM revenue growth in the coming months.

Sources: Reuters, TrendForce

GDDR7 Approaches: Samsung Lists GDDR7 Memory Chips on Its Product Catalog

27 mars 2024 à 19:00

Now that JEDEC has published specification of GDDR7 memory, memory manufacturers are beginning to announce their initial products. The first out of the gate for this generation is Samsung, which has has quietly added its GDDR7 products to its official product catalog.

For now, Samsung lists two GDDR7 devices on its website: 16 Gbit chips rated for an up to 28 GT/s data transfer rate and a faster version running at up to 32 GT/s data transfer rate (which is in line with initial parts that Samsung announced in mid-2023). The chips feature a 512M x32 organization and come in a 266-pin FBGA packaging. The chips are already sampling, so Samsung's customers – GPU vendors, AI inference vendors, network product vendors, and the like – should already have GDDR7 chips in their labs.

The GDDR7 specification promises the maximum per-chip capacity of 64 Gbit (8 GB) and data transfer rates of 48 GT/s. Meanwhile, first generation GDDR7 chips (as announced so far) will feature a rather moderate capacity of 16 Gbit (2 GB) and a data transfer rate of up to 32 GT/s.

Performance-wise, the first generation of GDDR7 should provide a significant improvement in memory bandwidth over GDDR6 and GDDR6X. However capacity/density improvements will not come until memory manufacturers move to their next generation EUV-based process nodes. As a result, the first GDDR7-based graphics cards are unlikely to sport any memory capacity improvements. Though looking a bit farther down the road, Samsung and SK Hynix have previously told Tom's Hardware that they intend to reach mass production of 24 Gbit GDDR7 chips in 2025.

Otherwise, it is noteworthy that SK Hynix also demonstrated its GDDR7 chips at NVIDIA's GTC last week. So Samsung's competition should be close behind in delivering samples, and eventually mass production memory.

Source: Samsung (via @harukaze5719)

Report: SK Hynix Mulls Building $4 Billion Advanced Packaging Facility in Indiana

26 mars 2024 à 23:00

SK hynix is considering whether to build an advanced packaging facility in Indiana, reports the Wall Street Journal. If the company proceeds with the plan, it intends to invest $4 billion in it and construct one of the world's largest advanced packaging facilities. But to accomplish the project, SK hynix expects it will need help from the U.S. government.

Acknowledging the report but stopping short of confirming the company's plans, a company spokeswoman told the WSJ that SK hynix "is reviewing its advanced chip packaging investment in the U.S., but hasn’t made a final decision yet."

Companies like TSMC and Intel spend billions on advanced packaging facilities, but so far, no company has announced a chip packaging plant worth quite as much as SH hynix's $4 billion. The field of advanced packaging – CoWoS, passive silicon interposers, redistribution layers, die-to-die bonding, and other cutting edge technologies – has seen an explosion in demand in the last half-decade. As bandwidth advances with traditional organic packaging are largely played out, chip designers have needed to turn to more complex (and difficult to assemble) technologies in order to wire up an ever larger number of signals at ever-higher transfer rates. Which has turned advanced packaging into a bottleneck for high-end chip and accelerator production, driving a need for additional packaging facilities.

If SK hynix approves the project, the advanced packaging facility is expected to begin operations in 2028 and could create as many as 1,000 jobs. With an estimated cost of $4 billion, the plant is poised to become one of the largest advanced packaging facilities in the world.

Meanwhile, government backing is thought to be essential for investments of this scale, with potential state and federal tax incentives, according to the report. These incentives form part of a broader initiative to bolster the U.S. semiconductor industry and decrease dependence on memory produced in South Korea.

SK hynix is the world's leading producer of HBM memory, and is one of the key HBM suppliers to NVIDIA. Next generations of HBM memory (including HBM4 and HBM4E) will require even closer collaboration between chip designers, chipmakers, and memory makers. Therefore, packaging HBM in America could be a significant benefit for NVIDIA, AMD, and other U.S. chipmakers.

Investing in the Indiana facility will be a strategic move by SK hynix to enhance its advanced chip packaging capabilities in general and demonstrating dedication to the U.S. semiconductor industry.

Construction of $106B SK hynix Mega Fab Site Moving Along, But At Slower Pace

23 mars 2024 à 12:00

When a major industry slowdown occurs, big companies tend to slowdown their mid-term and long-term capacity related investments. This is exactly what happened to SK hynix's Yongin Semiconductor Cluster, a major project announced in April 2021 and valued at $106 billion. While development of the site has been largely completed, only 35% of the initial shell building has been constructed, according to the Korean Ministry of Trade, Industry, and Energy.

"Approximately 35% of Fab 1 has been completed so far and site renovation is in smooth progress," a statement by the Korean Ministry of Trade, Industry, and Energy reads. "By 2046, over KRW 120 trillion ($90 billion today, $106 billion in 2021) in investment will be poured to complete Fabs 1 through 4, and construction of Fab 1's production line will commence in March next year. Once completed, the infrastructure will rank as the world's largest three-story fab."

The new semiconductor fabrication cluster by SK hynix announced almost exactly three years ago is primarily meant to be used to make DRAM for PCs, mobile devices, and servers using advanced extreme ultraviolet lithography (EUV) process technologies. The cluster, located near Yongin, South Korea, is intended to consist of four large fabs situated on a 4.15 million m2 site. With a planned capacity of approximately 800,000 wafer starts per month (WSPMs), it is set to be one of the world's largest semiconductor production hubs.

With that said, SK hynix's construction progress has been slower than the company first projected. The first fab in the complex was originally meant to come online in 2025, with construction starting in the fourth quarter of 2021. However, SK hynix began to cut its capital expenditures in the second half of 2022, and the Yongin Semiconductor Cluster project fell a victim of that cut. To be sure, the site continues to be developed, just at a slower pace; which is why some 35% of the first fab shell has been built at this point.

If completed as planned in 2021, the first phase of SK hynix Yongin operations would have been a major memory production facility costing $25 billion, equipped with EUV tools, and capable of 200,000-WSPM, according to reports from 2021.

Sources: Korean Ministry of Trade, Industry, and Energy; ComputerBase

Micron Samples 256 GB DDR5-8800 MCR DIMMs: Massive Modules for Massive Servers

22 mars 2024 à 20:00

Micron this week announced that it had begun sampling of its 256 GB multiplexer combined (MCR) DIMMs, the company's highest-capacity memory modules to date. These brand-new DDR5-based MCRDIMMs are aimed at next-generation servers, particularly those powered by Intel's Xeon Scalable 'Granite Rapids' processors that are set to support 12 or 24 memory slots per socket. Usage of these modules can enable datacenter machines with 3 TB or 6 TB of memory, with the combined ranks allowing for effect data rates of DDR5-8800.

"We also started sampling our 256 GB MCRDIMM module, which further enhances performance and increases DRAM content per server," said Sanjay Mehrotra, chief executive of Micron, in prepared remarks for the company's earnings call this week.

In addition to announcing sampling of these modules, Micron also demonstrated them at NVIDIA's GTC conference, where server vendors and customers alike are abuzz at building new servers for the next generation of AI accelerators. Our colleagues from Tom's Hardware have managed to grab a couple of pictures of Micron's 256 GB DDR5-8800 MCR DIMMs.


Image Credit: Tom's Hardware

Apparently, Micron's 256 GB DDR5-8800 MCRDIMMs come in two variants: a taller module with 80 DRAM chips distributed on both sides, and a standard-height module using 2Hi stacked packages. Both are based on monolithic 32 Gb DDR5 ICs and are engineered to cater to different server configurations with the standard-height MCRDIMM adressing 1U servers.The taller version consumes about 20W of power, which is in line with expectations as a 128 GB DDR5-8000 RDIMM consumes around 10W in DDR5-4800 mode. I have no idea about power consumption of the version that uses 2Hi packages, though expect it to be a little bit hotter and harder to cool down.


Image Credit: Tom's Hardware

Multiplexer Combined Ranks (MCR) DIMMs are dual-rank memory modules featuring a specialized buffer that allows both ranks to operate simultaneously. This buffer enables the two physical ranks to operate as though they were separate modules working in parallel, which allows for concurrent retrieval of 128 bytes of data from both ranks per clock cycle (compared to 64 bytes per cycle when it comes to regular memory modules), effectively doubling performance of a single module. Of course, since the modules retains physical interface of standard DDR5 modules (i.e., 72-bits), the buffer works with host at a very high data transfer rate to transfer that fetched data to the host CPU. These speeds exceed the standard DDR5 specifications, reaching 8800 MT/s in this case.

While MCR DIMMs make memory modules slightly more complex than regular RDIMMs, they increase performance and capacity of memory subsystem without increasing the number of memory modules involved, which makes it easier to build server motherboards. These modules are poised to play a crucial role in enabling the next generation of servers to handle increasingly demanding applications, particularly in the AI field.

Sources: Tom's Hardware, Micron

Micron Sells Out Entire HBM3E Supply for 2024, Most of 2025

22 mars 2024 à 15:00

Being the first company to ship HBM3E memory has its perks for Micron, as the company has revealed that is has managed to sell out the entire supply of its advanced high-bandwidth memory for 2024, while most of their 2025 production has been allocated, as well. Micron's HBM3E memory (or how Micron alternatively calls it, HBM3 Gen2) was one of the first to be qualified for NVIDIA's updated H200/GH200 accelerators, so it looks like the DRAM maker will be a key supplier to the green company.

"Our HBM is sold out for calendar 2024, and the overwhelming majority of our 2025 supply has already been allocated," said Sanjay Mehrotra, chief executive of Micron, in prepared remarks for the company's earnings call this week. "We continue to expect HBM bit share equivalent to our overall DRAM bit share sometime in calendar 2025."

Micron's first HBM3E product is an 8-Hi 24 GB stack with a 1024-bit interface, 9.2 GT/s data transfer rate, and a total bandwidth of 1.2 TB/s. NVIDIA's H200 accelerator for artificial intelligence and high-performance computing will use six of these cubes, providing a total of 141 GB of accessible high-bandwidth memory.

"We are on track to generate several hundred million dollars of revenue from HBM in fiscal 2024 and expect HBM revenues to be accretive to our DRAM and overall gross margins starting in the fiscal third quarter," said Mehrotra.

The company has also began sampling its 12-Hi 36 GB stacks that offer a 50% more capacity. These KGSDs will ramp in 2025 and will be used for next generations of AI products. Meanwhile, it does not look like NVIDIA's B100 and B200 are going to use 36 GB HBM3E stacks, at least initially.

Demand for artificial intelligence servers set records last year, and it looks like it is going to remain high this year as well. Some analysts believe that NVIDIA's A100 and H100 processors (as well as their various derivatives) commanded as much as 80% of the entire AI processor market in 2023. And while this year NVIDIA will face tougher competition from AMD, AWS, D-Matrix, Intel, Tenstorrent, and other companies on the inference front, it looks like NVIDIA's H200 will still be the processor of choice for AI training, especially for big players like Meta and Microsoft, who already run fleets consisting of hundreds of thousands of NVIDIA accelerators. With that in mind, being a primary supplier of HBM3E for NVIDIA's H200 is a big deal for Micron as it enables it to finally capture a sizeable chunk of the HBM market, which is currently dominated by SK Hynix and Samsung, and where Micron controlled only about 10% as of last year.

Meanwhile, since every DRAM device inside an HBM stack has a wide interface, it is physically bigger than regular DDR4 or DDR5 ICs. As a result, the ramp of HBM3E memory will affect bit supply of commodity DRAMs from Micron, the company said.

"The ramp of HBM production will constrain supply growth in non-HBM products," Mehrotra said. "Industrywide, HBM3E consumes approximately three times the wafer supply as DDR5 to produce a given number of bits in the same technology node."

SK Hynix Mulls 'Differentiated' HBM Memory Amid AI Frenzy

1 mars 2024 à 19:30

SK Hynix and AMD were at the forefront of the memory industry with the first generation of high bandwidth memory (HBM) back in 2013 – 2015, and SK Hynix is still leading this market in terms of share. In a bid to maintain and grow its position, SK Hynix has to adapt to the requirements of its customers, particularly in the AI space, and to do so it's mulling over how to make 'differentiated' HBM products for large customers.

"Developing customer-specific AI memory requires a new approach as the flexibility and scalability of the technology becomes critical," said Hoyoung Son, the head of Advanced Package Development at SK Hynix in the status of a vice president

When it comes to performance, HBM memory with a 1024-bit interface has been evolving fairly fast: it started with a data transfer rate of 1 GT/s in 2014 – 2015 and reached upwards of 9.2 GT/s – 10 GT/s with the recently introduced HBM3E memory devices. With HBM4, the memory is set to transit to a 2048-bit interface, which will ensure steady bandwidth improvement over HBM3E.

But there are customers which may benefit from differentiated (or semi-custom) HBM-based solutions, according to the vice president.

"For implementing diverse AI, the characteristics of AI memory also need to become more varied," Hoyoung Son said in an interview with BusinessKorea. "Our goal is to have a variety of advanced packaging technologies capable of responding to these changes. We plan to provide differentiated solutions that can meet any customer needs."

With a 2048-bit interface, many (if not the vast majority) of HBM4 solutions will likely be custom or at least semi-custom based on what we know from official and unofficial information about the upcoming standard. Some customers might want to keep using interposers (but this time they are going to get very expensive) and others will prefer to install HBM4 modules directly on logic dies using direct bonding techniques, which are also expensive.

Making differentiated HBM offerings requires sophisticated packaging techniques, including (but certainly not limited to) SK Hynix's Advanced Mass Reflow Molded Underfill (MR-RUF) technology. Given the company's vast experience with HBM, it may well come up with something else, especially for differentiated offerings.

"For different types of AI to be realized, the characteristics of AI memory also need to be more diverse," the VP said. "Our goal is to have a range of advanced packaging technologies to respond to the shifting technological landscape. Looking ahead, we plan to provide differentiated solutions to meet all customer needs."

Sources: BusinessKorea, SK Hynix

Samsung Launches 12-Hi 36GB HBM3E Memory Stacks with 10 GT/s Speed

27 février 2024 à 12:00

Samsung announced late on Monday the completion of the development of its 12-Hi 36 GB HBM3E memory stacks, just hours after Micron said it had kicked off mass production of its 8-Hi 24 GB HBM3E memory products. The new memory packages, codenamed Shinebolt, increase peak bandwidth and capacity compared to their predecessors, codenamed Icebolt, by over 50% and are currently the world's fastest memory devices.

As the description suggests, Samsung's Shinebolt 12-Hi 36 GB HBM3E stacks pack 12 24Gb memory devices on top of a logic die featuring a 1024-bit interface. The new 36 GB HBM3E memory modules feature a data transfer rate of 10 GT/s and thus offer a peak bandwidth of 1.28 TB/s per stack, the industry's highest per-device (or rather per-module) memory bandwidth.

Meanwhile, keep in mind that developers of HBM-supporting processors tend to be cautious, so they will use Samsung's HBM3E at much lower data transfer rates to some degree because of power consumption and to some degree to ensure ultimate stability for artificial intelligence (AI) and high-performance computing (HPC) applications.

Samsung HBM Memory Generations
  HBM3E
(Shinebolt)
HBM3
(Icebolt)
HBM2E
(Flashbolt)
HBM2
(Aquabolt)
Max Capacity 36GB 24 GB 16 GB 8 GB
Max Bandwidth Per Pin 9.8 Gb/s 6.4 Gb/s 3.6 Gb/s 2.0 Gb/s
Number of DRAM ICs per Stack 12 12 8 8
Effective Bus Width 1024-bit
Voltage ? 1.1 V 1.2 V 1.2 V
Bandwidth per Stack 1.225 TB/s 819.2 GB/s 460.8 GB/s 256 GB/s

To make its Shinebolt 12-Hi 36 GB HBM3E memory stacks, Samsung had to use several advanced technologies. First, the 36 GB HBM3E memory products are based on memory devices made on Samsung's 4th generation 10nm-class (14nm) fabrication technology, which is called and uses extreme ultraviolet (EUV) lithography.

Secondly, to ensure that 12-Hi HBM3E stacks have the same z-height as 8-Hi HBM3 products, Samsung used its advanced thermal compression non-conductive film (TC NCF), which allowed it to achieve the industry's smallest gap between memory devices at seven micrometers (7 µm). By shrinking gaps between DRAMs, Samsung increases vertical density and mitigates chip die warping. Furthermore, Samsung uses bumps of various sizes between the DRAM ICs; smaller bumps are used in areas for signaling. In contrast, larger ones are placed in spots that require heat dissipation, which improves thermal management.

Samsung estimates that its 12-Hi HBM3E 36 GB modules can increase the average speed for AI training by 34% and expand the number of simultaneous users of inference services by more than 11.5 times. However, the company has not elaborated on the size of the LLM.

Samsung has already begun providing samples of the HBM3E 12H to customers, with mass production scheduled to commence in the first half of this year.

Source: Samsung

Micron Kicks Off Production of HBM3E Memory

26 février 2024 à 17:00

Micron Technology on Monday said that it had initiated volume production of its HBM3E memory. The company's HBM3E known good stack dies (KGSDs) will be used for Nvidia's H200 compute GPU for artificial intelligence (AI) and high-performance computing (HPC) applications, which will ship in the second quarter of 2024.

Micron has announced it is mass-producing 24 GB 8-Hi HBM3E devices with a data transfer rate of 9.2 GT/s and a peak memory bandwidth of over 1.2 TB/s per device. Compared to HBM3, HBM3E increases data transfer rate and peak memory bandwidth by a whopping 44%, which is particularly important for bandwidth-hungry processors like Nvidia's H200.

Nvidia's H200 product relies on the Hopper architecture and offers the same computing performance as the H100. Meanwhile, it is equipped with 141 GB of HBM3E memory featuring bandwidth of up to 4.8 TB/s, a significant upgrade from 80 GB of HBM3 and up to 3.35 TB/s bandwidth in the case of the H100.

Micron's memory roadmap for AI is further solidified with the upcoming release of a 36 GB 12-Hi HBM3E product in March 2024. Meanwhile, it remains to be seen where those devices will be used.

Micron uses its 1β (1-beta) process technology to produce its HBM3E, which is a significant achievement for the company as it uses its latest production node for its data center-grade products, which is a testament to the manufacturing technology.

Starting mass production of HBM3E memory ahead of competitors SK Hynix and Samsung is a significant achievement for Micron, which currently holds a 10% market share in the HBM sector. This move is crucial for the company, as it allows Micron to introduce a premium product earlier than its rivals, potentially increasing its revenue and profit margins while gaining a larger market share.

"Micron is delivering a trifecta with this HBM3E milestone: time-to-market leadership, best-in-class industry performance, and a differentiated power efficiency profile," said Sumit Sadana, executive vice president and chief business officer at Micron Technology. "AI workloads are heavily reliant on memory bandwidth and capacity, and Micron is very well-positioned to support the significant AI growth ahead through our industry-leading HBM3E and HBM4 roadmap, as well as our full portfolio of DRAM and NAND solutions for AI applications."

Source: Micron

❌
❌