Samsung Demos In-Memory Processing

October 12, 2021 - Author: Bob Wheeler


As High Bandwidth Memory (HBM) proliferates in high-performance computing, Samsung is turning the tables by demonstrating a processor-in-memory (PIM) version of its popular Aquabolt HBM2. At Hot Chips, it disclosed HBM-PIM architecture details as well as early performance results using an undisclosed GPU and a Xilinx Alveo card, both populated with HBM-PIM stacks dubbed Aquabolt-XL. It also presented simulations for a conceptual LPDDR5-PIM, which would address low-power applications.

The standard Aquabolt module comprises eight DRAM die stacked on a buffer die. To create Aquabolt-XL, Samsung replaced the bottom four die with special PIM-DRAM, leaving the rest of the stack unmodified. The PIM-DRAM inserts 32 processing units between even/odd bank pairs, enabling bank-level parallelism. Replacing only half the die reduced risk and allows performance comparisons between HBM and HBM-PIM in a single stack. Aquabolt-XL preserves HBM2 timing so it can drop into existing designs.

Marvell 5nm Switch Handles 5G RAN

To evaluate Aquabolt-XL’s performance benefits, Samsung tested BLAS microbenchmarks as well as full neural-network models against GPUs with standard HBM2 as a baseline. Unsurprisingly, workloads that caused high last-level-cache miss rates benefited most from moving operations to PIM. It also worked with Xilinx to build a prototype Virtex UltraScale+ FPGA with two Aquabolt-XL stacks. The results show that PIM is no panacea, but it can greatly accelerate memory-bound workloads. Samsung isn’t the first to demonstrate PIM-DRAM, but as the leading DRAM vendor, its market influence is undeniable.

Subscribers can view the full article in the Microprocessor Report.


The authoritative information platform to the semiconductor industry.

Discover why TechInsights stands as the semiconductor industry's most trusted source for actionable, in-depth intelligence.