D-Matrix Transforms SRAMS For AI
The AI-processor startup has emerged from stealth mode with a chiplet-based inference engine that employs digital in-memory computing (IMC) to run Transformer models like those in Bert and GPT-3.
D-Matrix is an AI-processor startup developing chiplets that employ digital in-memory computing (IMC) to run Transformer-based inference models. Its first chiplet—Nighthawk—is just a proof-of-concept (PoC), but the company aims to scale the architecture for data-center inference on PCIe accelerator cards and servers.
Nighthawk is a tiny 10mm2 die built in 6nm TSMC technology that integrates four copies of the base neural-network accelerator, which D-Matrix calls a slice, along with a SiFive S76 CPU to control the chiplet. Each slice includes two of the company’s Apollo compute cores, a global memory (GM) for storing input and output data, and a NoC interface that connects with the chiplet’s other units. The Apollo core has eight digital-IMC units that operate in parallel, each executing 64x64 matrix-multiplications using a variety of floating-point and integer formats. Each core can execute 2,048 INT8 multiply-accumulate (MAC) operations per cycle, yielding 33 trillion operations per second (TOPS) for the entire chiplet running at 1GHz.
D-Matrix is demonstrating Nighthawk to customers, but that chiplet won’t enter production. In 2H22, the company plans to add a second chiplet to its demonstration platform, called Jayhawk, which will implement die-to-die interfaces based on the Open Domain-Specific Architecture (ODSA) standard. For the complete product, however, customers must wait until 2H23 for the Corsair chiplet. That device will integrate 512 IMCs—eight times as many as Nighthawk—boosting performance to more than 200 TOPS. Also, by year-end, D-Matrix will selectively license its design as an intellectual-property (IP) macro to customers that help kick-start its software ecosystem.