Intel Gaudi2 Nears Nvidia H100 on LLMs

Industry leaders duke it out on the new LLM test included in the new version of the MLPerf data-center inference benchmark while also pursuing efficiency gains.
Joseph Byrne
Joseph Byrne

Intel’s Gaudi2 AI accelerator nearly matches the performance of Nvidia’s H100 (Hopper) on the large language model test added to the newest MLPerf inference benchmark. It’s the closest any company has come to toppling Nvidia’s position on the performance leaderboard on any test.

Nvidia is improving as well; the company released the first MLPerf inference data for its Grace Hopper module, showing it speeds up AI performance by 10%. Google previewed TPUv5e scores for the first time, revealing its decision to focus on efficiency instead of per-chip performance. Published by MLCommons, v3.1 of the semiannual MLPerf inference results also includes updated scores for Qualcomm’s inference accelerator and Intel’s most recent Xeon processors (Sapphire Rapids) running with no external offload.

As did the most recent MLPerf training benchmark, the new data-center inference suite updates the recommendation model to DLRMv2 and adds a large language model (LLM) test. The changes to the recommender mirror those made on the training side. However, the LLM inference test is unique; running GPT J 6B, it performs text summarization.

Free Newsletter

Get the latest analysis of new developments in semiconductor market and research analysis.

Subscribers can view the full article in the TechInsights Platform.


You must be a subscriber to access the Manufacturing Analysis reports & services.

If you are not a subscriber, you should be! Enter your email below to contact us about access.

The authoritative information platform to the semiconductor industry.

Discover why TechInsights stands as the semiconductor industry's most trusted source for actionable, in-depth intelligence.