Hopper Leads MLPerf Training Scores
The newest MLPerf training results include scores for Nvidia’s Hopper H100. Depending on the subtest, the H100 is 7–160% faster than its predecessor. No other AI processor is as fast.
Having already topped the MLPerf Inference benchmarks, Nvidia’s Hopper H100 has bounced to the head of the MLPerf Training leaderboard. The GPU is only slightly faster than the vendor’s Ampere A100 on the Mini-Go test but is on average twice as fast on the remaining seven workloads. No competing chip has better results.
Developed by the MLCommons consortium, the open-source MLPerf Training suite represents a spectrum of applications. Submissions come from the community—typically, hardware vendors and cloud-service providers—and can include results from a subset of the benchmark tests. November’s v2.1 results are the second ones using the Training 2.0 suite (see MPR Jul 2022, “Gaudi2 Makes Impressive MLPerf Debut”).
The MLPerf suite avoids massive models, such as GPT-3 and Switch-XXL, that AI-processor clusters typically handle. These models would highlight the improvements in system-level interconnects. Often, however, such interconnects—not the raw computational resources—are the performance bottleneck.
In addition to the H100 results, the November v2.1 release includes updated data for Gaudi2 from Intel subsidiary Habana (see MPR Jun 2022, “Habana Gaudi2 Triples Performance”) and the first submission for Intel’s Sapphire Rapids server processor. MLCommons categorized both the H100 and Sapphire Rapids results as previews because they’re not based on generally available hardware.