NVIDIA Mellanox MQM8790-HS2F InfiniBand Switch in Production | Optimizing Low‑Latency Interconnect for RDMA/HPC/AI

May 27, 2026

Dernières nouvelles de l'entreprise NVIDIA Mellanox MQM8790-HS2F InfiniBand Switch in Production | Optimizing Low‑Latency Interconnect for RDMA/HPC/AI

As large language model training and exascale HPC simulations drive GPU clusters toward tens of thousands of nodes, traditional Ethernet fabrics struggle with tail latency and incast congestion. A national AI computing center recently tackled this challenge by deploying the NVIDIA Mellanox MQM8790-HS2F InfiniBand switches as the backbone of their 800‑node GPU expansion. This article walks through their real‑world journey — from bottlenecks to measurable gains — using RDMA and in‑network computing to optimize cluster interconnect performance.

Background & Challenge: When Network Becomes the AI Bottleneck

The center’s legacy 400‑node cluster ran on 100Gb/s RoCEv2 Ethernet. As workloads shifted from CNN models to trillion‑parameter LLMs, cross‑node communication latency skyrocketed. During All‑Reduce operations, network wait time consumed over 40% of total iteration time. Architects needed a platform delivering sub‑microsecond latency, lossless flow control, and native RDMA support — all while reusing existing QSFP56 optics. After evaluating multiple alternatives, the MQM8790-HS2F InfiniBand switch stood out with its 200Gb/s HDR bandwidth and 40‑port high‑density design.

Solution & Deployment: HDR Fat‑Tree Built on MQM8790-HS2F

The new interconnect adopts a two‑layer fat‑tree topology, deploying 24 units of the NVIDIA Mellanox MQM8790-HS2F across core and leaf layers. Each switch provides 40 QSFP56 ports running at 200Gb/s per direction, delivering a non‑blocking switching capacity of 16Tb/s. Engineers followed the MQM8790-HS2F datasheet and MQM8790-HS2F specifications to enable adaptive routing and advanced congestion control. Every GPU node connects via HDR ConnectX‑6 adapters, leveraging native InfiniBand RDMA for zero‑copy data transfers — offloading over 95% of CPU involvement in communication.

The center also reserved several MQM8790-HS2F 200Gb/s HDR 40-port QSFP56 units for an in‑situ computing zone. With SHARP (Scalable Hierarchical Aggregation and Reduction Protocol), collective operations like All‑Reduce are offloaded from servers directly to the switch network. In 128‑GPU training, this cut communication time by 32% without any code change to the AI framework.

Results & Gains: Lower Latency, Higher Throughput, Controlled TCO

Post‑deployment metrics showed dramatic improvements:

  • Point‑to‑point latency: MPI pingpong tests measured ~0.9μs on 200Gb/s HDR links — 65% lower than the legacy RoCE setup.
  • Collective communication efficiency: At 512‑GPU scale, All‑Reduce completed in just 18.3ms, a 52% reduction compared to the previous baseline.
  • Network utilization: Adaptive routing kept link load balancing above 92%, with almost no congestion hotspots.
  • Procurement & operations: The MQM8790-HS2F price per port was roughly 12% lower than competing 200G solutions. Moreover, MQM8790-HS2F compatible optics are standard QSFP56 modules, allowing full reuse of existing cabling.

“After moving to the MQM8790-HS2F InfiniBand switch solution, we finally achieved near‑linear scaling (0.9 efficiency) on trillion‑parameter model training," said the center’s lead architect. “Network is no longer the bottleneck — we can focus on model architecture innovation instead of communication scheduling."

Conclusion & Outlook: Core Building Block for Exascale Interconnects

This real‑world case demonstrates that the MQM8790-HS2F is much more than higher port density. With 200Gb/s HDR, native RDMA, SHARP in‑network compute, and adaptive routing, it directly addresses low‑latency interconnect pain points in today’s AI/HPC clusters. Whether you are planning a university supercomputing center or upgrading an enterprise AI cloud, the NVIDIA Mellanox MQM8790-HS2F offers a balanced path combining performance, compatibility, and cost predictability. The switch is now in volume production. For detailed design references, request the official MQM8790-HS2F datasheet from NVIDIA’s partner portal. For real‑time inventory or MQM8790-HS2F for sale inquiries, contact authorized solution providers for pricing and technical support.