
Announced at the GTC 2024 AI developers conference, the X800 series of networking switches, developed by NVIDIA, is a new generation of switches intended for large-scale AI. With NVIDIA Quantum-X800 InfiniBand and NVIDIA Spectrum-X800 Ethernet, a networking system with end-to-end 800Gb/s throughput, networking performance for (cloud) computing, HPC workloads, and AI applications is pushed to new heights.
When it comes to providing AI-dedicated infrastructure with the best performance, the Quantum-X800 platform would raise the bar. It consists of the NVIDIA Quantum Q3400 switch and the NVIDIA ConnectX-8 SuperNIC, which when combined provide an end-to-end throughput of 800Gb/s. Compared to the previous generation, this represents a 9x increase in 14.4Tflops of In-Network Computing and a 5x increase in bandwidth capacity thanks to NVIDIA’s Scalable Hierarchical Aggregation and Reduction Protocol (SHARPv4).
Networking performance is significantly improved for AI cloud and business infrastructure by the Spectrum-X800 platform. Large companies and multi-tenant generative AI (GenAI) clouds would need rich feature sets, which the Spectrum-X800 platform can offer by using the NVIDIA BlueField-3 SuperNIC and the Spectrum SN5600 800Gb/s switch.
NVIDIA’s Spectrum-X800 is designed to maximize network performance, which in turn would speed up the development, deployment, and time to market of AI systems by enabling quicker processing, analysis, and execution of workloads. With its exceptional performance isolation for each tenant's AI workloads, Spectrum-X800 is especially designed for multi-tenant systems, which may improve customer happiness and service quality.
“NVIDIA Networking is central to the scalability of our AI supercomputing infrastructure,” said Gilad Shainer, Senior Vice President of Networking at NVIDIA. “NVIDIA X800 switches are end-to-end networking platforms that enable us to achieve trillion-parameter-scale generative AI essential for new AI infrastructures.”
NVIDIA Software Support
To maximize performance for trillion-parameter AI models, NVIDIA offers an extensive collection of network acceleration libraries, software development kits, and administration applications.
Using its In-Network Computing capabilities with SHARPv4 supporting FP8, the NVIDIA Collective Communications Library (NCCL) extends GPU parallel computing tasks to the Quantum-X800 network fabric, boosting performance for large model training and generative AI.
The improved programmability offered by NVIDIA's full-stack software approach would increase the flexibility, responsiveness, and reliability of data center networks, therefore boosting overall operational efficiency and meeting the demands of contemporary services and applications.
Microsoft Azure, Oracle Cloud Infrastructure, and Coreweave are among the early users of Quantum InfiniBand and Spectrum-X Ethernet.
“AI is a powerful tool to turn data into knowledge. Behind this transformation is the evolution of data centers into high-performance AI engines with increased demands for networking infrastructure,” said Nidhi Chappell, Vice President of AI Infrastructure at Microsoft Azure. “With new integrations of NVIDIA networking solutions, Microsoft Azure will continue to build the infrastructure that pushes the boundaries of cloud AI.”
NVIDIA Ecosystem Momentum
Multiple top infrastructure and system vendors worldwide, such as Aivres, DDN, Dell Technologies, Eviden, Hitachi Vantara, Hewlett Packard Enterprise, Lenovo, Supermicro, and VAST Data, will offer NVIDIA’s Quantum-X800 and Spectrum-X800 starting next year.