OpenAI and Tech Giants Launch Revolutionary AI Network MRC
Le brief IA que les pros lisent chaque soir
Les 7 actus IA du jour, décryptées en 5 min. Gratuit.
Inclus dès l'inscription : notre sélection des meilleurs guides & comparatifs IA.
Choisis ton rythme
Gratuit · Pas de spam · Désabonnement en 1 clic
A Major Technological Partnership for OpenAI
OpenAI recently announced a collaboration with leaders in the tech industry, including AMD, Broadcom, Intel, Microsoft, and NVIDIA. Together, they have developed a new network protocol called MRC (Multipath Reliable Connection).
Enhancing AI Supercomputer Performance
The MRC protocol is designed to optimize data transfers between GPUs in supercomputers dedicated to artificial intelligence. By distributing data packets across hundreds of simultaneous paths, MRC aims to make these transfers faster, more predictable, and resilient. This approach helps reduce congestion within the network.
In the event of path, link, or switch failures, MRC can detect and circumvent issues in a microsecond. Traditional networks may require several seconds to stabilize after a failure, according to OpenAI.
Advantages of the MRC Protocol
Thanks to its multi-plane design, MRC can connect over 100,000 GPUs with just two levels of Ethernet switches, unlike the three or four levels required by conventional 800 Gb/s networks. This reduces energy consumption, the number of necessary components, and the overall cost of the network.
Deployment and Immediate Impact
The MRC protocol is already operational on OpenAI's NVIDIA GB200 supercomputers, which are used for training advanced models. Among these installations are the Oracle Cloud Infrastructure site in Abilene, Texas, as well as Microsoft's Fairwater supercomputers.
During the training of a recent model for ChatGPT and Codex, OpenAI was able to avoid coordinating the restart of four level 1 switches thanks to MRC, which could have disrupted ongoing work.
Publication and Contributions
The MRC specification was made public today via the Open Compute Project (OCP), accompanied by a research paper detailing its features. In addition to OpenAI, companies AMD, Broadcom, Intel, Microsoft, and NVIDIA all contributed to the development of this innovative protocol.
Brief IA — L'actualité IA en français
L'essentiel de l'actualité de l'intelligence artificielle, décrypté et expliqué chaque jour.