Brief IA

Zyphra, Cohere, and Poolside: The Revolution of Open Models

🔬 Research·Tom Levy·

Zyphra, Cohere, and Poolside: The Revolution of Open Models

Zyphra, Cohere, and Poolside: The Revolution of Open Models
Key Takeaways
1The ecosystem of open models is expanding with players like Zyphra, Cohere, and Poolside, diversifying the offerings.
2NVIDIA launches the model Nemotron-3-Ultra-550B-A55B-BF16, using the OpenMDW license for its model weights.
3Cohere releases Command A+ under the Apache 2.0 license, offering multi-modal and multi-lingual capabilities.
💡Why it mattersThe diversification of open models enhances innovation and accessibility in the field of AI, preventing technological concentration.
Le brief IA que lisent les pros

Le brief IA que les pros lisent chaque soir

Les 7 actus IA du jour, décryptées en 5 min. Gratuit.

Inclus dès l'inscription : notre sélection des meilleurs guides & comparatifs IA.

Choisis ton rythme

Gratuit · Pas de spam · Désabonnement en 1 clic

📄
Full Analysis

The Evolution of the Open Models Ecosystem

The open models ecosystem is undergoing a notable transformation, characterized by an increasing diversification of the actors involved. Once dominated by a few large companies, primarily Chinese, this field is now witnessing the emergence of niche companies worldwide. This evolution reflects a trend towards greater diversity in model development, although the precise motivations of companies often remain opaque.

The "Pure" Model Manufacturers

Among the key players are the "pure" model manufacturers, whose primary goal is to develop cutting-edge models. These companies include well-known names like DeepSeek, Zhipu, and Minimax in China, as well as Western firms such as Poolside, Arcee, and Zyphra. Additionally, sovereign AI players like Cohere, Sovereign, Mistral, and Trillion Labs are gaining importance. A recent incident involving Mythos has raised awareness among some decision-makers, which could stimulate interest in the development of sovereign models.

The Motivations of Tech Giants

Large tech companies, such as Alibaba with Qwen, Google with Gemma, and NVIDIA, have varied motivations for their model launches. Alibaba, for instance, uses these launches to promote its closed models, while NVIDIA benefits from an open models ecosystem that drives the use of its GPUs. This approach contrasts with the era of Llama models, where the motivations for open launches were less clear and ultimately faded.

Product Companies and AI

Some companies, like JetBrains, Zed, Krea, and Photoroom, integrate AI as a central component of their products. To avoid dependency on closed models or to offer unique solutions, they develop specialized and smaller models tailored to their needs. The open-sourcing of these models does not compromise their profitability but allows them to remain competitive.

Diversity as a Strength of the Ecosystem

This diversity in the development of open models is a major strength of the ecosystem. It is reflected in the technical reports of launches, which reuse training methods, architectural choices, and data from other open models. Attempts to restrict this ecosystem have proven not only ineffective but also potentially dangerous, as they could concentrate AI development in the hands of a few dominant players.

Notable Models of This Year

NVIDIA-Nemotron-3-Ultra-550B-A55B-BF16

NVIDIA has launched an advanced version of its Nemotron series, utilizing LatentMoE technology to surpass comparable models in speed. The majority of the data for this model is open source, and NVIDIA has adopted the OpenMDW license, specifically designed for model weights, thus abandoning its custom license.

Command A+ by CohereLabs

CohereLabs recently released its flagship model, Command A+, under the Apache 2.0 license. This decision marks a welcome shift from previous versions, which were under a non-commercial license. Command A+ offers multi-modal, multi-lingual, and agentic capabilities and can be used with a single B200 in 4-bit mode.

GLM-5.2 by zai-org

GLM-5.2 continues to impress with its performance, competing with the best available closed models. Since its launch, download figures show popularity comparable to that of GLM-5, confirming its utility for everyday work.

ZAYA1-74B-preview by Zyphra

Zyphra, known for its innovative architectural choices, has released new models, including a 74B-A4B MoE and an 8B-A0.6B MoE. These models are the result of extensive research and intensive use of AMD GPUs.

Laguna-M.1 by Poolside

Poolside has also launched its flagship model under the Apache 2.0 license, committing to maintain open launches in the future. This strategy aims to release increasingly powerful models while adhering to the principles of openness.

General-Purpose Models

Kimi-K2.7-Code by moonshotai

This update to Kimi emphasizes token efficiency, thereby improving the overall performance of the model.

Step-3.7-Flash by stepfun-ai

Step-Flash has been updated to excel particularly in mathematical applications, reinforcing its position in this domain.

Nemotron-Labs-Diffusion-14B by NVIDIA

NVIDIA has also introduced an experimental model, Nemotron-Labs-Diffusion-14B, which can be used in three different modes: autoregressive, diffusion, and auto-speculation, each suited for specific use cases.

Brief IA — L'actualité IA en français

L'essentiel de l'actualité de l'intelligence artificielle, décrypté et expliqué chaque jour.