OpenAI Challenges Nvidia with Jalapeño Chip to Cut Costs

⚡

Key Takeaways

1OpenAI has developed the Jalapeño chip with Broadcom to reduce its infrastructure costs.

2OpenAI's expenses to maintain ChatGPT reached $8.4 billion last year.

3The Jalapeño chip is designed for the inference of large language models, thereby optimizing performance.

💡Why it matters — OpenAI aims to compete with industry giants by better controlling its costs and improving its infrastructure.

OpenAI and Its Quest to Reduce Costs

OpenAI, faced with high infrastructure costs, has embarked on developing its own custom chip, named Jalapeño, in partnership with Broadcom. This application-specific integrated circuit, or ASIC, represents a strategic attempt to alleviate the massive expenses associated with using third-party hardware.

Currently, Nvidia dominates the market with an impressive profit margin of 75% on its high-end processors. In comparison, OpenAI must settle for a much more modest margin, with only 33 cents of profit for every dollar generated, after accounting for its substantial operational expenses. Operating large language models, like those used by OpenAI, proves to be an expensive endeavor.

Last year, maintaining the responsiveness of ChatGPT's servers cost OpenAI a staggering $8.4 billion. With a weekly user base now reaching 900 million, this cost is expected to rise to around $14 billion this year. Over an eight-year period, OpenAI plans to invest approximately $1.4 trillion in computing power, a bold bet for a company currently generating $25 billion in annual revenue.

The Design of the Jalapeño Chip

The Jalapeño chip, described as OpenAI's first "intelligence processor," is specifically designed for the inference of large language models, rather than for general-purpose AI tasks. OpenAI provided the basic architectural design, tailored to its specific models and service systems, while Broadcom handled the silicon engineering and integration of high-performance networks.

The physical manufacturing of the chip is carried out by TSMC in Taiwan, and Celestica is responsible for building the card and rack systems. According to OpenAI, preliminary lab samples are already operational with advanced workloads, including an unpublished model named GPT-5.3-Codex-Spark, functioning at the targeted frequency and power output.

Richard Ho, head of hardware programs at OpenAI, explained that the chip architecture minimizes data movement to bring actual usage closer to its theoretical maximum performance. Unlike general-purpose accelerators, this architecture is specifically balanced to address bottlenecks related to data movement in the interactive service of LLMs.

To achieve this goal at scale, the platform integrates Broadcom's Tomahawk network silicon directly into the design, allowing custom processors to communicate efficiently across vast data center environments.

Vertical Integration as a Strategic Lever

By developing its own silicon, OpenAI is transitioning from a simple software layer to a vertically integrated infrastructure company. This full-stack strategy encompasses the entire pipeline, from chip architecture to software cores, memory systems, network planning, and the final application layer. Similar to Apple, which optimizes its infrastructure around its own products, OpenAI can now refine its infrastructure based on its specific needs.

This vertical integration offers a continuous operational lever. The increased efficiency of the infrastructure reduces the cost of training and servicing models, making the service more affordable and thereby increasing user volume and revenue to reinvest in the next generation of custom infrastructure.

Overcoming the Advantage of Delay

By introducing its own chip, OpenAI is entering a field where its main competitors already have a head start. Google, for example, began deploying its Tensor Processing Units (TPUs) in 2015 and now controls about a quarter of the global AI computing capacity, outside of Nvidia's supply.

Amazon has already shipped over a million of its custom chips, while Meta and Microsoft continue to develop their own infrastructure. Greg Brockman, president and co-founder of OpenAI, stated that Jalapeño is part of their long-term full-stack infrastructure strategy to make computing more abundant. By designing more of the stack themselves, OpenAI can deliver more intelligence with greater efficiency.

To close this time gap, OpenAI accelerated the development of its Jalapeño chip, moving from initial design to manufacturing tape-out in just nine months. Engineering teams achieved this timeline by using OpenAI's own language models to automate and optimize certain parts of the hardware design process.

This approach creates a unique feedback loop where the models used by users also serve to build the physical infrastructure that will support future iterations. The initial deployment of the hardware in data centers is expected to begin by the end of 2026.

Hock Tan, CEO of Broadcom, confirmed that the deployment will roll out alongside infrastructure partners, including Microsoft, to prepare for the integration of data centers at a gigawatt scale.

OpenAI Challenges Nvidia with Jalapeño Chip to Cut Costs

Le brief IA que les pros lisent chaque soir

OpenAI and Its Quest to Reduce Costs

The Design of the Jalapeño Chip

Vertical Integration as a Strategic Lever

Overcoming the Advantage of Delay

Brief IA — L'actualité IA en français