OpenAI & Broadcom unveil 'Jalapeno', their custom AI chip for LLMs

OpenAI and Broadcom have unveiled ‘Jalapeno,’ OpenAI’s first custom AI processor for LLM inference. Developed in nine months, it shows superior performance per watt and will be deployed at a gigawatt scale with partners starting by the end of 2026.

OpenAI and Broadcom have unveiled Jalapeno, OpenAI’s first custom Intelligence Processor designed from scratch for large language model inference. The chip was delivered to OpenAI’s leadership after a nine-month design-to-tape-out cycle and will be deployed at a gigawatt scale with data centre partners across multiple generations starting by the end of 2026.

Add Asianet Newsable as a Preferred Source

According to a press release by Broadcom, early lab tests show Jalapeno running ML workloads at production target frequency and power, with performance per watt substantially better than the current state-of-the-art. Broadcom said Jalapeno marks the start of a multi-generation compute platform built with OpenAI.

A Multi-Generation Roadmap

“Our collaboration with OpenAI represents a fundamental commitment to scaling the physical infrastructure required for the next decade of AI,” said Hock Tan, President and CEO, Broadcom. “This is just the beginning of a multi-generation roadmap. By co-developing our industry-leading silicon directly with OpenAI, we are enabling the deployment of gigawatt-scale data centres with Microsoft and other partners beginning in 2026,” Broadcom said.

The chip was co-developed from initial design to manufacturing tape-out in just nine months, with Broadcom contributing silicon implementation expertise, board and rack integration, high-performance networking and scalable production systems.

A Blank-Slate Design for LLM Inference

According to Broadcom, the accelerator is a blank-slate design for modern LLM inference rather than a general-purpose chip adapted from earlier AI workloads. “Jalapeno is a blank-slate design for modern LLM inference, not a general-purpose accelerator adapted from earlier AI workloads,” Broadcom said.

The architecture reduces data movement and balances compute, memory and networking resources to achieve utilisation much closer to theoretical peak performance. Broadcom’s silicon implementation and networking technologies, including Tomahawk networking silicon, help bring the platform to large-scale production.

Engineering samples are already running ML workloads in the lab at production target frequency and power, including GPT-5.3-Codex-Spark. Broadcom highlighted that OpenAI designed the chip around its understanding of LLM fundamentals, kernels, serving systems and product needs, while Broadcom and Celestica industrialised the platform.

High Performance for Interactive AI

“Designed to be the best inference platform for LLMs,” Broadcom said. Jalapeno combines the power and throughput of today’s leading AI accelerators with latency closer to the fastest specialised inference systems. That makes it suited for interactive LLM products at scale across ChatGPT, Codex, the API and future agentic products.

While OpenAI is still measuring final performance, Broadcom noted that early testing shows Jalapeno will deliver performance per watt substantially better than the current state-of-the-art, with a detailed technical report on performance to be presented in the coming months.

The companies said the custom ASIC program reflects one of the fastest development cycles ever in advanced semiconductors. Broadcom said the speed reflects deep software-hardware co-development with OpenAI’s engineering teams and the use of OpenAI models to accelerate parts of the design and optimisation process.

Jalapeno is the first step in a platform that combines OpenAI-designed accelerators with Broadcom silicon and connectivity technologies and Celestica’s board, rack and system expertise for initial deployment by the end of 2026. (ANI)

(Except for the headline, this story has not been edited by Asianet Newsable English staff and is published from a syndicated feed.)

OpenAI & Broadcom unveil ‘Jalapeno’, their custom AI chip for LLMs

OpenAI and Broadcom have unveiled ‘Jalapeno,’ OpenAI’s first custom AI processor for LLM inference. Developed in nine months, it shows superior performance per watt and will be deployed at a gigawatt scale with partners starting by the end of 2026.

A Multi-Generation Roadmap

A Blank-Slate Design for LLM Inference

High Performance for Interactive AI

Leave a Comment Cancel reply