OpenAI and Broadcom Reveal Jalapeno: OpenAI's First Custom AI Inference Chip

Patrick94 · **Yesterday** at 01:23 PM

OpenAI and Broadcom unveiled the design of Jalapeño on June 24th, OpenAI's first custom AI chip developed in partnership with the semiconductor giant. The chip is optimised specifically for inference workloads, meaning running trained models to generate responses rather than training new models from scratch. OpenAI has received initial samples and is testing them with deployment planned by end of year. The design targets high efficiency in power and performance for large language model inference, and OpenAI engineers used Broadcom's AI tools during the chip design process itself.

This move is significant because it represents OpenAI reducing its dependence on Nvidia for inference compute. Training still requires Nvidia's H100 and B100 GPUs because they are simply the best available hardware for that workload, but inference is a different problem where custom silicon tuned for specific model architectures can offer substantial efficiency advantages. Every inference request you run costs money and energy, and at ChatGPT's scale of over a billion monthly users even small efficiency gains translate to enormous savings.

The parallel with what Google has done with TPUs is clear. Google spent years and billions building custom inference silicon and it became one of their key competitive advantages in AI infrastructure. Microsoft has been building the Maia inference chip. Amazon has Trainium and Inferentia. OpenAI is the last major AI lab to move toward custom silicon and Jalapeño puts them on the same roadmap as their infrastructure competitors.

OpenAI unveils its first custom chip, built by Broadcom | TechCrunch

techcrunch.com

OpenAI and Broadcom Reveal Jalapeno: OpenAI's First Custom AI Inference Chip

Patrick94