OpenAI collaborates with Broadcom to launch self-developed inference chip Jalape ñ o--Seetao

Reading this article requires

7 Minute

OpenAI and Broadcom have officially reached a deep cooperation to jointly launch the Jalape ñ o AI inference chip optimized for large language models, which is also the first artificial intelligence accelerator in the multi generation computing platform planned by both parties.

This chip was designed from scratch by OpenAI based on its own accumulation of underlying operating logic for large models. Broadcom was responsible for chip fabrication and industrial mass production, while New Meiya undertook the integration of circuit boards, racks, and overall systems, forming a complete implementation link among the three parties. At present, the chip engineering samples have been tested for machine learning workloads in OpenAI laboratory, and the frequency and power consumption performance have reached the preset indicators.

Complete the entire process from design to production in 9 months

The development and implementation speed of this chip far exceeds the industry's conventional pace, taking only 9 months from initial architecture design to final chip production. During the process, OpenAI directly used its own large-scale modeling capabilities to assist in the simulation and optimization of some chip design stages, greatly reducing the cycle of iterative verification in traditional chip development.

Unlike the design concept of general AI chips, Jalape ñ o has been fully focused on targeted optimization for large language model inference scenarios since its inception. It has made targeted architecture adjustments for core operator operations, data handling, network transmission, and actual service modes. In early testing, the running efficiency of core workloads has already approached the theoretical limit of hardware.

Launch gigawatt level data center deployment by the end of 2026

The core positioning of this chip is to undertake large model inference tasks. After its implementation, it can directly make ChatGPT's response speed faster and improve the efficiency of code generation tools such as Codex. The core goal of OpenAI's push for full stack self-developed chips is to break free from the performance bottleneck of general computing power, provide more abundant AI computing power supply, and ultimately lower inference costs, making higher-level AI services affordable for both ordinary users and enterprises. Keywords: AI inference chip, artificial intelligence

According to the current deployment pace, the first batch of Jalape ñ o will be launched by the end of 2026, directly connected to a gigawatt scale supercomputing data center, and will continue to iterate along the multi generation product roadmap in the future. At present, the competition for general AI chips in the industry is still ongoing, and the performance of this targeted optimization inference chip in actual implementation still needs to be tested in real business scenarios after large-scale deployment.Editor/Cheng Liting

Retrieve password