OpenAI, the artificial intelligence research and deployment company, is accelerating the pace of innovation in AI-powered coding with its new GPT-5.3-Codex-Spark model. What sets this iteration apart isn’t just improved performance, but a strategic shift away from reliance on Nvidia, traditionally the dominant provider of chips for AI workloads. The new model, launched on February 12, 2026, runs on hardware from Cerebras Systems, a move signaling a growing diversification of OpenAI’s infrastructure and a focus on ultra-low latency for developers. This development comes amid a competitive landscape where speed is paramount in the rapidly evolving world of AI coding assistants.
The demand for faster AI coding tools is driven by the increasing usefulness of these agents in software development. Tools like OpenAI’s Codex and Anthropic’s Claude Code are now capable of rapidly building prototypes, interfaces, and boilerplate code, fundamentally changing how developers work. Latency – the time it takes for the model to respond – has develop into a critical differentiator, as quicker response times allow for faster iteration and more efficient coding. The ability to generate over 1,000 tokens per second is a key metric in this race, though OpenAI acknowledges that speed can sometimes come with trade-offs in capability.
GPT-5.3-Codex-Spark’s launch follows a period of rapid iteration for OpenAI’s Codex line. The company released GPT-5.2 in December 2025, following an internal “code red” memo issued by CEO Sam Altman in response to competitive pressure from Google, demonstrating the intensity of the competition in the AI space. This latest release underscores OpenAI’s commitment to staying ahead in the development of AI coding assistants.
A Plate-Sized Processor Powers the Next Generation
The core of this speed boost lies in the hardware. Codex-Spark runs on Cerebras’ Wafer Scale Engine 3 (WSE-3), a massive chip roughly the size of a dinner plate. Cerebras has been developing this technology since at least 2022, focusing on low-latency AI workloads. The partnership between OpenAI and Cerebras, announced in January 2026, represents a significant step in diversifying OpenAI’s hardware infrastructure.
While 1,000 tokens per second is a notable achievement, Cerebras has demonstrated even greater speeds with other models. The company has measured 2,100 tokens per second on Llama 3.1 70B and reported 3,000 tokens per second on OpenAI’s own open-weight gpt-oss-120B model. This suggests that Codex-Spark’s comparatively lower speed may be a deliberate choice, reflecting the complexity and specific requirements of the coding model.
Beyond Nvidia: A Strategic Shift in Infrastructure
OpenAI’s move to incorporate Cerebras’ technology is part of a broader strategy to reduce its dependence on Nvidia. Over the past year, the company has forged partnerships with other major players in the chip industry. In October 2025, OpenAI signed a multi-year deal with AMD, and in November, it struck a $38 billion cloud computing agreement with Amazon. OpenAI is actively designing its own custom AI chip, with plans for fabrication by TSMC.
This diversification comes after a planned $100 billion infrastructure deal with Nvidia failed to materialize, though Nvidia has since committed to a $20 billion investment. Reuters reported that OpenAI expressed concerns about the speed of certain Nvidia chips for inference tasks – the incredibly type of workload that Codex-Spark is designed to excel at.
The Trade-offs of Speed
While speed is a crucial factor, OpenAI acknowledges that it may come at a cost. Developers who rely on AI coding assistants for real-time suggestions will appreciate the faster response times, but it’s important to remember that speed doesn’t necessarily equate to perfection. As the article notes, 1,000 tokens per second can feel like a “rip saw” rather than a carefully guided tool, requiring developers to remain vigilant and review the generated code carefully.
The move to utilize Cerebras’ WSE-3 and diversify its chip suppliers positions OpenAI to maintain a competitive edge in the rapidly evolving AI landscape. The company’s continued focus on reducing latency and improving the performance of its coding tools will be critical as AI coding assistants become increasingly integrated into the software development process. OpenAI is expected to provide further updates on its hardware strategy and the performance of Codex-Spark in the coming months.
Readers interested in learning more about OpenAI’s research and development efforts can visit the company’s official website for the latest news and updates.
