How Choke Points are Reshaping AI Economics

For the past two years, the narrative surrounding generative artificial intelligence has focused almost entirely on software: the brilliance of the architectures, the surprising emergence of reasoning, and the race to build the largest possible model. But as the industry moves from experimental prototypes to massive industrial deployment, the primary challenge has shifted from the digital to the physical. The era of frictionless scaling is over, and the AI supply crunch has arrived.

The industry is hitting a series of “choke points”—physical and resource-based limits that cannot be solved by writing better code. While the demand for intelligence is effectively infinite, the resources required to produce it—high-end semiconductors, stable electricity, and high-quality human data—are strictly finite. This shift is fundamentally altering the economics of AI, moving the competitive advantage away from those with the best algorithms and toward those who control the physical supply chain.

This transition marks a pivot from the “software era” of AI to an “infrastructure era.” In this new landscape, the most critical metrics are no longer just parameter counts or benchmark scores, but megawatts of power, hectares of data center land, and the availability of advanced packaging for chips.

The hardware bottleneck: Beyond the GPU

The most visible choke point has been the scarcity of high-end graphics processing units (GPUs). NVIDIA continues to dominate this space, with its H100 and H200 chips serving as the gold standard for training large language models. However, the bottleneck is not just the chips themselves, but the specialized manufacturing processes required to build them.

From Instagram — related to Taiwan Semiconductor Manufacturing Company

A critical constraint is “CoWoS” (Chip on Wafer on Substrate), a sophisticated advanced packaging technology provided primarily by Taiwan Semiconductor Manufacturing Company (TSMC). Without this packaging, the high-bandwidth memory cannot be integrated with the processor, rendering the chip useless for AI workloads. While TSMC has aggressively expanded its capacity, the lead times for these components remain a primary constraint for startups and sovereign AI projects attempting to build their own clusters.

The result is a tiered system of access. The “compute-rich”—a handful of hyperscalers like Microsoft, Google, and Meta—have secured long-term supply agreements and the capital to build massive clusters. The “compute-poor,” including most academic institutions and smaller startups, are finding themselves priced out or relegated to slower, older hardware, creating a widening gap in the ability to conduct frontier research.

The energy wall and the nuclear pivot

Even if the chip supply were solved tomorrow, the industry faces a more systemic crisis: the electrical grid. AI models are exponentially more energy-intensive than traditional cloud computing. A single generative AI query can require significantly more electricity than a standard Google search, and the data centers required to house these models are straining regional power grids to their limits.

The energy wall and the nuclear pivot
Microsoft

This energy demand has led to an unexpected convergence between Substantial Tech and the nuclear power industry. In a landmark move to secure a carbon-free, 24/7 energy source, Microsoft entered into a 20-year power purchase agreement with Constellation Energy to restart a reactor at the Three Mile Island nuclear plant in September 2024. This deal underscores a growing reality: the ability to scale AI is now directly tied to the ability to secure massive amounts of baseload power.

The constraints are not just about generation, but transmission. In hubs like Northern Virginia, the world’s largest data center market, the time it takes to get a new project connected to the power grid has stretched into years. This “gridlock” is forcing companies to look toward decentralized power solutions or relocate their infrastructure to regions with untapped energy reserves, regardless of where their engineers are located.

Comparing the Primary AI Choke Points

Key Constraints Affecting AI Scaling
Resource Primary Bottleneck Current Industry Solution
Compute Advanced packaging (CoWoS) Diversifying chip architecture; custom silicon (TPUs/LPUs)
Energy Grid capacity & baseload stability Nuclear restarts; Minor Modular Reactors (SMRs)
Data Exhaustion of high-quality text Synthetic data; licensing proprietary archives
Space Data center zoning & cooling Liquid cooling; edge computing deployment

The data depletion crisis

While chips and power are physical constraints, the industry is also hitting a “data wall.” Large language models (LLMs) have been trained on nearly the entire available corpus of high-quality, human-generated text on the public internet. Research from organizations like Epoch AI suggests that the stock of high-quality linguistic data may be exhausted in the coming years, potentially slowing the rate of model improvement.

This has sparked a desperate search for new fuel. Companies are now aggressively pursuing two paths: licensing private, high-value data from publishers and archives, and the creation of “synthetic data”—data generated by AI to train other AI. However, the use of synthetic data carries the risk of “model collapse,” a phenomenon where errors in the AI-generated data are amplified in subsequent generations, leading to a degradation in quality and an increase in hallucinations.

The economics of data are shifting from “scraping” to “buying.” The era of free data is ending, replaced by high-priced licensing deals that further consolidate power among the wealthiest AI labs who can afford to pay for premium datasets.

What this means for the economy

The AI supply crunch is transforming AI from a software-margin business into a capital-intensive infrastructure business. When the primary constraints are physical, the “moats” around companies are no longer just their intellectual property, but their physical assets. The companies that own the land, the power contracts, and the chip allocations hold the real leverage.

For the broader economy, this means the “democratization of AI” may be a misnomer. While the tools are accessible via API, the ability to create the next generation of frontier models is becoming restricted to an elite group of entities with the balance sheets to manage these physical choke points. This creates a systemic risk where a few companies control the foundational intelligence layers of the global economy.

Disclaimer: This article is for informational purposes only and does not constitute financial, investment, or legal advice.

The next critical marker for the industry will be the release of NVIDIA’s next-generation Blackwell architecture in volume and the subsequent reporting on whether the projected increases in efficiency can offset the rising costs of power and data. Industry analysts will be watching the quarterly earnings of major hyperscalers to see if capital expenditure on infrastructure continues to accelerate or if the physical limits of the grid begin to force a slowdown in deployment.

We want to hear from you. Is your organization feeling the impact of the compute or energy crunch? Share your thoughts in the comments or join the conversation on our social channels.

You may also like

Leave a Comment