The Cost Dilemma of AI: How Infrastructure Economics Will Reshape the Next Phase of the Market

Source:International Business Times UK

Translated and Compiled by:Gonka.ai

Artificial Intelligence, Humanoid Robot

AI is expanding at an astonishing rate, but its underlying economic logic is far more fragile than it appears on the surface. When three cloud giants control two-thirds of the world's computing power, when training costs surge toward 100 million dollars, and when inference bills catch startups off guard — the true cost of this computing arms race is quietly reshaping the entire value distribution of the AI industry.

This article does not discuss who will build the most advanced models. Instead, it explores a more fundamental question: Is the current economic model of AI infrastructure truly sustainable after scaling? How will the transformation of the computing distribution mechanism reshape the value distribution of the entire market?

1. The Hidden Costs of Intelligence

Training a cutting-edge large model often requires tens of millions or even hundreds of millions of dollars. Anthropic has publicly stated that the cost of training Claude 3.5 Sonnet is “tens of millions of dollars”, while its CEO Dario Amodei previously estimated that the training cost for the next generation model may approach 1 billion dollars. According to industry media reports, the training cost of GPT-4 may have exceeded 100 million dollars.

However, training costs are just the tip of the iceberg. The real structural pressure comes from inference costs — that is, the expenses incurred every time the model is called. According to OpenAI's publicly available API pricing, inference is charged per million tokens. For high-usage applications, this means that even before scaling, daily inference costs could already reach thousands of dollars.

AI is often described as a type of software. However, its economic nature is increasingly resembling that of a capital-intensive infrastructure — requiring both high initial investments and ongoing operating expenses.

This shift in economic structure is quietly changing the competitive landscape of the entire AI industry. Those who can afford computing power are the giants that have already established large-scale infrastructure; while those small startups struggling to survive are being gradually eroded by inference bills.

2. Capital Intensity and Market Concentration

According to Holori's 2026 Cloud Market Analysis, AWS currently holds about 33% of the global cloud market share, Microsoft Azure approximately 22%, and Google Cloud about 11%. These three companies together control roughly two-thirds of the global cloud infrastructure, on which the vast majority of AI workloads run.

The real significance of this concentration is: when OpenAI's API goes down, thousands of products are affected simultaneously; when a major cloud service provider experiences a failure, services across industries and regions come to a halt.

Concentration is not narrowing; rather, infrastructure spending continues to expand. Taking Nvidia as an example, its data center business has an annual revenue exceeding 80 billion dollars, reflecting the strong ongoing demand for high-performance GPUs.

More concerning is a hidden structural inequality. According to SEC filings and market reports, leading labs like OpenAI and Anthropic can secure GPU resources at close-to-cost rates of $1.30–$1.90 per hour through multi-billion dollar “equity-for-computing” agreements. In contrast, medium and small enterprises without strategic partnerships with Nvidia, Microsoft, or Amazon are forced to purchase at retail prices exceeding $14 per hour — a premium of up to 600%.

This pricing gap has been driven by Nvidia's recent $40 billion strategic investments in leading labs. The access to AI infrastructure is increasingly determined by capital-intensive procurement agreements rather than open market competition.

During the early adoption phase, this concentration may seem "efficient." However, after scaling, it brings pricing risks, supply bottlenecks, and infrastructure dependence — a triple vulnerability compounded.

3. The Overlooked Energy Dimension

The cost issue of AI infrastructure has another often-overlooked dimension: energy.

According to data from the International Energy Agency (IEA), data centers currently account for about 1–1.5% of global electricity consumption, while AI-driven demand growth may significantly raise this percentage in the coming years.

This means that the economics of computing power cannot be solely seen as a financial issue but also as an infrastructure and energy challenge. As AI workloads continue to expand, the geopolitical significance of electric power supply will become increasingly prominent — which country can provide the most stable computing power at the lowest energy cost will gain a structural advantage in industrial competition in the AI age.

When Jensen Huang announced that Nvidia's order visibility surpassed 1 trillion dollars at GTC26, he described not just the commercial success of a single company but a grand process where civilization is transforming electricity, land, and scarce minerals into intelligent computing power.

4. Rethinking Infrastructure Mechanisms

While centralized data centers continue to expand, another type of exploration is emerging quietly — attempting to fundamentally redefine how computing power resources are coordinated.

Decentralized Inference: A Structural Alternative

Gonka protocol is a representative practice in this direction. It is a decentralized network designed for AI inference, with the core design goal of minimizing network synchronization and consensus overhead while directing as many computing resources as possible to real AI workloads.

At the governance level, Gonka adopts the principle of “one computing unit, one vote” — governance weight is determined by verifiable computing contributions rather than capital holdings. At the technical level, the protocol employs short-cycle performance measurement intervals (known as Sprints) that require participants to demonstrate real GPU computing power in real-time through a transformer-based proof of work (PoW) mechanism.

The significance of this design is that nearly 100% of network computing power is directed toward AI inference workloads themselves, rather than being consumed in maintaining consensus, coordinating communications, and other infrastructure expenses.

The Economic Logic of Distributed Computing Power

From an economic perspective, the value proposition of decentralized computing networks has three levels.

The first is the cost layer. The pricing structure of centralized cloud service providers inherently includes substantial depreciation of fixed assets, data center operating costs, and shareholder profit expectations. Decentralized networks can significantly reduce this cost by monetizing idle GPU resources. For example, the inference services offered through Gonka's USD billing gateway GonkaGate are priced at about $0.0009 per million tokens — whereas centralized service providers like Together AI charge approximately $1.50 for similar models (such as DeepSeek-R1), demonstrating a gap exceeding a thousandfold.

The second is the supply elasticity layer. The computing power supply from centralized providers is rigid, with scaling cycles measured in months or even quarters. Participants in decentralized networks can flexibly join or exit according to demand fluctuations, theoretically providing a quicker response to demand peaks — just as Amazon Web Services emerged in response to peak holiday traffic demands, the peaks and valleys of AI inference also require elastic infrastructure to accommodate them.

The third is the sovereignty layer. This dimension is particularly pronounced from the perspective of sovereign nations. When a government's public services are deeply reliant on an external cloud service provider, that computing dependency becomes a strategic vulnerability. Decentralized networks offer a potential solution: local data centers can function as nodes connecting to a global distributed network, ensuring data sovereignty while earning sustainable business returns through providing computing power to the global market.

5. The Moment of Value Distribution Reconstruction

Returning to the core question at the beginning of the article: Is the current economic model of AI infrastructure sustainable after scaling?

The answer is: sustainable for the leading players; increasingly unsustainable for everyone else.

AWS, Azure, and Google Cloud have built a moat through decades of capital accumulation, and their scale advantage is nearly unshakable in the short term. However, this structural advantage also means that pricing power, data access, and infrastructure reliance are highly concentrated in a few private entities.

Historically, each major technological infrastructure monopoly has eventually led to the emergence of alternative distributed architectures — the internet itself was a rebellion against telecom monopolies, BitTorrent disrupted centralized content distribution, and Bitcoin challenged the concentration of currency issuance.

The decentralization of AI infrastructure may not be an ideological choice but an economic inevitability — when the costs of centralization are high enough to drive large-scale user migration, the demand for alternatives will genuinely explode. Jensen Huang's analogy that “every financial crisis drives more people toward Bitcoin” similarly applies to the computing market.

The emergence of DeepSeek has already proven one thing: In a world where the capabilities of open-source models approach those of closed-source frontiers, inference costs will become the core variable determining the speed of scaling AI applications. The one who can provide the lowest cost and highest availability of inference computing power will hold the ticket to this competition.

Conclusion: The Infrastructure War Has Just Begun

The next stage of competition in AI will not be decided by a leaderboard of model capabilities but will reveal its true nature in the economic games of infrastructure.

Concentrated giants of computing power hold capital and scale advantages but also bear the burden of fixed cost structures and pricing pressures. Decentralized networks are entering the market with extremely low marginal costs but need to prove that they can meet real business thresholds in stability, usability, and ecological scale.

The two paths will coexist long-term and exert mutual pressure. The tension between centralization and decentralization will be one of the most important structural themes to continuously track in the AI industry over the next five years.

This infrastructure war has just begun.

About the Author

Anastasia Matveeva is a senior product manager and researcher at Product Science, and is also one of the co-founders of the Gonka protocol. Her research focuses on machine learning infrastructure, large language model inference, and distributed computing systems.

She graduated with a PhD in Mathematics from the Polytechnic University of Catalonia (UPC Barcelona) and has worked as a researcher and lecturer at the same institution. Since joining Product Science in 2021, she has led the development of a set of AI engineering tools currently adopted by over a hundred engineers and utilized by multiple Fortune 500 companies.

About Gonka.ai

Gonka is a decentralized network aimed at providing efficient AI computing power, designed to maximize the utilization of global GPU resources for meaningful AI workloads. By eliminating centralized gatekeepers, Gonka offers developers and researchers permissionless access to computing resources while rewarding all participants with its native token GNK.

Gonka is incubated by American AI developer Product Science Inc., founded by industry veterans from Web 2, including former core product director of Snap Inc. Liberman siblings. The company successfully raised $18 million in 2023 and plans to raise an additional $51 million in 2025, with investors including OpenAI backer Coatue Management, Solana backer Slow Ventures, Bitfury, K 5, Insight and Benchmark Partners, among others. Early contributors to the project include renowned leading companies in the Web 2-Web 3 space such as 6 blocks, Hard Yaka, and Gcore.

免责声明：本文章仅代表作者个人观点，不代表本平台的立场和观点。本文章仅供信息分享，不构成对任何人的任何投资建议。用户与作者之间的任何争议，与本平台无关。如网页中刊载的文章或图片涉及侵权，请提供相关的权利证明和身份证明发送邮件到support@aicoin.com，本平台相关工作人员将会进行核查。