Goldman Sachs Deep Report: Coming Turning Point - Decoding the AI Agent Economy

The decline in underlying computing costs has outpaced the drop in token prices, and the inflection point for gross margins among ultra-large cloud vendors and large model providers may arrive in the next 3 to 12 months.

Written by: Bu Shuqing

Source: Wall Street Journal

Agentic AI is shifting the narrative in the artificial intelligence industry from a cost-oriented perspective to a profit-oriented one. Goldman Sachs believes that as token consumption is about to experience a leap in growth, the speed of decline in underlying computing costs has surpassed the drop in token pricing, and the inflection point for gross margins among ultra-large cloud vendors and large model providers may come in the next 3 to 12 months.

According to the Wind Trading Desk, Goldman Sachs released a report on May 5, stating that it expects that by 2030, AI agents on the consumer and enterprise sides will collectively drive a 24-fold increase in global token consumption compared to 2026 levels, reaching approximately 120 trillion tokens per month; if peak adoption is reached by 2040 for enterprise agents, this figure will further expand to 55 times.

Meanwhile, Goldman Sachs' inferred price and cost curve shows that the token pricing of mainstream large models has stabilized after a decline of about 40% per year and even shows slight increases in some cases, while the computing cost per token of chip-driven technologies like Nvidia, AMD, Google TPU, and Trainium continues to decline at a rate of 60% to 70% per year, creating profit space for the industry. The large-scale capital expenditures in AI infrastructure may gain more sustainable economic support due to improved profit margins.

Token Economics Inflection Point: Cost Declines Faster than Prices, Profit Space is Opening Up

The core argument of the Goldman Sachs report is that the AI industry is transitioning from a stage of "uncertain inference economics that could dilute profits" to a new stage of "token increments garnering attractive marginal profits."

In the first phase of the AI cycle, investors generally viewed computing power and tokens as cost drivers—more usage means more inference loads, more accelerators, more electricity, and higher capital expenditures. However, Goldman Sachs' inferred price and cost curve indicate that this logic is undergoing a transformation.

Although the token pricing of mainstream large models has significantly decreased, it has now stabilized, and in some situations, there has even been a rebound; meanwhile, the total cost per token of Nvidia, Google TPU (Broadcom), AMD, and Trainium (Marvell) continues to drop rapidly and consistently. If token pricing stabilizes at levels above token costs, the increase in the adoption rate of agentic AI will lead to positive profit expansion rather than just revenue growth.

Goldman Sachs further points out that agentic AI may form a self-reinforcing economic flywheel: lower computing costs per token generate richer, more complex agents; richer agents consume more tokens through longer contexts, more iterations, more validation, and continuous monitoring; higher utilization rates improve the economics of AI infrastructure, thereby supporting providers' ongoing investments in model quality and distribution capability. Goldman Sachs believes that this flywheel is drastically different from the mainstream narrative that "AI use will lead to unsustainable cost burdens."

However, Goldman Sachs also warns of risks: not all AI workloads can guarantee achieving a positive profit inflection point. For highly commoditized pure text chatbots, competition may still force the decline in token pricing to outpace the decrease in computing costs.

Consumer-side Agents: From Fragmented Conversations to "Constant" Assistants, Token Consumption Will Increase by 12 Times

Goldman Sachs estimates that by 2030, consumer-side AI agents can increase global token consumption by 12 times, adding approximately 60 trillion tokens per month.

The report categorizes consumer-side agents into two types: "on-demand" agents, like OpenAI Operator, Claude Code, and other browser-based agents, which autonomously plan, execute, and return results after users initiate tasks; and "constant" agents, like email monitoring, schedule management, or digital life assistants that continuously run in the background. Goldman Sachs believes that the biggest leap in token consumption will occur when agents transition from user-initiated tasks to continuous background operations—where agents continuously monitor context and take proactive actions when necessary.

From simulated data, ordinary LLM chatbots consume about 1,000 tokens per session, embedded Copilot consumes over 5,000 tokens daily, while constant-type agents can consume more than 100,000 tokens per day.

Goldman Sachs estimates that daily AI query volume will increase from approximately 5 billion in 2025 to around 23 billion by 2030, with up to 30% flowing into agents in areas like search, shopping, travel, email, and personal productivity. Meanwhile, the share of traditional search engines in query volume is expected to drop from 68% in 2025 to 36% in 2030, while LLM native applications' share will rise from 12% to 31%.

Enterprise-side Agents: Workflow Complexity Drives Token Intensity, Consumption May Reach 55 Times by 2040

Goldman Sachs expects that enterprise-side AI agents will become the largest token multiplier, driving a 24-fold increase in global token consumption by 2030 and further rising to 55 times at peak adoption in 2040, at which point enterprise workloads will account for more than 70% of total global token usage.

The reason enterprise-side agents have more token intensity than consumer-side agents is due to the complexity and precision of operations required by the workflows—monitoring tasks, retrieving context, inferring anomalies, validating outputs, updating systems, and continuously reporting issues throughout the workday. Additionally, enterprise agents often involve heavier multimodal inputs (voice, images, documents, screen activities, application data, logs, and structured system records), which significantly increases token intensity.

Goldman Sachs quantified the token consumption of simulated agents across different professions.

The results indicate that programming agents consume about 7 million tokens daily, API costs around $13/day, significantly lower than labor costs, explaining why agent adoption is fastest in software development; call center agents consume about 2 million tokens daily, but if relying on real-time voice processing, costs can soar to $92/day, making full voice automation economically non-competitive; data entry agents consume about 25 million tokens daily, costing about $60/day, still lower than labor costs.

Goldman Sachs points out that the adoption rate of enterprise-side agents will depend on four variables: token volume, API costs, modal combinations, and implementation complexity. Workflows that are text-based and have a mature tool ecosystem will scale first; those primarily focused on voice or deeply integrated with backend systems may progress more slowly.

From the adoption curve perspective, Goldman Sachs believes that enterprise-side agentic AI is most likely to follow an S-curve, with a peak adoption rate of about 35% to 40% among knowledge workers, reaching peak time in approximately 15 years, faster than the median for historical technology diffusion (29 years).

Sustainability of Capital Expenditures: Profit Improvement Provides Greater Space for Ultra-large Cloud Vendors

A key investment conclusion from Goldman Sachs' report is that improving profit margins for ultra-large cloud vendors will make the current high infrastructure investments more sustainable, thereby alleviating market concerns over the return on AI capital expenditure.

The report notes that operators are still supply-constrained in meeting current and future computing demands, with both Google and Meta raising their capital expenditure forecasts for the fiscal year 2026. Amazon's management also reiterated their strategy of maintaining high capital expenditures after their first-quarter earnings report. Goldman Sachs expects that as the profit inflection point approaches, investors will increasingly seek evidence of return visibility.

Regarding specific targets, Goldman Sachs’ core logic for Amazon lies in the resurgence of AWS revenue growth (28% year-on-year in the first quarter) and a backlog of $364 billion in orders; their view on Google is based on its cloud business growing 63% year-on-year in the first quarter, with a nearly doubled backlog to about $460 billion; their judgment on Meta is based on its advertising business growth significantly exceeding that of the overall digital advertising industry, and the ongoing contribution of AI computing power in enhancing user engagement and monetizing advertisements.

In the software sector, Goldman Sachs believes that lower token costs make it easier for software vendors to embed agents into existing products without significantly impacting gross margins, while also supporting pricing based on outcomes, productivity, or work units rather than merely seat counts, thereby expanding the addressable market for software. For IT service companies, as agents transition AI consumption from standalone tools to enterprise-level, high-integration workflow transformations, the demand for integration, governance, and orchestration will significantly increase, with Accenture seen as a major beneficiary of this trend.

免责声明：本文章仅代表作者个人观点，不代表本平台的立场和观点。本文章仅供信息分享，不构成对任何人的任何投资建议。用户与作者之间的任何争议，与本平台无关。如网页中刊载的文章或图片涉及侵权，请提供相关的权利证明和身份证明发送邮件到support@aicoin.com，本平台相关工作人员将会进行核查。

Goldman Sachs Deep Report: Coming Turning Point - Decoding the AI Agent Economy

Token Economics Inflection Point: Cost Declines Faster than Prices, Profit Space is Opening Up

Consumer-side Agents: From Fragmented Conversations to "Constant" Assistants, Token Consumption Will Increase by 12 Times

Enterprise-side Agents: Workflow Complexity Drives Token Intensity, Consumption May Reach 55 Times by 2040

Sustainability of Capital Expenditures: Profit Improvement Provides Greater Space for Ultra-large Cloud Vendors

Selected Articles by Foresight News

Table of Contents

Related Articles