深潮TechFlow
深潮TechFlow|Feb 11, 2026 01:51
The next earthquake of AI: Why is the real danger not SaaS killers, but the computing revolution? 】 Recently, the entire technology and investment communities have been closely watching the same thing: how AI applications are "killing" traditional SaaS. Ever since Claude Cowork from @ AnthropicAI demonstrated how it can easily help you write emails, create PPTs, and analyze Excel spreadsheets, a panic about 'software is dead' began to spread. This is indeed scary, but if your gaze only stays here, you may have missed the real earthquake. It's like all of us looking up at the drone aerial combat in the sky, but no one notices that the entire continental plate beneath our feet is quietly moving. The real storm is hidden beneath the water, in a corner that most people cannot see: the computing power foundation that supports the entire AI world is undergoing a 'silent revolution'. And this revolution may bring an end to the grand party carefully organized by NVIDIA @ nvidia, the AI shovel seller, earlier than anyone could have imagined. The two intersecting paths of revolution are not a single event, but are intertwined by two seemingly independent technological routes. They are like two armies besieging each other, forming a pincer attack on Nvidia's GPU dominance. The first path is the algorithmic slimming revolution. Have you ever thought that a super brain really needs to mobilize all brain cells when thinking about problems? Obviously not necessary. DeepSeek understood this matter and developed the architecture of MoE (Mixed Expert Model). You can imagine it as a company with hundreds of experts from different fields. But every time you have a meeting to solve a problem, you only need to invite the most relevant two or three people, instead of having everyone brainstorm together. This is the cleverness of MoE: it enables a large model to activate only a small portion of "experts" during each computation, greatly saving computing power. What will be the outcome? The DeepSeek-V2 model has a nominal number of 236 billion "experts" (parameters), but each time it works, only 21 billion of them need to be activated, which is less than 9% of the total. And its performance is comparable to the GPT-4, which requires 100% full power operation. What does this mean? The ability of AI is decoupled from the computing power it consumes! In the past, we all assumed that the stronger the AI, the more cards it burned. Now, DeepSeek tells us that with clever algorithms, the same effect can be achieved at one tenth of the cost. This is equivalent to directly putting a huge question mark on the essential attributes of Nvidia GPUs. The second path is a hardware "lane change" revolution. AI work is divided into two stages: training and reasoning. Training is like going to school, requiring reading tens of thousands of books. At this time, GPU, a parallel computing card that can perform miracles with great power, is really useful. But reasoning, like our daily use of AI, places more emphasis on reaction speed. GPU has a natural flaw in inference: its memory (HBM) is external, and there is latency when data comes back and forth. It's like a chef, where ingredients are stored in the refrigerator next door, and every time they stir fry, they have to run over to get them, no matter how fast they are. And companies like Cerebras and Groq have started anew, designing dedicated inference chips that solder memory (SRAM) directly onto the chip, with ingredients at hand, achieving "zero latency" access. The market has already voted with real money. OpenAI, while complaining about Nvidia's GPU inference capabilities, turned around and signed a $10 billion contract with Cerebras specifically to rent their inference services. Nvidia also panicked and spent $20 billion on the Groq in order to not fall behind on this new track. When two roads intersect: the cost avalanche is over, now let's put these two things together: using an algorithmic "skinny" DeepSeek model and running it on a hardware "zero latency" Cerebras chip. What will happen? A cost avalanche. Firstly, the slimmed down model is very small and can be fully loaded into the built-in memory of the chip at once. Secondly, without the bottleneck of external memory, AI's response speed will be astonishingly fast. The final result is that the training cost has decreased by 90% due to the MoE architecture, and the inference cost has decreased by another order of magnitude due to dedicated hardware and sparse computing. Calculated, the total cost of owning and running a world-class AI may only be 10% -15% of traditional GPU solutions. This is not an improvement, this is a paradigm shift. The throne of NVIDIA is quietly being pulled off the carpet, and now you should understand why this is more deadly than the Cowork panic. NVIDIA's trillion dollar market value today is built on a simple story: AI is the future, and the future of AI must rely on my GPU. But now, the foundation of this story is being shaken. In the training market, even if Nvidia continues to monopolize, if customers can work with only one tenth of their cards, the overall size of this market may shrink significantly. In the inference market, which is ten times larger than training, NVIDIA not only does not have an absolute advantage, but also faces the encirclement of various immortals such as Google and Cerebras. Even its biggest client OpenAI is defecting. Once Wall Street realizes that Nvidia's "shovel" is no longer the only or even the best option, what will happen to the valuation built on the expectation of a "permanent monopoly"? I think everyone is very clear about it. So, the biggest black swan in the next six months may not be any AI application killing someone, but a seemingly insignificant technology news: such as a new paper on the efficiency of MoE algorithm or a report showing the significant increase in market share of dedicated inference chips, quietly announcing that the computing power war has entered a new stage. When the shovel of the 'shovel seller' is no longer the only choice, his golden age may also come to an end.
Share To

HotFlash

APP

X

Telegram

Facebook

Reddit

CopyLink

Hot Reads