To grasp the scale of this seismic shift, one must first look at what lies behind the models: the "token." It is the unit of account for AI, much like a brick of text or code. To give an idea of magnitude, 1,000 tokens represent approximately 750 words.

Consumption curves crossed in February 2026, according to data from OpenRouter. This global platform, which aggregates over 300 AI models for developers, now serves as the benchmark barometer for measuring real-world token usage on a global scale.

While American models (OpenAI, Google, Anthropic) maintain a lead in high-value-added tasks, Chinese production has shifted into a purely industrial dimension. According to Liu Liehong, head of the National Data Administration, daily token consumption in China now exceeds 140 trillion, more than 1,000 times the 100 billion processed daily in early 2024, and a 40% increase compared to the end of 2025.

The Agentic AI Tidal Wave


This explosion in consumption is due to the advent of agents. Unlike traditional chatbots, these programs are designed to execute unique but constant tasks, such as real-time calendar optimization or automated meeting report drafting. By following a precise decision path to fulfill a specific command, agentic AI becomes a true industrial production tool. The change in scale is brutal: while summarizing a simple text mobilizes about 30,000 tokens, a programming or data analysis mission entrusted to an agent can devour up to 20 million in a single iteration, according to the Financial Times.

In this context, the price per token becomes the sole arbiter. Developers have understood this well. Terry Zhang, a Hong Kong-based professional, explains that he has shifted 80% of his workload to the Kimi model (Moonshot), reserving American models only for critical tasks. This shift is paying off; since late February, Chinese models have occupied the top 3 global positions in terms of call volume on OpenRouter. The leader, MiniMax M2.5, posted a 476% growth in usage in one month.

Price War: The 1-to-10 Cost Differential Freezing the Market


The sinews of war remain the price. While Chinese models like DeepSeek V3.2 or MiniMax M2.5 are priced between $0.10 and $0.30 per million tokens, American competition operates on a different price scale. The recent GPT-5.4, launched in early March, or Claude 4.6 Sonnet, trade between $0.90 and $3 for input (the data sent to the AI, such as a question or document), and climb up to $15 for output (the response generated by the AI). This 1-to-10 ratio is freezing the market. Concretely, this gap means that, for the price of a single response generated by high-end American models, a company can afford 10 times more data volume with Chinese solutions, making competition nearly impossible on large-scale projects.

This Chinese lead is political. During the "Two Sessions" in March 2026 (Beijing's major annual legislative meeting), the government made "compute-electricity synergy" a priority. According to Liu Liehong, head of the National Data Administration, converting raw electricity into AI services increases its export value by 22x. To understand the stakes: where China earns $1 by selling raw electricity to its neighbors, it can collect up to $22 by using that same current to power its servers and sell AI responses (tokens) to international clients.

Energy Brute Force at the Service of Silicon

This strategy relies on an absence of energy constraints. According to SemiAnalysis, China has added the equivalent of the US electrical grid to its own grid since 2011, allowing it to prioritize scaling over efficiency. Huawei's CloudMatrix 384 illustrates this trade-off: unable to produce a chip as powerful as Nvidia's, China compensates with numbers by linking 384 processors via a 100% optical architecture. This system delivers a raw power of 300 PFLOPs (one petaflop corresponds to one quadrillion mathematical operations per second), nearly double Nvidia's GB200, but at the cost of 4.1x higher electricity consumption.

In a country where power is abundant, this inefficiency becomes an industrial weapon to support record token volumes. However, the airtightness of the hardware blockade is being questioned. The hypothesis of re-packaging networks for critical components (HBM memory) suggests a porosity that, combined with MoE architecture, would explain the technical viability of this mass production. This coupling of brute force and the gray market ensures the maintenance of rates 10 times lower than Silicon Valley, at the cost of an assumed record energy intensity to freeze the global market.

Hardware Independence Facing the Reliability Challenge


Finally, the strategic lock remains. To meet the competition, Alibaba unveiled its XuanTie C950 processor earlier this month, etched in 5 nanometers and specifically dedicated to AI agents. This processor is based on the RISC-V architecture, an open standard that allows China to bypass Western technology licenses. The group has also reorganized its teams within the Alibaba Token Hub (ATH), a new division led by CEO Eddie Wu.

But this race for efficiency carries major financial risks. In February 2026, Knowledge Atlas Technology, known as Zhipu AI and the first of the six Chinese "AI Tigers" to go public in Hong Kong in January, buckled under the weight of global demand. The investor penalty was brutal: a major service degradation followed by a nearly 23% drop in its stock price in a single session, the FT says. In a few hours, some $10bn in market capitalization went up in smoke, according to the South China Morning Post, illustrating the fragility of these ultra-optimized architectures in the face of record consumption peaks.

This crash, coupled with the ROME agent incident (Alibaba) which diverted GPU power to mine cryptocurrency, illustrates the potential security flaws of these low-cost systems. The objective nevertheless remains clear: to transform AI into a massive export service via platforms like Accio Work, which allows companies to entrust commercial tasks to AI agents. With daily consumption now exceeding 140 trillion tokens in China, the battle for control of the world's digital fuel has only just begun.