Litecoin

Claude Code is 20 times evaporated by empty space, and the official says, "Save it."

2026/04/04 02:28
👤ODAILY
🌐en

Claude Code caches all Bug records

Claude Code is 20 times evaporated by empty space, and the official says, "Save it."

4-17 per cent. This is Claude Code's reading rate for the past month. The normal level is 97-99%。

This means that when you revert to a previous session, Claude Code does not return to the context that was already processed before, but deals with the whole content from the top every time and consumes between 10 and 20 times the normal amount. You think you're continuing a conversation, and you're actually starting a whole new one every time。

This figure is based on proxy surveillance by the independent developer ArkNill. He recorded every request between Claude Code and Anthropic API by setting up a transparent agent, and found at least two client cache bugs that made the API server unable to match the cached dialogue prefix, forcing each round to do a complete token reconstruction。

The figure above shows a comparison of the three stages of the cache reading rate. During the period v2.1.69 to v.2.1.89 (i.e. Bug lifetime), only 4-17 per cent of the cache access rate is available for the Bandalone version. v2.1.90 After repairing one of the key bugs, the cold-started cache reading rate returned to 47-99.7 per cent. At v2.1.91, the steady-run cache reading rate has returned to 97-99%。

It is worth noting the detail in the graph: the range of v2.1.90 is very wide (47 to 99.7 per cent), because it still needs to be “preheated” when the conversation resumes, with low hits in previous rounds, but soon returns to normal levels. And in the Bug version, this preheat will never happen -- Cache reading will always stay on the 14,500 token of the system hints, and every conversation history will always be fully priced。


28 days, 20 versions

this bug isn't the kind of one that was introduced in one of the updates and the next. according to the release record of npm registry, the v2.1.69 introduced for bug was published on 4 march, and the v2.1.90 for repair bug was published on 1 april. 28 days between 20 versions。

The timeline reveals an interesting detail. When the bug was introduced on March 4, the user did not immediately file a mass complaint. It was not until 23 March that the complaints broke out, with an interval of almost three weeks. The reason for this is that, according to GitHub issue #41930, from 13 to 28 March, Anthropic went online to double sales (twice off-peak time), which objectively masked the influence of the bug. Once the promotion is completed, the consumption of the cache bug returns to the normal costing baseline, with the user's amount “evaporated” in an instant。

Anthropic's response didn't come fast. On 26 March, three days after the outbreak of the user complaint, engineer Tariq Shihipar announced on his personal account X that the ceiling for peak hours (5am-11am PT) had been tightened. On March 30th, Anthropic admitted on Reddit that the "user reached the limit at a much faster rate than expected" and stated that it had been given the highest team priority. It was not until April 1st that team member Lydia Hallie released the official findings。

Throughout the process, Anthropic did not publish any blog posts, sent no emails, and did not update the status page. All official communications are done only through the engineer ' s personal social media posts and a small number of Reddit comments。


How much did you pay? How long

GitHub issue #41930 brings together hundreds of user reports. The most extreme case is that of a Max 20x subscriber ($200/month), whose five-hour scroll window is completely depleted in 19 minutes. Max 5x user ($100/month) reports 5 hours window running out in 90 minutes. According to The Letter Two, a simple "hello" is used to consume 13% of the session quota. One Pro user ($20/month) said in Discord that his quota was "used on Monday and replaced on Saturday" and only 12 days in 30 days。

According to the ArkNill benchmark test, on the bug version v2.1.89, the 100% quota of the Max 20x program will be exhausted in about 70 minutes. He also calculated the cost of a single-resume operation for a 500K token context session at approximately 0.15, as the system would be fully reset。


"You took it wrong."

Lydia Hallie's findings confirm two points, one of which is that the maximum time limit has indeed been tightened and the other of the 1 million token context with an increased consumption of conversation. She claims that the team repaired some bugs, but emphasizes that "no one of the bugs resulted in extra fees"。

She then made four recommendations:

1. Use of Sonnet 4.6 instead of Opus (opus consumption rate approximately twice)

2. reduced reasoning or shut down when in-depth reasoning is not required

3. Do not resume a long session which has been idle for more than an hour and reopen the session

SETS THE ENVIRONMENT VARIABLE CLADE CODE AUTO COMPACT WINDOW=200000 TO LIMIT THE SIZE OF THE CONTEXT WINDOW。

No reference was made to any form of quota replacement or compensation。

AI podcast host Alex Volkov summed up the response as "You're holding it wrong" and pointed out that Anthropic itself set a 1 million token context as default, promoted Opus as a flagship model, sold as a selling point, and now advised the payer not to use these functions。

The "no more fees" claim is also in tension with Claude Code's own updated record. Just one day before Lydia issued its response, v2.1.90 repaired a cache returned bug from v2.1.69: when using --resume to resume the session, the request that should have hit the cache triggers the full prompt Cache miss at full price. This confirmed expense anomaly was not mentioned in Lydia's response。

In contrast, Codex of OpenAI had similar abnormal consumption problems before. OpenAI's approach is to reset the user quota, reissue the credit numbers and announce in March that the Codex ceiling will be removed. Anthropic ' s approach is to recommend that users downgrade models, shut down functions, limit context and attribute responsibility to the user ' s mode of use。

Anthropic sells subscriptions to the "Strength Model + Maximum Context + Maximum Logic Capability" for between $20 and $200 per month. A 28-day cache of bugs lets pay users evaporate at 10-20 times the rate, and the official response is to save you。

QQlink

無加密後門,無妥協。基於區塊鏈技術的去中心化社交和金融平台,讓私隱與自由回歸用戶手中。

© 2024 QQlink 研發團隊. 保留所有權利.