Claude Acknowledges Excessive Charges, Users Report Up to 20x Overbilling

Claude Acknowledges Billing Issues

The issue of excessive charges in Claude Code is not an isolated experience. Following a wave of complaints from Reddit users about Claude Code’s excessive billing, Anthropic has finally responded:

We have noticed that users are reaching the usage limits in Claude Code much faster than expected. The team is urgently investigating this issue, which is currently our highest priority, and we will update you as soon as possible.

In short, there is a problem, and it is significant, and they are working on it.

Interestingly, many users do not perceive this as a genuine official response, but rather as a forced admission.

The Trigger

The situation escalated when a user reverse-engineered Claude Code and discovered two independent bugs, posting their findings on Reddit. These bugs can cause the prompt cache to fail, leading to inflated billing costs without the user’s awareness, potentially increasing costs by 10 to 20 times.

So, it’s not that users are overusing the service; rather, Claude Code is “secretly overcharging.”

Insufficient Plans

Tokens are often likened to utilities in the new era, but using them feels more like early phone bills or data packages: it seems like you haven’t used much, yet it’s never enough.

A subscription costing over $100 a month is already not cheap, and now it seems they are charging excessively. A simple greeting can consume 13% of the quota.

Working for just 11 minutes can exhaust 23% of the usage.

The most outrageous case reported is that a single prompt can consume 31% of the quota.

Even the highest-tier $200 monthly plan doesn’t fare much better, reaching its limit in just three and a half hours.

A user subscribed to Claude Pro (annual fee of $200) complained on Discord:

I usually hit my limit by Monday and have to wait until Saturday for it to reset… I can only use it for about 12 days out of 30.

At this rate, Claude Code has become nearly “unusable” in just a couple of days. For those who rely on AI for work, the inability to use it is more frustrating than the financial cost.

This issue isn’t unique to Anthropic; many users report that the major providers often exhaust monthly quotas within the first three weeks, even without heavy usage.

Systemic Problem

However, this time, Claude Code seems to have crossed a line—Reddit is flooded with complaints about this issue.

It’s clear now that the issue of “not being durable” is not just an individual experience but a systemic problem. It’s no wonder that Anthropic felt compelled to clarify the situation.

According to analysis by The Register, there may be three main reasons for this issue:

Last week, Anthropic announced that they would reduce quotas during peak times. This means that during peak hours, the same usage behavior corresponds to a lower available quota, making users feel like they are using it faster.
- Peak hours are from 08:00 to 14:00 ET on weekdays.
March 28 was also the last day of a promotional event for Claude. During non-peak hours, users could double their usage quota, but now that the promotion has ended, users’ quotas have reverted to normal levels, leading to a noticeable reduction in available usage.
The two bugs discovered by Reddit users:
- The sentinel replacement mechanism in the independent binary disrupts the billing logic when conversations involve it, breaking the cache.
- The resume parameter always leads to cache failure (since v2.1.69).

These issues mean that the prompt cache cannot function properly, resulting in repeated calculations for the same requests, which can inflate token consumption passively by over 10 times.

Some users have reported that downgrading to an older version has improved their experience significantly.

Downgrading to 2.1.34 made a noticeable difference. Have any users tried this? Let’s discuss.

Continuity Over Capability

In essence, the current issue with Claude is not about the strength of the model but whether users can rely on it consistently.

The better the model, the more it is used, and the more critical it becomes for workflows. While users may feel the pinch of rising costs, many are still willing to pay for high-quality responses, and the capabilities of A’s model are well recognized.

However, the problem lies in the rising consumption without an accompanying improvement in experience; costs are increasing, but user feedback is not visible, and ongoing delivery is lagging.

Users have pointed out that Anthropic’s customer service struggles with even the most basic token management.

More critically, users have been providing feedback for several days, and the bugs were identified two days ago, yet the response was only to say they were investigating.

In contrast, competitors like OpenClaw are making regular updates, often fixing issues overnight.

This raises a very real question: in the age of AI, the capability of the model may no longer be the rarest commodity. What is more scarce is the ability to deliver consistently, respond quickly to users, and take feedback seriously.

By the way, if Claude’s code is all AI-written, perhaps they should hire more people for customer support.