Skip to content
OBLAIDISH NEWS
PromptCrunch cuts input token costs 75% for long LLM chats
TX_568123AI

PromptCrunch cuts input token costs 75% for long LLM chats

PromptCrunch, a drop-in proxy, trims input tokens by up to 75% for long Claude Code sessions, reducing costs from $0.18 to $0.05 per session [Dev.to].

PromptCrunch, a drop-in proxy, launched on June 15, 2026, reducing input tokens by up to 75% for long Claude Code sessions [Dev.to]. It removes superseded code fragments, collapses stale tool output, and replaces older turns with concise summaries. In a benchmark, a 30-turn Claude Code session consumed 3 k tokens after PromptCrunch's optimization, down from 12 k tokens, a 75% cut [Dev.to]. With provider caching active, the additional savings were 7-10%.

Claude Code's input price is $0.015 per 1 k tokens [Anthropic Pricing]. The same session's bill drops from $0.18 to $0.05, a $0.13 saving per session. Setup requires only two lines of code and the user's API key passes straight through, so no credentials are stored by PromptCrunch.

PromptCrunch's optimization has several benefits. It provides cost predictability for multi-turn agents, as the input bill no longer balloons with conversation history. It also enables budget-friendly experimentation, as the free $5 credit and pay-as-you-go model let teams prototype long-running sessions without fearing runaway costs. Additionally, it shifts the optimization responsibility from the provider to the client, allowing developers to control the optimization horizon. While this introduces a new failure point and a modest latency penalty, the savings outweigh the downsides for most internal tooling and exploratory projects.

operator_channel
[ comments_offline · provider_not_configured ]
transmission_log

Subscribe to the broadcast.

Daily digest of the day's most important tech news. No fluff. Engineering signal only.

// delivered via substack · double-opt-in confirmation