Caveman for Claude: Reduce Tokens by 70% Without Losing Quality

Most people think saving tokens in AI means sacrificing quality.

That’s wrong.

The caveman project proves the opposite:
you can reduce tokens by ~65–75% while keeping the same technical meaning intact.

This isn’t a writing trick.
It’s a compression strategy for language itself.

The Core Idea: Remove Grammar, Keep Meaning

Caveman works on a simple principle:

Remove predictable words. Keep unpredictable information.

In normal AI responses:

Fillers
Politeness
Explanations around explanations

All of this costs tokens — but adds little value.

Caveman strips that away.

Example from the repo:

Normal (69 tokens):
“Your React component is re-rendering because you’re creating a new object reference…”

Caveman (19 tokens):
“New object ref each render. Inline prop = new ref = re-render. Use useMemo.”

Same fix.
~70% fewer tokens.

What Caveman Actually Is

Caveman is not just a style — it’s a plugin + system behavior:

Forces Claude to respond in compressed language
Keeps technical accuracy intact
Supports multiple intensity levels
Includes tools like memory compression

It even compresses internal files like CLAUDE.md to reduce recurring token costs across sessions.

https://github.com/JuliusBrussee/caveman/tree/main Download and run on your pc.

Why It Works So Well

Traditional prompting assumes:

More context = better output

Caveman flips that:

Better signal = less noise

It works because:

LLMs already understand context deeply
Redundant grammar is predictable
Meaning lives in keywords, not sentences

This aligns with “semantic compression” — removing structure while preserving meaning.

Where Caveman Delivers Maximum Value

1. Coding & Debugging

Faster responses
Cleaner fixes
Less scrolling

Example:
“Auth bug. Expiry check wrong. Use < not <=.”

2. AI Agents & Automation

Token savings compound over time:

Every API call cheaper
Faster pipelines
Scalable systems

3. Memory Optimization

Caveman compresses project memory files:

~46% average reduction in stored tokens
Lower cost every session
Same instructions preserved

4. Commit Messages & Dev Workflows

Caveman enforces:

No fluff
Focus on “why”
Strict brevity

This improves clarity, not just cost.

The Hidden Insight Most People Miss

Caveman is not about “talking dumb.”

It’s about forcing precision.

When you remove:

filler
tone
repetition

You’re left with:

pure intent
pure logic

That often makes outputs better, not worse.

The Trade-Off (Be Honest About This)

Caveman is not universal.

It performs poorly when you need:

storytelling
emotional tone
persuasive writing
branding

Also:

Sometimes you spend more tokens instructing caveman mode than you save in output.

So the real play is:

Use caveman for execution
Use normal language for communication

Final Take

Caveman is part of a bigger shift:

From:

“How do I prompt better?”

To:

“How do I remove everything unnecessary?”

If you’re running:

AI workflows
automation systems
dev pipelines

Then caveman isn’t optional — it’s cost infrastructure.

graphicosmos

By thinking on behalf of clients every day we anticipate what they want provide what they need & build lasting relationships. These are the concept that shape our distinctive culture & differentiate.

Blogs

The Core Idea: Remove Grammar, Keep Meaning

What Caveman Actually Is

Why It Works So Well