Most people think saving tokens in AI means sacrificing quality.
That’s wrong.
The caveman project proves the opposite:
you can reduce tokens by ~65–75% while keeping the same technical meaning intact.
This isn’t a writing trick.
It’s a compression strategy for language itself.
The Core Idea: Remove Grammar, Keep Meaning
Caveman works on a simple principle:
Remove predictable words. Keep unpredictable information.
In normal AI responses:
- Fillers
- Politeness
- Explanations around explanations
All of this costs tokens — but adds little value.
Caveman strips that away.
Example from the repo:
Normal (69 tokens):
“Your React component is re-rendering because you’re creating a new object reference…”
Caveman (19 tokens):
“New object ref each render. Inline prop = new ref = re-render. Use useMemo.”
Same fix.
~70% fewer tokens.
What Caveman Actually Is
Caveman is not just a style — it’s a plugin + system behavior:
- Forces Claude to respond in compressed language
- Keeps technical accuracy intact
- Supports multiple intensity levels
- Includes tools like memory compression
It even compresses internal files like CLAUDE.md to reduce recurring token costs across sessions.
https://github.com/JuliusBrussee/caveman/tree/main Download and run on your pc.
Why It Works So Well
Traditional prompting assumes:
More context = better output
Caveman flips that:
Better signal = less noise
It works because:
- LLMs already understand context deeply
- Redundant grammar is predictable
- Meaning lives in keywords, not sentences
This aligns with “semantic compression” — removing structure while preserving meaning.
Where Caveman Delivers Maximum Value
1. Coding & Debugging
- Faster responses
- Cleaner fixes
- Less scrolling
Example:
“Auth bug. Expiry check wrong. Use < not <=.”
2. AI Agents & Automation
Token savings compound over time:
- Every API call cheaper
- Faster pipelines
- Scalable systems
3. Memory Optimization
Caveman compresses project memory files:
- ~46% average reduction in stored tokens
- Lower cost every session
- Same instructions preserved
4. Commit Messages & Dev Workflows
Caveman enforces:
- No fluff
- Focus on “why”
- Strict brevity
This improves clarity, not just cost.
The Hidden Insight Most People Miss
Caveman is not about “talking dumb.”
It’s about forcing precision.
When you remove:
- filler
- tone
- repetition
You’re left with:
- pure intent
- pure logic
That often makes outputs better, not worse.
The Trade-Off (Be Honest About This)
Caveman is not universal.
It performs poorly when you need:
- storytelling
- emotional tone
- persuasive writing
- branding
Also:
Sometimes you spend more tokens instructing caveman mode than you save in output.
So the real play is:
- Use caveman for execution
- Use normal language for communication
Final Take
Caveman is part of a bigger shift:
From:
“How do I prompt better?”
To:
“How do I remove everything unnecessary?”
If you’re running:
- AI workflows
- automation systems
- dev pipelines
Then caveman isn’t optional — it’s cost infrastructure.
