CLI Auth in 28 Minutes
A complex CLI authentication system planned and built in a single session with 97.7% prompt cache hit rate.
Blueprint + Implementation in 28 Minutes on Opus — Cost Analysis
A complex CLI authentication system — OAuth 2.1 PKCE, token management, slash commands, shell hooks, and a curl installer — planned and built in a single session with 97.7% prompt cache hit rate.

What Happened
A developer needed to add CLI authentication to an existing plugin suite. This involved:
- Planning the architecture (OAuth flow, token storage, command structure, installer packaging)
- Reading 15+ files across 2 repositories to understand existing patterns
- Running 8 code graph queries for impact analysis
- Answering 4 interactive scope questions during planning
- Handling 2 mid-planning scope expansions from the developer
- Writing a 500-line blueprint with per-file implementation specs
- Implementing 8 files (807 lines of code) across 2 repositories
- Verifying all files with syntax checks and functional tests
Total time: 28 minutes and 8 seconds. Session cost: ~$7.75 (Claude Code UI) / $12.17 (Opus 4.6 API rates).
The Task
Build a CLI authentication system for a Claude Code plugin marketplace:
| Component | What it does |
|---|---|
composure-auth.mjs (359 lines) | OAuth 2.1 PKCE flow — generates challenge, opens browser, runs localhost callback server, exchanges code for tokens |
composure-token.mjs (206 lines) | Token read/write/refresh utilities — dual-mode (CLI + importable module) |
commands/auth.md (57 lines) | Unified /auth slash command with login/logout/status/upgrade subcommands |
hooks/auth-check.sh (43 lines) | SessionStart hook — checks auth status, attempts silent token refresh |
public/install.sh (110 lines) | Curl one-liner installer — prerequisite checks, marketplace registration, plugin install |
api/v1/install/route.ts (32 lines) | API endpoint serving the install script with caching headers |
hooks/hooks.json | Edited — registered auth-check hook in position #2 of SessionStart |
plugin.json (x2) | Edited — added bin/ entries, commands/ path, version bump to 1.37.0 |
The blueprint also produced a detailed Phase 2 plan for repo split and content gating — planned but not yet executed.
Session Metrics
Timeline
| Milestone | Time (ET) | Elapsed |
|---|---|---|
| Blueprint invoked | 21:02:05 | 0:00 |
| Graph scan + findings presented | 21:10:41 | 8m 36s |
| Scope expansion addressed | 21:14:56 | 12m 51s |
| Blueprint document written (500 lines) | 21:19:50 | 17m 45s |
| Refinements applied | 21:24:31 | 22m 26s |
| Implementation started | 21:24:46 | 22m 41s |
| Phase 1 complete (8 files built + verified) | 21:30:13 | 28m 08s |
Token Usage
| Category | Tokens | Notes |
|---|---|---|
| Input (uncached) | 295 | New content each turn (tool results, user answers) |
| Output | 46,632 | Claude's responses, code generation, tool calls |
| Cache read | 16,992,855 | Context reused from prior turns |
| Cache creation | 400,665 | New context added to cache |
| Peak context | 146,297 | Size of the conversation at the last turn |
| API turns | 191 | Individual API calls during the blueprint window |
Operations
| Operation | Count |
|---|---|
| Files read (Read tool) | ~15 |
| Code graph queries (MCP) | 8 |
| Files created (Write tool) | 8 |
| Files edited (Edit tool) | 3 |
| Interactive questions (AskUserQuestion) | 6 |
| Agent spawns | 1 (research, 3m 47s) |
| Lines of code written | 807 |
| Blueprint document lines | ~500 |
Cost Breakdown
Actual Cost (with prompt caching, Opus 4.6)
| Category | Tokens | Rate | Cost |
|---|---|---|---|
| Input (uncached) | 295 | $5.00/MTok | $0.00 |
| Output | 46,632 | $25.00/MTok | $1.17 |
| Cache read | 16,992,855 | $0.50/MTok | $8.50 |
| Cache write (5m) | 400,665 | $6.25/MTok | $2.50 |
| Total | $12.17 |
The $7.75 figure shown in Claude Code's UI reflects subscription pricing. The token counts above are from the session JSONL file, priced at Opus 4.6 API rates (2026). The important comparison is the ratio between cached and uncached costs.
Hypothetical Cost (without caching)
If every turn sent the full context as uncached input:
| Category | Tokens | Rate | Cost |
|---|---|---|---|
| All input (no cache) | 17,393,815 | $5.00/MTok | $86.97 |
| Output | 46,632 | $25.00/MTok | $1.17 |
| Total | $88.14 |
Cache Savings
| Metric | Value |
|---|---|
| Cache hit rate | 97.7% |
| Tokens served from cache | 16,992,855 |
| Cost with caching | $12.17 |
| Cost without caching | $88.14 |
| Savings | $75.97 (86% reduction) |
Why This Is Efficient
1. Direct file reads, not agent spawns
Composure's graph tells Claude exactly which files to read. Instead of spawning explore agents ($0.15+ per spawn, fresh context each time), files are read directly into the main conversation (~$0.005 per file). The crucial difference: files read into the main context get cached on subsequent turns. Agent contexts are ephemeral — they start fresh, can't reuse the parent's cache, and their results must be summarized back.
| Approach | 15 files read | Context reuse |
|---|---|---|
| Direct Read | ~$0.08 total | Cached for all 191 turns at $0.50/MTok |
| Agent spawns | ~$2.25+ total (15 x $0.15) | Zero — each spawn starts cold |
2. Graph-first discovery
The code graph answered structural questions in milliseconds that would otherwise require multiple file searches:
- "What files reference
no-bandaids.json?" → 89 results, instant - "What imports
plugin install?" → 16 results with file context - "What's the blast radius of changing
plugin.json?" → dependency map
Without the graph, these questions require recursive grep + file reads + manual dependency tracing — easily 20+ additional tool calls per question.
3. Prompt caching compounds over long conversations
Each API turn reuses the conversation history from cache. In a 191-turn session with ~140K context, the cache savings compound:
- Turn 1: ~14K tokens (no cache yet)
- Turn 50: ~50K cached, ~500 new
- Turn 100: ~77K cached, ~300 new
- Turn 191: ~143K cached, ~3K new
The longer the conversation runs (without agent spawns fragmenting the context), the higher the cache hit rate climbs. By the end, 97.7% of every API call's input was served from cache at 90% discount.
4. Blueprint prevents rework
The 17m 45s spent planning (graph scan, scope questions, impact analysis, per-file specs) prevented implementation missteps. Every file created matched the spec on the first attempt — no backtracking, no "actually that should be different" rewrites. Planning is cheap in tokens; rework is expensive.
What "28 Minutes" Actually Contains
This wasn't a simple "write 8 files" task. The session included:
- Project analysis — Read the hub structure, identified 2 repos, understood the existing OAuth backend
- Graph scan — 8 queries across a 4,155-node code graph (563 files, 9,219 edges)
- External research — Agent spawn to research CodeRabbit CLI packaging and Claude Code plugin commands
- 4 interactive scope decisions — Auth runtime, command naming, distribution model, token storage location
- 2 scope expansions — Developer added repo split requirement and config consolidation mid-planning
- Impact analysis — 89-file blast radius check for config references, 16-file install reference check
- 500-line blueprint — Per-file implementation specs, preservation boundaries, risks with mitigations, verification scenarios
- 2 blueprint refinements — Repo consolidation (3 repos → 2), installer URL flexibility
- 8 task creation with dependency graph (blocked-by relationships)
- 8 files written — 807 lines of production code (OAuth PKCE, token management, shell hooks, installer)
- Full verification — Syntax checks, JSON validation, functional tests, typecheck
Analyze Your Session Costs
Parse Claude Code session files to extract token usage, cache efficiency, and cost breakdowns with reproducible scripts.
User Dashboard in 30 Minutes
28 files created and edited, 1,831 lines of production code — blueprint planned and implemented in 30 minutes on Opus 4.6 with 98.5% cache hit rate.