ompsure

CLI Auth in 28 Minutes

A complex CLI authentication system planned and built in a single session with 97.7% prompt cache hit rate.

Blueprint + Implementation in 28 Minutes on Opus — Cost Analysis

A complex CLI authentication system — OAuth 2.1 PKCE, token management, slash commands, shell hooks, and a curl installer — planned and built in a single session with 97.7% prompt cache hit rate.

Blueprint cost efficiency — 28 minutes, 807 lines, 97.7% cache hit rate

What Happened

A developer needed to add CLI authentication to an existing plugin suite. This involved:

  • Planning the architecture (OAuth flow, token storage, command structure, installer packaging)
  • Reading 15+ files across 2 repositories to understand existing patterns
  • Running 8 code graph queries for impact analysis
  • Answering 4 interactive scope questions during planning
  • Handling 2 mid-planning scope expansions from the developer
  • Writing a 500-line blueprint with per-file implementation specs
  • Implementing 8 files (807 lines of code) across 2 repositories
  • Verifying all files with syntax checks and functional tests

Total time: 28 minutes and 8 seconds. Session cost: ~$7.75 (Claude Code UI) / $12.17 (Opus 4.6 API rates).

The Task

Build a CLI authentication system for a Claude Code plugin marketplace:

ComponentWhat it does
composure-auth.mjs (359 lines)OAuth 2.1 PKCE flow — generates challenge, opens browser, runs localhost callback server, exchanges code for tokens
composure-token.mjs (206 lines)Token read/write/refresh utilities — dual-mode (CLI + importable module)
commands/auth.md (57 lines)Unified /auth slash command with login/logout/status/upgrade subcommands
hooks/auth-check.sh (43 lines)SessionStart hook — checks auth status, attempts silent token refresh
public/install.sh (110 lines)Curl one-liner installer — prerequisite checks, marketplace registration, plugin install
api/v1/install/route.ts (32 lines)API endpoint serving the install script with caching headers
hooks/hooks.jsonEdited — registered auth-check hook in position #2 of SessionStart
plugin.json (x2)Edited — added bin/ entries, commands/ path, version bump to 1.37.0

The blueprint also produced a detailed Phase 2 plan for repo split and content gating — planned but not yet executed.

Session Metrics

Timeline

MilestoneTime (ET)Elapsed
Blueprint invoked21:02:050:00
Graph scan + findings presented21:10:418m 36s
Scope expansion addressed21:14:5612m 51s
Blueprint document written (500 lines)21:19:5017m 45s
Refinements applied21:24:3122m 26s
Implementation started21:24:4622m 41s
Phase 1 complete (8 files built + verified)21:30:1328m 08s

Token Usage

CategoryTokensNotes
Input (uncached)295New content each turn (tool results, user answers)
Output46,632Claude's responses, code generation, tool calls
Cache read16,992,855Context reused from prior turns
Cache creation400,665New context added to cache
Peak context146,297Size of the conversation at the last turn
API turns191Individual API calls during the blueprint window

Operations

OperationCount
Files read (Read tool)~15
Code graph queries (MCP)8
Files created (Write tool)8
Files edited (Edit tool)3
Interactive questions (AskUserQuestion)6
Agent spawns1 (research, 3m 47s)
Lines of code written807
Blueprint document lines~500

Cost Breakdown

Actual Cost (with prompt caching, Opus 4.6)

CategoryTokensRateCost
Input (uncached)295$5.00/MTok$0.00
Output46,632$25.00/MTok$1.17
Cache read16,992,855$0.50/MTok$8.50
Cache write (5m)400,665$6.25/MTok$2.50
Total$12.17

The $7.75 figure shown in Claude Code's UI reflects subscription pricing. The token counts above are from the session JSONL file, priced at Opus 4.6 API rates (2026). The important comparison is the ratio between cached and uncached costs.

Hypothetical Cost (without caching)

If every turn sent the full context as uncached input:

CategoryTokensRateCost
All input (no cache)17,393,815$5.00/MTok$86.97
Output46,632$25.00/MTok$1.17
Total$88.14

Cache Savings

MetricValue
Cache hit rate97.7%
Tokens served from cache16,992,855
Cost with caching$12.17
Cost without caching$88.14
Savings$75.97 (86% reduction)

Why This Is Efficient

1. Direct file reads, not agent spawns

Composure's graph tells Claude exactly which files to read. Instead of spawning explore agents ($0.15+ per spawn, fresh context each time), files are read directly into the main conversation (~$0.005 per file). The crucial difference: files read into the main context get cached on subsequent turns. Agent contexts are ephemeral — they start fresh, can't reuse the parent's cache, and their results must be summarized back.

Approach15 files readContext reuse
Direct Read~$0.08 totalCached for all 191 turns at $0.50/MTok
Agent spawns~$2.25+ total (15 x $0.15)Zero — each spawn starts cold

2. Graph-first discovery

The code graph answered structural questions in milliseconds that would otherwise require multiple file searches:

  • "What files reference no-bandaids.json?" → 89 results, instant
  • "What imports plugin install?" → 16 results with file context
  • "What's the blast radius of changing plugin.json?" → dependency map

Without the graph, these questions require recursive grep + file reads + manual dependency tracing — easily 20+ additional tool calls per question.

3. Prompt caching compounds over long conversations

Each API turn reuses the conversation history from cache. In a 191-turn session with ~140K context, the cache savings compound:

  • Turn 1: ~14K tokens (no cache yet)
  • Turn 50: ~50K cached, ~500 new
  • Turn 100: ~77K cached, ~300 new
  • Turn 191: ~143K cached, ~3K new

The longer the conversation runs (without agent spawns fragmenting the context), the higher the cache hit rate climbs. By the end, 97.7% of every API call's input was served from cache at 90% discount.

4. Blueprint prevents rework

The 17m 45s spent planning (graph scan, scope questions, impact analysis, per-file specs) prevented implementation missteps. Every file created matched the spec on the first attempt — no backtracking, no "actually that should be different" rewrites. Planning is cheap in tokens; rework is expensive.

What "28 Minutes" Actually Contains

This wasn't a simple "write 8 files" task. The session included:

  1. Project analysis — Read the hub structure, identified 2 repos, understood the existing OAuth backend
  2. Graph scan — 8 queries across a 4,155-node code graph (563 files, 9,219 edges)
  3. External research — Agent spawn to research CodeRabbit CLI packaging and Claude Code plugin commands
  4. 4 interactive scope decisions — Auth runtime, command naming, distribution model, token storage location
  5. 2 scope expansions — Developer added repo split requirement and config consolidation mid-planning
  6. Impact analysis — 89-file blast radius check for config references, 16-file install reference check
  7. 500-line blueprint — Per-file implementation specs, preservation boundaries, risks with mitigations, verification scenarios
  8. 2 blueprint refinements — Repo consolidation (3 repos → 2), installer URL flexibility
  9. 8 task creation with dependency graph (blocked-by relationships)
  10. 8 files written — 807 lines of production code (OAuth PKCE, token management, shell hooks, installer)
  11. Full verification — Syntax checks, JSON validation, functional tests, typecheck

On this page