claude-controller — Sujith K. Surendran

The core insight was that “which backend should I use” is not a one-time setup question — it’s answered differently depending on the task, the available hardware, and what broke last session. A controller that treats backend selection as a first-class, repeatable workflow decision — rather than a config file you edit once and forget — fits how developers actually work.

The post-session tooling is where the project earns its depth. cc-health reads Claude Code’s own JSONL session logs and names failure modes precisely: a model stuck calling the same tool with identical parameters (TOOL_LOOP), raw <function=...> syntax leaking as plain text (RAW_XML), identical input token counts across all turns indicating context truncation (CONTEXT_CAP). Nine failure patterns in total, each with a plain-English label. That turns “the local model didn’t work” — which is where most debugging stops — into a diagnosable, actionable signal.

cc-diagnose completes the picture: VRAM headroom at various context sizes, GPU vs CPU layer split, cold-start latency, and prefill/decode token speed. Together the two tools make local model selection an informed choice rather than trial and error.