Plan: Kimi K2.6 Subagent Setup¶
Goal¶
Reduce Claude Code token consumption by delegating I/O-heavy and boilerplate tasks to Kimi K2.6 — a cheap (~1/100th the cost), OpenAI-compatible model with a 262k context window. Claude handles reasoning; Kimi handles bulk reads and simple generation.
Two CLI Scripts¶
scripts/ask-kimi¶
Routes bulk file reading and summarization to Kimi. Instead of Claude consuming thousands of tokens reading large files directly, it calls this script and gets back a focused summary.
Usage (from Claude via Bash):
python scripts/ask-kimi.py "Summarize the scheduler jobs in scheduler/jobs.py"
python scripts/ask-kimi.py "What columns does the PortfolioHolding model have?" --files db/models.py
python scripts/ask-kimi.py "Find all places that call build_features()" --files models/predictor.py models/trend_analyzer.py
scripts/kimi-write¶
Generates boilerplate — test scaffolding, translation strings, config entries, migration stubs. Claude reviews and refines the output rather than generating from scratch.
Usage:
python scripts/kimi-write.py "Add Korean translation strings for the new Monte Carlo page" --context dashboard/translations.py
python scripts/kimi-write.py "Write a pytest scaffold for portfolio/parser.py"
API Details¶
- Base URL:
https://api.moonshot.ai/v1(OpenAI-compatible) - Auth:
MOONSHOT_API_KEYenv var - Model:
kimi-k2.6 - Context: 262k tokens
- Pricing: $0.95/1M input, $4.00/1M output (cache hit: $0.16/1M input)
CLAUDE.md Routing Rules¶
Add a delegation section to CLAUDE.md that tells Claude when to use each script:
## Subagent Delegation (Token Efficiency)
Delegate to `ask-kimi` when:
- Reading files > 300 lines to answer a factual question
- Searching across multiple large files for a symbol or pattern
- Summarizing a module's current state before making changes
Delegate to `kimi-write` when:
- Generating test file scaffolding
- Adding translation string entries (en + ko pairs)
- Writing config boilerplate (new ticker batch, new sector keywords)
- Drafting docstrings or inline comments
Use Claude directly for:
- All architectural decisions
- Debugging logic errors
- Any task requiring cross-file reasoning or judgment
- Code that touches FEATURE_ORDER, scheduler timing, or DB schema
Implementation Checklist¶
- [ ] Add
MOONSHOT_API_KEYto.env - [ ] Write
scripts/ask-kimi.py(~60 lines, OpenAI SDK, stdin or--filesargs) - [ ] Write
scripts/kimi-write.py(~60 lines, OpenAI SDK, task + optional context file) - [ ] Add routing rules to
CLAUDE.md - [ ] Test
ask-kimion a large file (frontend/src/pages/Overview.tsxor similar) - [ ] Test
kimi-writeon a translation string task - [ ] Verify Kimi handles Korean text correctly in both scripts
Notes¶
- Both scripts use the standard
openaiPython SDK pointed atapi.moonshot.ai/v1— no new SDK needed beyond what may already be installed - Keep scripts simple and stateless — no session management, no streaming required
- If Kimi quality is insufficient for a task, fallback is just not using the script — Claude reads the file directly as normal