Claude Code GitHub Copilot Kilo Code Pi Agent

The New Inference Harness

From Copilot to Polymodel Routing —
how we're rebuilding the dev toolchain

PickYourTrail · Engineering

Why Are We Switching?

💸
Cost Control
OpenRouter routes to the cheapest capable model for each task. Pay per token, not per seat.
🧠
Model Flexibility
Access 200+ models (Minimax, Kimi, Grok, GLM…) from one API key. No vendor lock-in.
🎯
Right Model for the Right Job
Use heavy reasoning models for planning, fast cheap models for simple edits.
🔧
Developer Ownership
Full config control in kilo.ai config. Engineers own the model strategy.
⚠️ Claude Code & GitHub Copilot are great tools, but they're closed ecosystems with fixed pricing and no model routing.
PickYourTrail · Engineering

What is OpenRouter?

OpenRouter is a unified API gateway that sits between your tools and AI model providers. One key, one endpoint, hundreds of models.

🛠️
Your Tools
Kilo Code
Pi Agent
single API key
🔀
OpenRouter
Routes · Balances · Logs
best model wins
Minimax m2.7
Moonshot kimi-k2.6
xAI grok-4.3
Zhipu glm-5.1
OpenAI gpt-4o
Anthropic claude-3.5
PickYourTrail · Engineering

Kilo Code

VS Code extension · replaces Copilot · routes via OpenRouter
01
Install Kilo Code
Search "Kilo Code" in VS Code Extensions panel and install
VS Code → Extensions → Search "Kilo Code"
02
Add OpenRouter API Key
Kilo Code settings → Provider: OpenRouter → paste your API key
Settings → API Provider → OpenRouter
03
Drop in the Config
Save kilo.ai config JSON in your project root — it auto-detects it
kilo.ai config.json → project root
04
Start Coding
Open Kilo Code panel and start asking — it picks the right model automatically
Ctrl+Shift+K to open
💡 Kilo Code replaces Copilot — same IDE experience, but model-agnostic and cost-optimised via OpenRouter.
PickYourTrail · Engineering

Pi Agent Coding

Agentic AI coding · replaces Claude Code
🤖
Agentic Coding
Pi Agent can plan, write, run, and iterate code autonomously — not just autocomplete.
🔀
Multi-step Tasks
Give it a feature request; it breaks it into subtasks, writes code, and self-reviews.
🔌
OpenRouter Powered
Uses your kilo.ai model config — the orchestrator and plan models do the heavy lifting.
🧪
Replaces Claude Code
Fills the same role as Claude Code (terminal-based agent) but open and configurable.
HOW A TASK FLOWS
💬
You describe
natural language
📋
Pi plans
GLM-5.1
⌨️
Pi codes
Minimax M2.7
🐛
Pi debugs
Kimi K2.6
Done
reviewed & committed
PickYourTrail · Engineering

Using Kilo Code

Day-to-day commands & workflow inside VS Code
COMMANDS
Ctrl+L Open Chat Panel
Describe what you need. Kilo replies inline with code suggestions.
Select + chat Refactor / Fix
Highlight any block of code, open chat, and ask to refactor, fix, or explain.
Tab Accept Completion
Inline completions appear as you type — Tab to accept, Esc to dismiss.
/explain /fix /test /review
Slash commands for quick targeted actions in the chat panel.
EXAMPLE PROMPTS
// Quick edits
"Add error handling to this function"
"Write a unit test for UserService.create()"
// Refactor
"Refactor this into smaller functions"
"Convert this class component to a hook"
// Debug
"Why is this returning undefined?"
"Explain this stack trace"
✅ USE KILO WHEN
You're inside a file making targeted edits, completions, or reviews. Think: Copilot replacement.
PickYourTrail · Engineering

Using Pi Agent

Writing tasks, reviewing output, staying in control
TASK WORKFLOW
1
Write the task
Be specific — include file names, module, stack. Pi reads context from your repo.
2
Review the plan
Pi shows a step-by-step plan before writing a single line. Approve or edit it.
3
Watch it execute
Pi writes, runs, and self-corrects. You see diffs before they're applied.
4
Accept / reject diffs
You're always in control. Accept changes file-by-file or roll back entirely.
EXAMPLE TASK PROMPTS
// Feature work
"Add a POST /user/preferences endpoint
 with Zod validation in src/routes/"
// Cross-file refactor
"Migrate all axios calls in /services
 to use our new httpClient wrapper"
// Test generation
"Write integration tests for the
 BookingService covering edge cases"
✅ USE PI AGENT WHEN
The task spans multiple files or needs autonomous execution. Think: Claude Code replacement.
💡 PRO TIP
Name the files and modules explicitly. Pi performs significantly better with "in src/services/user.service.ts" than vague instructions.
PickYourTrail · Engineering

The Inference Mindset Shift

How to think about tasks now that speed is no longer free
⚠️  THE OLD HABIT  (Claude / GPT-4o)
🚀  "Just throw it at the model"
Frontier models are so fast you barely notice a bad prompt. Get a bad result? Re-run in 3 seconds.
📋  "Queue 5 tasks at once"
Speed lets engineers pipeline tasks without reading the output of the previous one carefully.
🙈  "I'll review it later"
Code gets merged without deeply understanding what was written. Technical debt accumulates silently.
🌊  "Vibe coding"
Prompts are vague. Context is missing. The model guesses — and you accept whatever it returns.
✅  THE NEW APPROACH  (OpenRouter stack)
🧠  Think first, prompt once
Inference latency is real. Spend 2 minutes writing a precise task — you save 10 minutes of re-runs.
👁️  Read the plan before approving
Pi Agent shows you a plan. Actually read it. Catching a wrong assumption at step 1 beats fixing 300 lines later.
📦  One task at a time
Complete and review one task fully before starting the next. Compounding errors across queued tasks are hard to untangle.
✍️  Own the code
Review every diff. If you can't explain what was written, don't merge it. You're still the engineer.
The bottleneck moved. With frontier models, the bottleneck was your thinking speed. With this stack, it's inference time. That's actually a feature — it gives you a moment to stay in the loop.
PickYourTrail · Engineering

Token Consumption & ROI

Tokens spent ≠ value delivered — learn to close the gap
🤔
Ask yourself after every task:
"How many tokens did I burn — and did the output actually land in production?"
🔴  LOW TOKEN ROI — warning signs
You ran the same task 4+ times
Each re-run multiplies token spend. The problem is almost always the original prompt — not the model.
You accepted code you can't explain
Tokens were spent generating it, but the output has zero ROI if it can't be reviewed, owned, or maintained.
The task generated code that wasn't shipped
Exploratory generation is fine — but if it's a pattern, you're using the AI to avoid thinking, not to accelerate it.
You dumped the whole codebase as context
Massive context = massive token bill. Surgical context (the right 2 files) almost always beats a full repo dump.
🟢  HIGH TOKEN ROI — what good looks like
One precise prompt → one usable output
You spent 30 seconds thinking before prompting. First result needed minor edits and was merged same day.
You reviewed the diff line by line
You understand every change. If a bug surfaces later, you can debug it without re-running the agent from scratch.
Context was scoped to what was needed
You pointed the agent at the right file and function. Not the whole src/ directory.
📐 Build this habit
Before starting a task, write one sentence: "I want X in file Y, using pattern Z." If you can't write it, you're not ready to prompt yet.
PickYourTrail · Engineering

The Model Config

kilo.ai config.json — your team's model strategy
1
{
2
  "model": "openrouter/minimax/minimax-m2.7",← primary model (code gen)
3
  "small_model": "openrouter/moonshotai/kimi-k2.6",← fast tasks
4
  "agent": {
5
    "plan":        { "model": "openrouter/z-ai/glm-5.1" },← planning
6
    "orchestrator":{ "model": "openrouter/z-ai/glm-5.1" },← coordination
7
    "code":        { "model": "openrouter/minimax/minimax-m2.7" },← code gen
8
    "debug":       { "model": "openrouter/moonshotai/kimi-k2.6" },← debugging
9
    "ask":         { "model": "openrouter/moonshotai/kimi-k2.6" }← quick Q&A
10
  }
11
}
Minimax M2.7 — heavyweight code
Kimi K2.6 — fast & cheap
GLM-5.1 — reasoning & orchestration
PickYourTrail · Engineering

Which Model Does What?

Right model for the right job — this is the whole point
🦁
Minimax M2.7
minimax · openrouter
1M ctx
Primary / Code
Massive context window, reasoning, excellent for long code files and complex generation tasks.
code generation complex tasks primary model
Kimi K2.6
moonshotai · openrouter
262K ctx
Fast / Debug / Ask
Fast and cost-efficient — perfect for iterative debugging, quick questions, and small edits.
debugging quick Q&A small_model
🧭
GLM-5.1
z-ai · openrouter
128K ctx
Orchestrator / Plan
Strong reasoning model for task planning, breaking down features into steps, and coordinating agents.
planning orchestration
🌟
Grok 4.3
x-ai · openrouter
256K ctx
Available / Spare
In the config and ready to use — assign to any agent role as needed for specific tasks.
configurable available
PickYourTrail · Engineering

Next Steps & Q&A

👤 Everyone
Get your OpenRouter API key from the team lead
👩‍💻 Engineers
Install Kilo Code VS Code extension
Drop kilo.ai config.json into your project
Try Pi Agent on a feature branch this sprint
ANTICIPATED Q&A
Q: What about existing Claude Code tasks?
Finish them up this sprint, then migrate to Pi Agent going forward.
Q: Will Copilot still work?
Yes during transition, but Kilo Code is the target. Please switch by end of sprint.
PickYourTrail · Engineering
1 / 12 ← → Space