Claude Code GitHub Copilot → Kilo Code Pi Agent

The New Inference Harness

From Copilot to Polymodel Routing —
how we're rebuilding the dev toolchain

PickYourTrail · Engineering

Why Are We Switching?

💸

Cost Control

OpenRouter routes to the cheapest capable model for each task. Pay per token, not per seat.

🧠

Model Flexibility

Access 200+ models (Minimax, Kimi, Grok, GLM…) from one API key. No vendor lock-in.

🎯

Right Model for the Right Job

Use heavy reasoning models for planning, fast cheap models for simple edits.

🔧

Developer Ownership

Full config control in kilo.ai config. Engineers own the model strategy.

⚠️ Claude Code & GitHub Copilot are great tools, but they're closed ecosystems with fixed pricing and no model routing.

PickYourTrail · Engineering

What is OpenRouter?

OpenRouter is a unified API gateway that sits between your tools and AI model providers. One key, one endpoint, hundreds of models.

🛠️

Your Tools

Kilo Code
Pi Agent

single API key

🔀

OpenRouter

Routes · Balances · Logs

best model wins

Minimax m2.7

Moonshot kimi-k2.6

xAI grok-4.3

Zhipu glm-5.1

OpenAI gpt-4o

Anthropic claude-3.5

PickYourTrail · Engineering

Kilo Code

VS Code extension · replaces Copilot · routes via OpenRouter

01

Install Kilo Code

Search "Kilo Code" in VS Code Extensions panel and install

VS Code → Extensions → Search "Kilo Code"

02

Add OpenRouter API Key

Kilo Code settings → Provider: OpenRouter → paste your API key

Settings → API Provider → OpenRouter

03

Drop in the Config

Save kilo.ai config JSON in your project root — it auto-detects it

kilo.ai config.json → project root

04

Start Coding

Open Kilo Code panel and start asking — it picks the right model automatically

Ctrl+Shift+K to open

💡 Kilo Code replaces Copilot — same IDE experience, but model-agnostic and cost-optimised via OpenRouter.

PickYourTrail · Engineering

Pi Agent Coding

Agentic AI coding · replaces Claude Code

🤖

Agentic Coding

Pi Agent can plan, write, run, and iterate code autonomously — not just autocomplete.

🔀

Multi-step Tasks

Give it a feature request; it breaks it into subtasks, writes code, and self-reviews.

🔌

OpenRouter Powered

Uses your kilo.ai model config — the orchestrator and plan models do the heavy lifting.

🧪

Replaces Claude Code

Fills the same role as Claude Code (terminal-based agent) but open and configurable.

HOW A TASK FLOWS

💬

You describe

natural language

→

📋

Pi plans

GLM-5.1

→

⌨️

Pi codes

Minimax M2.7

→

🐛

Pi debugs

Kimi K2.6

→

✅

Done

reviewed & committed

PickYourTrail · Engineering

Using Kilo Code

Day-to-day commands & workflow inside VS Code

COMMANDS

Ctrl+L Open Chat Panel

Describe what you need. Kilo replies inline with code suggestions.

Select + chat Refactor / Fix

Highlight any block of code, open chat, and ask to refactor, fix, or explain.

Tab Accept Completion

Inline completions appear as you type — Tab to accept, Esc to dismiss.

/explain /fix /test /review

Slash commands for quick targeted actions in the chat panel.

EXAMPLE PROMPTS

// Quick edits
"Add error handling to this function"
"Write a unit test for UserService.create()"
// Refactor
"Refactor this into smaller functions"
"Convert this class component to a hook"
// Debug
"Why is this returning undefined?"
"Explain this stack trace"

✅ USE KILO WHEN

You're inside a file making targeted edits, completions, or reviews. Think: Copilot replacement.

PickYourTrail · Engineering

Using Pi Agent

Writing tasks, reviewing output, staying in control

TASK WORKFLOW

1

Write the task

Be specific — include file names, module, stack. Pi reads context from your repo.

2

Review the plan

Pi shows a step-by-step plan before writing a single line. Approve or edit it.

3

Watch it execute

Pi writes, runs, and self-corrects. You see diffs before they're applied.

4

Accept / reject diffs

You're always in control. Accept changes file-by-file or roll back entirely.

EXAMPLE TASK PROMPTS

// Feature work
"Add a POST /user/preferences endpoint
 with Zod validation in src/routes/"
// Cross-file refactor
"Migrate all axios calls in /services
 to use our new httpClient wrapper"
// Test generation
"Write integration tests for the
 BookingService covering edge cases"

✅ USE PI AGENT WHEN

The task spans multiple files or needs autonomous execution. Think: Claude Code replacement.

💡 PRO TIP

Name the files and modules explicitly. Pi performs significantly better with "in src/services/user.service.ts" than vague instructions.

PickYourTrail · Engineering

The Inference Mindset Shift

How to think about tasks now that speed is no longer free

⚠️ THE OLD HABIT (Claude / GPT-4o)

🚀 "Just throw it at the model"

Frontier models are so fast you barely notice a bad prompt. Get a bad result? Re-run in 3 seconds.

📋 "Queue 5 tasks at once"

Speed lets engineers pipeline tasks without reading the output of the previous one carefully.

🙈 "I'll review it later"

Code gets merged without deeply understanding what was written. Technical debt accumulates silently.

🌊 "Vibe coding"

Prompts are vague. Context is missing. The model guesses — and you accept whatever it returns.

✅ THE NEW APPROACH (OpenRouter stack)

🧠 Think first, prompt once

Inference latency is real. Spend 2 minutes writing a precise task — you save 10 minutes of re-runs.

👁️ Read the plan before approving

Pi Agent shows you a plan. Actually read it. Catching a wrong assumption at step 1 beats fixing 300 lines later.

📦 One task at a time

Complete and review one task fully before starting the next. Compounding errors across queued tasks are hard to untangle.

✍️ Own the code

Review every diff. If you can't explain what was written, don't merge it. You're still the engineer.

⚡ The bottleneck moved. With frontier models, the bottleneck was your thinking speed. With this stack, it's inference time. That's actually a feature — it gives you a moment to stay in the loop.

PickYourTrail · Engineering

Token Consumption & ROI

Tokens spent ≠ value delivered — learn to close the gap

🤔

Ask yourself after every task:

"How many tokens did I burn — and did the output actually land in production?"

🔴 LOW TOKEN ROI — warning signs

You ran the same task 4+ times

Each re-run multiplies token spend. The problem is almost always the original prompt — not the model.

You accepted code you can't explain

Tokens were spent generating it, but the output has zero ROI if it can't be reviewed, owned, or maintained.

The task generated code that wasn't shipped

Exploratory generation is fine — but if it's a pattern, you're using the AI to avoid thinking, not to accelerate it.

You dumped the whole codebase as context

Massive context = massive token bill. Surgical context (the right 2 files) almost always beats a full repo dump.

🟢 HIGH TOKEN ROI — what good looks like

One precise prompt → one usable output

You spent 30 seconds thinking before prompting. First result needed minor edits and was merged same day.

You reviewed the diff line by line

You understand every change. If a bug surfaces later, you can debug it without re-running the agent from scratch.

Context was scoped to what was needed

You pointed the agent at the right file and function. Not the whole src/ directory.

📐 Build this habit

Before starting a task, write one sentence: "I want X in file Y, using pattern Z." If you can't write it, you're not ready to prompt yet.

PickYourTrail · Engineering

The Model Config

kilo.ai config.json — your team's model strategy

1{
2  "model": "openrouter/minimax/minimax-m2.7",← primary model (code gen)
3  "small_model": "openrouter/moonshotai/kimi-k2.6",← fast tasks
4  "agent": {
5    "plan":        { "model": "openrouter/z-ai/glm-5.1" },← planning
6    "orchestrator":{ "model": "openrouter/z-ai/glm-5.1" },← coordination
7    "code":        { "model": "openrouter/minimax/minimax-m2.7" },← code gen
8    "debug":       { "model": "openrouter/moonshotai/kimi-k2.6" },← debugging
9    "ask":         { "model": "openrouter/moonshotai/kimi-k2.6" }← quick Q&A
10  }
11}

Minimax M2.7 — heavyweight code

Kimi K2.6 — fast & cheap

GLM-5.1 — reasoning & orchestration

PickYourTrail · Engineering

Which Model Does What?

Right model for the right job — this is the whole point

🦁

Minimax M2.7

minimax · openrouter

1M ctx

Primary / Code

Massive context window, reasoning, excellent for long code files and complex generation tasks.

code generation complex tasks primary model

⚡

Kimi K2.6

moonshotai · openrouter

262K ctx

Fast / Debug / Ask

Fast and cost-efficient — perfect for iterative debugging, quick questions, and small edits.

debugging quick Q&A small_model

🧭

GLM-5.1

z-ai · openrouter

128K ctx

Orchestrator / Plan

Strong reasoning model for task planning, breaking down features into steps, and coordinating agents.

planning orchestration

🌟

Grok 4.3

x-ai · openrouter

256K ctx

Available / Spare

In the config and ready to use — assign to any agent role as needed for specific tasks.

configurable available

PickYourTrail · Engineering

Next Steps & Q&A

👤 Everyone

✓Get your OpenRouter API key from the team lead

👩‍💻 Engineers

✓Install Kilo Code VS Code extension

✓Drop kilo.ai config.json into your project

✓Try Pi Agent on a feature branch this sprint

ANTICIPATED Q&A

Q: What about existing Claude Code tasks?

Finish them up this sprint, then migrate to Pi Agent going forward.

Q: Will Copilot still work?

Yes during transition, but Kilo Code is the target. Please switch by end of sprint.

PickYourTrail · Engineering