Kimi-K2.6 · Cloudflare Workers AI

Claude Code alternative, on your own Cloudflare account.

An open-source terminal coding agent powered by Kimi K2.6, MCP tools, local memory, and direct Workers AI execution — no API middleman.

Install View source
kimiflare - kimi k2.6 cli code editor hosted on cloudflare workers AI | Product Hunt
zsh
kimiflare Ready when you are.
  › Explain this codebase
  › Find and fix a bug
  › Refactor a file
Type a message or /help for commands · ctrl-c to exit · shift+tab to cycle modes
add a /health endpoint
────────────────────────────────────────
thinking… I'll need to read the server file first, then add the endpoint.
read(src/server.ts)
read(src/server.ts)
edit src/server.ts
Permission requested
tool: edit
action: edit src/server.ts
@@ -42,6 +42,10 @@
  app.get('/', …)
+  app.get('/health', (_, res) => res.json({ ok: true }))
  …
Allow once   Allow for this session   Deny
edit src/server.ts
Done — added /health that returns { ok: true }.
────────────────────────────────────────
■ Update landing page terminal (12s · ↑ 2.3k tokens)
Read current UI components
Update terminal simulation
Create PR
[edit] k2.6 · medium · thinking…
in 2,847 (1,203 cached) · out 412 · ctx 12% · 0.00321

Recently shipped

  • Cloudflare Code Mode support
  • Local structured agent memory
  • 70–90% token cost reduction work
  • Session compaction improvements

Coming next

  • OpenCode parity features
  • Cost attribution dashboard
  • Cloudflare Artifacts parallel agents
See full changelog →

What it does

01

Plan / Edit / Auto modes

Plan mode blocks all mutating tools for safe research. Edit mode prompts per call. Auto mode approves everything for trusted tasks.

02

Live task panel

For multi-step work, the agent publishes a task list with progress icons, elapsed time, and token deltas. Multi-step work feels managed.

03

Image understanding

Drop image paths (PNG, JPG, WebP, GIF, BMP up to 5 MB) into any prompt. The model sees them inline — perfect for UI reviews, diagrams, and screenshots.

04

14 terminal themes

dark, light, high-contrast, dracula, nord, one-dark, monokai, solarized, tokyo-night, gruvbox, catppuccin, rose-pine. Live preview with Ctrl+T.

05

Paste collapse

Large pastes collapse to [pasted N lines #id]. Full content still goes to the model — scrollback stays clean.

06

Type-ahead queue

Type your next prompt while the model is still working. Queued prompts fire in order. Ctrl-C aborts current and clears queue.

07

Smart context management

Compiled context extracts structured state and archives raw tool outputs as recallable artifacts. Auto-compaction kicks in at ~80% usage. The model never loses track of what it learned.

08

Streaming reasoning

Toggle the model's chain-of-thought with /reasoning or Ctrl-R. See how it thinks in real time.

09

Session persistence

Every turn is auto-saved. /resume lists past sessions with message counts in a paginated picker. Never lose your place.

10

Smart permissions

Bash session-allow is keyed by the first token (allow all git commands). Write/edit show a unified diff before you approve.

11

262K context window

Read entire modules, large configs, and full stack traces without the model losing track. Direct to Cloudflare — no middleman.

12

MCP server integration

Plug in external tools via the Model Context Protocol — local stdio servers or remote SSE endpoints. GitHub, Sentry, docs search, databases, and more.

13

Prefix cache optimization

Deterministic system prompts, stable JSON serialization, and session-affinity headers maximize Workers AI prefix-cache hits. Cached tokens are billed at a discount — cost drops as the conversation grows.

14

Explicit cross-session memory

The agent never surveils your conversation. Memories are stored only when you ask — via remember, recall, and forget tools — with SQLite + embeddings for durable, privacy-respecting retrieval across sessions.

kimiflare Node.js TUI
user msg → agent loop → runKimi()
POST SSE to Workers AI
api.cloudflare.com @cf/moonshotai/kimi-k2.6
tool result ← tool executor ← tool_calls
permission modal for write / edit / bash

Three ways to work

01

Plan

Read-only research. Mutating tools are hard-blocked. Ask "plan a refactor" and the agent investigates without touching your filesystem. Review, then exit plan mode to execute.

02

Edit

Default mode. The agent calls tools freely for read-only work; mutating tools pause for your approval with a unified diff preview.

03

Auto

Autonomous execution. Every tool call is auto-approved. Use for trusted, well-scoped tasks. The agent still warns before irreversible actions.

Get started

bash
# Install
npm install -g kimiflare

# Run — onboarding will ask for your Cloudflare credentials
kimiflare

Or run without installing: npx kimiflare

bash
# Interactive TUI
kimiflare

# One-shot mode
kimiflare -p "summarize PLAN.md"

# Auto-approve for scripts
kimiflare -p "..." --dangerously-allow-all

# Override model
kimiflare --model @cf/moonshotai/kimi-k2.6

# Stream reasoning to stderr
kimiflare --reasoning

# Image understanding — reference images inline
kimiflare
kimiflare -p "explain this diagram.png"

Core tools

Tool Permission Description
read auto Read a text file (≤ 2MB) with optional line range
write prompt Create or overwrite a file. Shows a diff before approval
edit prompt Replace an exact substring. Fails unless unique match
bash prompt Run a shell command. Session-allow keyed by first token
glob auto Match files by pattern, sorted by mtime
grep auto Regex search. Uses ripgrep if available
web_fetch auto Fetch a URL, convert HTML → markdown (≤ 100KB)
tasks_set auto Publish a live task list for multi-step work

Plus LSP intelligence (hover, go-to-definition, references, diagnostics), cross-session memory (remember / recall / forget), and MCP extensibility for plugging in external tool servers.