Open-source, powered by Kimi K2.7 on Cloudflare Workers AI — with AI Gateway observability.
Occasional release notes, technical write-ups, and early experiments on building AI coding agents on Cloudflare.
No spam. Only meaningful ships.
/health that returns { ok: true }.
/costPlan mode blocks all mutating tools for safe research. Edit mode prompts per call. Auto mode approves everything for trusted tasks.
For multi-step work, the agent publishes a task list with progress icons, elapsed time, and token deltas. Multi-step work feels managed.
Drop image paths (PNG, JPG, WebP, GIF, BMP up to 5 MB) into any prompt. The model sees them inline — perfect for UI reviews, diagrams, and screenshots.
Fully customizable color palettes with WCAG contrast validation. Pick from built-in presets or define your own. Live preview with Ctrl+T.
Research the web, read GitHub repos, and fetch JavaScript-rendered pages — all without leaving your terminal.
Hover, go-to-definition, references, and diagnostics via Language Server Protocol. Auto-configured per project with an interactive wizard.
The agent picks the right skill depth for the task — from quick edits to deep research — with graceful preemption and visible TUI indicators.
Toggle the model's chain-of-thought with /reasoning or Ctrl-R. See how it thinks in real time.
Every turn is auto-saved. /resume lists past sessions with message counts in a paginated picker. Never lose your place.
Bash session-allow is keyed by the first token (allow all git commands). Write/edit show a unified diff before you approve.
Read entire modules, large configs, and full stack traces without the model losing track. All on your Cloudflare account.
Plug in external tools via the Model Context Protocol — local stdio servers or remote SSE endpoints. GitHub, Sentry, docs search, databases, and more.
Every request flows through your own Cloudflare AI Gateway. Per-request logs, response caching with configurable TTL, authoritative per-turn cost from the logs API, and a per-feature breakdown in /cost. The status bar shows the gateway-confirmed cost and latency for the last turn.
The agent never surveils your conversation. Memories are stored only when you ask — via remember, recall, and forget tools — with SQLite + embeddings for durable, privacy-respecting retrieval across sessions.
Read-only research. Mutating tools are hard-blocked. Ask "plan a refactor" and the agent investigates without touching your filesystem. Review, then exit plan mode to execute.
Default mode. The agent calls tools freely for read-only work; mutating tools pause for your approval with a unified diff preview.
Autonomous execution. Every tool call is auto-approved. Use for trusted, well-scoped tasks. The agent still warns before irreversible actions.
| Capability | kimiflare | Claude Code | Aider | Gemini CLI | Codex CLI |
|---|---|---|---|---|---|
| Open source | Yes (MIT) | No | Yes | Yes | Yes |
| Runs on your own cloud | Yes — your Cloudflare account | No — Anthropic API | BYO API key | No — Google API | No — OpenAI API |
| Default model | Kimi K2.7 (Workers AI) | Claude | Model-agnostic | Gemini | GPT / Codex |
| Context window | 262K tokens | 200K+ | Varies by model | 1M | Varies by model |
| Authoritative per-turn cost | Yes — AI Gateway logs | Estimated | Estimated | Estimated | Estimated |
| Image understanding | Yes | Yes | Partial | Yes | Yes |
| MCP extensibility | Yes | Yes | No | Yes | Yes |
| LSP code intelligence | Yes | Partial | No | No | No |
Comparison reflects publicly documented capabilities at time of writing and may change. See the detailed write-ups: Claude Code alternative · vs Claude Code · vs Aider · vs Gemini CLI · vs Codex CLI.
Or run without installing: npx kimiflare
kimiflare is an open-source, terminal-based AI coding agent — a self-hosted alternative to Claude Code — that runs on your own Cloudflare account. It is powered by the Kimi K2.7 model on Cloudflare Workers AI, with every request routed through your own Cloudflare AI Gateway for per-request logs, response caching, and authoritative cost.
Yes. kimiflare is an open-source Claude Code alternative. It offers the same terminal-first agentic workflow — reading and editing files, running commands, web research, and image understanding — but runs on Cloudflare Workers AI using Kimi K2.7 on your own account, and is MIT-licensed. See the Claude Code alternative page for a full breakdown.
Install kimiflare with npm install -g kimiflare (or run npx kimiflare), then run kimiflare. The onboarding flow connects your Cloudflare account and AI Gateway. All inference runs on Cloudflare Workers AI under your own credentials — no third-party API keys are required.
kimiflare uses Kimi K2.7 (@cf/moonshotai/kimi-k2.7-code) on Cloudflare Workers AI, with a 262K-token context window. You can override the model with the --model flag.
An AI coding harness is the program that wraps a large language model with tools, permissions, context management, and a user interface so it can act as an autonomous coding agent — reading and editing files, running commands, and researching the web. kimiflare is a Cloudflare-native, open-source coding harness for the terminal. Read more in What is an AI coding harness?
kimiflare itself is free and open source (MIT). You pay only Cloudflare's usage for Workers AI inference on your own account. Because every request flows through your Cloudflare AI Gateway, kimiflare shows authoritative per-turn cost from the gateway logs in the /cost command.