How to run an AI coding agent on your own Cloudflare account

You can run a Claude Code-style AI coding agent entirely on infrastructure you control. With kimiflare — an open-source (MIT) terminal coding agent — you install one npm package, connect your own Cloudflare account during onboarding, and every model call runs on Cloudflare Workers AI through your own AI Gateway. No third-party API keys, no estimated billing, and authoritative per-turn cost.

June 4, 2026 · ~6 min read

Short answer: Install kimiflare with npm install -g kimiflare, run kimiflare, and follow the onboarding to connect your Cloudflare account and AI Gateway. From then on, the agent reads and edits your code, runs commands, and answers questions using the Kimi K2.6 model on Workers AI — all under your own credentials, with logs and cost visible in your Cloudflare dashboard and the in-app /cost command.

Why run a coding agent on your own cloud?

Most hosted coding agents route your prompts and code through a vendor's servers and bill you on their terms. Running the agent on your own Cloudflare account changes the trade-offs:

Control. Inference runs on Cloudflare Workers AI under your account. You decide which model runs and you can see every request.
Privacy. Requests flow through your own Cloudflare AI Gateway rather than a separate vendor's pipeline. There are no third-party API keys beyond your Cloudflare credentials.
Authoritative cost. Because every call passes through your AI Gateway, the cost you see is gateway-confirmed — not a token estimate. The in-app /cost command reports cost per turn and per feature.
No extra vendor. You already trust and pay Cloudflare. There is no additional company in the loop to onboard, audit, or pay separately.

What you need

A Cloudflare account (the agent connects to it during onboarding and uses Workers AI and AI Gateway).
Node.js ≥ 20 installed locally.
A terminal on macOS, Linux, or Windows.

That's the entire prerequisite list. You do not need a separate model provider account or any API key other than your Cloudflare credentials.

Install kimiflare

bash

# Install globally

npm install -g kimiflare

# Launch — first run starts onboarding

kimiflare

Prefer not to install globally? Run it on demand with npx kimiflare. Either way you need Node.js ≥ 20.

Onboarding: connect your Cloudflare account

The first time you run kimiflare, onboarding walks you through connecting your Cloudflare account and setting up an AI Gateway. The AI Gateway is the piece that makes the rest of the experience work: it sits in front of Workers AI, records a log for every request, can cache responses, and reports real cost. Once onboarding finishes, you're in the terminal UI and ready to give the agent tasks.

What happens on each turn

When you ask kimiflare to do something, a single turn flows like this:

Your request leaves the terminal and goes to your own Cloudflare AI Gateway.
The gateway forwards it to Workers AI, where the Kimi K2.6 model (@cf/moonshotai/kimi-k2.6) runs with a 262K-token context window.
The model decides which tools to call — reading files, editing code, running shell commands, searching, and so on.
Tool results feed back into the model until the task is done, and the result lands back in your terminal.

Because the gateway is on the path of every call, you get logs and cost for free. If you ever want a different model, you can override the default with --model.

Cost and observability via AI Gateway

This is the practical payoff of running on your own cloud. Every request is logged in your Cloudflare AI Gateway, so you can inspect prompts, responses, latency, and spend directly in the Cloudflare dashboard. Inside the app, the /cost command surfaces authoritative cost broken down per turn and per feature — confirmed by the gateway rather than estimated from token counts. The gateway can also cache responses, which can reduce repeat spend.

Stay in control with Plan, Edit, and Auto modes

An autonomous agent that edits files and runs shell commands needs guardrails. kimiflare ships three modes so you can dial in how much it does on its own:

Plan — read-only. Mutating tools are hard-blocked, so the agent can explore and propose a plan without touching anything.
Edit — the default. Mutating tools prompt for approval, and edits are shown as a unified diff before they're applied.
Auto — auto-approves trusted tasks when you want the agent to move quickly.

A common workflow is to start in Plan to scope the work, switch to Edit to apply changes with diffs you can review, and reach for Auto only on routine, low-risk tasks.

First tasks to try

Once you're set up, these are good ways to get a feel for the agent:

Ask it to explain an unfamiliar module or trace how a feature works across files.
Have it add a small feature or fix a bug in Edit mode and review the diff before approving.
Run a one-shot task without the interactive UI: kimiflare -p "summarize the changes in the last commit".
Drop in a screenshot of a UI bug — image understanding works inline.
Check /cost after a few turns to see authoritative spend from your AI Gateway.

That's the whole loop: an open-source Claude Code alternative running on your own Cloudflare account, with the model, the logs, and the bill all in one place you already control.

Install kimiflare View on GitHub