How to run an AI coding agent on your own Cloudflare account

You can run a Claude Code-style AI coding agent entirely on infrastructure you control. With kimiflare — an open-source (MIT) terminal coding agent — you install one npm package, connect your own Cloudflare account during onboarding, and every model call runs on Cloudflare Workers AI through your own AI Gateway. No third-party API keys, no estimated billing, and authoritative per-turn cost.

June 4, 2026 · ~6 min read

Short answer: Install kimiflare with npm install -g kimiflare, run kimiflare, and follow the onboarding to connect your Cloudflare account and AI Gateway. From then on, the agent reads and edits your code, runs commands, and answers questions using the Kimi K2.6 model on Workers AI — all under your own credentials, with logs and cost visible in your Cloudflare dashboard and the in-app /cost command.

Why run a coding agent on your own cloud?

Most hosted coding agents route your prompts and code through a vendor's servers and bill you on their terms. Running the agent on your own Cloudflare account changes the trade-offs:

What you need

That's the entire prerequisite list. You do not need a separate model provider account or any API key other than your Cloudflare credentials.

Install kimiflare

bash
# Install globally
npm install -g kimiflare

# Launch — first run starts onboarding
kimiflare

Prefer not to install globally? Run it on demand with npx kimiflare. Either way you need Node.js ≥ 20.

Onboarding: connect your Cloudflare account

The first time you run kimiflare, onboarding walks you through connecting your Cloudflare account and setting up an AI Gateway. The AI Gateway is the piece that makes the rest of the experience work: it sits in front of Workers AI, records a log for every request, can cache responses, and reports real cost. Once onboarding finishes, you're in the terminal UI and ready to give the agent tasks.

What happens on each turn

When you ask kimiflare to do something, a single turn flows like this:

  1. Your request leaves the terminal and goes to your own Cloudflare AI Gateway.
  2. The gateway forwards it to Workers AI, where the Kimi K2.6 model (@cf/moonshotai/kimi-k2.6) runs with a 262K-token context window.
  3. The model decides which tools to call — reading files, editing code, running shell commands, searching, and so on.
  4. Tool results feed back into the model until the task is done, and the result lands back in your terminal.

Because the gateway is on the path of every call, you get logs and cost for free. If you ever want a different model, you can override the default with --model.

Cost and observability via AI Gateway

This is the practical payoff of running on your own cloud. Every request is logged in your Cloudflare AI Gateway, so you can inspect prompts, responses, latency, and spend directly in the Cloudflare dashboard. Inside the app, the /cost command surfaces authoritative cost broken down per turn and per feature — confirmed by the gateway rather than estimated from token counts. The gateway can also cache responses, which can reduce repeat spend.

Stay in control with Plan, Edit, and Auto modes

An autonomous agent that edits files and runs shell commands needs guardrails. kimiflare ships three modes so you can dial in how much it does on its own:

A common workflow is to start in Plan to scope the work, switch to Edit to apply changes with diffs you can review, and reach for Auto only on routine, low-risk tasks.

First tasks to try

Once you're set up, these are good ways to get a feel for the agent:

That's the whole loop: an open-source Claude Code alternative running on your own Cloudflare account, with the model, the logs, and the bill all in one place you already control.

Related

Install kimiflare   View on GitHub