Zero inference on our servers

Typillar runs zero inference on its own servers. Every token used to plan or write your product is generated on your side — Cloudflare Workers AI or your own provider key. This is a deliberate architectural choice, not a configuration option.

What “zero inference” means

Typillar’s control plane decides what work to do and orchestrates it.
The thinking — every model call that reasons or writes code — runs on your Cloudflare Workers AI or against your own provider key.
No prompts or generated code are processed by a model hosted by Typillar, because Typillar hosts none.

See Bring your own models for the two ways inference can run on your side.

Why it’s built this way

A clean trust boundary. Your prompts and code don’t pass through our inference. The control plane coordinates; your infrastructure does the reasoning and the running.
Cost transparency. You pay for inference directly on your own account — no proxy, no markup, no hidden token bill.
Control. You choose the model and provider and can change or revoke the key whenever you want.

How it fits the bigger picture

Zero inference is the inference half of Typillar’s ownership split: you own the account, the repo, the data, and every token of inference; Typillar owns the agents and the orchestration that put those tokens to work. The control plane is the office; the thinking and the running happen at your address.

Zero inference on our servers

What “zero inference” means

Why it’s built this way

How it fits the bigger picture

Related