F1 is a BYOK proxy for the Anthropic API. One SDK line change — get real USD spend per key, per model, per call. Plus 5 optimization insights that find your savings automatically.
Your app → F1 → Anthropic. One proxy, zero latency penalty.
F1 holds your Anthropic key encrypted at rest (AES-GCM, per-account HKDF key). You still pay Anthropic directly. F1 charges a flat $19/mo for observability.
Runs on Cloudflare Workers. Auth lookup is KV-cached (5min TTL). Usage write goes via ctx.waitUntil() — your latency is unchanged.
Per-model price table covers the full Claude 4.x family. Input tokens, output tokens, cache-write premium, cache-read discount — all computed correctly.
SQL-only, deterministic. Top expensive calls, model-mix Haiku-savings estimate, prompt-caching candidates, batchable workloads, since-cutover comparison.
Usage events record token counts and USD — never prompt text. Body storage is off by default. You can verify this in the open-source Worker code.
Every decryption of your Anthropic key is logged. Your dashboard shows the full history — you can verify your key was only used when you made requests.
Set a monthly $ threshold. The moment your Anthropic spend crosses it, F1 emails you (and optionally pings a Slack or Discord webhook). One alert per month, never again until you re-arm. Free on every tier.
All five rules run against your own usage data. No LLM, no guesses — just SQL.
Sorted by USD cost. Shows model, token counts, and (opt-in) a 200-char prompt preview. Your most expensive call in one click.
Sonnet calls with <500 output tokens are Haiku candidates. F1 estimates the monthly delta based on your actual traffic.
Typical saving: 60–80% on short-output callsDetects identical long inputs (≥2k tokens) repeated ≥10× in 7 days. Shows estimated savings from adding cache_control.
Prompts repeated ≥5× in 24 hours. The Anthropic Batch API gives 50% off for async use — F1 flags exactly which patterns qualify.
Set your migration date. F1 shows cumulative spend vs what the $100 Max 5x Agent SDK credit cap would have allowed over the same window.
You can read the code, check the audit log, and verify every subprocessor.
MIT-licensed. Every deploy is tagged on GitHub. The dashboard footer links to the exact commit running right now.
Usage events contain token counts and USD. Body storage is opt-in (200-char prefix only). Never in logs.
AES-GCM with a HKDF-derived per-account key. Decryption happens in the Worker runtime only, never logged.
Every decryption is logged. Your dashboard shows you exactly when and why your Anthropic key was used.
Cloudflare, Stripe, Brevo, Anthropic. No analytics SDKs, no tracking pixels, no third-party JS on the dashboard.
Cancel in Stripe Portal → 30-day grace → all account, key, and usage data hard-deleted from D1.
You pay Anthropic directly for tokens. F1 is a flat observability fee.
Questions? hello@mini-on-ai.com
Change two lines in your SDK init: set base_url to the F1 Worker URL and api_key to your F1 key. Every other call — models, messages, streaming, tools — works exactly the same. F1 forwards your request verbatim to Anthropic and records the usage on the way back.
F1 stores your Anthropic key encrypted at rest using AES-GCM with a per-account HKDF-derived key. Decryption happens only inside the Cloudflare Worker runtime, in the hot path. The key is never logged, never written to KV, and never leaves Cloudflare's environment. You can verify this in the open-source Worker source.
No. Usage events record model, token counts, USD cost, status code, and latency. Prompt text is never stored by default. You can opt in to storing the first 200 characters of each input (for the "Top expensive prompts" insight) — off by default.
Near zero in the happy path. Auth uses a KV-cached key lookup with a 5-minute TTL (typically <1ms). Usage writes go via ctx.waitUntil() — they happen after the response is returned and don't block your call. Only the D1 key lookup on first request (cold KV miss) adds ~5ms.
Yes — F1 proxies streaming responses transparently without buffering. In v1, token usage is not extracted from SSE streams (recorded as 0 tokens). Non-streaming calls always capture full usage. Streaming usage extraction is on the v1.1 roadmap.
Starting June 15, 2026, Anthropic moves claude -p, the Agent SDK, and Claude Code GitHub Actions into a separate credit pool ($100/mo on Max 5x, $200 on Max 20x). If your pipeline runs on claude -p, it hits that cap and stops. Migrating to F1 (via direct Anthropic API with a base_url override) means you pay pay-as-you-go API rates and have full USD visibility. F1 helps you manage that cost precisely.
Yes, any time in Stripe Customer Portal. Cancel → 30-day grace period → all account data, keys, and usage events are hard-deleted. No questions asked.