diff --git a/api-reference/error-codes.mdx b/api-reference/error-codes.mdx index 378ae2b..b9ba958 100644 --- a/api-reference/error-codes.mdx +++ b/api-reference/error-codes.mdx @@ -106,8 +106,9 @@ Tenant-scoped provider config (`PUT /v1/tenants/me/provider`) — how the runner |---|---|---|---| | `UZ-PROVIDER-001` | 400 | credential_ref required when mode=self_managed | PUT body must name a vault `credential_ref` when `mode` is `self_managed` | | `UZ-PROVIDER-002` | 400 | Credential row not found in vault | The named `credential_ref` has no vault row in the tenant's primary workspace | -| `UZ-PROVIDER-003` | 400 | Credential JSON missing required field | Stored credential must include `provider`, `api_key`, and `model` (all non-empty) | -| `UZ-PROVIDER-004` | 400 | Model not in cached caps catalogue | Effective model is absent from `core.model_caps`. Pick one from the model-caps endpoint. | +| `UZ-PROVIDER-003` | 400 | Credential JSON missing required field | Stored credential must include `provider` and `model` (non-empty); `api_key` is required for a named provider, optional for an `openai-compatible` endpoint | +| `UZ-PROVIDER-004` | 400 | Model not in cached caps catalogue | Effective model is absent from `core.model_caps` — named providers only. An `openai-compatible` endpoint bypasses the catalogue (it bills against your own provider). Pick a catalogued model from the model-caps endpoint. | +| `UZ-PROVIDER-005` | 400 | Custom endpoint base_url invalid or unsafe | An `openai-compatible` credential's `base_url` must be `https` and must not target a loopback, private, link-local, or cloud-metadata host; a `base_url` set on a named (non-`openai-compatible`) provider is rejected too | ## API limits diff --git a/changelog.mdx b/changelog.mdx index b7b666d..458671f 100644 --- a/changelog.mdx +++ b/changelog.mdx @@ -22,6 +22,45 @@ export const STAGE_SELF_MANAGED_M66 = "$0.0001"; agentsfleet is in **stealth-mode testing** and pre-production. APIs and agent behavior may change between releases without long deprecation windows. Email [agentsfleet@agentmail.to](mailto:agentsfleet@agentmail.to) if you want a hand calibrating an agent or to join as a design partner. + + ## One terminal-native dashboard, and any OpenAI-compatible endpoint + + The dashboard now reads as one product instead of several half-finished pages: a single underline tab style, one content width with an ambient glow on every page, a description under each page title, and an ink primary button in light mode — with Billing, Models, and Credentials rebuilt on it. And own-key model setup is no longer limited to the named providers: you can point a fleet at any OpenAI-compatible URL — self-hosted vLLM, a gateway, OpenRouter — from the CLI or the dashboard, with the endpoint validated before any run. + + ## What's new + + - **Billing** — a balance card with a full-width consumption meter and a terminal-style usage ledger (`date · amount · type · description`) with an empty state; the per-seat grid is gone, replaced by a single "Pay as you go" row. + - **Models and Credentials are now two destinations** — the sidebar lists `Models` (`/settings/models`) and `Credentials` (`/credentials`) separately; the Models screen collapses to two option cards, where the active one reads "Active — nothing to do" and the action lives only on the option you'd switch to. + - **A credentials vault** — `/credentials` groups model-provider keys, custom `NAME=value` secrets a SKILL reads by name, and integrations. Provider keys stay write-only and masked (Replace, never reveal). GitHub, Zoho, and Slack render as **Planned**, with a hint to bridge them as a custom secret until first-class connectors land. + - **Custom — OpenAI-compatible model setup** — add a credential with a base URL and an optional key, then point own-key setup at it; the base URL rides inside the saved credential, so the activation request shape is unchanged. + - **One install flow** — the Dashboard and Fleets pages share a single minimal install experience (a template, an `owner/repo` GitHub source, or pasted `SKILL.md`). One click proceeds in place through live importing → creating → done states instead of a separate review page, and a fleet that is still installing shows its progress until it is ready, then opens into its steer/chat. + + ## API reference + + A self-managed credential may now carry an OpenAI-compatible endpoint in its stored JSON. The `PUT /v1/tenants/me/provider` body is unchanged — the URL lives in the referenced credential: + + ```json + { + "provider": "openai-compatible", + "api_key": "", + "model": "", + "base_url": "https://host/v1" + } + ``` + + - **`base_url` is validated before any run** — it must be `https` and resolve to a public host. A loopback, private, link-local, or cloud-metadata target is rejected with `UZ-PROVIDER-005` and never dialed, and a `base_url` set on a named (non-`openai-compatible`) provider is rejected too. + + ## Bug fixes + + - **Custom endpoints now activate** — selecting an OpenAI-compatible credential previously failed because its model is not in the platform rate catalogue. Self-managed endpoints bill against your own provider, so they no longer require a catalogued model. + - **Account-modal email is readable in dark mode** — the secondary identifier in the account modal mapped to a too-dim token; it now uses a readable one in both themes. + - **Outbound model calls pin the validated address** — the non-streaming provider dial connects to the address checked at validation time instead of re-resolving at connect, closing a DNS-rebinding gap on custom endpoints. Requests still reach only the validated host. + + ## CLI + + - **`agentsfleet credential add`** accepts `--provider openai-compatible --base-url --model [--api-key ]` — the key is optional (a keyless gateway), the model is required, and a non-`https` `--base-url` is rejected at parse time with no network call. **`agentsfleet tenant provider add --credential `** then activates that credential as the workspace's model. Tool secrets (a GitHub token, a Slack token) keep the existing `credential add --data=@-` form. + + ## Grafana dashboards for token throughput, credit drain, and run latency diff --git a/cli/agentsfleet.mdx b/cli/agentsfleet.mdx index 246f215..e75cc15 100644 --- a/cli/agentsfleet.mdx +++ b/cli/agentsfleet.mdx @@ -262,6 +262,17 @@ JSON Default behaviour is skip-if-exists. Use `--force` to overwrite for rotation. +For a **model-provider** credential (own-key model setup), use typed flags instead of `--data` — including any OpenAI-compatible endpoint: + +```bash +agentsfleet credential add my-gateway \ + --provider openai-compatible --base-url https://gateway.example.com/v1 \ + --model my-model \ + --api-key sk-... # optional — omit for a keyless gateway +``` + +`--base-url` is required and `https`-only (rejected at parse, no network call); `--model` is required; `--api-key` is optional for a keyless gateway. The typed flags and `--data` are mutually exclusive. + See [Workspace credentials](/fleets/credentials) for the full vault model. ### `agentsfleet credential show ` @@ -447,7 +458,7 @@ For per-fleet spend breakdowns, use `agentsfleet logs --json` and agg ## Tenant provider -By default your tenant runs on **platform-managed** inference — agentsfleet holds a Fireworks key for Kimi K2.6 and bills you for the tokens at retail rate, bundled into the run charge. You can flip the tenant to **self-managed** at any time: store your own provider credential (Anthropic, OpenAI, Fireworks, Together, Groq, Moonshot, OpenRouter) in the vault under a name you choose, then point the tenant at it. On self-managed runs the per-second run fee still applies, but there is no added per-token cost from agentsfleet — you pay your provider directly. See [pricing](https://agentsfleet.net/#pricing) for current rates. +By default your tenant runs on **platform-managed** inference — agentsfleet holds a Fireworks key for Kimi K2.6 and bills you for the tokens at retail rate, bundled into the run charge. You can flip the tenant to **self-managed** at any time: store your own provider credential — a named provider (Anthropic, OpenAI, Fireworks, Together, Groq, Moonshot) or [any **OpenAI-compatible** endpoint](/fleets/credentials#model-provider-credentials-own-key-model-setup) (a self-hosted vLLM server, a gateway, OpenRouter) — in the vault under a name you choose, then point the tenant at it. On self-managed runs the per-second run fee still applies, but there is no added per-token cost from agentsfleet — you pay your provider directly. See [pricing](https://agentsfleet.net/#pricing) for current rates. ### `agentsfleet tenant provider show` @@ -458,9 +469,9 @@ agentsfleet tenant provider show agentsfleet tenant provider show --json ``` -### `agentsfleet tenant provider add` +### `agentsfleet tenant provider add --credential ` -Activate a self-managed credential for the tenant. The credential must already exist in the workspace vault (`agentsfleet credential add `). The command validates the credential structure, resolves the model's context cap from the `cap.json` endpoint, and pins both into the tenant's provider row. +Activate a self-managed credential for the tenant. The credential must already exist in the workspace vault (`agentsfleet credential add `). The command validates the credential structure and pins the model into the tenant's provider row. For a named provider it resolves the model's context cap from the `cap.json` endpoint; an **OpenAI-compatible** endpoint bypasses the caps catalogue (it bills against your own provider, so its model need not be catalogued). A missing, malformed, or unsafe-`base_url` credential is rejected with a typed `UZ-PROVIDER-*` error — see [Error codes](/api-reference/error-codes). ```bash # 1. Store the provider credential diff --git a/fleets/credentials.mdx b/fleets/credentials.mdx index 74c0cd2..97d8f5c 100644 --- a/fleets/credentials.mdx +++ b/fleets/credentials.mdx @@ -56,6 +56,33 @@ Exit codes: - `4` — missing `` or `--data`. - `5` — local config error, or `--data` input that isn't readable JSON. +## Model-provider credentials (own-key model setup) + +The credentials above are **tool secrets** — a key your fleet reaches an external service with through `http_request` (a GitHub token, a Slack token). The vault also holds the **model-provider key**: the credential your fleet's LLM calls authenticate with when you bring your own provider instead of the platform default. + +Named providers (`anthropic`, `openai`, …) take a key and a model. You can also point a fleet at **any OpenAI-compatible endpoint** — a self-hosted vLLM server, a gateway, OpenRouter — with typed flags instead of `--data`: + +```bash +agentsfleet credential add my-gateway \ + --provider openai-compatible \ + --base-url https://gateway.example.com/v1 \ + --model my-model \ + --api-key sk-... # optional — omit for a keyless gateway +``` + +- `--provider openai-compatible` opts the credential into the custom-endpoint path; `--base-url` is then **required** and is valid only here. +- `--model` is **required** — every self-managed credential names the model it serves. +- `--api-key` is **optional** for a custom endpoint (a keyless gateway dials with no key); a named provider still requires one. +- The typed flags and `--data` are mutually exclusive — a credential is either a model provider or a raw-JSON tool secret, not both. + +The base URL is validated server-side before any run: it must be `https` and resolve to a **public** host. A loopback, private, link-local, or cloud-metadata target — or a `base_url` set on a named (non-`openai-compatible`) provider — is rejected with [`UZ-PROVIDER-005`](/api-reference/error-codes) and never dialed. The runner's egress allowlist is derived from that host, so requests reach only it. + +Activate the credential as the workspace's model with `agentsfleet tenant provider add --credential ` (or the dashboard's **Models → Custom — OpenAI-compatible** option); `agentsfleet tenant provider delete` resets to the platform default. The URL rides inside the stored credential, so activation references it by name — the `PUT /v1/tenants/me/provider` body is identical to a named provider. + + + A self-managed endpoint bills against **your** provider account, not agentsfleet's per-model rates — so its model does not need to appear in the platform catalogue. + + ## Listing credentials `agentsfleet credential list` returns the names and creation timestamps of every credential in the current workspace. **Values are never returned.** This is by design — once stored, a secret is only ever used by the firewall.