feat: add azure-foundry provider for Microsoft Foundry model access#950
feat: add azure-foundry provider for Microsoft Foundry model access#950guglxni wants to merge 5 commits into
Conversation
Introduce a first-class azure-foundry provider that targets Foundry's OpenAI v1-compatible route with api-key authentication, so users no longer need to hand-wire the generic openai provider for Azure deployments. Closes MoonshotAI#918 Co-authored-by: Cursor <cursoragent@cursor.com>
🦋 Changeset detectedLatest commit: 1207f07 The changes in this PR will be included in the next version bump. This PR includes changesets to release 1 package
Not sure what this means? Click here to learn what changesets are. Click here if you're a maintainer who wants to add another changeset to this PR |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: f36bd80f8e
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
|
|
||
| const clientOpts: Record<string, unknown> = { | ||
| apiKey: key, | ||
| baseURL: baseUrl, |
There was a problem hiding this comment.
Require a Foundry base URL before building the client
When an azure-foundry provider is configured with an API key but no base_url/AZURE_FOUNDRY_BASE_URL, baseUrl remains undefined here, so the OpenAI SDK falls back to its default OpenAI host instead of failing fast. That sends the Foundry api-key header to the wrong endpoint and produces a confusing upstream auth error; this provider should reject missing/blank base URLs before constructing the client.
Useful? React with 👍 / 👎.
Require base_url before constructing the Foundry client so api-key auth never falls back to the default OpenAI host. Clamp completion budgets against Foundry's shared input+output context window and recover once when a model stalls after tool results without issuing further tool calls. Addresses Codex review on MoonshotAI#950. Relates to MoonshotAI#918 and MoonshotAI#520. Co-authored-by: Cursor <cursoragent@cursor.com>
Foundry deployments of Kimi-K2.x were using max_tokens, which shares the output budget with reasoning_content and can yield think-only responses. Use max_completion_tokens and thinking enablement like the native Kimi provider, honor explicit thinking-off over history auto-injection, and apply shared-window clamping against the correct completion field.
Microsoft Foundry exposes Kimi through the OpenAI chat-completions schema and rejects the Moonshot-proprietary `thinking` argument. Keep reasoning enabled via `reasoning_effort` and the max_completion_tokens split; only KimiChatProvider sends `thinking` on the native Moonshot API.
Related Issue
Resolve #918
Problem
Kimi Code has no first-class support for Microsoft Foundry model deployments. Users must hand-wire the generic
openaiprovider against Foundry's OpenAI v1-compatible route, which is undocumented, fragile around auth (api-keyvs Bearer), and poorly named for Foundry's multi-model catalog (GPT, DeepSeek, Llama, Mistral, etc.). Real-world usage already exists via this workaround (#520).Foundry-hosted Kimi reasoning models (e.g.
Kimi-K2.6) additionally hit think-only responses when wired through the generic OpenAI adapter:max_tokensshares the output budget withreasoning_content, so the model can finish reasoning without emitting visible text or tool calls.What changed
azure-foundryprovider type across kosong, agent-core, and oauth custom-registry wiring.AzureFoundryChatProvidertargeting Foundry's OpenAI v1 route (https://{resource}.openai.azure.com/openai/v1) withapi-keyheader auth, delegating streaming/tools/reasoning to the existing OpenAI chat-completions adapter.base_urlbefore constructing the client soapi-keyauth never falls back to the default OpenAI host.max_context_size).max_completion_tokens(visible output budget) plusthinking: { type: 'enabled' }alongsidereasoning_effort, instead ofmax_tokenswhich conflates reasoning and output.withThinking('off')over history-basedreasoning_effortauto-injection.KIMI_MODEL_THINKING_KEEPto Foundry-hosted Kimi models.AZURE_FOUNDRY_API_KEY,AZURE_FOUNDRY_BASE_URL) in English and Chinese provider/config docs.Out of scope (follow-ups per #918): Entra ID token refresh, legacy deployment URLs with
api-version, Foundry Agent Service APIs, and/providercatalog import (models.dev has no Azure entry).Checklist
gen-changesetsskill, or this PR needs no changeset.gen-docsskill, or this PR needs no doc update.Test plan
pnpm vitest run packages/kosong/test/azure-foundry.test.ts packages/kosong/test/kimi-reasoning.test.ts packages/kosong/test/shared-context-window.test.ts packages/kosong/test/catalog.test.ts packages/agent-core/test/harness/runtime-provider.test.ts packages/agent-core/test/config/kimi-env-params.test.tspnpm --filter @moonshot-ai/kosong typecheckpnpm --filter @moonshot-ai/agent-core typecheckpnpm --filter @moonshot-ai/kimi-code-oauth typecheck