Previously only base64 data: URLs were handled in the OpenAI-to-Claude
and OpenAI-to-Gemini request translators. HTTP/HTTPS image URLs were
silently dropped, causing vision-capable models to respond with
"I don't see any image."
Add stream_options: { include_usage: true } to iFlow streaming requests
to get token usage data in the final streaming chunk. This fixes token
counts showing as 0 for iFlow streaming requests.
Only injected when streaming is enabled and body.messages exists (OpenAI
format), and the client hasn't already set stream_options.
Note: Applied only to iFlow executor instead of BaseExecutor to avoid
affecting all providers globally. This gives us more control and allows
testing with iFlow first.
Fixes#74
Co-authored-by: Ibrahim Ryan <ryan@nuevanext.com>
Made-with: Cursor
- Add comboRotationState Map to track rotation per combo
- Add getRotatedModels() to rotate model order based on strategy
- Pass comboName and comboStrategy to handleComboChat()
- Add comboStrategy setting (default: fallback)
- Add UI toggle for Combo Round Robin in profile settings
When enabled, each request to a combo starts with a different provider
instead of always starting with the first one, distributing load evenly.
Co-authored-by: Antigravity Agent <antigravity@example.com>
Some upstream providers (e.g. Antigravity) return non-standard finish_reason
values like 'other' instead of the OpenAI-standard 'tool_calls' when the
model invokes tools. This causes downstream consumers (e.g. OpenClaw) to
fail to execute tool calls, breaking agentic sub-agent workflows.
Changes:
- nonStreamingHandler: post-translation guard that normalizes finish_reason
to 'tool_calls' when message.tool_calls is present
- sseToJsonHandler: accumulate tool_calls from streaming deltas in
parseSSEToOpenAIResponse; extract function_call items from Responses API
output in handleForcedSSEToJson
- openai-responses translator: use toolCallIndex to choose between
'tool_calls' and 'stop' in flush and response.completed events
Tested: 7 scenarios (non-stream text, single/multiple tool calls, stream
text/tool calls, multi-turn tool conversation, tools present but unused)
Kiro returns HTTP 400 with 'Improperly formed request (reset after Xs)'
when a model is not available on that account's subscription tier.
Previously this fell through to COOLDOWN_MS.transient (30s), causing
rapid retries on all accounts before failing — all accounts get locked
simultaneously with no actual fallback.
Treating this as paymentRequired (2min cooldown) ensures:
1. The model is locked on that account for 2min (proper cooldown)
2. The next available account is tried immediately
3. If all accounts hit the same 400, 9Router falls through to the
next provider in the combo
Fixes#384
Root cause: Codex/OpenAI Responses streams multiple alternating reasoning and
message output items. The first message block often has empty output_text; the
visible answer lives in a later message. Previous code used output.find() which
always picked the first (empty) message block.
Fix: walk message items from end and use the last message whose extracted text
is non-empty; fall back to final message if all are empty.
Note: Removed debug logging code from original PR #383 to keep implementation clean.
Co-authored-by: lokinh <locnh@uniultra.xyz>
Made-with: Cursor
- fixes#335: on transient 503/502/504, wait for short cooldown (up to
5s) before falling to next combo model, giving the provider a chance
to recover rather than immediately skipping it
- fixes#334: when all combo models have no active credentials, return
503 (Service Unavailable) instead of 406 (Not Acceptable), which is
more accurate and retriable by clients
Gemini API requires enum properties to have an explicit type:"string"
declaration. Without it, tool calls with enum parameters return 400
Bad Request. Fixes#359.
Add MiniMax-M2.7 to provider models and pricing config alongside
existing M2.5. M2.7 is the latest reasoning model with 204K context.
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
- Simplify ANTIGRAVITY_HEADERS to dynamic User-Agent only
- Use IDE_TYPE, PLUGIN_TYPE enums and getPlatformEnum() in metadata
- Update antigravity baseUrl to sandbox endpoint
- Bump User-Agent version from 1.104.0 to 1.107.0
- Remove redundant header spread in AntigravityExecutor
Made-with: Cursor
Co-authored-by: Quan <quanle96@outlook.com>
PR: https://github.com/decolua/9router/pull/298
Thanks to @kwanLeeFrmVi for the original implementation. Here is a summary
of changes made during review integration:
- Replaced google-auth-library with jose (already a project dependency)
for SA JSON -> OAuth2 Bearer token minting (RS256 JWT assertion flow)
- Moved auth logic (parseSaJson, refreshVertexToken, token cache) from
executor into open-sse/services/tokenRefresh.js to match project pattern
- Fixed executor to use proxyAwareFetch instead of raw fetch (proxy support)
- Simplified buildUrl: use global aiplatform.googleapis.com endpoint for
both vertex (Gemini) and vertex-partner; removed region/modelFamily fields
- Added auto-detection of GCP project_id from raw API key via probe request
(vertex-partner only, cached per key)
- Added vertex/vertex-partner cases to /api/providers/validate/route.js
- Updated model lists based on live testing:
- vertex: gemini-3.1-pro-preview, gemini-3.1-flash-lite-preview,
gemini-3-flash-preview, gemini-2.5-flash (removed gemini-2.5-pro: 404)
- vertex-partner: deepseek-v3.2, qwen3-next-80b (instruct+thinking),
glm-5 (removed Mistral/Llama: not enabled in test project)
- gemini provider: added gemini-3.1-pro-preview, gemini-3.1-flash-lite-preview
- Removed bun.lock (project uses npm/package-lock.json)
- Removed region and modelFamily UI fields (global endpoint, auto-detect)
- Kiro token auto-refresh on AccessDeniedException (from commit 2)
Made-with: Cursor
- Guard data: [DONE] in github.js TransformStream with stream === true
- Inject response_format as system prompt for Claude models via GitHub executor
Note: stream.js guards skipped, createSSEStream is only called for true streaming paths.
Cherry-picked and adapted from PR #286 by @rothnic
https://github.com/decolua/9router/pull/286
Made-with: Cursor
- Respect Accept: application/json header to return non-streaming JSON
instead of SSE, fixing AI SDK generateObject/generateText compatibility
- Strip markdown code block markers (```json...```) from Claude
non-streaming responses to prevent JSON parse errors
Cherry-picked and adapted from PR #290 by @rothnic
https://github.com/decolua/9router/pull/290
Made-with: Cursor
Translates OpenAI response_format parameter into Claude-compatible system
prompt instructions, enabling structured JSON output for json_schema and
json_object types.
Co-authored-by: Nick Roth <nlr06886@gmail.com>
Made-with: Cursor
- Added new provider models: DeepSeek 3.1, DeepSeek 3.2, and Qwen3 Coder Next.
- Implemented UI changes to support round-robin strategy with sticky limits in the provider detail page.
- Improved logging to display connection names instead of IDs for better clarity.
Match native GeminiCLI client fingerprint to avoid upstream rejection.
Also fix base executor to call transformRequest before buildHeaders so
subclasses can store model context for header generation.
Made-with: Cursor
- Centralize proxy management with reusable proxy pools
- Per-connection proxy binding with legacy fallback
- Add strictProxy option: fail hard instead of silently falling back to direct
- Resolve alicode-intl conflict: keep alicode-intl support + proxy support
Made-with: Cursor