73 Commits

Author SHA1 Message Date
decolua
b72a443bd3 feat: add CommandCode provider support 2026-05-07 23:01:33 +07:00
decolua
d4bc42e1f5 feat: add STT support, Gemini TTS, and expand usage tracking
- Speech-to-Text: full pipeline with sttCore handler, /v1/audio/transcriptions
  endpoint, sttConfig for OpenAI, Gemini, Groq, Deepgram, AssemblyAI,
  HuggingFace, NVIDIA Parakeet; new 9router-stt skill
- Gemini TTS: add gemini provider with 30 prebuilt voices and TTS_PROVIDER_CONFIG
- Usage: implement GLM (intl/cn) and MiniMax (intl/cn) quota fetchers; refactor
  Gemini CLI usage to use retrieveUserQuota with per-model buckets
- Disabled models: lowdb-backed disabledModelsDb + /api/models/disabled route
- Header search: reusable Zustand store (headerSearchStore) wired into Header
- CLI tools: add Claude Cowork tool card and cowork-settings API
- Providers: introduce mediaPriority sorting in getProvidersByKind, add
  Kimi K2.6, reorder hermes, drop qwen STT kind
- UI: expand media-providers/[kind]/[id] page (+314), enhance OAuthModal,
  ModelSelectModal, ProviderTopology, ProxyPools, ProviderLimits
- Assets: refresh provider PNGs (alicode, byteplus, cloudflare-ai, nvidia,
  ollama, vertex, volcengine-ark) and add aws-polly, fal-ai, jina-ai, recraft,
  runwayml, stability-ai, topaz, black-forest-labs
2026-05-05 10:32:59 +07:00
decolua
9c6be62a54 Feat : Skills 2026-05-04 11:29:02 +07:00
decolua
936d65ae1c Enhance chat handling and introduce Caveman feature
- Refactored handleChatCore to include Caveman functionality, allowing for terse-style system prompts to reduce output token usage.
- Updated APIPageClient to manage Caveman settings, including enabling/disabling and selecting compression levels.
- Adjusted AntigravityExecutor to consolidate function declarations for compatibility with Gemini.
- Removed unnecessary console logs during translator initialization across multiple routes.
2026-04-30 18:00:38 +07:00
decolua
512e3de371 Update version to 0.4.9, enhance README with Trendshift badge, and add new embedding models to providerModels.js. Refactor TTS handling to support additional providers and improve API key validation for media providers. 2026-04-29 11:34:39 +07:00
decolua
8f81363675 Enhance token refresh functionality across multiple executors
- Updated refreshCredentials methods in various executors (Antigravity, Base, Default, Github, Kiro) to accept optional proxyOptions for improved proxy handling.
- Modified token refresh logic to utilize proxy-aware fetch for better network management.
- Enhanced usage retrieval functions to support proxy options, ensuring seamless integration with proxy configurations.
- Updated ModelSelectModal and ProviderInfoCard components to incorporate kind filtering for improved user experience in model selection.
- Added validation for API keys in the provider validation route, including support for webSearch/webFetch providers.
2026-04-28 17:28:57 +07:00
lukmanfauzie
222e22fa53 Fix GitHub Copilot agent mode with Antigravity
Co-authored-by: Copilot <copilot@github.com>
2026-04-26 17:47:13 +08:00
decolua
83418e8a9d Add codex to image providers 2026-04-25 17:01:40 +07:00
decolua
0b8bed5793 Enhance image and embedding provider support
- Added new image models for GPT 5.2, 5.3, and 5.4, including capabilities for text-to-image and editing.
- Updated embedding handling to include optional dimensions in requests.
- Introduced support for custom embedding providers, allowing dynamic fetching and validation of custom nodes.
- Improved image generation handling with Codex integration, including progress tracking and error handling.
- Enhanced UI components to support adding custom embeddings and displaying their status.
2026-04-25 16:22:30 +07:00
decolua
cca615eaff - Cap maximum cooldown for rate limit handling in account unavailability and single-model chat flows
- Dynamic custom model fetching for model selection
2026-04-24 16:14:18 +07:00
decolua
030fb34f88 - Updated markAccountUnavailable function to accept resetsAtMs for precise cooldown management.
- Added email backfill functionality for Codex OAuth connections to improve account information accuracy.
2026-04-24 11:36:16 +07:00
decolua
5abc9e5c74 add GPT 5.5 model 2026-04-24 09:51:05 +07:00
decolua
45731ae639 feat: add OpenCode Go provider and support for custom models
- Introduced OpenCode Go provider with relevant configurations.
- Enhanced model management by allowing users to add and delete custom models.
- Updated UI components to support model selection for image types.
- Adjusted sidebar visibility to include image media kinds.
2026-04-22 14:16:21 +07:00
decolua
b669b6ffc1 Refactor error handling to config-driven approach with centralized error rules
Made-with: Cursor
2026-04-15 11:46:47 +07:00
decolua
6a6e2fcd77 Fix : noAuth support for providers and adjusted MITM restart settings. 2026-04-14 10:14:50 +07:00
decolua
4c28a1671d Enhance provider models and chat handling with new thinking configurations 2026-04-13 12:04:57 +07:00
decolua
89eb26dee2 Enhance proxy functionality with Vercel relay support 2026-04-13 10:08:24 +07:00
decolua
b3feb96740 Enhance TTS functionality and security settings
- Integrated Google TTS languages from a separate module for better maintainability.
- Updated local device voice fetching to support both macOS and Windows, improving cross-platform compatibility.
- Enhanced dashboard route protection by adding dynamic settings for login requirements and tunnel access.
- Introduced UI elements for managing security settings related to API key requirements and dashboard access via tunnel.
- Added default TTS response example in the media provider page for better user guidance.
- Updated constants to reflect changes in TTS provider configurations.

This commit improves the overall user experience and security of the TTS features.
2026-04-11 14:56:35 +07:00
Omar Nahhas Sanchez
878cdf302b fix: only strip reasoning_content when content is non-empty (#542)
sseToJsonHandler.js unconditionally deleted reasoning_content from all
non-streaming responses (added for Firecrawl SDK compatibility). This
breaks thinking models (Qwen3.5, Claude extended thinking, etc.) where
the model may use all tokens for reasoning, leaving content empty.

When reasoning_content is stripped in that case, the response appears
completely empty to the client.

Fix: only strip reasoning_content when the response also has non-empty
content, so that reasoning output is preserved when it is the only
useful output.

Co-authored-by: Agent Zero <agent@agent-zero.local>
2026-04-10 10:28:58 +07:00
decolua
3c96e8d6d1 Feat : tts 2026-04-10 10:17:53 +07:00
decolua
401772cb9a Fix bug strip image 2026-04-07 10:18:59 +07:00
Anurag Saxena
a53ccf1343 fix: strip reasoning_content from non-streaming responses (closes #509) (#517) 2026-04-07 09:46:28 +07:00
decolua
67e0db77da Fix : Updated Anthropic-Beta header. 2026-04-05 07:46:26 +07:00
kwanLeeFrmVi
666aecfc7c feat(translator): lossless passthrough via CLI tool + provider pairing
Add clientDetector utility to identify CLI tools (Claude Code, Gemini CLI,
Antigravity, Codex) from request headers. When the CLI tool and provider
are a native pair, skip all translation — only swap model and Bearer token.

Made-with: Cursor
2026-04-04 23:48:58 +07:00
decolua
333e704b2a MODEL_CAPS 2026-04-04 23:24:24 +07:00
Anurag Saxena
e3a7733a08 fix: strip functionCall/functionResponse id and synthetic thoughtSignature for Vertex AI (closes #388) (#414) 2026-03-27 10:46:47 +07:00
Liam
01e4a28f0a fix: normalize finish_reason to 'tool_calls' when tool calls are present (#379)
Some upstream providers (e.g. Antigravity) return non-standard finish_reason
values like 'other' instead of the OpenAI-standard 'tool_calls' when the
model invokes tools. This causes downstream consumers (e.g. OpenClaw) to
fail to execute tool calls, breaking agentic sub-agent workflows.

Changes:
- nonStreamingHandler: post-translation guard that normalizes finish_reason
  to 'tool_calls' when message.tool_calls is present
- sseToJsonHandler: accumulate tool_calls from streaming deltas in
  parseSSEToOpenAIResponse; extract function_call items from Responses API
  output in handleForcedSSEToJson
- openai-responses translator: use toolCallIndex to choose between
  'tool_calls' and 'stop' in flush and response.completed events

Tested: 7 scenarios (non-stream text, single/multiple tool calls, stream
text/tool calls, multi-turn tool conversation, tools present but unused)
2026-03-23 09:35:25 +07:00
decolua
3d4dbdc0e7 fix(chat): pick last non-empty message for Codex Responses SSE
Root cause: Codex/OpenAI Responses streams multiple alternating reasoning and
message output items. The first message block often has empty output_text; the
visible answer lives in a later message. Previous code used output.find() which
always picked the first (empty) message block.

Fix: walk message items from end and use the last message whose extracted text
is non-empty; fall back to final message if all are empty.

Note: Removed debug logging code from original PR #383 to keep implementation clean.

Co-authored-by: lokinh <locnh@uniultra.xyz>
Made-with: Cursor
2026-03-23 09:29:31 +07:00
decolua
adae2605bf Feat : Auto restart after crash 2026-03-14 09:37:29 +07:00
Nick Roth
d12b14f411 feat: AI SDK compatibility - Accept header & JSON markdown stripping
- Respect Accept: application/json header to return non-streaming JSON
  instead of SSE, fixing AI SDK generateObject/generateText compatibility
- Strip markdown code block markers (```json...```) from Claude
  non-streaming responses to prevent JSON parse errors

Cherry-picked and adapted from PR #290 by @rothnic
https://github.com/decolua/9router/pull/290

Made-with: Cursor
2026-03-13 10:00:47 +07:00
decolua
373b10ebb5 feat(chat): Enhance bypass handling and introduce CC filter naming feature
Fix : Ollam Provider response
2026-03-13 09:41:40 +07:00
decolua
b0c6b61398 Refactor config 2026-03-12 16:20:46 +07:00
decolua
83d94daa82 feat(ollama): Enhance Ollama support by adding new models, updating API format handling, and integrating translation functionality. 2026-03-12 15:24:10 +07:00
decolua
880f4eca91 feat(proxy): add proxy pool and per-connection binding + strictProxy support
- Centralize proxy management with reusable proxy pools
- Per-connection proxy binding with legacy fallback
- Add strictProxy option: fail hard instead of silently falling back to direct
- Resolve alicode-intl conflict: keep alicode-intl support + proxy support

Made-with: Cursor
2026-03-09 15:46:06 +07:00
decolua
4903a9b2cb Feat : console log 2026-03-02 09:31:16 +07:00
decolua
5954b8f4eb - Refactor chatCore.js to streamline imports and remove unused functions.
- Fix streaming /v1/responses
2026-02-27 11:15:12 +07:00
decolua
d21f7aaadc Fix bug Tunnel 2026-02-22 21:44:11 +07:00
decolua
0baa299722 feat :
- Added tunnel
- Removed cloud feature
2026-02-21 16:42:46 +07:00
decolua
adf57aa0c9 Fixed Codex 2026-02-21 14:36:06 +07:00
Thiên Toán
806bd4ae14 feat: add API endpoint dimension to usage statistics dashboard (#152)
- Tracks endpoints like /v1/chat/completions, /v1/messages, /v1/responses
- New sortable/groupable table in usage dashboard with expandable groups
- Enhanced usage database aggregation by endpoint + model + provider
- Added endpoint tracking to all saveRequestUsage/saveRequestDetail calls
- Maintains backward compatibility with existing data structure
2026-02-20 15:03:18 +07:00
Hồ Xuân Dũng
a57a8ce206 feat: add Gemini embeddings support + Letta compatibility fixes
Cherry-picked from decolua/9router#148 (author: xuandung38 / Hồ Xuân Dũng <me@hxd.vn>)

- Add Google AI (Gemini) embeddings support for /v1/embeddings endpoint
- Add Gemini embedding models: gemini-embedding-001, text-embedding-005, text-embedding-004
- Inject missing object/created fields for Letta and strict OpenAI clients
- Strip Azure-specific fields (prompt_filter_results, content_filter_results) from responses
- Fix Dockerfile: copy open-sse directory into Docker runner stage

Skipped: whitelist message field stripping (commit 3/7/8) — too aggressive for all providers
Skipped: default stream=false change (commit 9) — behavior change needs further review
Co-authored-by: Cursor <cursoragent@cursor.com>
2026-02-20 15:01:10 +07:00
Thiên Toán
9fbd6e619d fix: correct token extraction for Claude non-streaming responses (#131)
- Add response logging for non-streaming requests (5_res_provider.json, 7_res_client.json)
- Fix extractUsageFromResponse() to check Claude format before OpenAI format
- Prevents format misidentification that caused tokens to show as 0
- Claude uses input_tokens/output_tokens vs OpenAI's prompt_tokens/completion_tokens

Fixes dashboard Details tab showing 0 tokens for Claude requests
2026-02-20 14:24:21 +07:00
HXD.VN
e1b836168a feat: add /v1/embeddings endpoint (OpenAI-compatible) (#146)
* feat: implement /v1/embeddings endpoint (#117)

Add OpenAI-compatible POST /v1/embeddings endpoint that routes through
the existing provider credential + fallback infrastructure.

Changes:
- open-sse/handlers/embeddingsCore.js: core handler (handleEmbeddingsCore)
  * Validates input (string or array), encoding_format
  * Builds provider-specific URL and headers for openai, openrouter,
    and openai-compatible providers
  * Handles 401/403 token refresh via executor.refreshCredentials
  * Returns normalized OpenAI-format response { object: 'list', data, model, usage }
- cloud/src/handlers/embeddings.js: cloud Worker handler (handleEmbeddings)
  * Auth + machineId resolution identical to handleChat
  * Provider credential fallback loop with rate-limit tracking
- cloud/src/index.js: wire new routes
  * POST /v1/embeddings  (new format — machineId from API key)
  * POST /{machineId}/v1/embeddings  (old format — machineId from URL)

* test: add unit tests for /v1/embeddings endpoint

- Setup vitest as test framework (tests/ directory)
- embeddingsCore.test.js (36 tests):
  - buildEmbeddingsBody: single string, array, encoding_format, default float
  - buildEmbeddingsUrl: openai, openrouter, openai-compatible-*, unsupported
  - buildEmbeddingsHeaders: per-provider headers, accessToken fallback
  - handleEmbeddingsCore: input validation, success path, provider errors,
    network errors, invalid JSON, token refresh 401 handling
- embeddings.cloud.test.js (23 tests):
  - CORS OPTIONS preflight
  - Auth: missing/invalid/old-format/wrong key → 401/400
  - Body validation: bad JSON, missing model, missing input, bad model → 400
  - Happy path: single string, array, delegation, CORS header, machineId override
  - Rate limiting: all-rate-limited → 429 + Retry-After, no credentials → 400
  - Error propagation: non-fallback errors, 429 exhausts accounts

Total: 59/59 tests passing
Framework: vitest v4.0.18, Node v22.22.0

* feat: add Next.js API route for /v1/embeddings endpoint

Wire the embeddings handler into Next.js App Router.

- src/app/api/v1/embeddings/route.js: Next.js API route (POST + OPTIONS)
- src/sse/handlers/embeddings.js: SSE-layer handler mirroring chat.js pattern

Uses handleEmbeddingsCore from open-sse/handlers/embeddingsCore.js with
the same auth, credential fallback, and token refresh logic as the chat
handler. Supports REQUIRE_API_KEY env var, provider fallback loop, and
consistent logging.
2026-02-18 13:24:02 +07:00
decolua
e2db638982 feat: enhance request handling and error management in chatCore and streamToJsonConverter
- Added detailed request logging and latency tracking in handleChatCore.
- Improved error handling for SSE to JSON conversion and JSON parsing in streamToJsonConverter.
- Introduced a safe JSON parsing utility to handle potential parsing errors gracefully in requestDetailsDb.

Co-authored-by: zx <me@char.moe>
2026-02-15 12:02:53 +07:00
zx07
3d29b86d44 feat: enhance disconnect handling and request tracking in chatCore.js (#126)
Co-authored-by: zx <me@char.moe>
2026-02-15 11:51:37 +07:00
apeltekci
ac7cedd27e feat(responses): respect client streaming preference + string input support (#121)
- Remove forced stream=true from responsesHandler
- Add stream-to-JSON converter for non-streaming clients (Codex)
- Accept string input in Responses API (normalize to array)
- Codex SSE header fallback for missing Content-Type
- Refactor: extract shared normalizeResponsesInput()

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Co-authored-by: Cursor <cursoragent@cursor.com>
2026-02-15 11:47:55 +07:00
Blade096
1ae4e311b7 feat: add GLM Coding (China) provider and Usage by API Keys statistics
Co-authored-by: Cursor <cursoragent@cursor.com>
2026-02-11 15:44:08 +07:00
Blade
85b7a0b136 Feature/ai observability dashboard (#79)
* feat: add AI request details feature with latency tracking

Add comprehensive request history and debugging capability to the Usage dashboard:

**Storage Layer** (usageDb.js):
- Add saveRequestDetail() for storing full request/response details
- Implement FIFO queue with 1000-record limit in request-details.json
- Auto-sanitize sensitive headers (authorization, api-key, cookie, token)
- Add getRequestDetails() with pagination and filtering support
- Add getRequestDetailById() for single record lookup

**Pipeline Integration** (chatCore.js):
- Track request start time and calculate total latency
- Record TTFT (Time To First Token) and total latency for all requests
- Capture full request details (messages, model, parameters)
- Save response content for non-streaming, mark streaming responses
- Handle error cases with detailed error information
- Async non-blocking saves to avoid impacting request performance

**API Layer** (/api/usage/request-details):
- GET endpoint with pagination (page, pageSize: 1-100)
- Filter by provider, model, connectionId, status, date range
- Returns { details: [...], pagination: {...} } format

**UI Components**:
- Drawer.js: Right slide-out panel with backdrop blur and ESC close
- Pagination.js: Full pagination with page size selector (10/20/50)
- RequestDetailsTab.js: Complete table view with filters and detail drawer

**Dashboard Integration**:
- Add "Details" tab to Usage page (4th tab after Overview/Logger/Limits)
- Table columns: Timestamp, Model, Provider, Input Tokens, Output Tokens, Latency (TTFT/Total), Action
- Provider filter dropdown (9 providers supported)
- Date range filters (start/end datetime)
- Click "Detail" button to view full request/response JSON in slide-out drawer

**Features**:
- Real-time latency monitoring (TTFT & Total)
- Complete request/response inspection for debugging
- Filterable and searchable request history
- Responsive design with mobile-friendly filters
- Data security with automatic header sanitization
- Performance: async saves don't block request pipeline

**Files Created/Modified**:
- src/lib/usageDb.js (modified)
- open-sse/handlers/chatCore.js (modified)
- src/app/api/usage/request-details/route.js (new)
- src/shared/components/Drawer.js (new)
- src/shared/components/Pagination.js (new)
- src/app/(dashboard)/dashboard/usage/components/RequestDetailsTab.js (new)
- src/app/(dashboard)/dashboard/usage/page.js (modified)

Closes: AI Observability Dashboard feature

* feat: enhance request details with full config and streaming content capture

Improve Request Details feature to capture comprehensive request parameters
and actual streaming response content:

**Request Configuration Enhancement** (chatCore.js):
- Add extractRequestConfig() helper function to capture all request parameters
- Include temperature controls: temperature, top_p, top_k
- Include token limits: max_tokens, max_completion_tokens
- Include thinking/reasoning modes: thinking, reasoning, enable_thinking
- Include OpenAI parameters: presence_penalty, frequency_penalty, seed, stop,
  tools, tool_choice, response_format, n, logprobs, top_logprobs, logit_bias,
  user, parallel_tool_calls, prediction, store, metadata
- Apply to all request types: non-streaming, streaming, and error cases

**Streaming Content Capture** (chatCore.js & stream.js):
- Add onStreamComplete callback mechanism to stream processors
- Accumulate content from all formats: OpenAI, Claude, Gemini
- Track content from delta.content, delta.reasoning_content, delta.text,
  delta.thinking, and Gemini content.parts
- Save initial record with "[Streaming in progress...]" marker
- Update record with actual content when stream completes
- Include usage tokens when available from stream

**Files Modified**:
- open-sse/handlers/chatCore.js - extractRequestConfig() + streaming capture
- open-sse/utils/stream.js - onStreamComplete callback + content accumulation

**Benefits**:
- View complete request configuration in Request Details (thinking mode, etc.)
- See actual streaming response content instead of placeholder
- Better debugging and observability for AI requests

Refs: #request-details-enhancement

* feat: separate thinking/reasoning content from response content

Improve Request Details to display thinking process separately from final response:

**Backend Changes**:
- stream.js: Capture content and thinking separately in streaming mode
  - Add accumulatedThinking variable alongside accumulatedContent
  - Route delta.content to content, delta.reasoning_content to thinking
  - Support OpenAI (reasoning_content), Claude (thinking), Gemini (part.thought)
  - Update onStreamComplete callback to return { content, thinking } object

- chatCore.js: Update response structure to include thinking field
  - Non-streaming: Extract thinking from reasoning_content field
  - Streaming: Receive { content, thinking } from stream callback
  - Error responses: Include thinking: null
  - Initial streaming save: Include thinking: null

**Frontend Changes**:
- RequestDetailsTab.js: Display thinking and content in separate sections
  - Add amber/yellow themed "Thinking Process" section with psychology icon
  - Show "Final Response" label when thinking is present
  - Use distinct visual styling for thinking (amber bg) vs content (gray bg)
  - Only show thinking section when thinking content exists

**Benefits**:
- Users can clearly see model's reasoning process vs final answer
- Better debugging for models with thinking capabilities (Claude, o1, etc.)
- Visual distinction makes it easy to identify thinking vs response

Refs: #thinking-content-separation

* fix: map Claude thinking to reasoning_content field

Fix Claude thinking content to be properly captured as reasoning_content
instead of regular content, enabling separate display in Request Details:

**Changes**:
- claude-to-openai.js: Use reasoning_content field for thinking blocks
  - thinking start: send { reasoning_content: "" } instead of { content: "```\n```" }
  - thinking delta: map to reasoning_content instead of content
  - thinking stop: send { reasoning_content: "" } instead of { content: "```\n```" }

**Why This Matters**:
- Previously Claude thinking was sent as `content` field, mixed with actual response
- Now thinking uses `reasoning_content` field, matching OpenAI's o1 format
- stream.js can now properly route thinking to accumulatedThinking variable
- Request Details UI will show Claude thinking in separate "Thinking Process" section

**Supported Thinking Formats**:
- OpenAI: delta.reasoning_content → thinking
- Claude: delta.thinking → reasoning_content (now fixed)
- Gemini: part.thought === true → thinking

Refs: #claude-thinking-fix

* feat(observability): capture and display full 4-layer request chain

Capture complete request/response chain in AI Request Details:
- Add providerRequest field (translated request sent to provider)
- Add providerResponse field (raw provider response, streaming indicator)
- Update chatCore.js at all 5 saveRequestDetail() call sites
- Reorganize UI into 4 collapsible sections with Material icons
- Preserve backward compatibility for old records
- Add distinct styling for streaming indicator

* fix(observability): resolve React duplicate key warning in request details table

- Use composite key (detail.id + index) to ensure unique keys
- Prevents React warnings when database contains duplicate IDs from old ID generation

* fix(observability): display actual content in streaming request details

Change providerResponse field for streaming requests from placeholder
"[Streaming - raw response not captured]" to actual final content.

This improves debugging experience by showing the real AI response
in the "Provider Response (Raw)" section instead of a confusing
placeholder message.

Files changed:
- open-sse/handlers/chatCore.js: Save contentObj.content to providerResponse
- src/app/.../RequestDetailsTab.js: Remove special handling for placeholder

* refactor(observability): migrate request details to SQLite for improved concurrency

- Replace LowDB JSON storage with better-sqlite3
- Enable WAL mode for true concurrent read/write support
- Add 5 indexes to accelerate queries (timestamp, provider, model, connection_id, status)
- Perform pagination at the database level to reduce memory footprint
- Maintain 1000 record limit with automatic cleanup of old data
- Ensure API compatibility via re-exports, requiring no caller changes

Performance improvements:
- Concurrent Writes: Lock-free WAL mode prevents data contention
- Query Efficiency: Index-based searches replace full dataset loading
- Data Integrity: Atomic operations prevent file corruption

* fix(observability): resolve pagination statistics display issues

- Fix issue where totalItems=0 showed 'Showing 1 to 0 of 0 results'
- Hide pagination controls when totalItems=0 or totalPages<=1
- Standardize API response fields: pagination.total -> pagination.totalItems

Before: Incorrect stats shown for empty data, and pager visible even for single-page results
After: Stats hidden for empty data, pager hidden when navigation is unnecessary

* feat(observability): display friendly provider names in request details

- Add /api/usage/providers endpoint to dynamically fetch provider list with names
- Replace hardcoded provider options with dynamic loading from database
- Display friendly provider names instead of IDs in both table and detail drawer
- Support custom provider nodes (e.g., OpenAI-compatible) with user-defined names
- Add provider name caching to optimize performance

* fix(observability): use INSERT OR REPLACE for request details to handle streaming updates

* fix(observability): resolve zero-token display issue by ensuring streaming usage capture and fixing key mismatch

* fix(observability): separate TTFT and total latency calculation for streaming requests

* feat(observability): implement SQLite write queue and JSON size limits

- Added in-memory buffer and batch writing for SQLite to prevent lock contention
- Implemented  with configurable 1MB limit to prevent DB bloat
- Added dashboard UI for observability performance and data management settings
- Integrated graceful shutdown handlers to prevent data loss

* fix(observability): resolve ReferenceError by declaring dbInstance
2026-02-09 10:30:42 +07:00
decolua
388389c972 Revert "feat(request-details): implement observability settings and enhance request detail tracking"
This reverts commit cbabf5547c.
2026-02-09 10:29:38 +07:00
decolua
cbabf5547c feat(request-details): implement observability settings and enhance request detail tracking
- Added new observability settings in the dashboard for max records, batch size, flush interval, and max JSON size.
- Introduced `extractRequestConfig` function to capture full request configurations.
- Enhanced error handling by saving detailed request information on failures.
- Updated usage tracking to include new token metrics.
- Modified streaming functions to support detailed content and reasoning tracking.
2026-02-09 10:20:24 +07:00