Files
9router/skills/9router-tts/SKILL.md
decolua 5c62e73cc6 - Cowork: ComboFormModal
- BaseUrlSelect: add cloud endpoint option, custom URL local state, always
  default to first option; new cliEndpointMatch helper; CLI tool cards refactor
- API: new /v1/audio/voices and /v1/models/info; /v1/models filters disabled
  models, drop unused timestamp
- initializeApp: guard tunnel/tailscale auto-resume to once-per-process
- geminiHelper: ensureObjectType for schemas with properties but no type
- skills: minor SKILL.md tweaks (chat/embeddings/image/stt/tts/web-*)
2026-05-07 15:45:09 +07:00

3.0 KiB

name, description
name description
9router-tts Text-to-speech via 9Router /v1/audio/speech using OpenAI / ElevenLabs / Deepgram / Edge TTS / Google TTS / Hyperbolic / Inworld voices. Use when the user wants to convert text to speech, generate audio, voiceover, narrate, or read text aloud.

9Router — Text-to-Speech

Requires NINEROUTER_URL (and NINEROUTER_KEY if auth enabled). See https://raw.githubusercontent.com/decolua/9router/refs/heads/master/skills/9router/SKILL.md for setup.

Discover

# 1) List models
curl $NINEROUTER_URL/v1/models/tts | jq '.data[].id'
# 2) Per-model metadata (params, voicesUrl if voice-by-id)
curl "$NINEROUTER_URL/v1/models/info?id=el/eleven_multilingual_v2"
# 3) List voices (elevenlabs, edge-tts, deepgram, inworld, local-device). Optional ?lang=vi
curl "$NINEROUTER_URL/v1/audio/voices?provider=edge-tts&lang=vi" | jq '.data[].model'

model field in /v1/audio/speech = voice ID directly (e.g. edge-tts/vi-VN-HoaiMyNeural, el/<voice_id>, or openai/tts-1 model+default voice).

Endpoint

POST $NINEROUTER_URL/v1/audio/speech

Field Required Notes
model yes voice ID from /v1/models/tts
input yes text to speak

Query ?response_format=mp3 (default, raw bytes) or ?response_format=json ({audio: base64, format}).

Examples

Save MP3:

curl -X POST "$NINEROUTER_URL/v1/audio/speech" \
  -H "Authorization: Bearer $NINEROUTER_KEY" \
  -H "Content-Type: application/json" \
  -d '{"model":"openai/tts-1","input":"Hello world"}' \
  --output speech.mp3

JS (save file):

import { writeFile } from "node:fs/promises";
const r = await fetch(`${process.env.NINEROUTER_URL}/v1/audio/speech`, {
  method: "POST",
  headers: { "Authorization": `Bearer ${process.env.NINEROUTER_KEY}`, "Content-Type": "application/json" },
  body: JSON.stringify({ model: "el/eleven_multilingual_v2", input: "Xin chào" }),
});
await writeFile("speech.mp3", Buffer.from(await r.arrayBuffer()));

Response shape

Default → raw audio bytes (Content-Type audio/mp3).

?response_format=json:

{ "audio": "SUQzBAAAA...", "format": "mp3" }

Provider quirks (model format)

Provider model format Notes
openai tts-1/alloy (model/voice) or just voice Default model gpt-4o-mini-tts
elevenlabs <model_id>/<voice_id> or <voice_id> Default model eleven_flash_v2_5; list voices in Dashboard
openrouter openai/gpt-4o-mini-tts/alloy Streamed via chat-completions audio modality
edge-tts voice id e.g. vi-VN-HoaiMyNeural noAuth; default vi-VN-HoaiMyNeural
google-tts language code e.g. en, vi noAuth
local-device OS voice name (say -v ? / SAPI) noAuth; needs ffmpeg
deepgram aura-asteria-en etc Token auth
nvidia, inworld, cartesia, playht model/voice Provider-specific auth header
coqui, tortoise speaker / voice id Localhost noAuth
hyperbolic model id Body = {text} only