Introducing Max: Describe Any Voice in Plain English

What if you could describe a voice the way you'd describe it to a friend?

"A warm British narrator in his 50s, slightly raspy, like he's telling a story by the fireplace."

That's a real voice instruction. And it works.

from leanvox import Leanvox

client = Leanvox()
result = client.generate(
    text="The kingdom had stood for a thousand years. Tonight, it would fall.",
    model="max",
    voice_instructions="A warm British narrator in his 50s, slightly raspy, like he's telling a story by the fireplace"
)
print(result.audio_url)

No voice IDs. No uploading samples. Just words.

#Three Tiers, One API

LeanVox now has three models, each designed for different needs:

Tier	Model	Price/1K chars	Best for
Standard	Kokoro	$0.005	High volume, low cost
Pro	Chatterbox	$0.01	Voice cloning, emotion control, 40 curated voices
Max	Qwen3-TTS	$0.03	Instruction-based voice design, creative control

All three work with the same API, same SDKs, same MCP server. Just change model.

#How It Works

Instead of picking from a preset voice catalog, you write a description:

{
  "text": "Welcome back to the show!",
  "model": "max",
  "voice_instructions": "Energetic female podcast host, mid-30s, American accent, upbeat and conversational"
}

The model interprets your instructions and generates a unique voice. Every generation returns a generated_voice_id you can reuse for consistency across multiple calls.

#Voice Instructions That Work

Be specific. The more detail you give, the better the result:

Gender & age: "Young woman in her 20s" or "Elderly man, 70s"
Accent: "British RP", "Southern American", "Australian"
Tone: "Warm and reassuring", "Cold and clinical", "Excited and breathless"
Character: "Like a nature documentary narrator", "Like a bedtime story reader"
Quality: "Slightly raspy", "Crystal clear", "Deep and resonant"

#Multi-Speaker Dialogue

Max works with the dialogue endpoint too. Each speaker gets their own instructions:

episode = client.dialogue(
    model="max",
    lines=[
        {
            "text": "So what actually happened in production last night?",
            "voice_instructions": "Calm female tech lead, direct and analytical"
        },
        {
            "text": "Honestly? A config typo took down the whole cluster.",
            "voice_instructions": "Sheepish junior engineer, male, nervous laugh energy"
        }
    ],
    gap_ms=500
)

Two distinct voices, one API call, no presets needed.

#Works Everywhere

Python:

pip install leanvox

Node.js/TypeScript:

npm install leanvox

MCP (Claude, Cursor, VS Code):

npx leanvox-mcp

Same instruction-based voice design across every integration.

#Pricing

$0.03 per 1,000 characters. No subscription. Credits never expire.

A 500-word narration costs ~$0.09. A 5-minute podcast episode costs ~$0.15. Still cheaper than every competitor's base tier.

Free signup credit covers 200+ minutes of audio — including Max tier voice design. Roughly 50 generations to experiment with, no credit card required.

#When to Use Each Tier

Standard ($0.005) — Bulk narration, IVR, notifications. Pick a voice ID, go.
Pro ($0.01) — Podcasts, audiobooks, apps. Clone your voice, use emotion tags, 40 curated voices.
Max ($0.03) — Creative projects, prototyping, dynamic characters. Describe any voice you can imagine.

→ Get your API key · Docs · Python SDK · Node SDK