← All posts
· 5 min read

AI Voice for Podcast Narration: Generate Professional Audio for $0.05/Episode

How podcast creators and developers are using LeanVox to generate consistent, professional-sounding narration without a recording studio.

podcastttsaudiodeveloper-tools

You don't need a recording booth, a microphone, or a consistent daily schedule to make a podcast anymore.

With a TTS API and a decent script, you can generate a 20-minute episode in under 60 seconds for about $0.05. Here's exactly how to do it.

The use case

Podcast narration with AI voice works best for:

  • Solo narration podcasts — news roundups, book summaries, tech explainers, daily briefings
  • Multi-host formats — interview-style shows with two or more distinct voices
  • Branded audio content — consistent voice across hundreds of episodes
  • Translated episodes — same script, different language, same production cost

The key requirement: a consistent voice across every episode. That's where voice cloning or curated voice selection comes in.

Picking a voice

LeanVox ships with a curated library of 238+ pro voices, organized by use case. For podcast narration, the standout categories:

  • Podcast hosts — warm, conversational, sounds like someone you'd listen to for an hour
  • Narrators — authoritative, measured, great for explainer content
  • News anchors — crisp, professional, high credibility

Or clone your own voice. Record 30 seconds, upload it, and every episode sounds like you — even when you're not recording.

Single-host episode

For a solo narration show, it's one API call:

from leanvox import Leanvox

client = Leanvox(api_key="lv_live_...")

# Generate a full episode from script
result = client.generate(
    text=episode_script,  # up to 10,000 chars per call
    model="pro",
    voice="podcast_conversational_female",
    speed=1.0,
)

print(result.audio_url)  # CDN URL, ready to publish

For longer episodes (over 10,000 characters), use async jobs — no manual splitting or stitching required:

# Generate a long episode from a script file — handles any length automatically
lvox generate --use-async   --model pro   --voice podcast_conversational_female   --file episode_script.txt   --output episode_042.mp3

Or via Python SDK:

import requests

# Submit async job — no chunking needed
job = client.generate_async(
    text=long_episode_script,
    model="pro",
    voice="podcast_conversational_female",
)

result = job.wait()  # polls until complete

audio = requests.get(result.audio_url).content
with open("episode_042.mp3", "wb") as f:
    f.write(audio)

Two-host conversation format

The dialogue endpoint handles multi-speaker shows in a single call — no stitching required:

episode = client.dialogue(
    model="pro",
    gap_ms=500,  # natural pause between speakers
    lines=[
        {
            "text": "Welcome back to Syntax Error. I'm your host, and today we're talking about LLM inference costs.",
            "voice": "podcast_conversational_female",
            "language": "en"
        },
        {
            "text": "It's a topic I have strong opinions about. GPU costs have dropped 70% in two years and nobody's passing that on to developers.",
            "voice": "podcast_casual_male",
            "language": "en",
            "exaggeration": 0.6  # slightly more animated delivery
        },
        {
            "text": "Strong take. Let's break down the numbers.",
            "voice": "podcast_conversational_female",
            "language": "en"
        },
    ],
)

print(episode.audio_url)

One call, one MP3, both voices. The gap_ms parameter controls the silence between speakers — 400-600ms sounds natural for conversation.

Clone your voice for brand consistency

If you're building a podcast brand, cloning your own voice locks in consistency across every episode:

# One-time setup: clone your voice
with open("my_voice_sample.wav", "rb") as f:
    voice = client.voices.clone(
        name="My Podcast Voice",
        audio=f
    )
    client.voices.unlock(voice.voice_id)  # make available for generation

# Every episode from here forward
result = client.generate(
    text=episode_script,
    model="pro",
    voice=voice.voice_id,
)

Record a clean 30-second sample (quiet room, no background noise) and it captures your timbre, pace, and tone.

Automate the full pipeline

The real power is chaining this with an LLM. Write a script, generate audio, publish — fully automated:

import anthropic
from leanvox import Leanvox

claude = anthropic.Anthropic()
leanvox = Leanvox(api_key="lv_live_...")

def generate_daily_briefing(topic: str) -> str:
    # Step 1: Write the script with Claude
    response = claude.messages.create(
        model="claude-opus-4-5",
        max_tokens=1024,
        messages=[{
            "role": "user",
            "content": f"Write a 2-minute podcast script about: {topic}. Professional, conversational tone. No stage directions."
        }]
    )
    script = response.content[0].text

    # Step 2: Generate audio with LeanVox
    result = leanvox.generate(
        text=script,
        model="pro",
        voice="podcast_conversational_female",
    )

    return result.audio_url

# Run daily
url = generate_daily_briefing("The latest in open-source AI models")
print(f"Today's episode: {url}")

What does it cost?

For a typical 10-minute podcast episode (~8,000 words = ~40,000 characters):

TierRateEpisode cost100 episodes/mo
Standard$0.005/1K chars$0.20$20
Pro (with cloning)$0.01/1K chars$0.40$40

Compare: a voice actor typically charges $200–$500 per hour. A 10-minute episode might take 30–60 minutes of studio time to record and edit.

Try it

Browse the voice library — preview 238+ voices by category. Your $1.00 signup credit covers your first 2–3 episodes.

Get your API key · Docs


Try LeanVox free

$0.50 in free credits. No credit card required. Start generating speech in 30 seconds.

Get started free →