test-repo/docs/PODCAST_PRD.md

2.8 KiB

Daily Digest Podcast - PRD & Roadmap

Current State (What We Have)

Built & Working

  • TTS generation (3 providers: OpenAI, Piper, macOS say)
  • Audio storage in Supabase (podcast-audio bucket)
  • RSS feed endpoint (/api/podcast/rss)
  • Database schema (audio_url, audio_duration)
  • Daily digest workflow at 7am CST

⚠️ Not Working / Disabled

  • TTS generation is OFF (ENABLE_TTS=false)
  • No music layering yet

TODO: Music Layering Feature

What We Need

  1. Intro/Outro Music Files

    • Store MP3 files somewhere (Supabase Storage or local)
    • Need: 5-10 sec intro, 5-10 sec outro
  2. Audio Mixing with ffmpeg

    • Layer: Intro → TTS Speech → Outro
    • Optional: Background music under speech (lower volume)
  3. Environment Config

    • INTRO_MUSIC_URL - Path to intro file
    • OUTRO_MUSIC_URL - Path to outro file
    • BACKGROUND_MUSIC_URL - Optional background track
    • MUSIC_VOLUME - Background volume (0.1-0.3)

Implementation Plan

Phase 1: Basic TTS (Quick Win)

  • Enable TTS in .env.production
  • Test with OpenAI provider
  • Verify audio appears on blog

Phase 2: Music Files

  • Source or create intro/outro music
  • Upload to Supabase Storage bucket
  • Add URLs to environment config

Phase 3: Audio Mixing

  • Add ffmpeg dependency
  • Create mixing function in tts.ts
  • Mix: Intro + TTS + Outro
  • Optional: Background music layer

Phase 4: Production

  • Deploy with music mixing enabled
  • Test full pipeline
  • Verify RSS includes mixed audio

Files Reference

File Purpose
src/lib/tts.ts TTS generation (add mixing here)
src/lib/storage.ts Audio file upload/download
`src/app/api/tts/routeTS API endpoint
src/app/api/digest/r.ts Toute.ts`
.env.production TTS config (ENABLE_TTS, TTS_PROVIDER, etc.)

Configuration Variables Needed

# Current (in .env.production)
ENABLE_TTS=true
TTS_PROVIDER=openai
TTS_VOICE=alloy
OPENAI_API_KEY=sk-...

# New (for music layering)
INTRO_MUSIC_URL=https://.../intro.mp3
OUTRO_MUSIC_URL=https://.../outro.mp3
BACKGROUND_MUSIC_URL=https://.../bg.mp3  # optional
MUSIC_VOLUME=0.2

Blockers

1. No intro/outro music files - Need to source or create 2. ffmpeg not installed on Vercel - May need local-only generation or custom build

Phase 1: Basic TTS - COMPLETE

  • Enable TTS in .env.production
  • Using macOS say (built-in, no external API)
  • Verify audio appears on blog

Questions to Answer

  1. Do you have intro/outro music already, or should we source it?
  2. Prefer OpenAI TTS or Piper (local/free)?
  3. Want background music under speech, or just intro/outro?