# TTS Options Research for Daily Digest Podcast ## Executive Summary After evaluating multiple TTS solutions, **Piper TTS** emerges as the best choice for a daily digest workflow, offering excellent quality at zero cost with full local control. --- ## Option Comparison ### 1. **Piper TTS** ⭐ RECOMMENDED - **Cost**: FREE (open source) - **Quality**: ⭐⭐⭐⭐ Very good (neural voices, natural sounding) - **Setup**: Easy-Medium (binary download + voice model) - **Platform**: macOS, Linux, Windows - **Automation**: CLI tool, easily scripted - **Pros**: - Completely free, no API limits - Runs locally (privacy, no internet needed) - Fast inference on CPU - Multiple high-quality voices available - Active development (GitHub: rhasspy/piper) - **Cons**: - Requires downloading voice models (~50-100MB each) - Not quite as expressive as premium APIs - **Integration**: ```bash echo "Your digest content" | piper --model en_US-lessac-medium.onnx --output_file digest.mp3 ``` ### 2. **macOS say Command** - **Cost**: FREE (built-in) - **Quality**: ⭐⭐ Basic (functional but robotic) - **Setup**: None (pre-installed) - **Platform**: macOS only - **Automation**: CLI, easily scripted - **Pros**: - Zero setup required - Native macOS integration - Multiple built-in voices - **Cons**: - Quality is noticeably robotic - Limited voice options - No neural/AI voices - **Integration**: ```bash say -v Samantha -o digest.aiff "Your digest content" ``` ### 3. **ElevenLabs Free Tier** - **Cost**: FREE tier: 10,000 characters/month (~10 min audio) - **Quality**: ⭐⭐⭐⭐⭐ Excellent (best-in-class natural voices) - **Setup**: Easy (API key signup) - **Platform**: API-based (any platform) - **Automation**: REST API or Python SDK - **Pros**: - Exceptional voice quality - Voice cloning available (paid) - Multiple languages - **Cons**: - 10K char limit is very restrictive for daily digest - Paid tier starts at $5/month for 30K chars - Requires internet, API dependency - Could exceed limits quickly with daily content - **Integration**: Python SDK or curl to API ### 4. **OpenAI TTS API** - **Cost**: $0.015 per 1,000 characters (~$0.018/minute) - **Quality**: ⭐⭐⭐⭐⭐ Excellent (natural, expressive) - **Setup**: Easy (API key) - **Platform**: API-based - **Automation**: REST API - **Pros**: - High quality voices (alloy, echo, fable, etc.) - Fast, reliable API - Good for moderate usage - **Cons**: - Not free - costs add up (~$1-3/month for daily digest) - Requires internet connection - Rate limits apply - **Cost Estimate**: Daily 5-min digest ≈ $2-4/month ### 5. **Coqui TTS** - **Cost**: FREE (open source) - **Quality**: ⭐⭐⭐⭐ Good (varies by model) - **Setup**: Hard (Python environment, dependencies) - **Platform**: macOS, Linux, Windows - **Automation**: Python scripts - **Pros**: - Free and open source - Multiple voice models available - Voice cloning capability - **Cons**: - Complex setup (conda/pip, GPU recommended) - Heavier resource usage than Piper - Project maintenance has slowed (team laid off) - **Integration**: Python script with TTS library ### 6. **Google Cloud TTS** - **Cost**: FREE tier: 1M characters/month (WaveNet), then $4 per 1M - **Quality**: ⭐⭐⭐⭐ Very good (WaveNet voices) - **Setup**: Medium (GCP account, API setup) - **Platform**: API-based - **Automation**: REST API or SDK - **Pros**: - Generous free tier - Multiple voice options - Reliable infrastructure - **Cons**: - Requires GCP account - API complexity - Privacy concerns (sends text to cloud) - **Integration**: gcloud CLI or API calls ### 7. **Amazon Polly** - **Cost**: FREE tier: 5M characters/month for 12 months, then ~$4 per 1M - **Quality**: ⭐⭐⭐⭐ Good (Neural voices available) - **Setup**: Medium (AWS account) - **Platform**: API-based - **Automation**: AWS CLI or SDK - **Pros**: - Generous free tier initially - Neural voices sound natural - **Cons**: - Requires AWS account - Complexity of AWS ecosystem - **Integration**: AWS CLI or boto3 --- ## Recommendation **Primary Choice: Piper TTS** - Best balance of quality, cost (free), and ease of automation - Local processing means no privacy concerns - No rate limits or API keys to manage - Perfect for daily scheduled digest generation **Alternative if quality is paramount: OpenAI TTS** - Use if the ~$2-4/month cost is acceptable - Slightly better voice quality - Simpler than maintaining local models **Avoid for this use case:** - ElevenLabs free tier (too limiting for daily use) - macOS say (quality too low for podcast format) - Coqui (setup complexity not worth it vs Piper) --- ## Suggested Integration Workflow ```bash #!/bin/bash # Daily Digest TTS Script # 1. Fetch or read markdown content CONTENT=$(cat digest.md) # 2. Convert markdown to plain text (strip formatting) PLAIN_TEXT=$(echo "$CONTENT" | pandoc -f markdown -t plain) # 3. Generate audio with Piper piper \ --model ~/.local/share/piper/en_US-lessac-medium.onnx \ --output_file "digest_$(date +%Y-%m-%d).mp3" \ <<< "$PLAIN_TEXT" # 4. Optional: Upload to podcast host or serve locally ``` --- ## Voice Model Recommendations for Piper | Voice | Style | Best For | |-------|-------|----------| | lessac | Neutral, clear | News/digest content | | libritts | Natural, varied | Long-form content | | ljspeech | Classic TTS | Short announcements |