# 🎤 Karaoke Video Downloader – PRD (v2.1) ## ✅ Overview A Python-based Windows CLI tool to download karaoke videos from YouTube channels/playlists using `yt-dlp.exe`, with advanced tracking, songlist prioritization, and flexible configuration. --- ## 📋 Goals - Download karaoke videos from YouTube channels or playlists. - Organize downloads by channel (or playlist) in subfolders. - Avoid re-downloading the same videos (robust tracking). - Prioritize and track a custom songlist across channels. - Allow flexible, user-friendly configuration. --- ## 🧑‍💻 Target Users - Karaoke DJs, home karaoke users, event hosts, or anyone needing offline karaoke video libraries. - Users comfortable with command-line tools. --- ## ⚙️ Platform & Stack - **Platform:** Windows - **Interface:** Command-line (CLI) - **Tech Stack:** Python 3.7+, yt-dlp.exe, mutagen (for ID3 tagging) --- ## 📥 Input - YouTube channel or playlist URLs (e.g. `https://www.youtube.com/@SingKingKaraoke/videos`) - Optional: `data/channels.txt` file with multiple channel URLs (one per line) - Optional: `data/songList.json` for prioritized song downloads ### Example Usage ```bash python download_karaoke.py https://www.youtube.com/@SingKingKaraoke/videos python download_karaoke.py --file data/channels.txt python download_karaoke.py --songlist-only python download_karaoke.py --reset-channel SingKingKaraoke --reset-songlist python download_karaoke.py --clear-cache SingKingKaraoke ``` --- ## 📤 Output - MP4 files in `downloads//` subfolders - All videos tracked in `data/karaoke_tracking.json` - Songlist progress tracked in `data/songlist_tracking.json` - Logs in `logs/` --- ## 🛠️ Features - ✅ Channel-based downloads (with per-channel folders) - ✅ Robust JSON tracking (downloaded, partial, failed, etc.) - ✅ Batch saving and channel video caching for performance - ✅ Configurable download resolution and yt-dlp options (`data/config.json`) - ✅ Songlist integration: prioritize and track custom songlists - ✅ Songlist-only mode: download only songs from the songlist - ✅ Global songlist tracking to avoid duplicates across channels - ✅ ID3 tagging for artist/title in MP4 files (mutagen) - ✅ Real-time progress and detailed logging - ✅ Automatic cleanup of extra yt-dlp files - ✅ **Reset/clear channel tracking and files via CLI** - ✅ **Clear channel cache via CLI** - ✅ **Download plan pre-scan and caching**: Before downloading, the tool pre-scans all channels for songlist matches, builds a download plan, and prints stats. The plan is cached for 1 day in data/download_plan_cache.json for fast resuming and reliability. Use --force-download-plan to force a refresh. - ✅ **Latest-per-channel download**: Download the latest N videos from each channel in a single batch, with a per-channel download plan, robust resume, and unique plan cache. Use --latest-per-channel and --limit N. --- ## 📂 Folder Structure ``` KaroakeVideoDownloader/ ├── karaoke_downloader/ # All core Python code and utilities │ ├── downloader.py # Main downloader class │ ├── cli.py # CLI entry point │ ├── id3_utils.py # ID3 tagging helpers │ ├── songlist_manager.py # Songlist logic │ ├── youtube_utils.py # YouTube helpers │ ├── tracking_manager.py # Tracking logic │ ├── check_resolution.py # Resolution checker utility │ ├── resolution_cli.py # Resolution config CLI │ └── tracking_cli.py # Tracking management CLI ├── data/ # All config, tracking, cache, and songlist files │ ├── config.json │ ├── karaoke_tracking.json │ ├── songlist_tracking.json │ ├── channel_cache.json │ ├── channels.txt │ └── songList.json ├── downloads/ # All video output │ └── [ChannelName]/ # Per-channel folders ├── logs/ # Download logs ├── downloader/yt-dlp.exe # yt-dlp binary ├── tests/ # Diagnostic and test scripts │ └── test_installation.py ├── download_karaoke.py # Main entry point (thin wrapper) ├── README.md ├── PRD.md ├── requirements.txt └── download_karaoke.bat # (optional Windows launcher) ``` --- ## 🚦 CLI Options (Summary) - `--file `: Download from a list of channels - `--songlist-priority`: Prioritize songlist songs in download queue - `--songlist-only`: Download only songs from the songlist - `--songlist-status`: Show songlist download progress - `--limit `: Limit number of downloads - `--resolution <720p|1080p|...>`: Override resolution - `--status`: Show download/tracking status - `--reset-channel `: **Reset all tracking and files for a channel** - `--reset-songlist`: **When used with --reset-channel, also reset songlist songs for this channel** - `--clear-cache `: **Clear channel video cache for a specific channel or all** - `--force-download-plan`: **Force refresh the download plan cache (re-scan all channels for matches)** - `--latest-per-channel`: **Download the latest N videos from each channel (use with --limit)** --- ## 🧠 Logic Highlights - **Tracking:** All downloads, statuses, and formats are tracked in JSON files for reliability and deduplication. - **Songlist:** Loads and normalizes `data/songList.json`, matches against available videos, and prioritizes or restricts downloads accordingly. - **Batch/Caching:** Channel video lists are cached to minimize API calls; tracking is batch-saved for performance. - **ID3 Tagging:** Artist/title extracted from video title and embedded in MP4 files. - **Cleanup:** Extra files from yt-dlp (e.g., `.info.json`) are automatically removed after download. - **Reset/Clear:** Use `--reset-channel` to reset all tracking and files for a channel (optionally including songlist songs with `--reset-songlist`). Use `--clear-cache` to clear cached video lists for a channel or all channels. - **Download plan pre-scan:** Before downloading, the tool scans all channels for songlist matches, builds a download plan, and prints stats (matches, unmatched, per-channel breakdown). The plan is cached for 1 day and reused unless --force-download-plan is set. - **Latest-per-channel plan:** Download the latest N videos from each channel, with a per-channel plan and robust resume. Each channel is removed from the plan as it completes. Plan cache is deleted when all channels are done. --- ## 🚀 Future Enhancements - [ ] Web UI for easier management - [ ] More advanced song matching (fuzzy, multi-language) - [ ] Download scheduling and retry logic - [ ] More granular status reporting