KaraokeVideoDownloader/docs/PRD.md

111 lines
4.1 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# 🎤 Karaoke Video Downloader PRD (v2.0)
## ✅ Overview
A Python-based Windows CLI tool to download karaoke videos from YouTube channels/playlists using `yt-dlp.exe`, with advanced tracking, songlist prioritization, and flexible configuration.
---
## 📋 Goals
- Download karaoke videos from YouTube channels or playlists.
- Organize downloads by channel (or playlist) in subfolders.
- Avoid re-downloading the same videos (robust tracking).
- Prioritize and track a custom songlist across channels.
- Allow flexible, user-friendly configuration.
---
## 🧑‍💻 Target Users
- Karaoke DJs, home karaoke users, event hosts, or anyone needing offline karaoke video libraries.
- Users comfortable with command-line tools.
---
## ⚙️ Platform & Stack
- **Platform:** Windows
- **Interface:** Command-line (CLI)
- **Tech Stack:** Python 3.7+, yt-dlp.exe, mutagen (for ID3 tagging)
---
## 📥 Input
- YouTube channel or playlist URLs (e.g. `https://www.youtube.com/@SingKingKaraoke/videos`)
- Optional: `channels.txt` file with multiple channel URLs (one per line)
- Optional: `docs/songList.json` for prioritized song downloads
### Example Usage
```bash
python download_karaoke.py https://www.youtube.com/@SingKingKaraoke/videos
python download_karaoke.py --file channels.txt
python download_karaoke.py --songlist-only
```
---
## 📤 Output
- MP4 files in `downloads/<ChannelName>/` subfolders
- All videos tracked in `karaoke_tracking.json`
- Songlist progress tracked in `songlist_tracking.json`
- Logs in `logs/`
---
## 🛠️ Features
- ✅ Channel-based downloads (with per-channel folders)
- ✅ Robust JSON tracking (downloaded, partial, failed, etc.)
- ✅ Batch saving and channel video caching for performance
- ✅ Configurable download resolution and yt-dlp options (`config.json`)
- ✅ Songlist integration: prioritize and track custom songlists
- ✅ Songlist-only mode: download only songs from the songlist
- ✅ Global songlist tracking to avoid duplicates across channels
- ✅ ID3 tagging for artist/title in MP4 files (mutagen)
- ✅ Real-time progress and detailed logging
- ✅ Automatic cleanup of extra yt-dlp files
---
## 📂 Folder Structure
```
KaroakeVideoDownloader/
├── download_karaoke.py # Main script
├── tracking_manager.py # Tracking logic
├── manage_tracking.py # Tracking management utility
├── update_resolution.py # Resolution config utility
├── config.json # Main config
├── downloader/yt-dlp.exe # yt-dlp binary
├── downloads/ # All video output
│ └── [ChannelName]/ # Per-channel folders
├── logs/ # Download logs
├── karaoke_tracking.json # Main tracking DB
├── songlist_tracking.json # Songlist tracking DB
├── docs/songList.json # Songlist for prioritization
```
---
## 🚦 CLI Options (Summary)
- `--file <channels.txt>`: Download from a list of channels
- `--songlist-priority`: Prioritize songlist songs in download queue
- `--songlist-only`: Download only songs from the songlist
- `--songlist-status`: Show songlist download progress
- `--limit <N>`: Limit number of downloads
- `--resolution <720p|1080p|...>`: Override resolution
- `--status`: Show download/tracking status
---
## 🧠 Logic Highlights
- **Tracking:** All downloads, statuses, and formats are tracked in JSON files for reliability and deduplication.
- **Songlist:** Loads and normalizes `docs/songList.json`, matches against available videos, and prioritizes or restricts downloads accordingly.
- **Batch/Caching:** Channel video lists are cached to minimize API calls; tracking is batch-saved for performance.
- **ID3 Tagging:** Artist/title extracted from video title and embedded in MP4 files.
- **Cleanup:** Extra files from yt-dlp (e.g., `.info.json`) are automatically removed after download.
---
## 🚀 Future Enhancements
- [ ] Web UI for easier management
- [ ] More advanced song matching (fuzzy, multi-language)
- [ ] Download scheduling and retry logic
- [ ] More granular status reporting