KaraokeVideoDownloader/PRD.md

135 lines
5.5 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# 🎤 Karaoke Video Downloader PRD (v2.1)
## ✅ Overview
A Python-based Windows CLI tool to download karaoke videos from YouTube channels/playlists using `yt-dlp.exe`, with advanced tracking, songlist prioritization, and flexible configuration.
---
## 📋 Goals
- Download karaoke videos from YouTube channels or playlists.
- Organize downloads by channel (or playlist) in subfolders.
- Avoid re-downloading the same videos (robust tracking).
- Prioritize and track a custom songlist across channels.
- Allow flexible, user-friendly configuration.
---
## 🧑‍💻 Target Users
- Karaoke DJs, home karaoke users, event hosts, or anyone needing offline karaoke video libraries.
- Users comfortable with command-line tools.
---
## ⚙️ Platform & Stack
- **Platform:** Windows
- **Interface:** Command-line (CLI)
- **Tech Stack:** Python 3.7+, yt-dlp.exe, mutagen (for ID3 tagging)
---
## 📥 Input
- YouTube channel or playlist URLs (e.g. `https://www.youtube.com/@SingKingKaraoke/videos`)
- Optional: `data/channels.txt` file with multiple channel URLs (one per line)
- Optional: `data/songList.json` for prioritized song downloads
### Example Usage
```bash
python download_karaoke.py https://www.youtube.com/@SingKingKaraoke/videos
python download_karaoke.py --file data/channels.txt
python download_karaoke.py --songlist-only
python download_karaoke.py --reset-channel SingKingKaraoke --reset-songlist
python download_karaoke.py --clear-cache SingKingKaraoke
```
---
## 📤 Output
- MP4 files in `downloads/<ChannelName>/` subfolders
- All videos tracked in `data/karaoke_tracking.json`
- Songlist progress tracked in `data/songlist_tracking.json`
- Logs in `logs/`
---
## 🛠️ Features
- ✅ Channel-based downloads (with per-channel folders)
- ✅ Robust JSON tracking (downloaded, partial, failed, etc.)
- ✅ Batch saving and channel video caching for performance
- ✅ Configurable download resolution and yt-dlp options (`data/config.json`)
- ✅ Songlist integration: prioritize and track custom songlists
- ✅ Songlist-only mode: download only songs from the songlist
- ✅ Global songlist tracking to avoid duplicates across channels
- ✅ ID3 tagging for artist/title in MP4 files (mutagen)
- ✅ Real-time progress and detailed logging
- ✅ Automatic cleanup of extra yt-dlp files
-**Reset/clear channel tracking and files via CLI**
-**Clear channel cache via CLI**
---
## 📂 Folder Structure
```
KaroakeVideoDownloader/
├── karaoke_downloader/ # All core Python code and utilities
│ ├── downloader.py # Main downloader class
│ ├── cli.py # CLI entry point
│ ├── id3_utils.py # ID3 tagging helpers
│ ├── songlist_manager.py # Songlist logic
│ ├── youtube_utils.py # YouTube helpers
│ ├── tracking_manager.py # Tracking logic
│ ├── check_resolution.py # Resolution checker utility
│ ├── resolution_cli.py # Resolution config CLI
│ └── tracking_cli.py # Tracking management CLI
├── data/ # All config, tracking, cache, and songlist files
│ ├── config.json
│ ├── karaoke_tracking.json
│ ├── songlist_tracking.json
│ ├── channel_cache.json
│ ├── channels.txt
│ └── songList.json
├── downloads/ # All video output
│ └── [ChannelName]/ # Per-channel folders
├── logs/ # Download logs
├── downloader/yt-dlp.exe # yt-dlp binary
├── tests/ # Diagnostic and test scripts
│ └── test_installation.py
├── download_karaoke.py # Main entry point (thin wrapper)
├── README.md
├── PRD.md
├── requirements.txt
└── download_karaoke.bat # (optional Windows launcher)
```
---
## 🚦 CLI Options (Summary)
- `--file <data/channels.txt>`: Download from a list of channels
- `--songlist-priority`: Prioritize songlist songs in download queue
- `--songlist-only`: Download only songs from the songlist
- `--songlist-status`: Show songlist download progress
- `--limit <N>`: Limit number of downloads
- `--resolution <720p|1080p|...>`: Override resolution
- `--status`: Show download/tracking status
- `--reset-channel <CHANNEL_NAME>`: **Reset all tracking and files for a channel**
- `--reset-songlist`: **When used with --reset-channel, also reset songlist songs for this channel**
- `--clear-cache <CHANNEL_ID|all>`: **Clear channel video cache for a specific channel or all**
---
## 🧠 Logic Highlights
- **Tracking:** All downloads, statuses, and formats are tracked in JSON files for reliability and deduplication.
- **Songlist:** Loads and normalizes `data/songList.json`, matches against available videos, and prioritizes or restricts downloads accordingly.
- **Batch/Caching:** Channel video lists are cached to minimize API calls; tracking is batch-saved for performance.
- **ID3 Tagging:** Artist/title extracted from video title and embedded in MP4 files.
- **Cleanup:** Extra files from yt-dlp (e.g., `.info.json`) are automatically removed after download.
- **Reset/Clear:** Use `--reset-channel` to reset all tracking and files for a channel (optionally including songlist songs with `--reset-songlist`). Use `--clear-cache` to clear cached video lists for a channel or all channels.
---
## 🚀 Future Enhancements
- [ ] Web UI for easier management
- [ ] More advanced song matching (fuzzy, multi-language)
- [ ] Download scheduling and retry logic
- [ ] More granular status reporting