# ๐ŸŽค Karaoke Video Downloader A Python-based Windows CLI tool to download karaoke videos from YouTube channels/playlists using `yt-dlp.exe`, with advanced tracking, songlist prioritization, and flexible configuration. ## โœจ Features - ๐ŸŽต **Channel & Playlist Downloads**: Download all videos from a YouTube channel or playlist - ๐Ÿ“‚ **Organized Storage**: Each channel gets its own folder in `downloads/` - ๐Ÿ“ **Robust Tracking**: Tracks all downloads, statuses, and formats in JSON - ๐Ÿ† **Songlist Prioritization**: Prioritize or restrict downloads to a custom songlist - ๐Ÿ”„ **Batch Saving & Caching**: Efficient, minimizes API calls - ๐Ÿท๏ธ **ID3 Tagging**: Adds artist/title metadata to MP4 files - ๐Ÿงน **Automatic Cleanup**: Removes extra yt-dlp files - ๐Ÿ“ˆ **Real-Time Progress**: Detailed console and log output - ๐Ÿงน **Reset/Clear Channel**: Reset all tracking and files for a channel, or clear channel cache via CLI - ๐Ÿ—‚๏ธ **Latest-per-channel download**: Download the latest N videos from each channel in a single batch, with server deduplication, fuzzy matching support, per-channel download plan, robust resume, and unique plan cache. Use --latest-per-channel and --limit N. - ๐Ÿงฉ **Enhanced Fuzzy Matching**: Advanced fuzzy string matching for songlist-to-video matching with improved video title parsing (handles multiple title formats like "Title Karaoke | Artist Karaoke Version") - โšก **Fast Mode with Early Exit**: When a limit is set, scans channels and songs in order, downloads immediately when a match is found, and stops as soon as the limit is reached with successful downloads - ๐Ÿ”„ **Deduplication Across Channels**: Ensures the same song is not downloaded from multiple channels, even if it appears in more than one channel's video list - ๐Ÿ“‹ **Default Channel File**: Automatically uses data/channels.txt as the default channel list for songlist modes (no need to specify --file every time) - ๐Ÿ›ก๏ธ **Robust Interruption Handling**: Progress is saved after each download, preventing re-downloads if the process is interrupted - โšก **Optimized Scanning**: High-performance channel scanning with O(nร—m) complexity, pre-processed lookups, and early termination for faster matching - ๐Ÿท๏ธ **Server Duplicates Tracking**: Automatically checks against local songs.json file and marks duplicates for future skipping, preventing re-downloads of songs already on the server - โšก **Parallel Downloads**: Enable concurrent downloads with `--parallel --workers N` for significantly faster batch downloads (3-5x speedup) - ๐Ÿ“Š **Unmatched Songs Reports**: Generate detailed reports of songs that couldn't be found in any channel with `--generate-unmatched-report` - ๐Ÿ›ก๏ธ **Duplicate File Prevention**: Automatically detects and prevents duplicate files with `(2)`, `(3)` suffixes, with cleanup utility for existing duplicates - ๐Ÿท๏ธ **Consistent Metadata**: Filename and ID3 tag use identical artist/title format for clear file identification ## ๐Ÿ—๏ธ Architecture The codebase has been comprehensively refactored into a modular architecture with centralized utilities for improved maintainability, error handling, and code reuse: ### Core Modules: - **`downloader.py`**: Main orchestrator and CLI interface - **`video_downloader.py`**: Core video download execution and orchestration - **`tracking_manager.py`**: Download tracking and status management - **`download_planner.py`**: Download plan building and channel scanning - **`cache_manager.py`**: Cache operations and file I/O management - **`channel_manager.py`**: Channel and file management operations - **`songlist_manager.py`**: Songlist operations and tracking - **`server_manager.py`**: Server song availability checking - **`fuzzy_matcher.py`**: Fuzzy matching logic and similarity functions ### Utility Modules (v3.2): - **`youtube_utils.py`**: Centralized YouTube operations and yt-dlp command generation - **`error_utils.py`**: Standardized error handling and formatting - **`download_pipeline.py`**: Abstracted download โ†’ verify โ†’ tag โ†’ track pipeline - **`id3_utils.py`**: ID3 tagging utilities - **`config_manager.py`**: Configuration management - **`resolution_cli.py`**: Resolution checking utilities - **`tracking_cli.py`**: Tracking management CLI ### New Utility Modules (v3.3): - **`parallel_downloader.py`**: Parallel download management with thread-safe operations - `ParallelDownloader` class: Manages concurrent downloads with configurable workers - `DownloadTask` and `DownloadResult` dataclasses: Structured task and result management - Thread-safe progress tracking and error handling - Automatic retry mechanism for failed downloads - **`file_utils.py`**: Centralized file operations, filename sanitization, and file validation - `sanitize_filename()`: Create safe filenames from artist/title - `generate_possible_filenames()`: Generate filename patterns for different modes - `check_file_exists_with_patterns()`: Check for existing files using multiple patterns - `is_valid_mp4_file()`: Validate MP4 files with header checking - `cleanup_temp_files()`: Remove temporary yt-dlp files - `ensure_directory_exists()`: Safe directory creation - **`song_validator.py`**: Centralized song validation logic - `SongValidator` class: Unified logic for checking if songs should be downloaded - `should_skip_song()`: Comprehensive validation with multiple criteria - `mark_song_failed()`: Consistent failure tracking - `handle_download_failure()`: Standardized error handling - **Enhanced `config_manager.py`**: Robust configuration management with dataclasses - `ConfigManager` class: Type-safe configuration loading and caching - `DownloadSettings`, `FolderStructure`, `LoggingConfig` dataclasses - Configuration validation and merging with defaults - Dynamic resolution updates ### Benefits: - **Centralized Utilities**: Common operations (file operations, song validation, yt-dlp commands, error handling) are centralized - **Reduced Duplication**: Eliminated ~150 lines of code duplication across modules - **Consistency**: Standardized error messages and processing pipelines - **Maintainability**: Changes isolated to specific modules - **Testability**: Modular components can be tested independently - **Type Safety**: Comprehensive type hints across all new modules ## ๐Ÿ”ง Recent Improvements (v3.4.1) ### **Enhanced Fuzzy Matching** - **Improved video title parsing**: The `extract_artist_title` function now handles multiple title formats: - `"Title Karaoke | Artist Karaoke Version"` โ†’ Artist: "38 Special", Title: "Hold On Loosely" - `"Title Artist KARAOKE"` โ†’ Attempts to extract artist from complex titles - `"Artist - Title"` โ†’ Standard format (unchanged) - **Consolidated parsing logic**: All modules now use the same `extract_artist_title` function from `fuzzy_matcher.py` - **Better matching accuracy**: Reduced false negatives for songs with non-standard title formats ### **Fixed --limit Parameter** - **Correct limit application**: The `--limit` parameter now properly limits the scanning phase, not just downloads - **Improved performance**: When using `--limit N`, only the first N songs are scanned, significantly reducing processing time - **Accurate logging**: Logging messages now show the correct counts for songs that will actually be processed when using `--limit` ### **Code Quality Improvements** - **Eliminated duplicate functions**: Removed duplicate `extract_artist_title` implementations - **Fixed import conflicts**: Resolved inconsistencies between different parsing implementations - **Single source of truth**: All title parsing logic is now centralized in `fuzzy_matcher.py` ## ๐Ÿ›ก๏ธ Duplicate File Prevention & Filename Consistency (v3.4.2) ### **Duplicate File Prevention** - **Enhanced file existence checking**: Now detects files with `(2)`, `(3)`, etc. suffixes that yt-dlp creates - **Automatic duplicate prevention**: Skips downloads when files already exist (including duplicates) - **Updated yt-dlp configuration**: Set `"nooverwrites": false` to prevent yt-dlp from creating duplicate files - **Cleanup utility**: `data/cleanup_duplicate_files.py` helps identify and remove existing duplicate files ### **Filename vs ID3 Tag Consistency** - **Consistent metadata**: Filename and ID3 tag now use identical artist/title format - **Removed extra suffixes**: No more "(Karaoke Version)" in ID3 tags that don't match filenames - **Unified parsing**: Both filename generation and ID3 tagging use the same artist/title extraction ### **Benefits** - โœ… **No more duplicate files** with `(2)`, `(3)` suffixes - โœ… **Consistent metadata** between filename and ID3 tags - โœ… **Efficient disk usage** by preventing unnecessary downloads - โœ… **Clear file identification** with consistent naming ### **Clean Up Existing Duplicates** ```bash # Run the cleanup utility to find and remove existing duplicates python data/cleanup_duplicate_files.py # Choose option 1 for dry run (recommended first) # Choose option 2 to actually delete duplicates ``` ## ๐Ÿ“‹ Requirements - **Windows 10/11** - **Python 3.7+** - **yt-dlp.exe** (in `downloader/`) - **mutagen** (for ID3 tagging, optional) - **ffmpeg/ffprobe** (for video validation, optional but recommended) - **rapidfuzz** (for fuzzy matching, optional, falls back to difflib) ## ๐Ÿš€ Quick Start > **๐Ÿ’ก Pro Tip**: For a complete list of all available commands, see `commands.txt` - you can copy/paste any command directly into your terminal! ### Download a Channel ```bash python download_karaoke.py https://www.youtube.com/@SingKingKaraoke/videos ``` ### Download ALL Videos from a Channel (Not Just Songlist Matches) ```bash python download_karaoke.py --channel-focus SingKingKaraoke --all-videos ``` ### Download ALL Videos with Parallel Processing ```bash python download_karaoke.py --channel-focus SingKingKaraoke --all-videos --parallel --workers 10 ``` ### Download ALL Videos with Limit ```bash python download_karaoke.py --channel-focus SingKingKaraoke --all-videos --limit 100 ``` ### Download Only Songlist Songs (Fast Mode) ```bash python download_karaoke.py --songlist-only --limit 5 ``` ### Download with Parallel Processing ```bash python download_karaoke.py --parallel --songlist-only --limit 10 ``` ### Focus on Specific Playlists by Title ```bash python download_karaoke.py --songlist-focus "2025 - Apple Top 50" "2024 - Billboard Hot 100" ``` ### Focus on Specific Playlists from Custom File ```bash python download_karaoke.py --songlist-focus "CCKaraoke" --songlist-file "data/my_custom_songlist.json" ``` ### Force Download from Channels (Bypass All Existing File Checks) ```bash python download_karaoke.py --songlist-focus "2025 - Apple Top 50" --force ``` ### Download with Fuzzy Matching ```bash python download_karaoke.py --songlist-only --limit 10 --fuzzy-match --fuzzy-threshold 85 ``` ### Download Latest N Videos Per Channel ```bash python download_karaoke.py --latest-per-channel --limit 5 ``` ### Download Latest N Videos Per Channel (with fuzzy matching) ```bash python download_karaoke.py --latest-per-channel --limit 5 --fuzzy-match --fuzzy-threshold 85 ``` ### Prioritize Songlist in Download Queue ```bash python download_karaoke.py --songlist-priority ``` ### Show Songlist Download Progress ```bash python download_karaoke.py --songlist-status ``` ### Limit Number of Downloads ```bash python download_karaoke.py --limit 5 ``` ### Override Resolution ```bash python download_karaoke.py --resolution 1080p ``` ### **Reset/Start Over for a Channel** ```bash python download_karaoke.py --reset-channel SingKingKaraoke ``` ### **Reset Channel and Songlist Songs** ```bash python download_karaoke.py --reset-channel SingKingKaraoke --reset-songlist ``` ### **Clear Channel Cache** ```bash python download_karaoke.py --clear-cache SingKingKaraoke python download_karaoke.py --clear-cache all ``` ## ๐Ÿง  Songlist Integration - Place your prioritized song list in `data/songList.json` (see example format below). - The tool will match and prioritize these songs across all available channel videos. - Use `--songlist-only` to download only these songs, or `--songlist-priority` to prioritize them in the queue. - Use `--songlist-focus` to download only songs from specific playlists by title (e.g., `--songlist-focus "2025 - Apple Top 50" "2024 - Billboard Hot 100"`). - Download progress for the songlist is tracked globally in `data/songlist_tracking.json`. #### Example `data/songList.json` ```json [ { "title": "2025 - Apple Top 50", "songs": [ { "artist": "Kendrick Lamar & SZA", "title": "luther", "position": 1 }, { "artist": "Kendrick Lamar", "title": "Not Like Us", "position": 2 } ] }, { "title": "2024 - Billboard Hot 100", "songs": [ { "artist": "Taylor Swift", "title": "Cruel Summer", "position": 1 }, { "artist": "Billie Eilish", "title": "Happier Than Ever", "position": 2 } ] } ] ``` ## ๐Ÿ› ๏ธ Tracking & Caching - **data/karaoke_tracking.json**: Tracks all downloads, statuses, and formats - **data/songlist_tracking.json**: Tracks global songlist download progress - **data/server_duplicates_tracking.json**: Tracks songs found to be duplicates on the server for future skipping - **data/channel_cache.json**: Caches channel video lists for performance ## ๐Ÿ“‚ Folder Structure ``` KaroakeVideoDownloader/ โ”œโ”€โ”€ commands.txt # Complete CLI commands reference (copy/paste ready) โ”œโ”€โ”€ karaoke_downloader/ # All core Python code and utilities โ”‚ โ”œโ”€โ”€ downloader.py # Main orchestrator and CLI interface โ”‚ โ”œโ”€โ”€ cli.py # CLI entry point โ”‚ โ”œโ”€โ”€ video_downloader.py # Core video download execution and orchestration โ”‚ โ”œโ”€โ”€ tracking_manager.py # Download tracking and status management โ”‚ โ”œโ”€โ”€ download_planner.py # Download plan building and channel scanning โ”‚ โ”œโ”€โ”€ cache_manager.py # Cache operations and file I/O management โ”‚ โ”œโ”€โ”€ channel_manager.py # Channel and file management operations โ”‚ โ”œโ”€โ”€ songlist_manager.py # Songlist operations and tracking โ”‚ โ”œโ”€โ”€ server_manager.py # Server song availability checking โ”‚ โ”œโ”€โ”€ fuzzy_matcher.py # Fuzzy matching logic and similarity functions โ”‚ โ”œโ”€โ”€ youtube_utils.py # Centralized YouTube operations and yt-dlp commands โ”‚ โ”œโ”€โ”€ error_utils.py # Standardized error handling and formatting โ”‚ โ”œโ”€โ”€ download_pipeline.py # Abstracted download โ†’ verify โ†’ tag โ†’ track pipeline โ”‚ โ”œโ”€โ”€ id3_utils.py # ID3 tagging utilities โ”‚ โ”œโ”€โ”€ config_manager.py # Configuration management with dataclasses โ”‚ โ”œโ”€โ”€ file_utils.py # Centralized file operations and filename handling โ”‚ โ”œโ”€โ”€ song_validator.py # Centralized song validation logic โ”‚ โ”œโ”€โ”€ check_resolution.py # Resolution checker utility โ”‚ โ”œโ”€โ”€ resolution_cli.py # Resolution config CLI โ”‚ โ””โ”€โ”€ tracking_cli.py # Tracking management CLI โ”œโ”€โ”€ data/ # All config, tracking, cache, and songlist files โ”‚ โ”œโ”€โ”€ config.json โ”‚ โ”œโ”€โ”€ karaoke_tracking.json โ”‚ โ”œโ”€โ”€ songlist_tracking.json โ”‚ โ”œโ”€โ”€ channel_cache.json โ”‚ โ”œโ”€โ”€ channels.txt โ”‚ โ””โ”€โ”€ songList.json โ”œโ”€โ”€ downloads/ # All video output โ”‚ โ””โ”€โ”€ [ChannelName]/ # Per-channel folders โ”œโ”€โ”€ logs/ # Download logs โ”œโ”€โ”€ downloader/yt-dlp.exe # yt-dlp binary โ”œโ”€โ”€ tests/ # Diagnostic and test scripts โ”‚ โ””โ”€โ”€ test_installation.py โ”œโ”€โ”€ download_karaoke.py # Main entry point (thin wrapper) โ”œโ”€โ”€ README.md โ”œโ”€โ”€ PRD.md โ”œโ”€โ”€ requirements.txt โ””โ”€โ”€ download_karaoke.bat # (optional Windows launcher) ``` ## ๐Ÿšฆ CLI Options > **๐Ÿ“‹ Complete Command Reference**: See `commands.txt` for all available commands with examples - perfect for copy/paste! ### Key Options: - `--file `: Download from a list of channels (optional, defaults to data/channels.txt for songlist modes) - `--songlist-priority`: Prioritize songlist songs in download queue - `--songlist-only`: Download only songs from the songlist - `--songlist-focus ...`: Focus on specific playlists by title (e.g., `--songlist-focus "2025 - Apple Top 50" "2024 - Billboard Hot 100"`) - `--songlist-file `: Custom songlist file path to use with --songlist-focus (default: data/songList.json) - `--songlist-status`: Show songlist download progress - `--limit `: Limit number of downloads (enables fast mode with early exit) - `--resolution <720p|1080p|...>`: Override resolution - `--status`: Show download/tracking status - `--reset-channel `: **Reset all tracking and files for a channel** - `--reset-songlist`: **When used with --reset-channel, also reset songlist songs for this channel** - `--clear-cache `: **Clear channel video cache for a specific channel or all** - `--clear-server-duplicates`: **Clear server duplicates tracking (allows re-checking songs against server)** - `--latest-per-channel`: **Download the latest N videos from each channel (use with --limit)** - `--fuzzy-match`: Enable fuzzy matching for songlist-to-video matching (uses rapidfuzz if available) - `--fuzzy-threshold `: Fuzzy match threshold (0-100, default 85) - `--parallel`: Enable parallel downloads for improved speed (defaults to 3 workers) - `--workers `: Number of parallel download workers (1-10, default: 3, only used with --parallel) - `--generate-songlist ...`: **Generate song list from MP4 files with ID3 tags in specified directories** - `--no-append-songlist`: **Create a new song list instead of appending when using --generate-songlist** - `--force`: **Force download from channels, bypassing all existing file checks and re-downloading if necessary** - `--channel-focus `: **Download from a specific channel by name (e.g., 'SingKingKaraoke')** - `--all-videos`: **Download all videos from channel (not just songlist matches), skipping existing files** ## ๐Ÿ“ Example Usage > **๐Ÿ’ก For complete examples**: See `commands.txt` for all command variations with explanations! ```bash # Fast mode with fuzzy matching (no need to specify --file) python download_karaoke.py --songlist-only --limit 10 --fuzzy-match --fuzzy-threshold 85 # Parallel downloads for faster processing python download_karaoke.py --parallel --songlist-only --limit 10 # Latest videos per channel with parallel downloads python download_karaoke.py --parallel --latest-per-channel --limit 5 # Traditional full scan (no limit) python download_karaoke.py --songlist-only # Focused fuzzy matching (target specific playlists with flexible matching) python download_karaoke.py --songlist-focus "2025 - Apple Top 50" --fuzzy-match --fuzzy-threshold 80 --limit 10 # Focus on specific playlists from a custom file python download_karaoke.py --songlist-focus "CCKaraoke" --songlist-file "data/my_custom_songlist.json" --limit 10 # Force download with fuzzy matching (bypass all existing file checks) python download_karaoke.py --songlist-focus "2025 - Apple Top 50" --force --fuzzy-match --fuzzy-threshold 80 --limit 10 # Channel-specific operations python download_karaoke.py --reset-channel SingKingKaraoke python download_karaoke.py --reset-channel SingKingKaraoke --reset-songlist python download_karaoke.py --clear-cache all python download_karaoke.py --clear-server-duplicates # Download ALL videos from a specific channel python download_karaoke.py --channel-focus SingKingKaraoke --all-videos python download_karaoke.py --channel-focus SingKingKaraoke --all-videos --parallel --workers 10 python download_karaoke.py --channel-focus SingKingKaraoke --all-videos --limit 100 # Song list generation from MP4 files python download_karaoke.py --generate-songlist /path/to/mp4/directory python download_karaoke.py --generate-songlist /path/to/dir1 /path/to/dir2 --no-append-songlist # Generate report of songs that couldn't be found python download_karaoke.py --generate-unmatched-report python download_karaoke.py --generate-unmatched-report --fuzzy-match --fuzzy-threshold 85 ``` ## ๐Ÿท๏ธ ID3 Tagging - Adds artist/title/album/genre to MP4 files using mutagen (if installed) ## ๐Ÿ“‹ Song List Generation - **Generate song lists from existing MP4 files**: Use `--generate-songlist` to create song lists from directories containing MP4 files with ID3 tags - **Automatic ID3 extraction**: Extracts artist and title from MP4 files' ID3 tags - **Directory-based organization**: Each directory becomes a playlist with the directory name as the title - **Position tracking**: Songs are numbered starting from 1 based on file order - **Append or replace**: Choose to append to existing song list or create a new one with `--no-append-songlist` - **Multiple directories**: Process multiple directories in a single command ## ๐Ÿงน Cleanup - Removes `.info.json` and `.meta` files after download ## ๐Ÿ› ๏ธ Configuration - All options are in `data/config.json` (format, resolution, metadata, etc.) - You can edit this file or use CLI flags to override ## ๐Ÿ“‹ Command Reference File **`commands.txt`** contains a comprehensive list of all CLI commands with explanations. This file is designed for easy copy/paste usage and includes: - All basic download commands - Songlist operations - Latest-per-channel downloads - Cache and tracking management - Reset and cleanup operations - Advanced combinations - Common workflows - Troubleshooting commands > **๐Ÿ”„ Maintenance Note**: The `commands.txt` file should be kept up to date with any CLI changes. When adding new command-line options or modifying existing ones, update this file to reflect all available commands and their usage. ## ๐Ÿ“š Documentation Standards ### **Documentation Location** - **All changes, refactoring, and improvements should be documented in the PRD.md and README.md files** - **Do NOT create separate .md files for documenting changes, refactoring, or improvements** - **Use the existing sections in PRD.md and README.md to track all project evolution** ### **Where to Document Changes** - **PRD.md**: Technical details, architecture changes, bug fixes, and implementation specifics - **README.md**: User-facing features, usage instructions, and high-level improvements - **CHANGELOG.md**: Version-specific release notes and change summaries ### **Documentation Requirements** - **All new features must be documented in both PRD.md and README.md** - **All refactoring efforts must be documented in the appropriate sections** - **All bug fixes must be documented with technical details** - **Version numbers and dates should be clearly marked** - **Benefits and improvements should be explicitly stated** ### **Maintenance Responsibility** - **Keep PRD.md and README.md synchronized with code changes** - **Update documentation immediately when implementing new features** - **Remove outdated information and consolidate related changes** - **Ensure all CLI options and features are documented in both files** ## ๐Ÿ”ง Refactoring Improvements (v3.3) The codebase has been comprehensively refactored to improve maintainability and reduce code duplication. Recent improvements have enhanced reliability, performance, and code organization: ### **New Utility Modules (v3.3)** - **`file_utils.py`**: Centralized file operations, filename sanitization, and file validation - `sanitize_filename()`: Create safe filenames from artist/title - `generate_possible_filenames()`: Generate filename patterns for different modes - `check_file_exists_with_patterns()`: Check for existing files using multiple patterns - `is_valid_mp4_file()`: Validate MP4 files with header checking - `cleanup_temp_files()`: Remove temporary yt-dlp files - `ensure_directory_exists()`: Safe directory creation - **`song_validator.py`**: Centralized song validation logic - `SongValidator` class: Unified logic for checking if songs should be downloaded - `should_skip_song()`: Comprehensive validation with multiple criteria - `mark_song_failed()`: Consistent failure tracking - `handle_download_failure()`: Standardized error handling - **Enhanced `config_manager.py`**: Robust configuration management with dataclasses - `ConfigManager` class: Type-safe configuration loading and caching - `DownloadSettings`, `FolderStructure`, `LoggingConfig` dataclasses - Configuration validation and merging with defaults - Dynamic resolution updates ### **Benefits Achieved** - **Eliminated Code Duplication**: ~150 lines of duplicate code removed across modules - **Centralized File Operations**: Single source of truth for filename handling and file validation - **Unified Song Validation**: Consistent logic for checking if songs should be downloaded - **Enhanced Type Safety**: Comprehensive type hints across all new modules - **Improved Configuration Management**: Structured configuration with validation and caching - **Better Error Handling**: Consistent patterns via centralized utilities - **Enhanced Maintainability**: Changes to file operations or song validation only require updates in one place - **Improved Testability**: Modular components can be tested independently - **Better Developer Experience**: Clear function signatures and comprehensive documentation ### **New Parallel Download System (v3.4)** - **Parallel downloader module:** `parallel_downloader.py` provides thread-safe concurrent download management - **Configurable concurrency:** Use `--parallel` to enable parallel downloads with 3 workers by default, or `--parallel --workers N` for custom worker count (1-10) - **Thread-safe operations:** All tracking, caching, and progress operations are thread-safe - **Real-time progress tracking:** Shows active downloads, completion status, and overall progress - **Automatic retry mechanism:** Failed downloads are automatically retried with reduced concurrency - **Backward compatibility:** Sequential downloads remain the default when `--parallel` is not used - **Performance improvements:** Significantly faster downloads for large batches (3-5x speedup with 3-5 workers) - **Integrated with all modes:** Works with both songlist-across-channels and latest-per-channel download modes ### **Previous Improvements (v3.2)** - **Centralized yt-dlp Command Generation**: Standardized command building and execution across all download operations - **Enhanced Error Handling**: Structured exception hierarchy with consistent error messages and formatting - **Abstracted Download Pipeline**: Reusable download โ†’ verify โ†’ tag โ†’ track process for consistent processing - **Download plan pre-scan:** Before downloading, the tool scans all channels for songlist matches, builds a download plan, and prints stats (matches, unmatched, per-channel breakdown). The plan is cached for 1 day and reused unless --force-download-plan is set. - **Latest-per-channel plan:** Download the latest N videos from each channel, with a per-channel plan and robust resume. Each channel is removed from the plan as it completes. Plan cache is deleted when all channels are done. - **Fast mode with early exit:** When a limit is set, the tool scans channels and songs in order, downloads immediately when a match is found, and stops as soon as the limit is reached with successful downloads. This provides much faster performance for small limits compared to the full pre-scan approach. - **Deduplication across channels:** Tracks unique song keys (artist + normalized title) to ensure the same song is not downloaded from multiple channels, even if it appears in more than one channel's video list. - **Fuzzy matching:** Uses string similarity algorithms to find approximate matches between songlist entries and video titles, tolerating minor differences, typos, or extra words like "Karaoke" or "Official Video". - **Default channel file:** For songlist-only and latest-per-channel modes, if no --file is specified, automatically uses data/channels.txt as the default channel list, reducing the need to specify the file path repeatedly. - **Robust interruption handling:** Progress is saved after each download, and files are checked for existence before downloading to prevent re-downloads if the process is interrupted. - **Optimized scanning algorithm:** High-performance channel scanning with O(nร—m) complexity, pre-processed song lookups using sets and dictionaries, and early termination for faster matching of large songlists and channels. - **Enhanced cache management:** Improved channel cache key handling for better cache hit rates and reduced YouTube API calls. - **Robust download plan execution:** Fixed index management in download plan execution to prevent errors during interrupted downloads. ## ๐Ÿž Troubleshooting - Ensure `yt-dlp.exe` is in the `downloader/` folder - Check `logs/` for error details - Use `python -m karaoke_downloader.check_resolution` to verify video quality - If you see errors about ffmpeg/ffprobe, install [ffmpeg](https://ffmpeg.org/download.html) and ensure it is in your PATH - For best fuzzy matching, install rapidfuzz: `pip install rapidfuzz` (otherwise falls back to slower, less accurate difflib) --- **Happy Karaoke! ๐ŸŽค**